The A&D Solution in IFS Cloud is distributed. It requires the coordination
of multiple services via a series of complex message interactions across
various frameworks and infrastructure, each with their own failure points
and error surfacing mechanisms. When errors occur in the communication
request or event, the errors are surfaced or logged in various ways that
make it difficult to cohesively handle errors across the solution.
The monitoring service which is centered around the Communication Log page
(ADMON_COMMUNICATION_LOG) is used to monitor communication between IFS Cloud
and A&D Services and surface errors via a uniform interface.
Using the communication between Mobile Maintenance (MM) for Aviation and Maintenix
services as an example, the below diagram highlights the key communication
stages in a request pattern. Each stage will appear as a log record in the Communication
Log page.
The following table describes the communication log type codes used to identify each stage in the Request communication.
Number | Communication Type | Notes |
1 | Request Initiated | A request processing flow has been started in a service. |
2 | Request Sent | The request has been sent by the client service. |
3 | Request Received | The request has been received by the target service. |
4 | Response Sent | The target service has sent a response. |
5 | Response Received | The client service has received the response. |
6 | Request Completed | The client has completed the processing of the response. |
7 | Error | This is a catch-all state that would result if any unexpected (i.e., non-business) error is encountered that prevents the message from successfully completing. An IFS Application Event will be emitted to alert administrators of these types of failures. |
NA | Exception | An exception refers to a specific event that disrupts the normal flow of communication. In the event of an exception, six retry attempts are made to process the message. |
NA | Ad hoc | An ad hoc entry indicates either a duplicate message that has been skipped to prevent reprocessing, or an entry added to support troubleshooting or development. |
Using the communication between Mobile Maintenance (MM) for Aviation and Maintenix services as an example, the following diagram highlights the key communication stages in an Event pattern. Each stage will appear as a log record in the Communication Log.
The following table describes the communication log type codes used to identify each step in the Event communication.
Number | Communication Type | Notes |
1 | Event Sent | The event has been produced by a source service. |
2 | Event Received | The event has been received by a consumer service. |
3 | Event Completed | The consumer service has completed processing of the event. |
4 | Error | This would be a catch all state that would result if any unexpected that would prevent the data synchronization communication from completing. An IFS Application Event will be emitted to alert administrators of these types of failures. |
NA | Exception | An exception refers to a specific event that disrupts the normal flow of communication. In the event of an exception, six retry attempts are made to process the message. |
NA | Ad hoc | An ad hoc entry indicates either a duplicate message that has been skipped to prevent reprocessing, or an entry added to support troubleshooting or development. |
The ADMON_ERROR_ALERT event is
triggered when an error occurs in the communication process (i.e., when a
log record is recorded with communication type Error). The event
contains information about the error. An event action can be created to
respond to this event, for example, to send an email to administrators.