eneter.net: How to Design Interprocess Communication

There are many aspects and possibilities that must be considered when designing the interprocess communication. E.g. which communication protocol or transportation mechanism shall be used, how to serialize data, what is the communication behavior, what to do if the connection is broken, etc.
This article will discuss the communication between processes and will try to provide a guide how
to consider various aspects and possibilities when designing the communication between processes.

Software Architecture and Interprocess Communication

To design and implement the communication between processes first we need to understand how the
interprocess communication is related to the software architecture.

Therefore, what is the software architecture?

There is no one commonly agreed definition for the term software architecture. In fact there are multiple definitions. But in general they say the software architecture is a way how a software system is structured. E.g. the definition from [Bas12]:

The software architecture of a system is the set of structures needed to reason about the system, which comprise software elements, relations among them, and properties of both.

The term structure typically means something static. Therefore it is obvious the software architecture includes a decomposition of elements with their static relationships such as dependencies or hierarchies. And it may not be so obvious the software architecture includes dynamic behavior too. It means how dynamics such as message flow, passing through states or sequential or parallel running is organized (structured) in the software system.

Therefore the architecture is not just a static organization of elements but also dynamic behavior which includes the interaction between elements. And if interacting elements are running in different processes it is the interprocess communication.

The interprocess communication differs from the interaction within one process. It introduces many aspects that are not present in the interaction inside one process. E.g. transferring data across the network is much slower than a local call, or security aspect, or one communicating part can be temporarily unavailable, or communicating parts can be implemented in different programming languages and running on different operating systems, etc.
All these aspects can have significant impact on the software system and therefore they need to be carefully considered when creating the software architecture.

Creating Software Architecture

The software architecture (set of structures representing the system) is derived from requirements. The
software architect must recognize those requirements which have profound effect on structures representing the system. These requirements are called Architecturally Significant Requirements and it is not easy to find them.
There are three categories of requirements:

Functional requirements
Non-functional requirements
Constrains

Functional requirements describe what features are expected. They specify what the software shall do. Usually it is not so difficult to capture them into some kind of functional specification document.
However, these requirements are typically not those that drive the structuring (architecting) the software.
E.g. it is possible to satisfy same functional requirements by different architectures. But will they be equally good for the given product or project?
It is typical a software system has required features but still needs to be reworked. E.g. because it is difficult to add a new feature, or because it is expensive to maintain it.

Non-functional requirements represent certain quality attributes which are expected from the whole software system or from functional requirements. These quality attributes are typically not obvious just from functional requirements. In many cases they are just implicitly assumed. The software architect must capture them by talking to various stakeholders to understand the wider context of the software system. E.g. what drives your business to be successful and how the architecture shall support that? If a business driver is 'time to market' it means it may be important to be able to introduce new features in a short time. Or your application is recognized by customers because you are able to quickly customize it according to their needs. Or your application must provide outstanding performance. Or you want to port the application to various platforms.
By analyzing these requirements you find yourself thinking about modules and layers with certain responsibilities, proposing mechanisms to extend some behavior or thinking about technologies you may need. Quality attributes force you to organize the software system into structures. Therefore nonfunctional requirements have the major impact on the software architecture.

Constrains are requirements which specify what is defined and cannot be changed. E.g. using a certain programming language or operating system. There can be also constrains that have an impact on the architecture. E.g. it can be decided the software system must be a cloud based system. Or it must be closely integrated with some other system requiring the integration across multiple layers what can result to mirror the same architecture.

Quality Attributes and Interprocess Communication

As discussed above, quality attributes have the major impact on the software architecture. There are many types of quality attributes that can be reflected by the architecture:

performance, usability, reliability, recoverability, security, modifiability, reusability, testability, scalability, extensibility, portability etc.

(You can refer to ISO 25010 for various categories of quality attributes but there are more than specified in this standard.)

The quality attributes affect the interprocess communication but also the interprocess communication may introduce new quality attributes to the software system. E.g. the interprocess communication may require to address topics like slow network, broken connection, confidential data, etc. This can result into additional quality attributes (non-functional requirements) for your system like performance, availability, recoverability, security, etc.

The interprocess communication must provide the interaction between entities in accordance with quality attributes which are identified for the system. There are various tactics you can apply in order to address quality attributes in your architecture. E.g. to achieve the modifiability you can use tactics like reducing size of a module, encapsulation or restricting dependencies. To address quality attributes in the interprocess communication you need to use tactics for the communication between processes.

Tactics for Interprocess Communication

The following part discusses quality attributes and tactics which are relevant for the interprocess communication. It summarizes quality attributes from the interprocess communication perspective and shows possibilities for designing the communication.

Performance

The performance is about how much work a software system is able to perform for a specified time interval. In the interprocess communication the performance is related to the speed (how fast a message is serialized, sent, received and deserialized) and the throughput (how many requests is the system able to process).
To address the performance:

Consider to use binary serializer instead of XML or JSON.
In case of the interprocess communication on one computer prefer to use shared memory or named pipes instead of TCP, Web Sockets or HTTP.
In case of the communication across the network consider to use TCP or Web Sockets instead of HTTP. E.g. if the communication requires notification messages (pushed from the service) or one request can generate multiple responses. HTTP is not suitable because it is 'one request' - 'one response' protocol. Therefore various polling mechanisms or tricks must be used to achieve notifications or multiple responses. This is an overhead reducing the performance.
Prevent unnecessary opening and closing connections.
Prefer parallel processing of incoming connections on the service side.
You can consider using a load balancer component to distribute the workload across multiple computers.

Resource Friendly

The resource friendly is an ability to minimize consuming resources. In the interprocess communication the resource consumption is related typically to consuming batteries in mobile devices. (How long the application can run and communicate without charging the battery?)

To address the resource friendly:

In order to save batteries the device can switch to the sleeping mode. Prefer to handle this state in your communication instead of programmatically preventing the device from the sleeping. E.g. you can consider opening connection only for needed time. Or you can consider providing a mechanism recovering the connection after the sleeping mode.

Interoperability

The interoperability is an ability to exchange data and correctly interpret them between diverse systems.

To address the interoperability:

Precisely design the API and provide a comprehensive documentation with examples.
You can consider using communication standards. But be careful to rely only on standards. The problem is the same standard can have "slightly" different implementation/interpretation in two different systems.
If you use a communication middleware then other systems will probably need to use the same middleware if they want to interact with your system. Analyze if it is a problem or not. Also consider which middleware you will choose. E.g. if you need the interoperability across multiple platforms then the middleware should be available on those platforms.
You can consider supporting multiple communication standards, mechanisms and middlewares in your software system.
You can consider encapsulating the communication functionality to a separate component (layer) which would be extensible. Then if needed you can extend the system by other communication technology in order to achieve the interoperability.

Accessibility

The accessibility is a degree a software system is accessible to various people. In the interprocess communication the accessibility is related to make the system accessible for various types of devices and platforms people may use.

To address the accessibility:

If you provide client applications for multiple devices and platforms (e.g. Android, iPhone, Windows Phone, HTML5, .NET ...) then you need to consider the multiplatform aspect in the communication too. E.g. binary data is not compatible among these platforms. So you may need to use e.g. XML or JSON serialization. Be aware of various string formats (UTF-8, UTF-16 …). If you transfer binary data be aware platforms may differ in encoding of numbers (littleendian, bigendian).
You can consider exposing the service via multiple communication protocols at the same time. E.g. the service can listen to TCP, Web Sockets and HTTP. (Windows Phone 7.0 client uses HTTP, HTML5 client uses Web Sockets and .NET client can use TCP.)

Availability

The availability is an ability of a system to stay operable and functional for a declared time of a given time interval. E.g. annual availability can be 99.9% (i.e. overall time the system can be down during the year is 8.76 hours).

To address the availability:

You can consider to provide identical backup services so that if the service gets unavailable (e.g. because of a maintenance or a crash) the communication is automatically rerouted to backups and overall system stays functional. (This approach can be a little tricky if the service received a request but failed before sent the response. There should be a mechanism handling this state. E.g. client may need to repeat the request in such case.)
Or you can consider providing multiple identical services. Incoming messages would be then multiplied and forwarded for parallel processing to all services. In case one service gets unavailable the request is still processed by rest of running services. So the overall system stays functional.
Consider to provide a mechanism constantly monitoring the network connection so that the disconnection is detected early and the system can handle it (e.g. to start a recovery procedure).

Recoverability

The recoverability is an ability of a system to recover from a failure and continue in functioning. In the interprocess communication the recoverability is related to recover from the broken connection.

To address the recoverability:

Consider to provide a mechanism constantly monitoring the connection so that the broken connection is detected early.
Consider to provide a reconnecting mechanism.
You can also consider providing a message buffering mechanism. So that in case of a disconnection sent messages are stored in the buffer until the connection is not recovered. Once the connection is reopen messages from the buffer are sent to the receiver. This can be useful for unstable networks where short sporadic disconnections can occur (e.g. a mobile device can get out of the signal).
If your application consists of multiple interacting processes consider to make the communication among them startup independent. E.g. sent messages can be buffered until the message receiver is up and running.

Reliability

The reliability in the interprocess communication is an ability to deliver messages.

To address reliability:

Consider to use acknowledged messaging. It means the message receiver will send back a confirmation message that the message was received. If the sender does not get the confirmation within a specified time it will consider the message as not delivered.
Consider to use persistent message queues. Messages are persisted in the queue and so they are not lost. The receiver can pick them up from the queue whenever is ready.

Security

The security consists of: confidentiality to protect messages from unauthorized access, integrity to protect messages from unauthorized manipulation and authenticity to verify identify of communicating parts (if they are really who they are saying they are).

To address security in the interprocess communication:

You can consider using secured TCP connection (SSL or TLS). The whole communication is encrypted so messages are protected from unauthorized access and manipulation and the protocol is also able to verify the identity of both communicating parts.
You can consider using symmetric AES to encrypt messages. It protects messages from unauthorized access and manipulation but is not able to verify the message sender identity.
You can consider using asymmetric RSA (with public and private keys) to encrypt messages. It protects messages from unauthorized access and manipulation and is able to verify message sender identity.
You can consider using digitally signed messages. This approach does not protect messages from the unauthorized access but protects them from unauthorized manipulation and is able to verify the message sender identity.

Scalability

Scalability is an ability to handle increasing workload by adding new resources. In the interprocess communication the scalability is related to increase the performance of the system by adding next computers hosting processing services.

To address scalability:

You can consider using the load balancer component that would maintain a list of identical services and distribute the workload among them.

Monitorability

Monitorability is an ability to monitor how the system is functioning. In the interprocess communication the monitorability is about monitoring the connection availability or monitoring the workload of a service.

To address monitorability:

You can consider providing a mechanism continuously monitoring the connection so that the system is informed whether the connection is working or not.
You can consider providing a service to monitor the workload on server computers. This workload information can be used e.g. by a load balancer to optimally distribute the workload among multiple servers.

References

[Bas12] Software Architecture in Practice (3rd Edition) by Len Bass, Paul Clements and Rick Kazman.

1 comment:

HenriJuly 5, 2013 at 10:32 PM
Nice article Ondrej. Just a few comments:

TCP and HTTP are actually on two different layers (resp. transport and application) and TCP is actually the most common transport layer protocol used with HTTP. Also HTTP doesn't prevent having both system connect to each other and send messages in both direction. The tricks you mention are only necessary if only one system (e.g. a client) can connect to the other one (e.g. server) but not the other way around. Of course as you mention, Websockets have the advantage that they allow the server to send messages to the client even though the connection has been opened by the client (which requires tricks to work with HTTP).

Search This Blog

Thursday, July 4, 2013

How to Design Interprocess Communication