Search This Blog

Thursday, July 29, 2010

Interprocess Communication

Summary: The article provides an overview about approaches for the interprocess communication.

When you need to realize the communication between applications, usually you consider the following approaches:

File Transfer
Applications communicate via files. Producing applications write data to files and consuming applications read files to get data.

The advantage is that applications are loosely coupled. They do not have to know about internal implementations of each other. They just need to know the file format and a location of files.

On the other hand writing and reading files is not so fast. Therefore, files cannot be updated very frequently (e.g. many times per second) but usually they are updated in intervals, e.g. hourly. However, this causes delays and applications reading the file must often deal with the fact that data is not up to date -> synchronization issues.

Database Storage
Applications communicate via shared data in a database. Share data is written into the database from where it can be read by other applications.

The advantage is that the database provides a unified access to all data (e.g. via SQL). So applications do not have to deal with different file formats. In addition, the transaction mechanism helps to keep data consistent.

The drawback is that applications using the database depends on the data schema. Therefore, the change in the data structure (data schema) can cause changes in applications using the database.

Also the performance can be problematic - especially, if multiple applications frequently read and update the same data or the database is distributed across different locations.

Remote Procedure Call
Applications communicate via exposed functionality. Application providing the functionality uses a middleware (e.g. WCF) to hide the interprocess communication. The intention is that remote calls look like local calls. The communication is usually synchronous.

The advantage of this approach is that the application provides a specified functionality and encapsulates (hides) data.

The disadvantage is that the communicating applications are coupled. E.g.:
The calling application assumes that the other side implements the particular interface. If you add a new method to the interface, then all client applications are affected and must be updated and recompiled.

Also the illusion that remote calls are same as local calls can be confusing. The problem is that the interprocess communication (especially across the network) is much slower and fails more often than simple local calls. To address these differences you may face to resolve issues that usually do not exist in local calls. E.g.:
"How long can the call be blocked waiting for the response? Do you need some mechanism to cancel the call?"
"What if the connection is broken? Do you need to reconnect and recover the call?"
Therefore, although the Remote Procedure Call approach provides an abstraction that the communication between two applications is a simple local call, you cannot follow this idea because if you do not want to have troubles you cannot ignore specifics of the interprocess communication.

Applications communicate via messages. Communicating applications use a middleware (message oriented middleware) to send and receive messages. The sending application sends messages through the channel and the receiving application listens the channel to receive incoming messages. The communication is usually asynchronous. (I.e. the sending application is not blocked until the message is processed by the receiving application.)

The advantage is that the communicating applications are loosely coupled. They do not have to know about internal implementations of each other. They just need to know the message format and the channel to send / receive the message.
(If you add a new message, client applications are not affected.)

Another advantage is that the service application provides a specified functionality and encapsulates (hides) data.
(Request message invokes a functionality and response message returns results.)

The price for the messaging communication is that the implementation deals with asynchronous programming. The asynchronous programming is more difficult and can add some complexity to the implementation.

It is not possible to say that one communication approach is good and another is bad. All have its own pros and cons. The point is to choose approach (or combination of approaches) that fits the best the particular project. The communication needs should not be underestimated but also they should not be over-engineered with fancy technologies. Also a tendency applying always the same approach does not have to work - if the preferred approach does not fit then its using causes workarounds and additional complexity.
Specifically, I have a feeling that for the interactive communication there is a tendency to prefer Remote Procedure Calls. But if coupling or synchronous approach should be problematic then Messaging could be a better alternative.

No comments:

Post a Comment