Programmer Musings: Chronistic Coupling, Communications

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

March 14, 2009

Chronistic Coupling, Communications

The comments from Ian and rlb3 have made me think a bit more on what I said last time about Chronistic Coupling. One thing I didn't make perfectly clear is that I'm not advocating avoiding Chronistic Coupling at all costs.

Any real system will require some amount of Chronistic coupling. The key design point is to decide how much. Choosing the wrong level of coupling will certainly impact how your system evolves in the future. Over the next few posts, I'm going to explore some of these levels of Chronistic coupling with some examples.

Communications Protocols

Once upon a time, people doing communication between two processes (or computers) regularly debated how the data should be transferred: ASCII or binary. (This was pre-Unicode.) The advocates of the binary approach argued that it was more efficient for two reasons:

Fewer bytes sent over the network
No time spent converting to a network format and back

When we transferred data at 1200 or 2400 bps these arguments were pretty convincing. Especially when communicating between processes on the same machine.

However, there were problems when communicating between machines that were not the same architecture. When crossing the architecture boundary, you had to do conversions anyway. Some places where the binary format might change include:

Byte order
Size of primitive data types
Format of floating point data storage
Padding in larger binary structures (structs, etc.)
Encoding of strings (nul-terminated, length, etc.)

Soon, a sizable amount of effort could be applied to converting binary data from other machines to the native format. The worst part about this was the lack of information in the data stream to help troubleshoot problems. Normally, you found out that your decoding logic was wrong when some portion of the binary data stream gave ridiculous results, or when you got to the end of the stream and found you had too little or too much data.

Meanwhile, text-based protocols sent more data over the wire (which became less of a problem as networks became faster). But, where a text-based protocol really shines is in debugging the data stream. If the next number in the stream is 1000000 and you expected a 16-bit short int, it's easy to see there's a problem. In a binary stream, the first two bytes of a long int look the same as an actual short int, there's no way to tell (at the protocol level) that something is wrong.

There were still problems. There was the EBCDIC vs. ASCII issue, which has mostly gone away. There is also the line ending problem, (LF vs, CRLF vs. CR).

The biggest win for the text-based protocols was the success of TCP/IP protocols on the network. A large number of the protocols that run the Internet are basically text. For example, HTTP, SMTP, FTP, Telnet, and more are basically a series of text strings sent between the client and server.

The major solutions to the size issue are relatively straight-forward. First, the networks got faster, so the problem is less of an issue. In places where bandwidth is still a problem, we can compress the text stream (gzip) to reduce the number of bytes. Since the compression is something that can be used by everyone, it is been greatly optimized over the years giving more benefit to everyone.

The Present

As a result of the (possibly compressed) text-based protocols used on the net today, machines with very different architectures can communicate easily. Text protocols have a lower chronistic coupling than binary protocols. An email client written to work on 16-bit Windows 3.1 could send messages to a client on a 32-bit Windows XP system. A web page served from a 64-bit Linux box can be viewed comfortably on Mac OS X, Windows Vista, or a mobile phone. More importantly, these clients don't need to know if the web page was generated from a C++ program, Ruby, Java, Lisp, or even Forth. It just doesn't matter.

Our video and audio formats are still binary because of the large amount of data being transferred. We still have chronistic coupling issues there. If you don't have the right codec for the file, you are basically out of luck. Many of these codecs are tied directly to the architecture where they were written.

In this case, the trade-off for reduced size is still more important than the ease of porting to multiple architectures.

Posted by GWade at March 14, 2009 11:29 PM. Email comments