This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

February 08, 2009

Serialized Objects and Chronistic Coupling

Many programs have a need to store program state to disk at various points. An approach used by many of these programs is to serialize the objects representing the program state directly to disk (or a database). Back in 2004 (XML-Serialized Objects and Coupling), I described a coupling problem caused by automatically serializing objects to XML.

Since that time, I have worked with other systems with similar functionality and have decided the problem was worse than I described five years ago. Serializing an object to disk with the intent of reading it in at a later date, couples the structure of the object from a past date to the structure of the object at a future date. If the object never changes form, that is not a problem. If the object structure needs to change, then the serialization process becomes more complicated. It has to take one of three forms:

  1. Convert the object to and from the old format.
  2. Recognize the old object and transform it into the new structure.
  3. Institute a versioning system that allows reading and writing the current format and older formats.

Chronistic Coupling

Recently, I have begun calling this effect Chronistic Coupling. (I like Temporal Coupling better, but that name is already taken.) Although you might think of this as another manifestation of Data Coupling, I think the time element makes Chronistic Coupling stronger (and more subtle) than data coupling. Unlike simple data coupling, object serialization couples the object structure through time. The older object format reaches forward in time to effect how the new program can structure its data.

If we allow saving in old formats, we must be very careful not to introduce an anachronism. This would be an old-style object that is inconsistent with the old program. This can cause problems that are hard to troubleshoot. You have to be able to identify where the old data came from to determine the problem. (In one system I worked on, we augmented the version of the data set with an extra piece of data describing the version of the program that saved this data.)

Costs of Chronistic Coupling

There is a sort of seductive quality to the idea that we can serialize objects and reinstantiate them at another time. This pattern recurs many times in the field of programming. Although it seems like a really good idea to have the data completely encapsulated by the object by serializing and deserializing the data straight to storage and back, the reality is there are still tradeoffs.

The obvious issue is to be certain that the data we read in is consistent with the design of the object. Most serialization needs to be augmented with some form of validation.

A separate issue that people often don't notice is that changes in the responsibilities and structure of an object can be hampered by Chronistic Coupling. At the very least, the code needed to deserialize old objects becomes much more complicated. In the worst case, it may be necessary to keep older classes in the design for the sole purpose of allowing us to convert old object into new object.

Where things really start to go bad is when a substantial portion of an object hierarchy changes. The object you have serialized may not bear any resemblance to the new classes. If the new object hierarchy is different enough, you would have to parse the old serialized object into a neutral format that can be used to instantiate the new objects, Either that, or you don't make the design improvements, because the work is too great (for this release).

In this way, the old design reaches into the future to prevent changes to the design. Often, the only way to fix the problem is to abandon backwards compatibility. This may result in major problems for clients or the need to provide special utility software to convert old data to a new format.

Conclusion

I am not saying that object serialization should always be avoided. The purpose of coining the term Chronistic Coupling is to give name to a cost that you may not realize that you are paying. In some cases, it might be better to store data in an object-neutral format and build new objects to represent the data rather than store the objects themselves. The unfortunate part of this is that there is no magical way to convert your objects to and from this simpler format.

No software can exist without some forms of coupling. However, some of the best minds in our field remind us to reduce coupling where we can. If you decide to use object serialization, remember that you are increasing coupling in the time dimension. It is important to consider whether or not this increased coupling is worth the cost.

Posted by GWade at 10:10 PM. Email comments