This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

March 03, 2004

The Law of Unintended Consequences

One of the fundamental laws of the universe could be called the Law of Unintended Consequences. This law is as universal as Murphy's Law, but not as well recognized. To me, the gut-level understanding of this law is one of the things that shows the difference between a really good programmer and someone who just writes code.

In its simplest form, the law could be stated as:

Every action, no matter how small, has consequences that were not intended.

This law applies in the physical world as well as the world of programming. Heat generated by friction is an unintended consequence of many physical activities. Many historians believe that organized crime was an unintended consequence of prohibition. I saw a report a few years ago that showed a relation between better auto theft deterrent devices and car-jackings.

Fortunately, most of the unintended consequences we deal with as programmers aren't quite this dramatic...right? Let's explore some of the consequences result from decisions and actions in our code.

For example, choice of language or paradigm has a large number of effects on your design from that point forward. If you choose a non-object-oriented language, you cut yourself off from many new techniques. (Unless, of course, you want to implement the OO primitives yourself.) On the other hand, choosing a strongly-typed OO language may slow you down if you are solving a quick scripting problem.

Choosing a development methodology also has many consequences. If you choose Extreme Programming, there is some evidence that you will be reducing the amount of documentation that accompanies the project. On the other hand, choosing a more structured methodology like the Rational Process does prevent very rapid development.

Most of us can see these kinds of consequences. In fact, those of us who have been working in the field for a while may even take the consequences into account as part of our decisions when starting a project. But, these aren't the only decisions and actions with unintended consequences.

The Y2K Problem

Many people pointed to the Y2K problem as an example of the short-sightedness of programmers, or of bad old programming practices. However, in many cases, these consequences were actually well understood by the people writing the code. In some cases, they had to decide between a data format that would significantly increase their data storage requirements and the possibility that this data or code would still be in use twenty or thirty years later. Remember in the sixties and early seventies, you couldn't run to a computer store and pick up a 120GB hard drive. Storage space was precious and expensive. A decision that decreased the storage requirements of a system by 10% could mean the difference between success and failure. On the other hand, the idea that the software and data would remain in use thirty years later was not very believable. But, this decision did not take into account how infrequently some kinds of systems are changed and how they change.

Many systems grow other programs that all communicate through the same data formats. Then, when these programs need to change, you are tied to the original formats because too much code would need to change at one time. This is a large risk for even a small change to the data format.

The Internet

Many of the protocols that the Internet is based on are based on straight ASCII text. Although many people try to improve these protocols by making them binary and therefore more efficient. They miss one of the important design decisions. Many of these protocols can be tested and debugged by a human being using telnet. If you have ever spent any time troubleshooting a binary protocol, you can appreciate the elegance of this solution. Many years ago, I was able to help my ISP solve an email problem by logging into the mail server directly and reporting to the tech support person how the server responded.

Every now and then someone attempts to "fix" one of these protocols by making it into a binary data stream. Invariably, one of the consequences of this action is to make the protocol harder to troubleshoot. This side effect almost always kills the attempt. One useful approach that has solved some of the bandwidth problems caused by using a straight ASCII protocol has been using a standard compression algorithm on top of the more verbose protocol. This often reduces the bandwidth of the protocol almost to the point of a binary implementation.

One place where I have seen this work particularly well is the design of Scalable Vector Graphics (SVG). SVG is an XML vocabulary for describing vector graphics. The main complaint from many people was that it was extremely verbose. Every graphic element is described with text tags and and attributes. However, the SVG committee had considered this issue. There are two defined formats that an SVG processor is required to understand. The normal SVG files (with a .svg extension) are XML. The second format is gzip-compressed SVG files (with a .svgz extension). The compressed files use a standard format to reduce size, but use the more verbose format for flexibility and extensibility.

HTML

One of the original decisions in the design of browsers was to make them forgiving in the HTML they would accept. In the days before HTML authoring tools, everyone who wanted to publish on the web had to do their HTML by hand. The thinking was that people would be more likely to continue using the thing if they didn't have to fight the browsers to make their pages display. Unfortunately, this had a consequence that drove the browser wars. Each different browser rendered invalid HTML slightly differently than the others. Before standard ways of describing the presentation of the HTML, people began using these differences as formatting tricks to get the displays they wanted.

Obviously, it wasn't long before someone would claim that because your browser didn't render their invalid HTML the same way as their browser, your browser was broken. In fact, one major browser added a special tag that was often used to break compatibility with another major browser.

We have spent years cleaning up the results of that decision. Arguably, the original decision probably did contribute to the speed at which the early web grew. One unintended consequence was incredibly complicated code in almost every browser to deal with the many ways that users could mess up HTML.

Security

The point of this rambling verbal walk through past code/design issues is to remind you that any code and design issues we make have unintended consequences as well. They may be as simple as an increase in memory use. More likely there are subtle consequences that may not be noticed for quite some time. In recent years, many unintended consequences have surfaced as security holes.

The buffer overflow problem that has been such a bane recently is definitely a result of unintended consequences. Although many scream sloppy coding when they hear of a buffer overrun bug, I don't always think that is the problem. Quite often, I think there is an unconscious assumption that the other end of a connection is friendly or a least not malicious. As such, code that was verified with reasonable inputs turns out to be flawed when the basic assumptions are violated.

I can just hear some of you screaming at this point. I'm not defending the decision that left a flaw in the code. I'm explaining how a (possibly unconscious) assumption had unintended consequences.

Conclusion...for now

This has already gone on longer than I intended. But, I believe that it is worth thinking about your assumptions and examining your decisions. When writing code, every one of your assumptions and decisions will have unintended consequences. I believe the more you think about it, the more benign the remaining unintended consequences may be.

Posted by GWade at March 3, 2004 09:53 PM. Email comments