Programmer Musings: May 2008 Archives

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

May 31, 2008

False Lazy Initialization

There is a technique I have seen used many times in my career called Lazy Initialization. The purpose of this technique is to delay an expensive initialization (or object construction, or calculation) until you actually need it. Just like any technique, Lazy Initialization has both advantages and disadvantages.

The advantages are a pretty obvious:

If you never use the value, you don't pay for it.
Initialization costs can be more spread out by not initializing everything at once.
In some cases, startup appears to be faster.
Ability to delay until needed information is available.

The disadvantages are not always quite so obvious. In fact, they are often ignored.

Code needed to check to see if it is time to initialize the value each time it is used.
Time spent in the above code each time the value is used.
Overhead for maintaining the inputs for the initialization that may not be needed after the initialization.
More complicated error recovery for dealing with failed initialization far from the point where the inputs were supplied.

Like any technique, you need to weigh the advantages and disadvantages in context to decide if the technique is worth it. If the cost of initialization is low or if you are going to need the value almost immediately, Lazy Initialization is not a good idea. If the initialization process is expensive or time-consuming and you are unlikely to need the value at all, the technique is quite useful.

Done correctly, you should have a single function that does the check and initialize. Then, everywhere you need the value you call this function.

False Laziness

Unfortunately, just like the programming virtue of Laziness has a dark twin, False Laziness, Lazy Initialization also has a dark copy. I suggest we should call this False Lazy Initialization. Unlike proper Lazy Initialization, false lazy initialization does not check for the need to initialize the object everywhere the object is needed. Like false laziness, this method appears to save time and effort by only checking and initializing in one place. After all, "the object is initialized after this call, so we don't really need to check elsewhere". The argument sounds seductively correct, but is in fact, very wrong.

In reality, the order of calling these methods will end up different than predicted at some point in the use of the code. Once this happens, some piece of code will depend on the object being initialized when it hasn't been. A bug is discovered. Someone fixes it by calling the function doing the initialization at this spot. Later, the same bug pops up in another place. The same fix is applied. Somewhere else code trips over the uninitialized object again, this time we fix the code by testing if the object is initialized and fail if it is not.

Over time, these two approaches to fixing the problem proliferate until the original benefit is lost in the noise. To me, false laziness is the epitome of that old saying:

There's never enough time to do it right, but there's always time to do it over.

True Lazy Initialization

Just like true laziness involves extra work up front to save even more work later, proper lazy initialization requires a bit of extra work. Think of lazy initialization as an optimization. When optimizing code, we should not change the real functionality or expectations of how the code should function. We only want to improve performance. If we were not doing lazy initialization, the object in question would have been initialized at the beginning. Everywhere we access this object it would have already have been initialized. The order that methods involving the object are called should not matter.

In order to keep these conditions the same, every method that accesses the object must go through a single method that checks the object and initializes it hasn't been already. This causes a small cost for each access to the object, but saves us a huge amount of maintenance time later.

In an OO system, we might have an object with an internal member that is lazily initialized. The false version would have every (or almost every) method individually check to see if the member is initialized properly and construct the member when needed. An even worse version would only initialize the member in one method and require that all users of the code remember to use it in just the right order or risk failure.

Both of these false versions tend to be bad for maintenance as well. Any time the code needs a change, you have to go back and review this decision to make sure the initialization has been done correctly. In the end, false lazy initialization ends up costing more than non-lazy initialization would have.

Proper lazy initialization provides one method that is always used to access the lazy member. The code never bypasses this method to access the value directly. Although we always pay a small penalty for testing the value, the maintenance costs are lower and the robustness is higher.

Posted by GWade at 09:35 PM. Email comments