Programmer Musings: BPGB: Don't Optimize Prematurely

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

January 13, 2015

BPGB: Don't Optimize Prematurely

It is written

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

—Donald Knuth

Most people forget the very next thing he said:

Yet we should not pass up our opportunities in that critical 3%.

Many developers have lost the original meaning of premature optimization and turned the meaning of the first quote into ignore performance until later.

Context

In the context of the time, Knuth was talking about what are currently called micro-optimizations. A micro-optimization modifies an individual statement or expression, or a small set of statements, to get the most performance possible. This kind of optimization often results in hard-to-read code with only a small performance benefit.

One thing many people miss is that many of these kinds of changes are already handled by modern compilers for most languages. They are often referred to as peephole optimizations, because they only take a look at a small amount of code at a time. Even dynamic languages without a traditional compiler often perform some of these kinds of optimizations on their bytecode before execution. They don't usually do as many as a traditional compiler, but they do some.

If you aren't measuring performance, your supposed optimizations could make it harder for the compiler to do a good job.

Premature Pessimization

When you completely ignore performance to the point of writing unnecessarily slow code, you are engaging in premature pessimization. Variations of this I have seen include:

Reading multiple items from a data store individually instead of as a group
Making multiple individual requests to a remote server rather than a more complex single query
Sorting based on an expensive measure instead of pre-processing to reduce the expense
Using an algorithm with a higher algorithmic complexity than necessary

An example of the first case is checking multiple file stats individually rather than getting all of the stats at once (existence, size, readability, writeability, etc.). You often see that second case with someone making database requests for individual rows rather than making a batch request. Sometimes these are the result of a simple piece of code that was extended to more steps. Other times, the person really didn't consider the overhead of the individual pieces and just used a simple, sloppy approach.

The sorting example is a pretty straight-forward time-memory trade-off. Someone not familiar with algorithm costs might decide that the obvious approach is better without thinking about costs. If you are looking at a quicksort-based algorithm, you can expect (n log₂ n) comparisons. Since you will access two elements per comparison, we can look at doing the expensive measure (2*n log₂ n) times. Depending on the expense, that can easily swamp the rest of the cost of the sort. To make this more concrete, the measure would be called 1328 times for a 100 item list and 3057 times for a 200 item list.

An example of the final case I remember, was some code I saw in a review. When I brought up that code had a complexity of O(n⁵), I was accused of optimizing prematurely. A couple months later, I was asked to help because that code ended up incredibly slow, as predicted. It turned out the data set was a little bigger than had been assumed. The corrected code wasn't any harder to read, it was just faster.

Avoiding Premature Optimization Correctly

The only reasonable way to avoid premature optimization without going too far is to write clear, readable code that meets the requirements. Keep important algorithmic and process issues in mind:

measure the code to determine bottlenecks
avoid making extra calls to expensive services
try not to loop more than necessary
consider time-memory trade-offs
pay attention to basic algorithmic complexity issues

As the code grows, periodically profile to see what bottlenecks develop. Only when a measurable, repeatable performance problem has been found, should you consider performing the kind of micro-optimizations that Knuth warned against.

If you can combine multiple expensive calls into one, you often get a benefit for potentially small readability impact. Making one database call to return a set of rows to process makes a lot more sense than reading one at a time.

A lot of people don't think about trading memory for performance. The sorting example from above is a good example. Many programmers assume that if they are using the library sort routine, it's as efficient as possible. But, your part of the sort is just as important. The Perl community has a special programming idiom for this trade-off called the Schwartzian Transform. This results in the expensive measurement only being done once per element, with some O(n) cleanup at the end. As a result of giving the technique a name, this algorithmic improvement is actually becoming more common.

Although not everyone has actual training in algorithmic complexity, you can recognize the basics pretty easily. Any time you have a loop over something the performance is related to the number of times through the loop. If you have a loop in a loop, the performance is going to degrade much faster.

Conclusion

Avoiding premature optimization is often an excuse to write sloppy code. It should not be. You still need to choose good algorithms and think about your design. Once the algorithm works, profile to find where more aggressive optimization is needed. Trust your tools to do the finicky, really low-level stuff unless you have measurable proof that they don't do what you need.

* "Structured Programming with Goto Statements". Computing Surveys 6:4 (December 1974), pp. 261–301, §1

Posted by GWade at January 13, 2015 08:58 AM. Email comments