This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

January 15, 2015

Design Principle: Just in Time Decisions

One of the classic mistakes of software development results from thinking we know what we are doing. Entirely too many people in software start off each project believing they know enough about the project to lay out the whole design. Except in the rare circumstance that you are building an exact copy of something you have already designed and built, you are almost certainly wrong.

As we develop experience, most of us learn this lesson. The problem is actually pretty simple to state:

At the beginning of any project, you know less than at any other time in the project.

Throughout the project, we should always be learning things: edge cases we didn't consider, business plans that change, users we weren't told about, etc. As the solution unfolds, we discover more and more that will affect the design decisions we make. This is why most software teams have begun to reject Big Design Up Front. However, not doing all of the design at the beginning is not the same as doing no design at all.

Small Design as Needed

Many methodologies have taken the idea that we should build a list of features, use cases, or user stories that lay out what needs to be done and fill in the details as we go. Done well, this can result in a system that meets the needs of the stakeholders very quickly. Done badly, it can result in an unmaintainable mess.

So, how do we move in the direction of the better designs? I suggest that we do just enough design before we need it to give some structure to the code that will be written, but not design ahead any more than necessary. Unfortunately, this is the kind of advice that sounds good, but doesn't help. Kind of like Buy low, sell high.

One approach that does seem to work is to put off individual design decisions until you absolutely have to make them. If we can continue coding on a particular feature without making a decision, then don't make the decision. At some point, the code will close off potential decisions. At that point, we have gone too far, because the decision has effectively been made.

An Example

Let's say that we are making a system that depends on weather information. In the initial design discussions, we determine there are three potential services that contain the information that we need. At the beginning of the project, we may not need to decide which service to use. We could actually build the UI (a decision that we need to make early to get user feedback), with an interface to mock data.

This allows someone to explore the APIs to determine their difficulty of use. We can also determine if there is other information we might have wanted. This also gives us the possibility of considering whether we want to be able to switch providers, maybe using a secondary service as a backup if our primary one goes down.

All of this should just be research, without building anything. By not making a decision on the service right away, but focusing on the interface, we can get the interface in front of users and get feedback.

What if the users decide that there is some information that they really need that we haven't put on the UI? What if they decide that the way we are displaying some information is not useful? What if they need a map displaying the data differently? All of these pieces of information may affect which service we need to use. If we had originally guessed wrong, we will be spending time re-designing and re-coding access to the service, or worse telling the users something can't be done because of our earlier (premature) decision.

Once we have the UI nailed down to the point of knowing what data the user's need, we can make a decision about which service to code for. It's entirely possible that for business reasons we will need data from more than one service. This would be a major change in design if we had coded to assume only one service up front.

Keep Collecting Data

As long as the work you need to do does not depend on a decision, put the decision off. The longer you wait the more data you will have. This increases the chance that when you must make the decision, you will have the information you need to make a good decision.

All of us have had the experience of having to tell a user that we can't do some change they want because that's not how the system works. Much of the time, that was because a decision was made before we knew what we know now. Granted, sometimes the request really doesn't make any sense for the current system. But, you know the uncomfortable feeling of having to say no, just because.

Don't Go Too Far

We can't put off making design decisions for too long. If you wait too long, someone will write code that implicitly makes the decision for us. Unfortunately, that spot decision may not take into account all that we know about the system. If the developer in question makes a local decision that forces a global design decision we have waited too long.

The people that are making the major design decisions need to be close enough to the actual development to be aware of when this sort of issue is likely to occur.

Ideally, all of the developers should be aware of the design decisions that we know are coming. When a developer stumbles into one of these spots where an actual decision must be made, they could talk to the senior people making the decisions or bring the team together to decide.

Conclusion

The process of Just In Time Design Decisions consists of a few parts:

  • Do design before you actually write code.
  • Put off any design decisions until you absolutely must make them.
  • Keep the whole team aware of the goals and data about the system.
  • Get the right people together to make design decisions when you must.

When this works well, the results speak for themselves.

Posted by GWade at 08:16 AM. Email comments

January 13, 2015

BPGB: Don't Optimize Prematurely

It is written

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

—Donald Knuth

Most people forget the very next thing he said:

Yet we should not pass up our opportunities in that critical 3%.

Many developers have lost the original meaning of premature optimization and turned the meaning of the first quote into ignore performance until later.

Context

In the context of the time, Knuth was talking about what are currently called micro-optimizations. A micro-optimization modifies an individual statement or expression, or a small set of statements, to get the most performance possible. This kind of optimization often results in hard-to-read code with only a small performance benefit.

One thing many people miss is that many of these kinds of changes are already handled by modern compilers for most languages. They are often referred to as peephole optimizations, because they only take a look at a small amount of code at a time. Even dynamic languages without a traditional compiler often perform some of these kinds of optimizations on their bytecode before execution. They don't usually do as many as a traditional compiler, but they do some.

If you aren't measuring performance, your supposed optimizations could make it harder for the compiler to do a good job.

Premature Pessimization

When you completely ignore performance to the point of writing unnecessarily slow code, you are engaging in premature pessimization. Variations of this I have seen include:

  • Reading multiple items from a data store individually instead of as a group
  • Making multiple individual requests to a remote server rather than a more complex single query
  • Sorting based on an expensive measure instead of pre-processing to reduce the expense
  • Using an algorithm with a higher algorithmic complexity than necessary

An example of the first case is checking multiple file stats individually rather than getting all of the stats at once (existence, size, readability, writeability, etc.). You often see that second case with someone making database requests for individual rows rather than making a batch request. Sometimes these are the result of a simple piece of code that was extended to more steps. Other times, the person really didn't consider the overhead of the individual pieces and just used a simple, sloppy approach.

The sorting example is a pretty straight-forward time-memory trade-off. Someone not familiar with algorithm costs might decide that the obvious approach is better without thinking about costs. If you are looking at a quicksort-based algorithm, you can expect (n log2 n) comparisons. Since you will access two elements per comparison, we can look at doing the expensive measure (2*n log2 n) times. Depending on the expense, that can easily swamp the rest of the cost of the sort. To make this more concrete, the measure would be called 1328 times for a 100 item list and 3057 times for a 200 item list.

An example of the final case I remember, was some code I saw in a review. When I brought up that code had a complexity of O(n5), I was accused of optimizing prematurely. A couple months later, I was asked to help because that code ended up incredibly slow, as predicted. It turned out the data set was a little bigger than had been assumed. The corrected code wasn't any harder to read, it was just faster.

Avoiding Premature Optimization Correctly

The only reasonable way to avoid premature optimization without going too far is to write clear, readable code that meets the requirements. Keep important algorithmic and process issues in mind:

  • measure the code to determine bottlenecks
  • avoid making extra calls to expensive services
  • try not to loop more than necessary
  • consider time-memory trade-offs
  • pay attention to basic algorithmic complexity issues

As the code grows, periodically profile to see what bottlenecks develop. Only when a measurable, repeatable performance problem has been found, should you consider performing the kind of micro-optimizations that Knuth warned against.

If you can combine multiple expensive calls into one, you often get a benefit for potentially small readability impact. Making one database call to return a set of rows to process makes a lot more sense than reading one at a time.

A lot of people don't think about trading memory for performance. The sorting example from above is a good example. Many programmers assume that if they are using the library sort routine, it's as efficient as possible. But, your part of the sort is just as important. The Perl community has a special programming idiom for this trade-off called the Schwartzian Transform. This results in the expensive measurement only being done once per element, with some O(n) cleanup at the end. As a result of giving the technique a name, this algorithmic improvement is actually becoming more common.

Although not everyone has actual training in algorithmic complexity, you can recognize the basics pretty easily. Any time you have a loop over something the performance is related to the number of times through the loop. If you have a loop in a loop, the performance is going to degrade much faster.

Conclusion

Avoiding premature optimization is often an excuse to write sloppy code. It should not be. You still need to choose good algorithms and think about your design. Once the algorithm works, profile to find where more aggressive optimization is needed. Trust your tools to do the finicky, really low-level stuff unless you have measurable proof that they don't do what you need.

* "Structured Programming with Goto Statements". Computing Surveys 6:4 (December 1974), pp. 261–301, §1

Posted by GWade at 08:58 AM. Email comments

January 07, 2015

BPGB: Frost-Bitten Features

In some environments where I've worked, new features were being added and bugs were being fixed up until the very moment that the code was released. This obviously leads to problems where one change generates new bugs that we have not yet had time to find, much less fix. Consequently, another buggy release goes out the door.

A Solution?

Some groups attack this problem with the idea of a Feature Freeze. Some period of time before the scheduled release (maybe two weeks), they declare a freeze on new features. From this point of time until the actual release, the only changes that can be made to the code are related to bugs found during testing. This is intended to serve as a clean-up period before the code is released. Also, there is at least some possibility that the actual features will be able to be documented before the release, because the documentation team may have had time to look at them before the actual release.

Once the feature freeze process becomes mandatory (which may be after one or more sort of freezes), there is normally one or two very painful releases where everyone follows the rules. In many cases there is a fair amount of wailing and gnashing of teeth as features that were promised don't make it into a release because they were not complete by the feature freeze date.

The Pain

At this point, the project person/team normally gets quite upset about promised features that didn't get done by the deadline. Maybe someone promised one of those features to an important customer, or there was just a feature that they were looking forward to having.

The developers also don't like being in the position of having agreed to something that they can not deliver.

Feature Slush

If pushed too hard to complete all of the features by a deadline, some developers may be tempted to finesse the definition of done a bit. Some will argue that their feature is done except for a little polishing or some minor bug fixes. In some cases, the programmer may actually believe that the feature is a few hours from complete.

In other cases, they may know there is a lot more work to be done. This can result in features that are complete (as in they have code in place that might work), but are not done (as in actually working as specified).

In one place that I worked this had reached an extreme where I saw some completed feature code that basically consisted of a couple of empty subroutines. The feature existed (according to the programmer). It just had a bug (it didn't work at all).

Not a Solution

One way to make this worse is to brow-beat the developers for not getting everything done before feature freeze. Another mistake is to push back the deadline, but add more to be done since we now have more time.

In at least one case I'm aware of, a pre-freeze deadline was instituted to give people time to finish before the actual feature freeze, but the problems that caused the initial problem weren't addressed. This just pushes the whole problem to a new level.

Controlling Feature Freeze

There are three things you can do to help make a feature freeze actually work.

  1. A good definition of done is critical to defining when a feature is included. Everyone needs to be aware that done writing code is not the same as the feature is done.

  2. The next part of a real feature freeze is realizing that not releasing features because of the deadline is part of the definition of a feature freeze. The iron triangle of project management still applies. If the deadline is fixed, the scope of development will need to vary.

  3. A quick release cycle can reduce this problem. If they know there will be a new release in a few months, developers are less likely to fudge their code to get it in now. Likewise, a short release schedule means that the project manager is less likely to scream if a feature doesn't make this release. If the next release is a year or more away, a developer might be tempted to squeeze one more feature in, or a project manager might be tempted to convince a developer to finish something for the deadline.


Given a definition of done that everyone agrees to, the ability to modify the list of features that will be in a release, and a relatively short period between releases, it's possible to get feature freezes to work.

Conclusion

The idea of feature freeze is mostly aimed at getting a better handle on what will actually be in a given release. Without careful considerations for the implications of a freeze, it can make matters worse.

Posted by GWade at 08:45 AM. Email comments