This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

November 25, 2007

Debugging Without a Debugger

I've noticed something about the programmers I have dealt with in the last few years. Many of them seem to equate debugging skill with ability to use a debugger. In fact, in some instances, the concept of being able to troubleshoot a problem outside a debugger is so foreign it would never occur to them.

A debugger is very helpful for many troubleshooting tasks. If there is a logical error in a localized area of code, a debugger can help you quickly explore the logic and find the problem. This sharp focus is the most important feature of the debugger. However, if you don't have any idea where the problem is located, this narrow focus is more of a hindrance than a help.

Most people who only have debugger-based troubleshooting experience end up scattering breakpoints throughout the code hoping that one of the breakpoints will get them close. Unfortunately, if the problem is based on the relationships between different portions of the code, the narrow focus may hide the actual problem. Much of the problem is that some defects are related to the sequence in which different pieces of code or called or relationships between multiple pieces of code.

In this case, the debugger only gives part of the story. You need to track these relationships separate from the debugger session. The debugger's narrow focus and the need to track relationships and sequencing separately makes this kind of troubleshooting difficult. This is not a problem with debuggers, it is the result of using the wrong tool for the job.

Instrumenting code

A completely different approach is to instrument the code with logging statements. Because of the nature of logging, the sequencing information is explicitly tracked in the log. Proper choice of information to write to the log can help tracking the relationships as well. Instrumenting the code does not provide as easy a method of focusing in on more localized problems, but it is much better at troubleshooting non-localized problems.

Either technique can be used on many problems. Some problems are easer to solve with one technique or the other. A few problems are easiest to solve by combining the techniques. You can use instrumentation to find the general shape of the problem. Instrumentation may help to discover which methods are being called in which order or which methods are called more often than others. This approach is really helpful in trying to find when code is not called.

Summarizing Your Output

Since the kinds of problems that work best with instrumenting are problems with relationships between calls, just looking at the output is not always enough to find the problems. Sometimes you need to summarize the data in some way. Maybe you need to count calls to particular routines, or verify that every call to method A has a corresponding call to method B. These kinds of relationships are often easier to see after the data has been re-ordered in some way.

You can use tools like sort and uniq to do simple reorganization of the data to look for patterns. Sometimes you will need more powerful tools like AWK or Perl to extract relationships from the code. If you format the output of your logging statements appropriately, you can even use a spreadsheet program like Excel to re-organize the output to provide better understanding.

Different Viewpoints

Instrumenting code provides a different kind of information than you normally get from a debugger. This technique is very useful for dealing with problems that require seeing relationships between multiple different portions of the code. Another place where instrumenting can be more useful than using a debugger is when investigating long loops. If a loop runs a dozen times, setting a breakpoint and inspecting the code on each pass can be useful. If the loop runs half a million times, the breakpoint is basically useless.

The main use for instrumenting code is looking for getting an overview of the code. Using a debugger gives a highly focused way to inspect the code. However, instrumenting is a better tool for getting a broader view of the code. In some cases, once you have digested this broader view, you may find a particular piece of code that needs more focused attention. Switching back to the debugger can be very effective at this point.

Once you are comfortable with both techniques, you will often find yourself switching back and forth between the two techniques. You might use some instrumenting to test an idea of why a problem is occurring and then switch to the debugger to look more carefully at a method that has attracted your attention. After doing some focused examination with the debugger, you might decide that another area might be more fruitful. Then, you instrument a different piece of the code to explore another idea. In some situations, the two techniques complement each other.

The Downsides of Instrumenting

One of the main problems with instrumenting code is the need to change the code to add the statements needed to log information. This requires some recompilation and is not as quick as adding and removing breakpoints. Because of the recompilation cost, some people ignore this technique.

Obviously, avoiding a useful technique because it has a cost is not reasonable. Many of the decisions we make in software development are about trade-offs. You should be able to evaluate your debugging tools in terms of costs and benefits, as well. Obviously, you wouldn't use an expensive technique for a trivial problem. But, if the problem is complicated enough, the cost is less than the benefits.

Another way to reduce the cost of recompiling is to try to add as much logging as possible to avoid recompiling again. Printing out too much information with each logging statement or adding lots of instrumentation just in case may reduce the amount of time spent recompiling, but it increases the amount of time you spend summarizing and analyzing the output. It's always important to remember that the output from the instrumentation and the summarizing code are not the goal, they are just tools used to find an actual problem.

Instrumenting the code in a way that tests one or a couple of ideas is much better than generating so much output that you will spend a day wading through the logs looking for something important. Eventually, you reach the point where you are spending more time looking at the logs, than you would have running another compile.

One final downside of instrumenting code is the risk of accidentally leaving the logging code in place after you have found the problem. Your version control system is your friend at this point. Always check the changes you are going to commit to verify that you are only adding actual fixes and not debugging code.

Posted by GWade at November 25, 2007 10:57 AM. Email comments