This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

November 25, 2007

Debugging Without a Debugger

I've noticed something about the programmers I have dealt with in the last few years. Many of them seem to equate debugging skill with ability to use a debugger. In fact, in some instances, the concept of being able to troubleshoot a problem outside a debugger is so foreign it would never occur to them.

A debugger is very helpful for many troubleshooting tasks. If there is a logical error in a localized area of code, a debugger can help you quickly explore the logic and find the problem. This sharp focus is the most important feature of the debugger. However, if you don't have any idea where the problem is located, this narrow focus is more of a hindrance than a help.

Most people who only have debugger-based troubleshooting experience end up scattering breakpoints throughout the code hoping that one of the breakpoints will get them close. Unfortunately, if the problem is based on the relationships between different portions of the code, the narrow focus may hide the actual problem. Much of the problem is that some defects are related to the sequence in which different pieces of code or called or relationships between multiple pieces of code.

In this case, the debugger only gives part of the story. You need to track these relationships separate from the debugger session. The debugger's narrow focus and the need to track relationships and sequencing separately makes this kind of troubleshooting difficult. This is not a problem with debuggers, it is the result of using the wrong tool for the job.

Instrumenting code

A completely different approach is to instrument the code with logging statements. Because of the nature of logging, the sequencing information is explicitly tracked in the log. Proper choice of information to write to the log can help tracking the relationships as well. Instrumenting the code does not provide as easy a method of focusing in on more localized problems, but it is much better at troubleshooting non-localized problems.

Either technique can be used on many problems. Some problems are easer to solve with one technique or the other. A few problems are easiest to solve by combining the techniques. You can use instrumentation to find the general shape of the problem. Instrumentation may help to discover which methods are being called in which order or which methods are called more often than others. This approach is really helpful in trying to find when code is not called.

Summarizing Your Output

Since the kinds of problems that work best with instrumenting are problems with relationships between calls, just looking at the output is not always enough to find the problems. Sometimes you need to summarize the data in some way. Maybe you need to count calls to particular routines, or verify that every call to method A has a corresponding call to method B. These kinds of relationships are often easier to see after the data has been re-ordered in some way.

You can use tools like sort and uniq to do simple reorganization of the data to look for patterns. Sometimes you will need more powerful tools like AWK or Perl to extract relationships from the code. If you format the output of your logging statements appropriately, you can even use a spreadsheet program like Excel to re-organize the output to provide better understanding.

Different Viewpoints

Instrumenting code provides a different kind of information than you normally get from a debugger. This technique is very useful for dealing with problems that require seeing relationships between multiple different portions of the code. Another place where instrumenting can be more useful than using a debugger is when investigating long loops. If a loop runs a dozen times, setting a breakpoint and inspecting the code on each pass can be useful. If the loop runs half a million times, the breakpoint is basically useless.

The main use for instrumenting code is looking for getting an overview of the code. Using a debugger gives a highly focused way to inspect the code. However, instrumenting is a better tool for getting a broader view of the code. In some cases, once you have digested this broader view, you may find a particular piece of code that needs more focused attention. Switching back to the debugger can be very effective at this point.

Once you are comfortable with both techniques, you will often find yourself switching back and forth between the two techniques. You might use some instrumenting to test an idea of why a problem is occurring and then switch to the debugger to look more carefully at a method that has attracted your attention. After doing some focused examination with the debugger, you might decide that another area might be more fruitful. Then, you instrument a different piece of the code to explore another idea. In some situations, the two techniques complement each other.

The Downsides of Instrumenting

One of the main problems with instrumenting code is the need to change the code to add the statements needed to log information. This requires some recompilation and is not as quick as adding and removing breakpoints. Because of the recompilation cost, some people ignore this technique.

Obviously, avoiding a useful technique because it has a cost is not reasonable. Many of the decisions we make in software development are about trade-offs. You should be able to evaluate your debugging tools in terms of costs and benefits, as well. Obviously, you wouldn't use an expensive technique for a trivial problem. But, if the problem is complicated enough, the cost is less than the benefits.

Another way to reduce the cost of recompiling is to try to add as much logging as possible to avoid recompiling again. Printing out too much information with each logging statement or adding lots of instrumentation just in case may reduce the amount of time spent recompiling, but it increases the amount of time you spend summarizing and analyzing the output. It's always important to remember that the output from the instrumentation and the summarizing code are not the goal, they are just tools used to find an actual problem.

Instrumenting the code in a way that tests one or a couple of ideas is much better than generating so much output that you will spend a day wading through the logs looking for something important. Eventually, you reach the point where you are spending more time looking at the logs, than you would have running another compile.

One final downside of instrumenting code is the risk of accidentally leaving the logging code in place after you have found the problem. Your version control system is your friend at this point. Always check the changes you are going to commit to verify that you are only adding actual fixes and not debugging code.

Posted by GWade at 10:57 AM. Email comments

November 12, 2007

Review of Software Configuration Management Patterns

Software Configuration Management Patterns
Stephen P. Berczuk and Brad Appleton
Addison-Wesley, 2003

The first three chapters define the problem space. We get a solid description of Software Configuration Management (SCM) and an introduction to patterns and pattern languages. This section of the book sets up the context that you will need to understand the rest.

Much like the GOF book, this book gives names to different practices that you may now be using. It explains the each of these practices as patterns. More importantly, this book relates these patterns to one another as a pattern language that gives more of a big picture understanding of SCM. In other words, the book not only presents patterns such as Mainline, Integration Build, and Release Line; it also explains how these and other patterns relate to each other to make a strong SCM policy.

I have been using various version control systems for nearly two decades. In that time, I have stumbled my way toward understanding many of these patterns. If you have worked in software for a long time, you might feel that you already know everything you need. One of the things I found most useful in this book, (besides the standardized naming) was justification for some of the practices I had come to accept. The way the book related different practices to make the combination stronger was also quite revealing.

If you have not been doing SCM for long or have just begun using some version control tool, this book can give you insight into what you should be doing. Unfortunately, I suspect that some experience is needed to properly appreciate the patterns in the book. If you already know everything you need to about SCM, the book still provides standardized names and relationships that can help when explaining your practices to others.

Overall, I recommend this book for anyone working in software development. While it is not the most exciting topic to read, it is practical and useful to working developers and their support teams.

Posted by GWade at 10:09 PM. Email comments

November 10, 2007

The IP Goose, revisited.

A couple of years ago, I wrote in The IP Goose about some effects that too strict an IP policy can have on developers. In the intervening two years, I've had some other insights that I believe are important.

First of all, I stand by my original essay. Companies should be able to protect the software they have hired people to write. They should have some protection against an individual taking the information they have learned to take business away from them. A carefully crafted IP agreement can do that.

In the previous essay, I also described how practice is needed to improve skills and increase knowledge. I still believe that development outside of work is needed to practice, learn new things, and keep unused skills from fading. Unfortunately, I've also come to believe that the effects of a draconian IP policy can be more damaging than I first thought.

Habit

One important part of practice is consistency. As long as you continue to practice anything, you tend to improve or, at least, maintain your skills. Obviously, if you stop practicing, the skills and knowledge begin to degrade. More importantly, you also start to lose the habit of practice. The longer you are not practicing, the more effort is needed to get back into the habit of practicing and the easier it is to backslide. This tends to make regaining the skill you are practicing even harder.

I'm sure many of you have seen this effect in different fields: martial arts, sports, music, cooking, etc.

Well, the time off from practice caused by a draconian IP agreement will have the same effect on your programming as an enforced absence from skiing or judo would have on your performance in those areas. Subtly, priorities have shifted. There's always something else that needs to be done. You haven't had time to practice for x months, what's another couple of days? It takes quite a while to overcome that inertia, and that whole time your skills continue to degrade.

Long Term Effects

I think that an overly strict IP agreement not only has a detrimental effect on our skills while you work under it, but also a long-term degradation in your ability to practice.

I worked for a while under a really strict IP agreement. Following the letter of the agreement meant that any software I touched would belong to the company. Working on Open Source projects was obviously out of the question. If I wanted to work on a project on my own, I had to contact the appropriate person in the company's legal department and obtain permission, in advance. If necessary, I had to get sign-off from my supervisor that the project was in no way related to my work with the company. Needless to say, quickie little projects to try out a new technology or technique were no longer worth the effort.

It has taken quite some time to regain the habit of working on projects on my own time. I was able to get back to the little stuff pretty quickly, but bigger project require a lot more effort to start and work on than they used to. Fortunately, I still enjoy writing software, so I have a strong incentive to keep working at it.

Posted by GWade at 01:36 AM. Email comments