This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

August 05, 2010

The Nature of Exceptions

While I was at YAPC::NA this year, I got a chance to catch up with a few old friends and talk with some really knowledgeable people from other parts of the Perl community. One such discussion started when Rob Kinyon told me that "exceptions are fundamentally flawed by design and should never be used." (I'm paraphrasing here. I didn't take notes on his exact words.)

I've used exceptions in multiple languages in the past decade. I've seen both good and bad usages of exceptions. I've also heard many of the arguments against exceptions over the years. (I actually touched on these years ago in the post Joel on Exceptions.) I asked if he could explain, mostly so I could hear which arguments he would use. I really expected the same old arguments. I was quite pleasantly surprised when Rob made some really interesting points in a direction I had not heard before.

If I understood him correctly, there were two major points to his argument.

  • Exceptions thrown by functions down the call stack and caught later introduces coupling between widely separated pieces of code.
  • The exception object is not part of the parameters or return of any of the functions, so it acts somewhat like a global variable, which is bad.

These arguments were not like the normal complaints I've heard about exceptions. Rob's arguments were actually fairly well-thought-out and well-argued. During the discussion, we ended up attracting Piers Cawley into the discussion. The discussion turned pretty lively after that.

Coupling

I don't think I presented my thoughts very well in that discussion. So, here goes. Let's start with the coupling argument. The simple example of Rob's argument looks something like this (in pseudo-code).


sub foo {
...
bar();
...
}
sub bar {
...
baz();
...
}
sub baz {
...
if( bad_stuff ) throw bad( "Shouldn't do this." );
...
}

Rob suggested that foo() is now coupled to baz() because it needs to be aware that a bad exception may be thrown. This would obviously be bad, because baz() is an implementation detail of bar() and foo() should only know about bar().

In fact, foo() is not coupled to baz() at all. Calling baz() has just expanded the interface of bar() to include the ability to throw a bad exception. In fact, that is no different than if the implementation of bar() were changed to throw a bad exception. This is really no different than having to change the calling signature of bar() to add a new parameter needed by baz(). The foo() function does not need to know what bar() does with the parameter (or exception), just that it is part of bar()'s interface.

What this argument does point out, however, is that the exceptions that may be thrown by a function are a hidden portion of the interface. Now, I'm sure some Java-fans will be getting smug about Java's checked exception system. Using this facility, the programmer must specify all of the exceptions that may be thrown by a method. Unfortunately, it's been my experience that system breaks down in maintenance as changes in lower-level code necessitate touching arbitrary amounts of calling code to fix up the throws clauses to deal with new exceptions.

It seems that many programmers eventually either change to unchecked exceptions or declare that all methods can throw an Exception or Throwable, so that anything is allowed. Better design skills or an architecture that converts exceptions to more generic forms as they move between layers can mitigate this issue. But, it does point out an implementation issue with exceptions. Exceptions seem to either result in a hidden interface or a lot of extra maintenance. Perhaps later implementations or better architectures will remove this issue.

Exceptions as Globals

The other argument that Rob made was that exceptions are basically global objects. Like globals, when you access the exception object you have little idea where in the code it was generated. Possibly the biggest problem with a real global is that it may be changed by literally any piece of code in the system. The only way to find out where the global is modified is to examine every piece of code in the system.

Rob's argument was that it is possible that you would need to examine a large amount of code to determine the context for the thrown exception, since the exception might have been thrown from a point far from where we catch it. Unlike a real global, exceptions are only created in code called by the point where we catch the exception. So, in theory, this limits the amount of code that would need to be examined. In some cases, however, this limited amount of code could still be a large section of the codebase.

Java exceptions reduce the effects of this issue to some extent by providing a stack dump in the exception. Depending on the system, that might be almost as bad as searching the code. Other systems support a method of chaining exceptions. In these kinds of systems, you catch an exception when you have more context to add and throw a new exception containing the old exception and further context information. Done correctly, this can reduce the noise of a stack dump type exception and increase the real, useful context available to deal with an exception.

The Meaning of an Exception

One of Rob's arguments was exceptions are not actually necessary. He argued that when something goes wrong in code, there are basically two possible ways of handling the problem.

  1. Automatically recover the error condition near the point of the problem.
  2. Code gives up and hands control to a human to handle it.

Piers made the very useful observation that exceptions give us another recovery method in between those. This allows the code to basically say I've got a problem I can't handle, someone handle this for me. If no code handles the exception, it basically degenerates to the second case. But, if higher-level code has more context and can actually handle an issue that the low-level code can not handle, an exception allows the code to turn the second case into the first.

So an exception can actually be looked as a a call for help from the lower-level code for someone else to deal with a problem.

Conclusion

To me, at least, Rob's arguments were interesting, but not compelling reasons to avoid exceptions. They also weren't strong enough to justify the claim that th design of exceptions are fundamentally and fatally flawed. I would also agree that certain implementations are less than ideal and that the best exception implementations only exist in the future.

Posted by GWade at 07:10 PM. Email comments

February 21, 2004

Review of Exceptional C++

Exceptional C++
Herb Sutter
Addison-Wesley, 2000

I had been working with C++ for a number of years before I read this book and I thought I knew the language.

This book provides 47 problems with included solutions. Trying to solve the problems is very important. Each one tests an area of C++ that some people find unclear. In some cases, I didn't realize that I was unclear on the topic until I solved the problem and finished reading the explanation. These problems will stretch your C++ skills and solidify your understanding of the language.

The second section of the book covers exception safety. In some ways, this may be the most important part of the book. Sutter really does a great job of converting gut-level intuition about exceptions into logical, useful knowledge. In this section, he covers different levels of exception safety and what each level guarantees. He then uses these levels and their guarantees to analyze and construct exception-safe code.

In addition to explaining why some things that work actually work, this book did a great job of showing when and where other good ideas will blow up in your face.

Exceptionally highly recommended.

Posted by GWade at 09:55 AM. Email comments

February 11, 2004

Object Death

What does it mean for an object to die? In C++, there are several distinct and well-defined stages in the death of an object. Other languages do this a little differently, but the general concepts remain the same.

This is the basic chain of events for an item on the stack.

  1. Object becomes inaccessible
  2. Object destructor is called
  3. Memory is freed

Once an object goes out of scope it begins the process of dying. The first step in that process is calling the object's destructor. (To simplify the discussion, we will ignore the destructors of any ancestor classes.) The destructor should undo anything done by the object's constructor. Finally, after all of the destruction of the object is completed, the system gets an opportunity to recover the memory taken by the object.

In some other languages, a garbage collection system handles recovering memory. Some systems guarantee destruction when the object leaves scope, even with automatic garbage collection. However, some of them focus so hard on memory recovery that they provide no guarantees about when, or even if, destruction of the object will occur.

Although many people pay a lot of attention to the memory recovery part of this process, it seems to be the least interesting part of the process to me. The destruction of the object often plays a vital role in the lifetime of the object. This destruction often involves releasing resources acquired by the object. Sometimes, memory is the only thing to be cleaned up, but many times other resources must be released. Some examples include

  • closing a file
  • releasing a semaphore or mutex
  • closing a socket
  • closing/releasing a database handle
  • terminating a thread

These are all issues that we would like to take care of as soon as possible. Also, they result in some consequence if the cleanup step is forgotten or missed.

Anytime I have a resource that must be initialized or acquired and shutdown or released, I immediately think of a class that wraps that functionality in the constructor and destructor. This pattern is often known as resource acquisition is initialization. Following this pattern gives you an easy way to tell when the resource is yours. Your ownership of the resource corresponds to the lifetime of the object. You can't forget to clean up, it is done automatically by the destruction of the object. Most importantly, the resource is even cleaned up in the face of exceptions.

In the systems where destruction may be postponed indefinitely, this very useful concept of object death and the related concept of object lifetime is discarded.

Posted by GWade at 05:49 PM. Email comments

February 07, 2004

The Forgotten OO Principle

When talking about Object Oriented Programming, there are several principles that are normally associated with the paradigm: polymorphism, inheritance, encapsulation, etc.

I feel that people tend to forget the first, most important principle of OO: object lifetime. One of the first things that struck me when I was learning OO programming in C++ over a decade ago, was something very simple. Constructors build objects and destructors clean them up. This seems obvious, but like many obvious concepts, it has subtleties that make it worth studying.

In an class with well-done constructors, you can rely on something very important. If the object is constructed it is valid. This means that you generally don't have to do a lot of grunt work to make sure the object is set up properly before you start using it. If you've only worked with well-done objects, this point may not be obvious. Those of us who programmed before OO got popular remember the redundant validation code that needed to go in a lot of places to make certain that our data structures were set up properly.

Since that time, I have seen many systems where the programmers forgot this basic guarantee. Every time this guarantee is violated in the class, all of the client programmers who use this class have a lot more work on their hands.

I'm talking about the kind of class where you must call an initialise method or a series of set methods on the object immediately after construction, otherwise you aren't guaranteed useful or reliable results. Among other things, these kinds of objects are very hard for new programmers to understand. After all, what is actually required to be set up before the object is valid? There's almost no way to tell, short of reading all of the source of the class and many of the places where it is used.

What tends to happen in these cases is the new client programmer copies code from somewhere else that works and tweaks it to do what he/she needs it to do. This form of voodoo programming is one of the things that OO was supposed to protect us from. Where this really begins to hurt is when a change must be made to the class to add some form of initialisation, how are you going to fix all of the client code written with it. Granted, modern IDEs can make some of this a little easier, but the point is that I, as the client of the class, will need to change the usage of the object possibly many times if the class implementation changes.

That being said, it is still possible to do some forms of lazy initialisation that save time at construction time. But, the guarantee must still apply for a good class. After construction, the object must be valid and usable. If it's not, you don't have an object, you have a mass of data and behavior.

The other end of the object's lifetime is handled by a destructor. When an object reaches the end of it's life, the destructor is called undoing any work done by the constructor. In the case of objects that hold resources, the destructor returns those resources to the system. Usually, the resource is memory. But, sometimes there are other resources, such as files, database handles, semaphores, mutexes, etc.

If the object is not properly destroyed, then the object may not be accessible, but it doesn't really die. Instead, it becomes kind of an undead object. It haunts the memory and resource space of the process until recovered by the death of the whole process. I know, it's a little corny. But, I kind of like the imagery.

This concept also explains one of the problems I have with some forms of garbage collection. Garbage collection tends to assume that the only thing associated with an object is memory. And, as long as the memory is returned before you need it again, it doesn't really matter when the object dies. This means that we will have many of these undead objects in the system at any time. They are not really alive, but not yet fully dead. In some cases, you are not even guaranteed that the destructor, or finalizer will be called. As a result, the client programmer has to do all of the end of object clean up explicitly. This once again encourages voodoo programming as we have to copy the shutdown code from usage to usage throughout the system.

So keep in mind the importance of the lifetime of your objects. This is a fundamental feature of object oriented programming that simplifies the use of your classes, and increases their usefulness.

Posted by GWade at 12:16 PM. Email comments

January 16, 2004

Language Book Intros

In the past year, I've had to move my Java skills from recognize the language at twenty paces to professional Java programmer. In the process, I've been reading a number of books on the language. This has been my approach to learning every language I've ever worked with.

Almost all of the Java books have seemed to have a chapter or section in common that I haven't seen anywhere else. Does anyone know for certain if it is mandatory that every Java book has a bash the other languages chapter?

Maybe I've just had bad luck in picking books, but it does seem that almost every one that I have read has a chapter like this. They harp on obviously inherently insecure C, dangerous, convoluted C++, lowly scripting languages like Perl, and many other real or imagined flaws of other languages.

Now I do understand that programmers can become quite passionate about their favorite language. Ask almost any programmer about which language is best, or most powerful, and you can expect a lively discussion. But, I really don't recall this kind of diatribe in any other language books that I've read.

When I was first learning C++ (nine or ten years ago), some books devoted space to how C++ allowed for better abstractions and potentially more maintainable code than C. But, this information wasn't in every book and it was not an attack on C. It framed more as enhanced features for solving different kinds of problems.

When I was first learning Perl (over ten years ago), most of the books talked about ability to get work done and programmer efficiency. I do remember discussions of using Perl instead of combinations of AWK, SED, and shell scripting. But, I don't recall any attacks on other languages.

When I was learning C (over fifteen years ago), there was almost no mention of other languages in the books I read. There was a lot of talk of solving problems and a strong impression that you could solve any kind of program with C.

Even when I was learning Forth, there was a lot of talk in the books about the Forth way of solving problems, but other languages were not attacked.

The same hold true for every other computer language I have learned including FORTRAN, LISP, Basic, and x86 assembler. No books on any of these languages spent much time on the flaws of other languages, they focused on getting a job (or all jobs) done using this language.

One of my biggest gripes about this approach is the waste of space I end up paying for when I buy the book. If I'm buying a book on a particular programming language, I've already made the decision that I will be using the language (at least for the current project). At this point, I wish to learn syntax, idioms, tools, and approaches to solving problems with the language. I am not looking to be convinced that this language is the embodiment of the One, True Way to program.

I'm not looking for the One, True Way to program. I have many languages in my toolkit. I try to use the best one for each job.

Posted by GWade at 11:29 PM. Email comments

Unit tests that should fail

I was doing a little research on the Java JUnit test framework and ran across the article The Third State of your Binary JUnit Tests.

The author points out that in many test sets there are ignored tests as well as the passing and failing tests. As the author says, you may want to ignore tests that show bugs that you can't fix at this time. He makes a pretty good case for this concept.

The Perl Test::More framework takes a more flexible approach. In this framework you can also have skipped tests and todo tests in addition to tests that actually need to pass. These two different types of tests have very different meanings.

Skipped tests are tests that should not be run for some reason. Many times tests will be skipped that don't apply to a particular platform, or rely on an optional module for functionality. This allows the tests to be run if the conditions are right, but skipped if they would just generate spurious test failures.

Todo tests have a very different meaning. These tests describe the way functionaly should work, even if it doesn't at this time. The test is still executed. But, if the test fails, it is not treated as a failure. More interestingly, if a todo test passes, it is reported as a failure because the test was not expected to pass. This allows bugs and unfinished features to be tracked in the test suite with a reminder to update the tests when they are completed.

Unlike the idea in the referenced article, these two separate mechanisms don't ignore tests that cannot or should not pass. Instead, we can document two different types of non-passing tests and still monitor them for changes.

Posted by GWade at 12:58 PM. Email comments