Programmer Musings: August 2004 Archives

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

August 26, 2004

The Forgotten Engineering Principle

Over the last few years, I've been frustrated with the "Software Engineering" industry. I was actually trained as an engineer, and much of what I see passed off as software engineering is not engineering. I've been mulling this over for some time without being able to put my finger on the part that bothers me. I think I know what it is, but I need to warn the process people and the methodology people that they should read to the end before disagreeing. This concept is not hard to understand, but I'm pretty certain my writing skills aren't up to making this impossible to misunderstand.

The forgotten principle is the concept of Good Enough. Engineering has never been about making something perfect, it has been about making it good enough. By good enough, I don't mean shoddy. I don't mean cheap. I mean exactly what I said, good enough. Now I'm sure some out there are beginning to mutter something about "undisciplined, cowboy coder," but bear with me.

"Good Enough" in Electrical Engineering

I was trained in electrical engineering, so let's start by looking at good enough in those terms. One component that everyone has at least heard of is the transistor. Most people understand that transistors are used to make amplifiers and that they are used inside CPUs. What people don't realize is that, except in very special circumstances, transistors make lousy amplifiers.

A transistor has several modes of operation. In one mode, the linear region, the transistor performs pretty well as an amplifier. At the edges of this region or outside that region, the transistor's behavior is non-linear. However, some engineer decided that if we could constrain the input and limit the amplification, we would gain a small component that amplifies well enough to be useful. Over the years, several variant transistors have been developed that have somewhat different linear capabilities, by tying different transistors together and making use of the "good enough" capabilities we have very good transistor-based amplifiers.

Outside the linear range, transistors work sort of like voltage-controlled switches. The problem was that they leaked current when they were turned off. They also tended to act as amplifiers when they passed through this "linear region" in the middle of their range. But, once again, some engineers decided that the transistor worked "enough" like a switch to do the job. Over time, the designs were refined and we now have transistor-based CPUs in desktop machines that have operating frequencies in excess of 2GHz.

If the people investigating transistors had taken the original transistor and tried to use it in a 2GHz CPU design or in a fancy multistage ammpilfier for CD-quality sound, they would have given up in despair. There is no way that the early transistors could have worked this way. But they realized that the current design was good enough to use in some applications, and they trusted that it could be improved given time.

There are numerous other components and circuits that have similar histories. None of them behave in a perfect fashion. But, in a controlled environment, these components perform well enough to do useful work.

"Good Enough" in Engineering

In engineering, almost every problem has a huge number of constraints and requirements. Many of these requirements are mutually exclusive. Engineers look at various constraints and compare requirements and find tradeoffs to try to meet all of the requirements well enough to allow the product to be built.

When automotive engineers are designing a car, they have to design for safety. They could design a car that is perfectly safe. You could design one that contains special cushioning that can support every part of the body in any kind of collision. In addition, we could make the car so that it has built-in life support for use in case the car goes under water. While we are at it, let's limit the top speed of the vehicle to 5 mph. Obviously, this would be safer than anything on the road. But, no one would buy it. First of all, it would cost too much. Second of all, we could not use it to get anywhere in a reasonable amount of time.

Anybody who has had foundation work done has probably wondered why the house wasn't built better in the first place. If you can repair the house so that the foundation is in good condition, why couldn't it have been built that way in the first place? I talked with an engineer that inspects foundations about that subject a short while back. He pointed out that the changes to solve this problem are well understood and a builder could easily add them the the construction of a house. Unfortunately, that change would significantly increase the cost of the house. Sure, you would like to have a house that would never have foundation problems. But, are you willing to pay three times as much to protect yourself from something that might happen in ten or twenty years? What if you move before then?

"Good Enough" in Computers

Shortly before I began developing software, a little revolution was going on in the computer world. Any serious computer work was being done on mainframes or mini-computers or on supercomputers. Mini-computers were slowly taking the market that mainframes had enjoyed. There had been quite a bit of excitement about microcomputers and the Personal Computer had only been out for a short time.

Many people, including IBM, expected the PC to be a toy. There was no way this toy would ever compete with the big machines. What many people found was that the PC was good enough to solve their problems. Obviously it didn't have the power of one of the bigger systems. In might need hours or days to solve a problem that would take minutes on a bigger piece of hardware. But, since it was in your office, you didn't need to wait days or weeks to get access to the big machines that others were sharing. And although it couldn't do everything that the big machines could do, the PC was good enough to "get my little job done".

Probably the best "good enough" story has to be the World Wide Web. Many people had looked at large Hypertext systems before Tim Berners-Lee. All of them had been stymied by one fundamental problem: How do you keep all of the resources synchronized? If there are dozens of computers in the network participating in this system, how can you make sure that all of the resources are constantly available.

Berners-Lee took a slightly different view. If we have some way to report a connection as failed and most of the links are available most of the time, maybe that would be good enough. He developed a simple markup language that provided good enough quality content and deployed the first web servers and browsers. Despite the annoying 404 problem and the fact that the early pages were basically ugly, the Web caught on and spread like wildfire. Obviously, a good enough reality was much better than a perfect theory.

Missing "Good Enough" in Process

Lately, some in the software engineering field seem to have lost this concept of good enough. More and more time and money are applied to create the perfect process that will allow for perfect software to be developed. People seem to think that more process will solve all of the problems. Anyone who questions the process is a heretic that wants to just hack at code. But, the fact of the matter is that in every other engineering discipline process supports the engineering principles, it doesn't replace them.

The principle of good enough, is an application of judgement to the problem. We could spend an infinite amount of time optimizing any given requirement. But a good engineer looks at all of the requirements, applies his/her experience, and says this part of the solution is good enough, we can move on. Some of the "process" in other forms of engineering serves to document that judgement call in a way that can be checked later.

Missing "Good Enough" in Design

Process isn't the only issue. Many people seem to be obsessed with big do-everything frameworks. Even simple work is performed using massive frameworks of cooperating components that will handle your little problem and any that you might scale up to. Companies have been built on using these big tools for solving hard problems. Consequently, some people feel the need to apply this big hammer to every problem they encounter.

Not every problem should be solved with a robust web-application. Not every web-based application needs J2EE. Not every data transfer needs an advanced XML application with full XML Schema support. Sometimes a good enough solution this week is better than the be-all-and-end-all solution two years from now.

Conclusion

You may wonder what all this means to you.

The next time you find yourself designing a huge multi-tier, system with clustered databases, stop and think about how little would be needed to solve your problem. Sometimes you may find that you can get away with a static output updated once a week instead of an automatically generated report with 24-7 availablility.

On the other hand, looking at the problem from another direction may force you to look more carefully at your tradeoffs. This may help you focus on what's really critical. This is the path to doing real engineering.

Posted by GWade at 07:25 PM. Email comments

August 11, 2004

Participation vs. Hacking

In The Architecture of Participation vesus(sic) Hacking, Carlos Perez argues against points he sees in the essay Great Hackers by Paul Graham. Having read the two essays, I find Perez's comments enlightening, but maybe not in the way he intended. I found things in both essays that I agree with, and things in both that I disagree with. There are some that I feel deserve comment.

Perez begins by focusing on a comment made by Graham that of all the great programmers he can think of, only one programs in the Java programming language. Perez, and others, take immediate offence at this comment and move to respond.

However, they miss an important point of the comment. Graham specifies of the great programmers he can think of. Later in the essay, Graham describes how hard it is to recognize a great programmer unless you've worked with him or her. I could make a similar comment that none of the great programmers I know program voluntarily in Lisp or Python. This does not mean that great programmers don't write in those languages, I just don't know any.

Furthermore, I think I'd go even farther to point out that the Java programming language currently attracts a large number of programmers of varying levels of skill for one reason only; someone will pay them to work in Java. As a side effect, you are more likely to find average or below average programmers working in Java than great programmers simply because there are a lot more to sort through. That does not detract from the great programmers you will find.

Having blasted Graham for an attack on his favorite language, Perez goes on to attack other "Hacker" languages. He writes that

Nothing has ever been built in Lisp, Python or any other "Hacker" language that comes even close to the innovation in the Eclipse platform. I challenge anyone to point one out.

Many people in the comments section of his essay point out many counterexamples. There's no need for me to repeat those here.

I do find it interesting that Perez starts his whole essay based on one off-handed slap at Java and in the process makes an equivalent unsupported swipe at the Perl programming language (the write-only comment). I still find this ongoing misconception to be annoying. I've programmed professionally in Fortran, C, C++, Perl, Forth, and Java. I haven't seen a single one of these languages that did not allow write-only code. (In fact, I've seen some spectacular examples in Java.) I've also seen beautifully written code in all of these languages. The readability of any given piece of code is more a reflection on the particular programmer and the problem at hand than it is on the language.

Perez also finds confusion by what he calls Graham's definition of the word "Hacker". I find this amusing. Graham is using the definition of hacker that was accepted before the media appropriated it for "people who write viruses and other bad things". (See Jargon 4.2, node: hacker) I remember this definition of hacker in a version of the Jargon file back in the late 80's.

Perez's final mistake is in what he perceives as Paul Graham's fatal flaw. He points out that sooner or later, every programmer enters maintenance mode and that no hacker would remain at that point. Perez defends this idea by quoting Graham's comment that hackers avoid problems that are not interesting. He proceeds to use the legacy code argument to show why you shouldn't rely on hackers or hacker languages.

Unfortunately, Graham didn't say anything about great programmers and maintenance mode or legacy code. So this argument has nothing to do with Graham's essay. Moreover, I have seen many cases where the beginning of the project was better described by Graham's death of a thousand cuts comment than the maintenance of any legacy system I've worked on. More importantly, GNU software and the entire open source movement is driven by programmers (many of them great programmers) maintaining software for years.

All in all, I think Perez would have made his points better if he had not taken the Java comment quite so personally.

Posted by GWade at 11:03 AM. Email comments

August 10, 2004

Review of The Little Schemer

The Little Schemer
Daniel P. Friedman and Matthias Felleisen
The MIT Press, 1996.

One of my wife's friends recommended this book for learning Scheme. He's a big proponent of Scheme and has even done research into teaching Scheme to kids. He is quite knowledgeable in his field. On the other hand, I have never written a line of Scheme; although I did some coursework with LISP during my master's degree. Although I don't normally choose to work in LISP(-based) languages, I can appreciate some of their power.

I realized that this book was going to require a bit of suspension of disbelief in the preface, where I found this gem:

Most collections of data, and hence most programs, are recursive.

I agree that there are many useful collections of data that are recursive. I would even agree that many programs apply recursion. But I find the assertion that most are recursive a little strong. In fact, the only way I could see this is if the language the writers were working in treated almost everything as recursion. And, of course, this is the case.

The other real problem I had with the book is the style. The book is written as a series of questions and answers that lead you to the conclusions that they wish. Some of these question and answer sessions became quite strained; such as trying to explain how a complicated function works. In other spots, the authors asked a question that there is no way the reader could have begun to answer. The authors would then respond with something like:

You were not expected to be able to do this yet, because you are missing some of the ingredients.

I found this style very jarrng. I've learned a dozen or so languages from books (including LISP), and I've never had this much trouble understanding a computer language. The style may work for someone else, but it did nothing for me.

From the comments above, you might decide that I have nothing good to say about the book. That's actually not the case. I found the Five Rules and the Ten Commandments to be very effective ideas. The Five Rules basically define some of the most primitive operations in the language. The Ten Commandments state some important best practices.

I was also surprised at times by really sweet pieces of code and understanding that would come after some prolonged pseudo-Socratic Q&A sessions. Some of the insights and commandments are well worth the read. But, overall I found it difficult going.

Since it is considered one of the classics for learning Scheme, you should probably consider it if you are needing to learn Scheme or LISP. If all you've done is code in C or Java, it might be worth reading to introduce yourself to another way of looking at problems. But, I find it very hard to recommend this book to anyone else.

Posted by GWade at 10:19 PM. Email comments

August 08, 2004

Subversion

If you haven't tried Subversion yet, you really owe it to yourself to give it a try. I've used CVS for over a decade now and I've been trying Subversion for a little less than a year. I haven't yet moved most of my home projects to Subversion, but it's looking more probable every day.

The ability to rename and reorganize your files and directories without losing history is wonderful. The separation of status from update is great. I'm slowly coming to appreciate the properties system. It's really great to have a mime-type associated with each file and all the potential that goes along with that.

If you want to get started with Subversion, you can download a version at the URL above. You'll also want to read Version Control with Subversion, which is available on-line or in hard copy.

Posted by GWade at 11:04 PM. Email comments

Programmers and Pattern Matching

Contrary to expectations, many programmers do not solve problems through logic. This should not be surprising, as the same is true of humans in general. But many programmers I've known whole-heartedly disagree with that comment.

I've noticed that many really good programmers are extremely good at pattern recognition and solve many problems through a creative use of that skill. The human brain actually has amazing capabilities for pattern recognition. Two really good examples of this skill are language and facial recognition. When you think about it, much of language consists of recognizing patterns of sounds at a level below conscious thought and reinterpreting that into a higher level meaning. The ability to pick up subtle clues from a person's expression or recognize a person after many years (and the physical changes that come with that time) is truly amazing when you stop to think about it.

Most really good programmers are fascinated by patterns. They often play with patterns in ways that non-programmers might find bizarre. Many programmers really get caught up in puns, word-play, debate, poetry, and other forms of linguistic patterns. Many are fascinated by puzzles and mazes. Some spend lots of time with art. In all of these cases, the programmers seem to be exercising their pattern skills more than most.

It turns out that pattern recognition can be a really efficient method of solving problems. Either deductive or inductive reasoning requires walking through all of the relevant data of the problem one step at a time and connectinve them all together. Pattern matching allows you to focus on the shape of the problem to recognize a problem you've seen before. If you don't agree that the pattern recognition is easier, compare recognizing the face of a friend to the effort of describing that friend in enough detail for someone to draw them.

This ability allows a programmer to solve complex problems by breaking it into simpler problems that are recognized and therefore already (or simply) solved. This approach leaves the programmer free to concentrate on the new, interesting problems, instead of wasting brain power on the stuff that he/she recognizes. Over time, this develops into a very large mental list of patterns and solutions that can be applied at a moment's notice.

Some programmers allow this habit to get them stuck in a rut. They always apply the same solution even if better techniques exist. The better programmers are constantly learning so that they have a set of tools to apply to most problems. This still allows the efficiency of the pattern recognition approach with the power of a (near) custom solution if necessary.

Posted by GWade at 10:52 PM. Email comments