This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home
Notice: Programmer Musings has moved to gwadej.org. This copy of the blog will be closed down presently.

September 03, 2015

Review of The Art of Readable Code

The Art of Readable Code
Dustin Boswell and Trevor Foucher
O'Reilly Media, Inc., 2012

When I saw this book, it seemed like a great idea. I've spent my entire career with legacy codebases, trying to make them better. A book that covers ways to make code more readable, and therefore, more maintainable is a wonderful resource.

The book starts with a discussion of small changes that you can make to improve code: naming, aesthetics, and commenting. The authors do a good job of covering what makes a good name and how careful naming can really help the readability of the code. They touch on the lengths of names and manage not to make the mistake of so many by mandating that long names are always best. In the section on aesthetics, they discuss how the look of the code can help with understanding, similar structure conveying similar intent. They even spend a lot of time on making comments pull their own weight and what makes a good comment.

The second section covers structuring control flow and expressions for readability. In the process, they touch on but don't really go into the importance of idioms for making code more recognizable even when you can't read it in depth. Despite that lack, they make a good case for standardizing sections of code into similar forms and simplifying loops and conditionals. Since complicated code structure is hard to read, simplifying these structures improves readability.

The third section moves out of small-scale changes and begins talking about larger restructuring to improve code. In this section, they finally mention some standard forms of refactoring. They also advocate thinking a bit more about what you are trying to accomplish before writing the code. They do not push for full scale BDUF, but instead focus on thinking through subroutines and chunks of code to make certain that you understand what you are doing before writing a bunch of code.

The final major part of the book covers testing and one large case study showing how to apply what you've learned in the book. In the case study, they start with a sub-optimal solution to a problem and make it better and faster while improving its readability. They come very close to implying that the readability is the cause of the performance increase without quite going there.

Overall, I found the book to be quite readable. They did not go as far as I would have in a few areas. They used several programming languages for their examples. For each of those languages, they focused on how to use the idioms of the language to make their point. That handling of languages makes one of my disappointments with the book stand out. In the section on control flow, they make the statement:

Many respected programming languages, as well as Perl, have a do { expression } while (condition) loop.

The only mention of any language that they were not using in their examples is an snarky reference to Perl. I understand that many people really don't like the language, but the random sniping does get old. I was especially amused when they praised the newer languages on their list for features that have been in Perl for decades. That suggests that they may never have programmed in the one language they bashed.

Despite that one piece of snark, I would recommend this book to a junior programmer trying to learn to code. For intermediate or senior level programmers, this book could be a start, but I would expect to go further into idioms, choice of audience, code smells, and other issues relating to high quality code.

Posted by GWade at 01:50 PM.

September 01, 2015

Secure Development: Threat Models

There are numerous issues that you need to consider when developing almost any software. If you are working on software that connects to a network in any way, security is yet another thing that you need to consider.

To introduce this series on Designing Secure software, I'm going to talk about something that normally gets left out of discussions about security: threat assessment.

But first, let's go over how security discussions usually play out...

The Attack

Many companies, or even just development groups, don't think about security until one of a few things happens:

  • They get attacked
  • One of their customers is attacked (and it might be their fault)
  • A competitor gets attacked
  • A big-name company somewhere (Target, etc.) is attacked

In the first two cases, the result is normally yelling at the development staff to find out why they didn't make things secure. The other two cases normally start with an emergency meeting to ask Are we safe from that?

In most cases, little to no thought was spared up front for security. Everyone was focused on features, usability, look, and other issues that seem to translate directly into dollars. As usual, hidden issues don't get much attention unless they go wrong.

Once the emergency happens, the powers-that-be want the development staff to smear some security on the system to protect from attack. The big problems with this approach are:

  • There's no thought about what secure means
  • There's no thought what we need to be secure from
  • It doesn't work

The Secure System

I'll cover the myth that a system can be absolutely secure in a later post. Suffice it to say that if money is no object, a sufficiently powerful and motivated attacker can get into any system.

Identifying Attackers

Which leads to the second point, what kind of attackers are you trying to protect against? To a large extent, the kinds of attackers you expect determines the kinds of attack you are likely to see. Some of the possible attackers you might need to protect against include:

  • Script kiddies
  • Bored students
  • Disgruntled former employees
  • People with a grudge against the company
  • People with a grudge against an executive in the company
  • Competitors
  • Fledgling hackers practicing their skills
  • Disreputable companies selling security solutions
  • Small hacking groups looking for fun or reputation
  • Large hacking groups with an ideological or political agenda
  • Foreign companies looking for an advantage
  • Criminal organizations looking to make money
  • Nation-funded hackers looking for political advantage
  • Law enforcement agencies investigating your company
  • Law enforcement agencies investigating your customers
  • Three-letter agencies trolling for possible persons of interest

This is not a complete list, but it does a fair job of covering the range of attackers you might face. The kinds of attackers you expect determines what kinds of security you need.

What Applies to You?

If your service serves a community of people who discuss different varieties of carnations, you are not likely to be targets of organized crime, Chinese hackers, or the FBI. There's not much reason for a large hacking group to go after your community. You might be the target of vandalism or someone trying to plant malware on your site, but those attacks are a much different caliber than a high-end, organized attack.

You need to determine what is important about your site that you need to keep safe. Are you storing passwords, credit card numbers, or personal identification information on your clients? You probably need to be more careful than if you are storing a user-chosen alias. If you are storing serious financial information, you need to be even more secure.

What would happen if someone gained access to all of the information you have on your customers? Would it be embarrassing, a source of financial difficulties, or life-threatening?

What Matters?

Let's look at a few scenarios to see how you might make some decisions.

Carnation Community

Let's start with our example from the last section. Say you have a site that supports a group of flower enthusiasts chatting about carnations. What you have determines who might attack and how.

If you have a username and password for each user to allow them to post, there are a small number of possible attacks. The first is malware or malicious links. Anywhere your users are allowed to post something that you can display will need some level of protection from this treat. But, this applies to almost all sites. The important part is what is special about your site.

Given the known breaches that have released large numbers of passwords, your passwords are a target. Since, in this example, there is no email address or real name associated with the account, the passwords are only mildly valuable.

If your passwords are not stored in the clear, the biggest threat to your system is people hacking in and posting something that harms the reputation of a user in your system.

This would probably not attract the attention of anyone with a large amount of resources. Although, you might have to watch out for the sunflower hacking squad. They might want to deface your pages as part of their ongoing campaign to take the most popular flower spot.

A Bit More Tempting

Let's say we increase the amount of information on the carnation chat site. In addition to the username and password, let's say you include contact information: real name, physical address, and email address. You use this to send personalized offers for flower shops.

This makes the passwords more valuable (many people have one email address and use the same password on multiple sites). The email address is always of interest to spammers. The physical address and real name together gives more information for potential identity theft.

Notice how a small amount of information significantly increases the potential threats.

The Carnation Store

Say the site has added the ability to buy carnations and have them shipped to you. If you store credit cards, you have just become the target of larger groups. Organized crime, larger hacking groups, small-timers trying to make a reputation will all be interested. You now have something that translates directly to cash.

Summary

The more advanced the attacker, the more effort and expense will be needed to secure the software. Identifying who is likely to attack your software allows you to provide reasonable security without spending too much effort.

What you have and what you do determines who would be interested in attacking you. Identifying potential threats determines how much effort you need to put into securing your (and your users) information.

Putting too much effort into security, which could cause a project to miss a deadline or fail completely.

Posted by GWade at 07:55 AM.

August 25, 2015

BPGB: (Dis-)Integration Branches

This is another post in my intermittent series of Best Practices Gone Bad (BPGB)

Today, we are going to take another side-step into version control. Most development groups use version control of some form. Whether you prefer Subversion, Git, Mercurial, Bazaar, Clear Case, or any of the many others, version control is an important technique for keeping your changes under control. This is especially true if you are maintaining multiple releases concurrently or have more than a couple developers on your team.

Back in BPGB: Feature Branch Fail, we covered what happens when branches live for too long. When you need to merge multiple long-running branches, you increase the probability of conflicts.

Integration Branch

In order to prevent these kinds of conflicts from messing up the main branch, many people discover the idea of an integration branch. You branch from the main line, merge multiple feature or bug fix branches into this integration branch, and fix any conflicts there. When the integration branch is clean and survives the tests, you merge the branch back to main.

This approach seems pretty reasonable and usually solves the first set of problems that people have with branch conflicts. Although the conflicts still exist, the integration branch gives us the time to resolve conflicts without leaving the main branch broken. If the conflicts take time to merge, we don't have main in a broken state. If the conflicts are too overwhelming, we have an easy way to back out. Maybe merging branches in a different order will make the conflict resolution easier. In any case, we have a few more options. Life is good.

Then, someone has the idea that recreating the integration branch each time we want to do this is a waste. The obvious approach is to leave the integration branch around and just keep it in sync with main. Although this seems reasonable, we have just turned the integration branch into the equivalent of a long-lived feature branch. as we found in the previously mentioned post, this tends to result in worse conflicts and pain.

If the integration branch is not kept in sync with main, there is a real possibility of problems when integration is merged to main. I've also seen situations where someone decides that the integration branch is obviously more up-to-date and overwrites (force push) the main branch, potentially posing changes that had already been merged. This becomes the same kind of issue that we were trying to solve with the integration branch in the first place.

Key to making the integration branch strategy work is that this branch starts out identical to your main branch before you begin merging. If there is any difference at all, you court the possibility of doing a bunch of work to get the integration branch functional, only to have the same problems again when you merge to the main line.

Summary

Most version control tools provide methods for maintaining and merging multiple lines of development. Despite the fact that the tools have become increasingly good at recognizing and resolving simple conflicts, human intervention may still be required. Care is needed to make sure that you reduce the effort needed to make changes rather than just move the effort.

One real anti-pattern for version control is long-lived branches. There are a few cases where it makes sense. But, they are a lot rarer than people believe. Don't ever make a long-term integration branch solely to save the time of setting up and tearing down this branch as needed. The pain will quickly outweigh the minor benefit.

Posted by GWade at 07:46 AM.

August 13, 2015

Review of Release It!

Release It!
Michael T. Nygard
Pragmatic Bookshelf, 2007

I've had this book on my shelf for a few years, and finally got some time to start reading it. I should not have waited.

Nygard takes the position that the life of a piece of software actually only begins when it is released. He spends a lot of time on the way things go wrong in production that you won't see in a development environment. Anyone who has ever survived a push to release and been surprised that there's no time to relax will really appreciate this book.

The book contains a large number of war stories showing things going wrong in real projects. Some of the failures are obvious, others are surprising. Nygard distills these failures down to some anti-patterns that can cause problems with stability or capacity. Then, he provides design patterns that can mitigate or eliminate some of the problems in release.

Some of these design patterns seem obvious: Use Timeouts or Pool Connections. Others are less familiar: Circuit Breaker or Bulkheads. Like the patterns from the Gang of Four book, a large part of the benefit of the patterns is having common names and descriptions of the patterns. This applies whether you have been using them for years or have seen them for the first time. If you are seeing them for the first time, his descriptions are good enough that you should quickly understand the pattern.

The section on general design principles is very effective, but the part that most developers really need to read is the section on Operations. Too often, those of us developing software forget what the operations people need from the software. When we think of them at all, we try to provide a nice interface for a few admin tasks. Nygard points out that a pretty interface is nowhere near as useful as a scripting or command line interface. He also describes examples of how this kind of approach actually helps operations.

Overall, this is a really great book for anyone that is releasing software that must run in a production environment. If you fall in this camp, you should get the book and read it before your next project.

Posted by GWade at 02:40 PM.