This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

March 25, 2006

Domain Specific Languages, a Renewed Interest

I've seen quite a bit of interest in Domain Specific Languages (DSLs) on the Internet lately. Some good examples include Martin Fowler's exploration of the subject:

* MF Bliki: DomainSpecificLanguage
* Language Workbenches: The Killer-App for Domain Specific Languages?

He does point out that this is not a new idea. He uses Unix as an example of a system that uses a large number of DSLs. The subject has gotten enough interest to reach the point where people are discussing when it is a good approach to apply (Artima Developer Spotlight Forum - How and When to Develop Domain-Specific Languages?). Others are beginning to apply the term DSL to extend their area of interest (Agile Development with Domain Specific Languages).

So, from this we can guess that interest in DSLs is on the rise. As Fowler pointed out, Unix has been a nexus for the creation of DSLs, including:

* make
* regular expressions
* awk
* sed
* yacc
* lex
* dot

and many more.

Recently, I have seen the suggestion that extending or modifying a general purpose language is a powerful way to build a useful DSL. To some extent, this is also well-trodden ground. The classic example of this approach was implemented using preprocessors to provide facilities not present in the original language, both Ratfor (for structured programming in FORTRAN) and cfront (for object oriented programming in C) used this approach.

A recent article, Creating DSLs with Ruby, discusses how the features of the Ruby language make it well suited to building DSLs without a separate preprocessing step. The Ruby language apparently supports features that supports creating simple DSL syntax that is still legal Ruby code.

This is a very powerful technique that is not very easy to do with most languages. Amusingly enough, this technique is also not very new. In fact, there was a general purpose programming language that was designed around the concept of writing a domain language: Forth. If I remember correctly, Charles Moore once described programming in Forth as writing a vocabulary for the problem domain, defining all of the words necessary to describe the answer, and then writing down the answer.

The Forth language is different than most programming languages you might have encountered because it has almost no syntax. What looks like syntax is actually a powerful technique for simple parsing, combined with the ability to execute code at compile time. This allows for extending the capabilities of the language in a very powerful way. One interesting effect of this feature is that many good Forth programmers naturally gravitate toward the DSL approach when solving problems above a certain level of complexity. We firmly believe that some problems are best served by a language, not an arcane set of configuration options.

Forth does give us an important insight into the problems with DSLs, as well. There is is a well-known joke among Forth programmers:

If you've seen one Forth...you've seen one Forth.

Unlike more traditional programming, Forth programs are built by extending the language. A new programmer trying to learn a Forth system needs to learn the new dialect including the extensions used in this environment. This is not technically much different than learning all of the libraries and utility classes used in a more traditional system, but there is a conceptual difference. In Forth (or a DSL-based system), there is no syntactic difference between the extensions and the base language. The language itself can be extended without an obvious cue in the code to say when you have changed languages. This means that a new programmer may not recognize a new piece to learn as readily as when seeing an obvious library call.

This becomes a very important tradeoff: Which is more important, ease of learning for new programmers or power for advanced users? A well-designed DSL gives the advanced user a succinct notation to use to express hie or her requirements concisely and precisely. This is the appeal of the DSL. The downside is that this represents a new set of knowledge for each programmer relating to troubleshooting and debugging. It also requires more care in the design to develop a consistent and usable notation.

As usual, the tradeoff is the key. We need to be able to decide if the benefits outweigh the disadvantages and build the system accordingly. I wish there was a magic formula that could be applied to tell how or when a DSL would improve a system. Unfortunately, I have not seen a sign of such a formula yet.

Posted by GWade at March 25, 2006 11:40 PM. Email comments