This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

June 19, 2015

LCDC: Fundamental Knowledge

In The Myth of Code Anyone Can Read, I introduced the idea that least common denominator code (LCDC) is not a good approach to writing software. One reason for this problem is caused by the knowledge base of your average programmer.

Different Programmers Have Different Backgrounds

Programming is still a relatively new field. It's also a pretty broad field. A person claiming to be a programmer or software engineer could have learned their craft in any of several ways:

  • Self-taught: on-line tutorials, books,ongoing self study
  • Computer science degree
  • Management Information System degree
  • Programming course in a different degree program
  • Programming boot camp
  • Internship at a programming shop

Each of these can result in either really good or not-so-good programming skills. In addition, the terms programming and software development can also be applied in very different areas.

  • Embedded systems
  • Hardware driver development
  • Scientific software
  • Website development
  • SCADA software
  • Financial software
  • Game development
  • Graphics programming
  • Smart phone app development
  • Automotive software
  • High availability software
  • ... and many more

Each of these different areas have very different ideas of what knowledge and skills are fundamental. You can't necessarily take a website developer and have them be productive on an embedded systems project. You might not want a game developer working on software for pacemakers.

Given different backgrounds, specifying a minimum level of knowledge becomes much harder.

Data Structures

Let's start simple. If we want to write LCDC, we can't use any data structures that aren't understood by everyone. So, we can probably guess that most people would understand arrays. That is pretty fundamental. What about others[1]:

  • Linked lists
  • Binary trees: basic binary, AVL, or red-black trees
  • Generalized trees: tries, suffix trees, octrees, B-trees, R-trees
  • Graphs: DAGs, spanning trees
  • Stacks
  • Queues: FIFO, dequeues, priority queues
  • Hash tables, associative arrays, dictionaries
  • Heaps

Most programmers of my experience are not familiar with many of the data structures above, much less all of them. Some of these data structures underlie programming tools we use every day. Others are more specialized. Some are extremely well-known in one industry or company and virtually unknown in others.

If we really want LCDC, these data structures and the advantages they give would be unavailable. After all, most programmers don't know how a red-black tree or hash table work, so how can we write code that uses them?

Fundamental Algorithms

Data structures aren't the only fundamentals that we can't rely on everyone understanding. Many of the algorithms that we depend on are opaque to the average developer.[2]

  • Sorting: quicksort, insertion sort, heap sort, merge sort
  • Security: SHA-256, AES, Diffie-Hellman key exchange, cypher-block chaining, HMAC
  • Graphics: JPEG compression, ray tracing, bezier curves
  • Databases: SQL, document databases, object databases, hierarchical databases
  • Randomness: Fisher-Yates shuffle, Mersenne Twister, entropy pools
  • String manipulation: regular expressions, longest common sub-sequence, hamming distance, Levenshtein distance, KMP algorithm
  • Graphs: Dijkstra's algorithm, alpha-beta pruning, topological sort

In some fields, each of these algorithms are commonly used. In others, each is completely unknown. Even in the fields that a particular algorithm is used, most developers probably don't understand all of algorithms used in that field. According the LCDC premise, we can not use any algorithms that everyone can't understand.

Summary

Because of the breadth of the programming field and the many different ways that individuals came to work in the field, it is very hard to describe a subset of knowledge that we can claim is known by everyone.

Not all of these apply to every business, but most programs end up touching one or more of these areas somewhere. Our code would be slower, less correct, and harder to maintain without being able to take advantage of well-known and well-tested algorithms, even if they are beyond the grasp of your most junior people.

In the next post, I'll explore libraries to solve this problem. We'll also see how they would be impacted by the LCDC idea.

Notes

  1. Apologies if I've left out your favorite data structure. I just wanted a list big enough to get the point across.
  2. Since there are even more algorithms than data structures. This is an even more incomplete list. On the other hand, I suspect that more of these will be unknown to more programmers.
Posted by GWade at June 19, 2015 08:25 AM. Email comments