This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

November 04, 2013

Novice Programmers: Your First Skill

The first skill a new programmer must learn is not what most people expect. It is not learning a programming language. It has nothing to do with algorithms or data structures. The first thing a new programmer must learn is how to identify a small piece of a problem.

Solving the Big Problems

Most problems that we want to solve are too big and complex for anyone to design an solution. If the problem is too big to solve, it's too big to code. When you have a problem that is too big to solve, most people take one of three approaches.

Give up

Most reasonable people take the first approach. They look at the kinds of problems they imagine a computer can solve and realise they don't know where to start.

Giving up or handing the problem to someone else is reasonable if you are never going to be a programmer. After all, no one is good at everything. But, it's not the only answer.

Start writing code to do it all

Some relatively young programmers avoid the underlying problem by just writing code. They stay up late, drink lots of caffeine and code until they have something. If the code doesn't solve the problem, this may result in giving up after writing a lot of code that doesn't really go anywhere. If they do manage to solve the problem, the result is often a big ball of mud. All parts of the problem are tied together with a random spaghetti of control.

While this may work once or twice. It is not a viable strategy for more complex problems. It also quickly becomes unmaintainable.

Identify a small piece that you can solve

People who can actually program, do the third. (To be honest, some of the second category do stumble across the third approach at some point and level up.)

As they develop more experience, good programmers apply this approach almost without thinking. They decompose the problem, identify the critical sub-problems, and begin synthesizing code that makes progress on the problem.

To someone without the necessary experience, this looks like the second approach. But, the results are much different.

Identify A Piece You Can Solve

A large portion of the practice of programming involves the following steps.

  1. Break the problem into sub-problems.
  2. Pick a sub-problem to work on.
  3. If you can't solve the sub-problem, go back to 1.
  4. Solve this sub-problem.
  5. Move back to an unsolved sub-problem.
  6. Solve the next sub-problem.
  7. Compose this solution with the last solution.
  8. Repeat until problem is solved.

This is a lousy algorithm, of course. But, it does give a general feel for the process. Actual programmers don't think of it this way, or use this exact process. But, this serves as a first-order approximation to the real process.

The problem that most novice programmers have involves steps 1 and 3. How do I break up this problem? and How do I know when I have a problem I can solve? More experienced programmers can draw on previous experience to identify sub-problems quickly and to imagine solutions. Novices normally don't have this experience.

Let's Break Down a Problem

Unfortunately, there is no one best approach. Different problems will decompose in different ways. However, many problems follow certain patterns. The simplest of these patterns would be "Input, Process, Output".

Problems that match this pattern would include:

  1. Extract information about certain projects from a number of documents containing reports on the projects.
  2. Generate a report on capabilities of a bunch of machines on a network.
  3. Identify all of the machines on a network running specific software.
  4. Check an employee registry to find all of the employees that are coming up on a 1, 5, or 10 year anniversary.
  5. Examine the web server logs for potential attempts to hack user accounts.

In all of these cases, we need to retrieve information from somewhere (Input). Then, we need to filter and analyse the data (Process). Finally, we need to generate some form of report or result to the screen, on a web page, in an email, or somewhere (Output). To a large extent, these steps can be done somewhat independently. (Except for the form of the data that needs to flow between them.)

Where many novices run into problems is getting stuck on how to process and output the data when they still don't understand how to retrieve the data. Alternately, they try to do all three steps at once.

An Approach

The easiest solution is often to start thinking in the middle, but start solving at the beginning.

First, think about some of the information you will need to do the processing. Don't worry about how you will process it, just identify what you will need. You don't have to get it perfect in this round, because you will be revisiting the processing shortly.

Now, identify where you need to get the information and how you will extract it. This may require multiple sources in order to get the information you need. In that case, start with one, and figure out what you need to do to get data for that one.

An Example

Let's use the first example above as an example: Extract information about certain projects from a number of documents containing reports on the projects.

Let's say that we want to know a few pieces of information about the projects:

  1. Project name
  2. Start date
  3. End date or projected end date
  4. People working on the project

If you were doing this by hand, you would open each file and read it, looking for the data you need and write it down. Some people might start looking at how you generate a list of the files to process, or try to figure out how to find the information you need. But, that's not actually the first step.

The first and smallest step you need to solve is knowing how to read a single document file. If it's a text file, most languages make that pretty easy. An HTML page is only slightly more challenging. If it's a MS Word document or a PDF, that's a different level of effort.

For the complicated formats, you probably want to look for a library that reads the intended file format, rather than trying to solve that problem for yourself.

Conclusion

In this entry, I've identified the first skill any novice programmer needs to learn. Some people recognise this issue and can apply the skill intuitively. If you are one of those, good for you.

If you were having problems getting started on a problem, thinking about the problem this way may help.

I also started the process of decomposing a straight-forward problem. In a later entry, I'll explore this example further.

Posted by GWade at 02:10 PM. Email comments