This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

January 19, 2014

Novice Example: Remote File Copy, Better Problem Breakdown

In the last post, we spent a small amount of time understanding a problem faced by our hypothetical novice programmer, Ned.

As a reminder, let's revisit Novice Ned's breakdown of the problem.

  1. Launch FTP
  2. Connect to the host
  3. Log in with the foobar user and password
  4. Change to the appropriate remote directory
  5. Put each of the files to the remote machine
  6. Log off from the remote machine
  7. Repeat for all servers

The problem with this breakdown is that it is focused on how we are currently solving the problem. Given that Ned was given a manual process, this breakdown is very manual. To help Ned, we're going to take a different approach to breaking down the problem into its fundamental functional pieces:

  1. Remotely access a machine
  2. Copy files from local machine
  3. ... to remote machine
  4. Repeat the above steps for dozens of machines

The original, manual process requires multiple steps for the each of the first three pieces. This is mostly because of the tool chosen to solve the problem: the FTP command line tool. This problem breakdown is focused on functional requirements of solving the problem, not a particular tool.

In the original process, logging in to the remote machine with FTP is one (multi-step) solution to the task of remotely accessing a machine. If we could find some way to copy files to a remote machine in a more automated way, then maybe we could automate the process of sending to multiple machines. As a first pass, here are some approaches to copying files to a remote system. (I'm leaving out special purpose configuration management tools and novel solutions like netcat. We'll just look at solutions that our novice is likely to be able to master relatively quickly.)

  • Script the ftp session
  • rsync
  • sftp
  • rcp
  • scp

A quick check with Ned shows that he uses SSH to access all of these machines for other purposes. This means that scp is probably a good choice. Since Ned had not seen scp before, we would suggest he spend some time with the man page. The key point is that scp works a lot like the Unix cp command, except it can copy securely to and from a remote machine.

Assuming the files to copy are in the current directory, the command to copy the files is now:

scp foobar.conf input1.txt input2.xml foobar@{server}:data

Where {server} is replaced with each server name, one by one. This still requires us to type a password for each server, but it's much less work than before. (We'll address this problem next time.) To automate this for all of the servers in our list, we just need to write a script.

I'll use bash, because it's what I'm comfortable with. If you are using a different shell, the commands should still be similar. Create the file cpfilestoremotes.sh with a text editor and type in the following (replacing server1, etc. with the correct server names).


#!/bin/bash

scp foobar.conf input1.txt input2.xml foobar@server1:data
scp foobar.conf input1.txt input2.xml foobar@server2:data
scp foobar.conf input1.txt input2.xml foobar@server3:data
scp foobar.conf input1.txt input2.xml foobar@server4:data


I created the four lines through copy and paste. Then I modified the server name (right after the @) to match the different machines. Obviously, I would need a different line for each machine. Make the file executable with:

chmod +x cpfilestoremotes.sh

First Step, Conclusion

At this stage, our solution has a number of problems. A real sysadmin-type would be suggesting tools better suited to mass configuration. A programmer-type would be twitching about the code duplication.

Everyone would have to agree that the need to type your password for each of the machines is suboptimal. Also if you need to change the files to be copied, you will need to do some manual work. Likewise, the list of server names is spread throughout the script file.

On the plus side, the amount of typing Ned needs to do has been vastly reduced. Now all he needs to do is type the name of the shell script and the passwords instead of running through all of the FTP commands one by one. For a large number of servers, this could reduce a multi-hour process down to much less than an hour. It's still boring, but at least it's not boring for as long.

Next time, we'll attack fixing some of the above problems.

Posted by GWade at January 19, 2014 11:28 AM. Email comments