Friday, December 16, 2005

Grow a Command, Then Execute It

Let's make a copy of all the files under /etc/ that contain references to httpd, the Apache http daemon.

First, a directory to hold the copies:
mkdir /tmp/Apachefiles
Next, we'll construct the commands, on the fly, by growing a pipeline. Follow along with me by executing these in a terminal window. For each step, just recall the previous line and edit it.

find /etc/ # list all the files under /etc.
find /etc/ | xargs grep -l httpd # look for all those files that contain the string httpd
{ find /etc/ | xargs grep -l httpd; } 2>/dev/null # get rid of annoying warnings
# now transform the list into a series of commands
{ find /etc/ | xargs grep -l httpd; } 2>/dev/null |
perl -lane 'print "cp $_ /tmp/Apachefiles/"'
Okay, these look good. (I've left out all the mistakes I made while developing the pipeline, because I know you would never make any. I do, so command history is my friend.)

Now, how about executing it? We could redirect the output into a file, mark the file as executable, and then execute it as a script.

Or we could just add one step to the pipeline, like this:
{ find /etc/ | xargs grep -l httpd; } 2>/dev/null |
perl -lane 'print "cp $_ /tmp/Apachefiles/"' | bash
I frequently grow a set of commands like this, on the fly, then when I have them right, pipe them into a subshell, which will execute them, one by one.

Thursday, December 15, 2005

Command Substitution for Counted Loops

Here's a loop.

for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
do
echo $i
done

Here's another way to do the same thing:

for i in $(seq 20)
do
echo $i
done
Just the command seq 20 by itself spits out the sequence 1 .. 20 . (Try it.)

The expression $(command) says "Run command in a subshell,
substitute its output for the expression, and then run the resulting command in the current shell."

This is called command substitution

Since "$" marks variables, think of $(command) as "The value of the temporary variable containing the output of the subshell '(command)'"

You have a mystery version of foo.c and you want to know what archival version it's closest to?

for i in $(seq 10 -1 1)
do
cvs co -p -r1.$i foo.c | diff - foo.c | wc -l
done
The version with the smallest diff is the version you're looking for.

If you have a very large number of versions, you can pipe the loop's output through less. Alternatively, if you're using a terminal emulator, you can use the scroll bar to scroll back up the screen, or, in kterm, the convenient Shift PgUp.

Wednesday, December 14, 2005

Get Scope With Subshells

You can tell the shell to execute a few commands in a subshell by surrounding them with parens:
pwd; ( cd /tmp; pwd }; pwd
You change directories in the subshell, but once you exit that subshell, the original shell hasn't gone anywhere.

This is particularly useful for scripts where you want to go somewhere and do something, temporarily. My scripts are chock-a-block full of code like this:

cvs -Q co modulename
(
cd modulename
make
)
Don't confuse this with curly braces, which group commands in the current
shell:
pwd; { cd /tmp; pwd; }; pwd
The two kinds of grouping are for two different things. Curlies are good when you want to do I/O redirection of groups of commands. Parens are good when you want to insulate a parent shell from a bunch of temporary changes, without having to save and restore context by hand.

Tuesday, December 13, 2005

Use pstree to See Your Tree of Subshells

Execute the command csh. The prompt changes! You are now running an older shell, csh, that has a different syntax. Try, for example, typing in a comment, like this
# this is a comment 
This new shell is a subshell of the bash you were talking to a minute ago. Exit the subshell, and you'll see you're back in the parent shell. You can see
this with the ps command.

ps -o pid,ppid,args
csh
ps -o pid,ppid,args
exit
ps -o pid,ppid,args
Each running program is a process, with an identifying number, or Process ID. If that process is a subprocess of some other program, then the ps command shown above also shows the Parent Process ID.

You should be able to trace out what's a child of what in the listings above, pretty easily.

Actually, in Linux, every process is the child of another process, except for the very first process, which is called init. Look:

pstree -A -p | less
The numbers in parens are PIDs. Try invoking some subshells and watch how the process tree changes.

Monday, December 12, 2005

You Can Invoke the Shell By Hand

You invoke a shell the same way you invoke any other program: type its name.

bash
You get back a prompt, but now this is a second shell, a subshell under the previous shell. The shell you were running a minute ago is now patiently waiting for this subshell to finish, so it can put up another prompt and let you issue some other command.

See for yourself? Type exit. You'll exit this subshell, and return to the shell a level up. You're back in the shell you started from.

Want more convincing? Type history. That's your current shell. Now execute a subshell, and look at the history again:

bash
history
Finally, exit the subshell, and look at the history again:

exit
history
You get two different histories because it's two different shells.

If you execute commands, then spawn a subshell, and then try to recall the commands you were just executing before you entered the subshell, you won't find them.

It's important that you have a firm grasp on the idea of subshells before you continue, so I'll return to this, tomorrow.

Sunday, December 11, 2005

Subshells

The shell is just a program. It's not wired into the kernel. Many programmers who've been in the business for a while have written a shell, for their own amusement. If you haven't, you can imagine the pseudocode.

while (1) {
put up a prompt
read the line the user types in
parse the line
execute any commands you find
}
Thinking of the shell as just another program, no different from ls or date or cc, is a big help in dealing with the shell.

This week, I'll explore that a little.