Friday, December 02, 2005

Remember the Pipe

Sometimes, you want to save output. For that, the shell gives you redirection.

Sometimes, though, you want to save the pipeline that you grew to get the
output.

You could recall it, write it down, and then type it into a file, but there's
an easier way.

After you run the pipeline, make your next command "fc."

Try this:
ls | wc -l
fc
(You've run other things since the pipeline? Recall it with command history, rerun it, then type fc)

Once fc puts the pipeline into a text editor, you can write it out to a file. Presto! A shell script.

And as long as you're in there, you could even take your time, add shebang lines and comments, and change the layout. The shell gives you command-line history and pipes, for quick-and-dirty script development, and then gives you fc, to capture what works and clean it up.

The default editor is vi, but you can make it whatever you want.
EDITOR=kate
ls | wc -l
fc
Notice that when you exit the editor, the command is rerun. fc stands for "fix command". Instead of editing a long command line with arrow keys, you can edit it with your favorite text editor.

Being able to write to a file and create a shell script, once you've edited it, is just a side-effect of the shell's clean design.

And now that we're talking about command history, we'll make that next week's introductory topic.

Thursday, December 01, 2005

Pipelines

You can add as many elements to a pipeline as you want.

ls -l /dev/
ls -l /dev/ | grep -v ,
ls -l /dev/ | grep -v , | grep -v -- '->'
ls -l /dev/ | grep -v , | grep -v -- '->' | grep -v ' 0'
This is an everyday, and brilliant, use of command history: exploratory
programming.

Issue a command, look at the output. Filter that output through something else to get closer to what you want. Look at the result of that, and filter it again. When you finally get what you want, redirect the output to save it.

The metaphor of pipes and filters suffuses shell programming. To make your own programs (including your shell scripts) fit in, all you have to do is design them to take input from standard in, and spit output to standard out.

Unless there are multiple inputs or outputs, don't write to files or read from them; use redirection for that.

Suppose, for example, you write a program that collects all the #ifdef variables into a file, one per line. One design choice is to have it called like this:
ifdefvars CFILE VARFILE
Worse, you could even design it to be called like this:
ifdefvars CFILE
and have it automatically generate CFILE.VARS.

If, instead, you have it read from stdin and write to stdout, so it's called like this
ifdefvars < CFILE > VARS
you can pop together a pipeline, at a moment's notice, to scan your code for mistyped variables:
find . -name '*.[ch]' | xargs cat | ifdefvars | sort | uniq -c | awk '$1 == 1'
(Don't know what all these commands do? Type the first one in and run it. Next, use command history to recall it, add the next stage in the pipeline and watch what changes.)

Wednesday, November 30, 2005

Pipes

You see, wire telegraph is a kind of a very, very long cat. You pull his tail in New York and his head is meowing in Los Angeles. Do you understand this? And radio operates exactly the same way: you send signals here, they receive them there. The only difference is that there is no cat. - Albert Einstein
The shell's biggest innovation is pipes.

To hook stdout of one program to stdin of another, you just connect the two programs with a pipe symbol: '|'.

How many files do I have?
ls | wc -l
You can think of it as shorthand for this:
ls > TMPFILE; wc -l TMPFILE; rm TMPFILE
The only thing is, there is no temporary file. In a pipeline, the output from one command really is hooked up to the input of the next, and both processes run simultaneously. It's magic.

If the first process is too slow, and the pipe gets empty, the second process just sits and waits for the first one to send it more stuff. If the first process is too fast, and the pipe starts to clog, it just blocks until the second sucks enough out of the pipe to make it worth starting up again.

For me, the ability to pipe into wc -l, alone, is invaluable.

How many processes are running?
ls -d /proc/[0-9]* | wc -l
How many .c and .h files are there in my source tree?
find . -name '*.[ch]' | wc -l
How many files have we had to change since the "FINAL" release?
cvs -nq update -rFINAL | wc -l
Next time you're asking "How many ...?", remember to answer it with a pipe.

Tuesday, November 29, 2005

Here Documents

Ouput has '>' and '>>'. Now that you've seen '<', what about '<<'?
cat <<FOO
hello world
FOO
This is a here document, which says "Take input from from 'here to FOO'"

Actually, FOO can be a little visually confusing, because I often use FOO and foo for variable or file names, so I usually use a marker for "End of Input" or "End of File" instead, like __EOF__ or __EOI__. These stand out and I don't use them for other things.

If you want to imbed the contents of a file into a script, a here document is a good way to do it.

As usual, you should try out moving around the redirects, and use command history to do it.

There's even a '<<<'
H='hello, world'
cat <<< $H
It's sort of a "right here" document, which provides input from the current line.

Monday, November 28, 2005

Redirecting Standard Input

Start like this:

  1. cat
    hello, world
    hello, world
    ^D

  2. cat copies the input that you type -- standard input, or stdin -- to the screen.
    Next, redirect output to a file.
  3. cat
    cat > FOO
    hello, world
    ^D

  4. This puts the string "hello, world" into the file, FOO.
    You can check it by looking at the contents of FOO.
  5. cat FOO
    hello, world

  6. Finally, use the command history and editing keys to recall command [2], change
    the '>' to a '<', and to press a carriage return.
  7. cat < FOO

  8. With '<' the arrowhead points from the file toward the command,
    redirecting the contents of the file FOO into the command cat.

Part of learning the shell is losing your fear of making mistakes. The sooner you make your first 5000 mistakes, the better. What do these do?
cat < FOO > BAR
< FOO > BAR cat
< cat > BAR
BAR < > cat

Sunday, November 27, 2005

Input

We've talked about output, but without input, I/O would just be "O" -- a big
nothing of a subject. This week, I'll sketch some input redirection.