Intro to UNIX, Chapter 4. The Shell and Command Processing

4. The Shell and Command Processing

The UNIX program that interprets your commands is called the ``shell''. The shell examines each command line it receives from your terminal or workstation, expands the command by substituting actual values for special characters, and then either executes the command itself or calls another program to do so. After the command has completed execution, the shell prompts you for a new command.

Most UNIX systems offer a choice of shells--the Bourne shell (sh) and the C shell (csh) are the most common. The default shell for Hardlink's UNIX systems, and the one described in this manual, is the C shell. Since this manual describes the C shell, you should make sure that is the one you are using. To do so, type

     echo $SHELL

If the C shell is reading your commands, the response will be

     /bin/csh

(The Bourne shell is /bin/sh.) The following sections discuss the shell concepts most needed by novice users.

Command Syntax

UNIX commands begin with a command name, often an abbreviation of the command's action (such as cp for ``copy'' or mv for ``move''). Most commands include ``flags'' and ``arguments''. A flag identifies some optional capability and begins with a hyphen. An argument is usually the name of a file, such as one to be read. For example, the command line

     cat -n stuff

calls the cat program (to ``concatenate'' files). In this case, cat reads the file named ``stuff'' and displays it. The -n flag tells cat to number the lines in the display.

The hyphen that precedes a flag is a special character used to distinguish flags from filenames. In the example above, the hyphen prevents cat from trying to display a file named ``n''. Commands can contain other special characters as well. The shell interprets such characters, and sometimes replaces them with other values, before it passes the command with its flags and arguments to the program that actually executes the command.

Remember that uppercase and lowercase letters are not interchangeable. Thus if you tried to give the cat command by typing CAT, it would fail; there is no program named (uppercase) CAT. Or notice in the echo command shown above that SHELL (and only SHELL) must be all uppercase.

Special Characters

The following sections describe many of the characters that have special meaning to the C shell. Keep in mind that MOST punctuation characters have some special meaning. Therefore, you should not include any punctuation character other than period or underscore in the names you give to files and directories, or you may get surprising results. Also, don't try to use spaces in file and directory names.

Input/Output Redirection (the >, >>, and < Characters)

If a program normally reads its input from the terminal (referred to as ``standard input'' or ``stdin'') or writes its output to the terminal (``standard output'' or ``stdout''), you may want it to read from or write to an alternate file instead. For example, if you give the command

who

the login names of others currently logged in are written to standard output--displayed at your terminal. If you want to have this list placed into a file called ``namelist'', you could use the character > to redirect standard output to namelist, like this:

     who > namelist

If you already have a file named namelist, the shell erases its contents, then calls who to write the new information to namelist. If you do not have a file named namelist, the shell creates a new one before it calls the who program.

You can then do what you want with the file namelist--print it, sort its contents, display it with the cat command, or whatever.

To append output to an existing file instead of overwriting it, use the symbol >>. Thus, if you first type

     who > namelist

and later

     who >> namelist

The file namelist will contain two sets of results: the second set is appended to the first instead of destroying it.

Normally, you will not need to redirect standard input, as most commands that can read from standard input also accept an input filename as an argument. However, for commands that normally expect input from your terminal, the character < followed by a filename tells the command to read from that file instead. For example, one way to send mail is to type

     mail user

and then to type the text of the message from your terminal. If the text of the message is already in a file named letter, you could redirect standard input to send the mail this way:

     mail user < letter

Note: The shell performs input and output redirection BEFORE it calls the command you want to execute. This means that files can be accidentally destroyed. For example, the command
     sort < myfile > myfile 
destroys ``myfile'' because the shell (1) opens myfile for reading and then (2) opens it again for writing. When you open a file for writing, UNIX destroys the old contents of the file. All this happens BEFORE the shell runs the sort program to actually read the file.

Pipes (the | Character)

A pipeline is a useful way to use the output from one command as input to another without creating an intermediate file. Suppose that you want to use the who command to see who is logged in, and you want to see the results sorted alphabetically. The sort program reads a file and displays the sorted results on your terminal--the standard output. So you could accomplish what you want with the following sequence of commands:

     who > namelist
     sort namelist

Afterward, you could give another command to get rid of the file namelist.

A pipeline enables you to do all this on a single command line, using the pipe symbol | (vertical bar). When commands are separated by the | symbol, output from the command on the symbol's left side becomes input to the one on the right of it. Thus you could type

     who | sort

so that the results of the who program are passed to the sort program as input and displayed, by default, on your terminal.

More than one pipe symbol can be used to make a series of commands, the standard output from each becoming the standard input to the next.

Note: If a command is not capable of reading from standard input, it cannot be placed to the right of a pipe symbol.

Characters Used to Expand File and Directory Names (the * ? [ ] - and ~ Characters)

The shell also interprets other special characters it finds on a command line before it passes this line to the program that will execute it. These characters are normally used in place of filenames or directory names:

*
An asterisk matches any number of characters in a filename, including none. Thus, the command

     cat a*

would display the file named ``a'', if it exists, as well as any file whose name begins with ``a''. An asterisk will not match a leading period in a file name, but it will match a period in any other position.

?
The question mark matches any single character. Thus

     cat ? F?

would display all files with single-character names, plus all files that have a two-character name beginning with F. Like the asterisk, a question mark does not match a leading period, but does match a period in any other position.

[ ]
Brackets enclose a set of characters, any one of which may match a single character at that position. For example,

     cat draft[125]

displays files draft1, draft2, and draft5 if they exist. Remember that the shell interprets special characters BEFORE it calls the cat program. So there is a difference between

     cat draft[125]     and    cat draft1 draft2 draft5

In the first form, the shell considers it a match if ANY of draft1, draft2, or draft5 exist. It builds a cat command that includes only the names of files it finds. If there is no draft2, for example,

     cat draft[125]

displays draft1 and draft5. However, explicitly giving the command

     cat draft1 draft2 draft5

displays draft1 and draft 5, but also gives an error message from the cat program: ``draft2: no such file or directory''. If no files begin with ``draft'', then the command

     cat draft[125]

produces a message from the C shell: ``no match.'' In this case, the shell doesn't even call the cat command.

-
A hyphen used within [ ] denotes a range of characters. For example,

     cat draft[1-9]

has the same meaning as

     cat draft[123456789]

Again, the shell expands this to build a cat command that includes only the filenames that actually exist.

~
A tilde at the beginning of a word expands to the name of your home directory (the directory in which you are placed when you log in on UNIX). It is a useful special character if you changed to some different directory after you logged in, and want to copy a file to or from your home directory. If you append another user's login name to the ~ character, it refers to that user's home directory--useful if you are sharing files with someone else. For example,

     ~cssmith

means ``the home directory of the user whose login name is cssmith''. The usefulness of this notation will become more apparent in the next chapter.

Note: On most UNIX systems, the ~ character has this special meaning to the C shell only, not to the Bourne shell. In shell scripts, which are generally processed by the Bourne shell, use the variable $HOME to specify your own home directory.

Action Characters

As already noted in chapter 2, a ^d signals the end of input. If a program is reading data from standard input (your terminal), it reads everything you type until you use ^d. This is also true if the program is the shell. To make the C shell ignore ^d, so that you don't log out accidentally, disable it with the command

     set ignoreeof

Another special control character for the C shell is ^z. Whenever you type ^z, it suspends the current program. UNIX responds with the message

     stopped

and gives you a new shell prompt. You can later resume the suspended program by giving the fg (foreground) command, or resume it in the background with the bg command. To kill a suspended program, give the command jobs -l to get the program's job number. Then use

     kill %job-no

to terminate it.

If you fail to kill or resume a suspended process, when you try to log out you will get the message: ``There are suspended jobs''. You can disable ^z entirely under UNIX by making the ``suspend'' character undefined. To do so, type the following command or place it in your .login file:

     stty susp ^-

You could also use the stty command to redefine the suspend character to something other than ^z.

Quotation Characters ( \ and ' )

Although you should not use special characters in filenames and directory names, you may need to include these characters in command lines without having the shell treat them as ``special''. To make the shell's special characters be treated as ordinary characters, do one of the following:

prefix a single character with a backslash \
enclose a sequence of characters in apostrophes ' '

For example, suppose you wanted to use a question mark as your C shell prompt. If you give the command

     set prompt=?

the shell will expand the question mark to be the first single-character filename it finds in your directory, and that filename will be your prompt. To avoid that problem, use

     set prompt=\?

History Substitution Characters (the ! ^ and $ Characters)

If you use the command

     set history=n

then the C shell keeps a record of your most recent n events, where an event is one command line. (That is, an event might include several commands in a pipeline, or several commands that you have typed on a single line.) You can use special characters to reissue the commands without retyping them. Here are some simple ways to do this:

!!
On a line by itself, simply reissues the most recent event.

!cmd
Reissues the most recent event that started with the command cmd.

!?string
Reissues the most recent event that contained string.

!-n
Reissues the nth previous event. For example, !-1 Reissues the immediately preceding event, and !-2 reissues the one before that.

!n
Reissues command line n. To see how previous commands are numbered, just type:

     history

It will display as many lines as you told set history to keep.

^old^new
Substitutes the string new for the first occurrence of the string old in the most recent event, and reissues that command line. Note: This is the real caret character, not a representation of the Control key.

For example, suppose you want to copy a file from one directory to another, using the cp command, and issued the command

     cl my_old_directory/the_file_to_copy my_new_directory/the_new_filename

Then you got the message "cl: Command not found". You can easily correct your mistake by simply typing

     ^l^p

:
Selects specific words from an event line, so that you can reissue parts of a command. The : also separates these words from actions to take on them, such as a substitute command to correct a misspelled word. For example,

     !15:s/cot/cat/

would reissue event number 15, substituting the string ``cat'' for the string ``cot''. See how the : is used with $ below.

Examples:

     !f77

This reissues the most recent f77 command--say, after you have corrected a source-file error that caused the previous attempt to abort.

     !!:0-2 !-2:$

This creates a new command from parts of two that have already been issued: words 0, 1 and 2 from the most recent event, followed by the last word from the event before that.

The Command Separator Character ( ; )

To put more than one command on a line, separate complete commands with a semicolon:

     cd ~colleague;ls

The Background Character ( & )

To place a slow-running job in the ``background'' so that you don't have to wait for it to finish before issuing another command, use the ampersand character. For example:

     sort verylargefile &

The shell will notify you when the background job is finished.

Other Special Characters

Besides the special characters discussed in this chapter, there are some others that have special meaning to the shell and which, if not quoted in a command line, will not be treated as ordinary characters. They include:

     `    {    }    #    "

Within a program, you should not need to quote special characters to make them ordinary; it is only when the shell interprets your command that such characters are expanded instead of being treated as text.

Shell Initialization Files

As noted in chapter 2, every time you log in under the C shell, it executes any commands that you have in a file named .login. But logging in is not the only way to start up the C shell. The command csh starts a new copy of the C shell for you if you are already logged in. Some programs, such as editors, allow you to ``escape'' temporarily to the shell and then resume your editing. In this case, you also start up a new copy of the shell instead of going back to the shell that called the editor. Because you are not logging in, this C shell ignores the file .login and executes commands it finds in the file .cshrc. In fact, when you log in, the C shell also executes commands in .cshrc--before it executes those in .login.

If you find that you occasionally need to start new copies of the C shell, be sure to use the file .cshrc for commands you want executed every time you start a new C shell. In general, environment variables specified by a setenv command (such as your setenv EDITOR command) should be in .login. Any set commands (such as set history=40) should be in .cshrc so every new copy of the C shell will be able to use them.

You can also put alias commands in .cshrc. For example, if you want to type just h instead of history to display your recently issued commands, put this line in your .cshrc file:

     alias h history

Keep in mind that any commands you put into your .cshrc file will not become effective until the next time you start a new copy of the C shell. To make the commands effective immediately, type

     source .cshrc

This will execute every command within .cshrc.

If you do not have a .cshrc file, you can copy the file /usr/local/lib/CSHRC instead of making a file from scratch. See chapter 5 to learn how to do this.

Note: The Bourne shell uses just one initialization file--named .profile.

Search Paths

Remember that the shell is itself a program. After the shell examines and expands each of your command lines, it determines if each command is one of its own built-in commands, which it can execute directly. The commands history, set, and setenv are examples of the C shell's built-in commands. Most built-in commands do not have individual man pages. Instead, they are described within the csh man page. see chapter 3).

However, if the command is not a built-in command, the shell must call an external program to execute it. But such programs are not all stored in one place: they are organized into hierarchies of directories. To determine where to look for the needed program, the shell examines a variable called PATH. Normally, the path tells the shell to look first in one or more system directories in some particular order, then in your own current directory.

To see what your current path is, give the command:

     echo $PATH

You will see a sequence of pathnames, separated by colons. The first pathname on the list is where the shell looks first if the command you gave is not a built-in command; the second pathname is where it looks next if the first path fails, and so on. A null path--a colon alone at the end or the beginning of the list, or a pair of colons-- means your current directory. A period, or dot, as a pathname also means your current directory.

You may never have to do anything to create or change the path variable. But if the shell cannot find a command you want to issue, the reason might be that the command is outside your search path. Also realize that if you give one of your own executable files the same name as a built-in command, your own command might never be executed. The shell will use its own code instead, if its pathname precedes your own pathname.

The next chapter discusses paths and pathnames in more detail.

Go to next chapter

Go back to table of contents