Chapter 4: UNIX File Processing
Input and Output
Extracting Information from FilesSection A: Objectives
After studying this lesson, you should be able to understand and use:– Redirection– Pipes– Wildcards and Regular Expressions– Commands:
find, sort, catgrep
Processing Files
• When performing commands, UNIX processes data, it receives input from the standard input device (the keyboard) and then sends output to the standard output device (the screen).
• System administrators and programmers refer to standard input as stdin
• They refer to standard output as stdout• The third standard output is called standard
error, or stderr (also typically the screen), where error messages are displayed
Using Input and Error Redirection
• File Redirection operators
> - redirect output, send it to a file
>> - redirect output, appends it to a file
< - redirects input, retrieves input from a file
Redirecting Outputs
• $ ls > foo• Send the directory list to the file named
“foo”
• $ cat foo• Show contents of “foo”
• $ date >> foo• Append the date to the file name “foo”
Redirecting Outputs (cont)
$ ls -l > barSend the long directory list to “bar”
$ cat foo bar > foobarCombine “foo” and “bar” into a new file
name “foobar”
$ cat foobarDisplay contents of new file
Redirecting Inputs
$ who > peoplePut the list of users currently on the system
into the file “people”
$ sort < peopleSort by sending “people” as input to sort
$ sort < people > speopleSend sorted list to new file “speople”
Using Pipes
• The pipe operator (|) redirects the output of one command to the input of another command
• The pipe operator is used in the following manner:first_command | second_command
• The pipe operator connects the output of the first command with the input of the second command
• The pipe operator can connect several commands on the same command line, in the following manner:first_command | second_command | third_command ...
Pipes Examples
$ who > people
$ sort < people
Is there a shortcut? Yes, using pipes.
$ who | sort List of users on the system
$ who | wc –l How many users are on the system?
Wildcards
The ‘*’ and ‘?’ characters are wildcards, which match against zero or more characters in a file or directory name.
‘*’ matches any # of characters (0 or more).
‘?’ matches exactly one character
[ ] and - : character range, a way to specify a sub range of characters to match.
Wildcard Examples
$ ls foo* Lists all files starting with “foo”$ ls *.c Lists all files ending with “.c”$ ls f.? Lists all files named “f” plus an extra
character, such as “f.1” or “f.c”$ ls a* d* f* List all files starting with a, d, f$ ls [adf]* same$ ls [a-z]* List files starting with any letter$ ls [a-z] * List all one letter files, then all
files
Wildcard Examples (cont)
$ ls *unix* list all files/subdirectories which contain
the word “unix”
$ ls *s*the following example files will match this
pattern.grades.txt, prog1.s, summary.doc
$ ls *[0-9._]* files having 0-9 or . or _
Wildcard Examples (cont)
$ rm *Delete ALL files
$ rm ???Delete all filenames with 3 characters
$ cp *.txt ..Copy all files with a “.txt” extension to the
parent directory
Wildcard Examples (cont)
• Example: to list any file that start with a lowercase letter range from a to z as specified
ls -l [a-z]*
• Which of the following files will this pattern match.ls hw[0-9]*For example: hw1, hw3.c, hw50.txt, hw.
Finding Files
• The find command searches for files that have a specified name
• The command has the form:
find pathname - name filename• The find command prohibits you from searching
where you do not have system-level permissions • Although Linux does not use it, other UNIX
versions require the -print option after the filename to display the names of fields the find command locates
Finding Files Examples
• find /usr -name Chapter1 -type f -print search through the "/usr" directory for the file
named "Chapter1".
• find /usr -name "Chapter*" -type f -print search through the "/usr" directory for all files that
begin with the letters "Chapter". The filename can end with any other combination of characters.
This will match filenames such as "Chapter", "Chapter1", "Chapter1.bad", "Chapter_in_life".
Using the Sort Command
• Use the sort command to sort a file’s contents alphabetically or numerically
• Here is an example of its use:
sort file1 > file2
• Here are more complex examples:
ls –s | sort –n list files in sorted order
ls –s | sort –nr reverse sorted order
ls –s | sort –nr | head -5 show 5 largest files
grep
• grep command is used to match a pattern in one or more files, and displays the matching output on standard output.
• The format of grep command is:grep [options] regexp [files]
• grep means global regular expression print.
grep
• Example: grep luo /etc/passwd: to search a user in the system who called 'luo' as the first name or the last name.
• Quotes (single or double) are needed if regular expressions are to be usedgrep olga* will stallgrep ‘olga*’ will work fine
• Wildcards in filename can be used with grep as usualgrep ‘olga*’ *.txt
Regular Expressions
• b* ( , b, bb, bbb, bbbb, ….) • . matches any single character • ^ beginning of the line
grep ‘^taylor’ file1 find pattern ^taylor• $ end of the line
grep ‘:$’ file1 entries that ends with :
grep ‘100\$’ f1 searches for 100$
grep ‘100$’ f1 searches for 100 at the end
Regular Expressions
• [xy] any single character in the set• [^xy] any single character not in the set
grep ‘^taylor[a-z]’ file1 starts with taylor and contains a lower case letter afterwardsgrep ‘^taylor[^a-z]’ file1 starts with taylor and no lower case letter afterwardsgrep ‘z*’ f1 zero or more occurrences of z (everything in f1)grep ‘zz*’ f1 one or more occurrences of z (does not return lines without z)
Grep flags
• -c: print a count of matched lines only• -l: list filenames that contain the specified
pattern• -i: Ignore the case of the letters in the pattern• -n: Including line numbers• -b: precede each line with its block number• -h: print matched lines but not
filenames(inverse of -l)• -s: Suppress error messages for nonexistent of
unreadable• -v: Print all lines that don't match regexp
grep
• Example 1: to indicate how many users on the system has a last name or first name as "marylin" ignoring the upper or lower cases.grep -ci marylin /etc/passwd
• Example2: to indicate how may case-sensitive matches are in the file before any of them are displayed on the screengrep -c lu /etc/passwd
• Example3: to number the output lines.grep -n lu /etc/passwd
Chapter Summary
• Output from a command may be redirected from stdout to a disk file
• Input to a command may be redirected from stdin to a disk file
• Use the sort command to sort a file’s contents alphabetically or numerically
• A pipe directs the standard output of one Unix command into the standard input of another Unix command.
• A wildcard is a tool that allow you to specify a group of filenames or directories.
• Useful commands such as find, grep, cat,