Top Banner
Unix grep Utility CS465
41

Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Unix grep Utility

CS465

Page 2: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

The grep utility• grep stands for globally search for a regular expression

and print the results

• It is one of the most used Unix tools. It has even added to the Unix user's vocabulary:

– Verb: “Grep through the files to see what should be changed.”

– Adjective: “The projx file is grepped source code.”

– Noun: “Grepping is the best way to find that information.”

Page 3: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep commands

• grep is actually a family of commands– fgrep– grep– egrep

• All three search files for strings which match specified patterns:

fgrep – pattern must be a fixed string

grep – pattern can include regular expressions

egrep – pattern can include extended regular expressions

Page 4: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Simplest grep

– Searches all files in the file-list

– Displays the filenames of the files which contain the fixed-pattern, along with the line in the file that the pattern was found on.

– If you list only ONE filename in the file-list, fgrep will NOT include the filename in the results

• The simplest grep command is fgrep:

$ fgrep fixed-pattern [file-list]

Page 5: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

fgrep Example

• Search for all files in the current directory that contain the string "main".

• Example:$ fgrep main *

memo: The main point is that the

new.c:main()

prog1.c:main()

$

Page 6: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More fgrep examples

Display all lines in file prog.c containing “num”:$ fgrep num prog.cnum = 0;while ( num < 5 ) {num = num + 1;$

Display info on all users lines containing “small”:$ fgrep small /etc/passwdsmall000:x:1164:102:Faculty – Pam Smallwood:/export/home/small000:/bin/ks

$

Page 7: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep format$ grep [options] pattern [filelist]

– Search for specified pattern in each line of specified files – Send lines containing pattern (or other info) to the standard

output (i.e. display them)

• Options:-c display only count of matching lines

-h outputs matched lines but not filenames-i ignore case when matching-l display names of files only (no matching lines)-n display line numbers

-s suppresses error messages for nonexistent or unreadable files

-v display only non-matching lines

-w restricts pattern to matching a whole word

Page 8: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

$ grep –n are soccer.txt

2:There are no time outs.

3:There are no helmets,

$ grep are soccer.txt

There are no time outs.

There are no helmets,

$ grep –c are soccer.txt

2

Page 9: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ grep –v no soccer.txt

In Soccer,

So if that’s what you need,

Are you ready?

$ grep Are soccer.txt

Are you ready?

$ grep –vc no soccer.txt

3

$ grep –i Are soccer.txt

There are no time outs.

There are no helmets,

Are you ready?

Page 10: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ grep –vw no soccer.txt

In Soccer,

So if that’s what you need,

Play another sport.

Are you ready?

$ grep –l soccer *

$ grep –l Soccer *

soccer.txt

$ grep –li soccer *

soccer.txt

Page 11: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team1 team2

team1:Scott Stewart

team1:Scott Smith

team2:Scott Jones

$ grep –l Scott team1 team2

team1

team2

$ grep –h Scott team1 team2

Scott Stewart

Scott Smith

Scott Jones

Page 12: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team1 taem2

team1:Scott Stewart

team1:Scott Smith

grep: can’t open taem2

$ grep –s Scott team1 taem2

team1:Scott Stewart

team1:Scott Smith

Page 13: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples

$ cat team1

Rob Murray

Scott Stewart

Martin Jones

Scott Smith

$ cat team2

Scott Jones

Richard Shepard

Doug Stringfellow

John English

$ grep Scott team*

team1:Scott Stewart

team1:Scott Smith

team2:Scott Jones

$ grep –c Do team* | grep ":0"

team1:0

$ grep –c Doug team*

team1:0

team2:1

Page 14: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More grep examples

Print non-commented lines in file myfile (i.e. lines that do NOT start with the string "#")$ egrep -v "^#" myfilename=billecho $name$

Search files in sub subdirectory for string “test” (ignore case)$ grep –i test `ls /sub`ltr: Test for todaymbox:Subject: testmakefile: test1: test1.cmakefile: gcc test1.c -o test1$

Page 15: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More grep example

Determine number of users in the projectX group:$ grep projectX /etc/group

projectX:x:507:Plin_9318,Fyusuf_9287,Rlee_8656,Rdeich_1254,Njuwal_5960,Mmelto_8858,Wbucki_6698,Tespin_0604,Psmallwo_000

$

-c shows matching count-c shows matching count-c shows matching count-c shows matching count-c shows matching count-c shows matching count

$ grep -c 507 /etc/passwd

9

$

Page 16: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Searching for Multiple Strings

-f option– If you have multiple strings that you want to

search for, you can put all the strings into a string file, and use:

$ cat stringfilepattern1pattern2$ grep –f stringfile filelist

Page 17: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples$ cat soccer.txt

In Soccer,

There are no time outs.

There are no helmets,

no shoulder pads,

no commercial breaks,

no warm dugouts,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

$ cat words

Soccer

dugouts

helmets

$ grep –f words soccer.txt

In Soccer,

There are no helmets,

no warm dugouts,

$ grep -v –f words soccer.txt

There are no time outs.

no shoulder pads,

no commercial breaks,

no halftime extravaganzas.

So if that’s what you need,

play another sport.

Are you ready?

Page 18: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep Exercises

• Display all lines of file “test” that do not contain the string “and”, ignoring case

$ grep -iv "and" test

• Display a count of all of the lines of each “.c” file in current directory that contain the strings “num” or “number”

• Answer:$ cat stringsnumnumber$ grep -c –f strings *.c

Page 19: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Advanced grep

• grep is much more powerful when used with regular expressions to match more complex strings.

• Regular expressions are strings of characters and special symbols that are used to match other strings.

Page 20: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Regular Expressions• A pattern matching string is called a regular

expression (RE)

• grep (and other Unix utilities) can use REs

• Regular Expression Metacharacters:

. (period) match any single character, except newline (similar to wildcard ?)

* (asterisk) match any number (including zero) of the preceding character

.* match any number of any character

Page 21: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Not Filename Expansion!

• Although there are similarities to the metacharacters used in filename expansion

• this is different!

• Filename expansion is done by the shell.

• Regular expressions are used by commands (programs).

Page 22: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More RE Metacharacters

^ (caret) match start of line

$ (dollar) match end of line

Character Sets:

[ ] match any of enclosed characters

[^ ] match anything BUT the enclosed

NOTE: If the caret (^) is anywhere inside a character set except right after the opening bracket, it has no special meaning

Page 23: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Character Set Ranges

• The hyphen (-) character can be used with the square brackets to indicate a range of characters:

[0-9] is the same as [0123456789]

[a-z] is the same as [abcd...wxyz]

• If the hyphen is placed at the beginning or end of the character set, it has no special meaning (and will match a hyphen in the string)

Page 24: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Other Characters• Any character other than a metacharacter will accept

one of itself:

– A single letter a in a regular expression will accept a single letter a in a string

– This is assumed to be case-sensitive; lowercase a doesn't accept uppercase A

• Use a backslash to turn OFF metacharacter processing (i.e. match a metacharacter to its real value)

• Use quotes around Regular Expressions to prevent SHELL metacharacter interpretation.

Page 25: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Example Pattern Matches

• Some sample regular expressions and what they match:"abc" matches the string abc

"^abc" abc at the beginning of a line

"abc$" abc at the end of a line

"^abc$" abc as the entire line

"[Aa]bc" abc or Abc

"a[aeiuo]c" a, lowercase vowel, c

"a[^aeiou]c" a, not lowercase vowel, c

Page 26: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Example Pattern Matches• More regular expressions and what they match:

"[x-z]" matches x or y or z

"[x\-z]" matches x or - or z

"[xz-]" matches x or - or z (same)

"[.c]" matches any character followed by a c

"[\.c]" matches .c

"[a-zA-Z0-9]" matches any letter or digit

"[^0-9]" match any non-digit

"[^\^]" match any single character except ^

Page 27: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More Pattern Matches

"[Pp][Aa][Mm]"

– Matches “Pam" or “pam" or “pAM“

– Does not match "am" or “pa“

"[abc]*"

– matches "aaaaa" or "acbca“

"0*10"

– matches "010" or "0000010" or "10"

Page 28: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ grep –i "^b" pattern

Background is black, and white.

but not yellow.

$ grep –i b pattern

Background is black, and white.

And I love blue,

but not yellow.

$ grep "^b" pattern

but not yellow.

$ grep "\." pattern

Background is black, and white.

But not yellow.

$ grep "." pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

Page 29: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ grep "," pattern

Background is black, and white.

I love red,

and I love blue,

$ grep ",$" pattern

I love red,

And I love blue,

Page 30: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep examples$ cat pattern2

rd

reed

red

reef

ref

reep

$ grep "re[df]" pattern2

red

ref

$ grep "re*d" pattern2

rd

reed

red

$ grep "f$" pattern2

reef

ref

$ grep "re*[dp]" pattern2

rd

reed

red

reep

Page 31: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More grep examples (using RE)

Display names of all files in this directory that refer to Unix (or unix)

$ grep -l "[Uu]nix" *mboxmyfile.txtscript2$

List soft-linked files only:$ ls –l | grep "^l"lrwx------ 2 small000 faculty 512 Jun 4 13:04 t1 -> t

lrwx------ 2 small000 faculty 512 Jun 2 13:43 t2 -> t

$

Page 32: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep in a script

Display long list of files, then the number of "old" files (files last accessed in 2007):

$ cat oldfiles.ksh#! /bin/kshls -l > listfilenum=`grep 2007 listfile | wc -l`echo Number of old files: $numrm listfileexit 0$

Page 33: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Extended Regular ExpressionAdditional Metacharacters(available only with egrep)

+ match any number (greater than zero) of preceding character

? match zero or one instances of preceding character

| combines REs with either-or matching

( ) groups pattern matching characters

Page 34: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

egrep examples$ cat pattern2

rd

reed

red

reef

ref

reep

$ egrep "re+d" pattern2

red

reed

$ egrep "re?d" pattern2

rd

red

$ egrep "re+[dp]" pattern2

red

reed

reep

$ egrep "re?[df]" pattern2

rd

red

ref

Page 35: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep/egrep examples$ cat pattern

Background is black, and white.

I love red,

and I love blue,

but not yellow.

$ egrep "I|a" pattern

Background is black, and white.

I love red,

and I love blue,

$ grep "red|blue" pattern

$ egrep "red|blue" pattern

I love red,

and I love blue,

$ egrep "^I|^a" pattern

I love red,

and I love blue,

Page 36: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Extended RE Examples

[abc]+d matches "aaaaad" or "acbcad "

but does NOT match "d"

0+10 matches "010" or "0000010"

but does NOT match "10"

x[abc]?x matches "xax", "xbx", "xcx" or "xx"

A[0-9]?B matches "A8B" or "AB“

but does NOT match "a8b" or "A123B"

Page 37: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Grouping

• The parentheses special characters ( and ) can be used to group several subexpressions together and apply a suffix to them as a group:

ba+d accepts bad, baad, baaad, etc

(ba)+d accepts bad, babad, bababad, etc

(ba)+(cd)+ accepts bacd, babacd, bacdcd, bacdcdcdcd, etc

Page 38: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Alternatives• The (|) "either-or" choice can also be "grouped"

aa|bb will accept either aa or bbfr(ie|ei)nd will accept friend or freind, and

nothing else

• There can be any number of choices:m(a|e|ai|oo)n will accept man, men, main, or moon

• If all the choices are single characters, then you might as well use a character set:p(a|e|i)n will accept pan, pen, or pin, and is

equivalent to p[aei]n

Page 39: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

More egrep Examples

• You must use egrep in order to have access to the EXTENDED regular expressions:

$ egrep 'ab+c?d' filematch lines with a followed by any number of

b’s and and optional c followed by a d

$ egrep '(ab)+c?(de)+' filematch lines with any number of ab’s and optional c

followed by any number of de’s

Page 40: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

Handout

• See handout for morefgrep, grep, and egrep examples

Page 41: Unix grep Utility CS465. The grep utility grep stands for globally search for a regular expression and print the results It is one of the most used Unix.

grep/egrep Exercise

• Display all lines of file “test” that end in the letters x, y, or z

$ grep '[xyz]$' testOR

$ egrep 'x$|y$|z$' test