Top Banner
grep, diff, find
23

grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

Feb 02, 2018

Download

Documents

danglien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

grep, diff, find

Page 2: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ grep searches for a pattern –  Outputs line(s) containing pattern to stdout

‣ grep accepts both stdin and command line arg(s) UNIX> seq 10 | grep 1 1 10 UNIX>

Page 3: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> cat > input.txt 1 haystack 2 haystack 3 needle 4 haystack <CTRL-C> UNIX> grep needle input.txt 3 needle

Page 4: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣  Use double quotes for complex strings UNIX> cat > input2.txt 1,Marc Rubin,98753,Blue 2,Marc Ruben,96242,Green 3,Marc Reuben,97114,Brown <CTRL-C> UNIX> cat input2.txt | grep "Marc Rubin" 1,Marc Rubin,98753,Blue

Page 5: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ Search multiple files at once

UNIX> grep 3 input.txt input2.txt

input.txt:3 needle

input2.txt:1,Marc Rubin,98753,Blue

input2.txt:3,Marc Reuben,97114,Brown

plark42:~ >

Page 6: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> seq -352.5983 0.001 73.9531 | grep .653 | wc -l

3952

UNIX>ls -l /u/sa/br | grep qhan drwx--S--- 8 qhan qhan 4096 Jul 17 14:28 qhan

Page 7: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ grep can make use regular expressions –  CS theory tangent: what type of machine recognizes regex?

‣ Beyond time and scope of course…

‣ More information: –  http://www.robelle.com/smugbook/regexpr.html

Page 8: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ diff compares two files, outputs differences to stdout –  No differences no output

UNIX> echo HELLO > f1.txt

UNIX> echo HELLO > f2.txt

UNIX> diff f1.txt f2.txt

UNIX>

Page 9: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> echo HELLO > f1.txt UNIX> echo GOODBYE > f2.txt UNIX> diff f1.txt f2.txt 1c1 < HELLO --- > GOODBYE

Page 10: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ -y option splits output into two columns

‣  ‘|’ indicates differences

UNIX> diff –y f1.txt f2.txt

HELLO | GOODBYE

UNIX>

Page 11: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> cat > input1.txt first second third <CTRL-C> UNIX>

UNIX> cat > input2.txt first s e c o n d third <CTRL-C> UNIX>

Page 12: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> diff –y input1.txt input2.txt first first second | s e c o n d third third

UNIX> diff –y input1.txt input2.txt | grep ‘|’ second | s e c o n d

Page 13: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ -w option ignores whitespace

UNIX> diff –y input1.txt input2.txt | grep ‘|’

second | s e c o n d

UNIX> diff –w input1.txt input2.txt

UNIX>

Page 14: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣  find recursively searches filesystem for file(s)

‣ Syntax: find where [..] what [..] –  where := where to start searching

–  what := what to search for

Page 15: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ -name option to specify file name –  “starting in current directory, find photo1.jpg ”

UNIX> mkdir temp UNIX> touch temp/photo1.jpg UNIX> find . –name photo1.jpg ./temp/photo1.jpg UNIX>

Page 16: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ Often use wildcards (*) –  “find all .jpg files starting in temp directory”

UNIX> touch temp/photo2.jpg temp/photo3.jpg

UNIX> find temp –name "*.jpg"

temp/photo1.jpg

temp/photo2.jpg

temp/photo3.jpg

Page 17: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣  -user option to specify file owner –  “starting in /data/csci274, find all files owned by qhan”

UNIX>find /data/csci274/ -user qhan

/data/csci274/

/data/csci274/Assignments

/data/csci274/Assignments/14_network.tgz

/data/csci274/Assignments/14_network

/data/csci274/Assignments/14_network/file.txt

Page 18: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣  -size option to specify file size (e.g., -10k, +100M) –  see man page for more details.

‣  “starting in temp, find all files >= 450 MB” UNIX> wc -c temp/huge.txt 471904256 temp/huge.txt UNIX> find . -size +450M ./temp/huge.txt UNIX>

Page 19: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ Often combine search expressions –  “Starting in /data/csci274/, find all files with:

–  .jpg extension

–  owned by qhan

–  size >= 100kB in size”

UNIX> find /data/csci274/ -name "*.jpg" -user qhan-size +100k

/data/csci274/Assignments/14_network/image.jpg

Page 20: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣  by default, multiple search expressions joined by logical AND

‣  -or option used for logical OR UNIX> find /data/csci274/ -name "driver.cpp" -or -name "dummy.cpp"

/data/csci274/Assignments/6_makefile/fourth/driver.cpp

/data/csci274/Assignments/6_makefile/fourth/dummy.cpp

/data/csci274/Assignments/6_makefile/second/driver.cpp

/data/csci274/Assignments/6_makefile/second/dummy.cpp

Page 21: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

UNIX> mkdir a a/b a/b/c

UNIX> touch a/f1.zzzz a/b/f2.zzzz a/b/c/f3.zzzz

UNIX> find a -name "*.zzzz"

a/b/c/f3.zzzz

a/b/f2.zzzz

a/f1.zzzz

Page 22: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ xargs: build and execute command lines from standard input

UNIX> find a -name "*.zzzz” | xargs rm

UNIX> find a –name “*.zzzz”

UNIX>

‣ Be VERY CAREFUL with xargs rm

UNIX> find / -user poor_guy | xargs rm

Page 23: grep, diff, find - Oregon State Universityweb.engr.oregonstate.edu/~rubinma/Mines_274/Content/Slides/10... · UNIX> seq 10 | grep 1 1 10 UNIX> UNIX> cat > input.txt 1 haystack 2 haystack

‣ Use grep, diff, and find to search

‣ http://eecs.mines.edu/Courses/csci274/Assignments/10_searching.html