Top Banner
NCNU Linux User Group NCNU Linux User Group 2012 2012 <Regular Expression> <Regular Expression> 王王王 王王王 2012/07/10 2012/07/10
27

NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

Jan 05, 2016

Download

Documents

Clinton Lewis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

NCNU Linux User NCNU Linux User Group 2012 Group 2012

<Regular Expression><Regular Expression>王惟綸王惟綸

2012/07/102012/07/10

Page 2: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 2

OutlineOutline

What’s a Regular Expression?What’s a Regular Expression? The PurposeThe Purpose What’s grep?What’s grep? Various OperatorsVarious Operators Extended Regular ExpressionsExtended Regular Expressions ExercisesExercises ReferencesReferences

Page 3: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 3

What’s a Regular What’s a Regular Expression?Expression?

A A regular expressionregular expression is a pattern that d is a pattern that describes a set of strings. escribes a set of strings.

ExamplesExamplesXX[[22--77] ] = {X2, X3, X4, X5, X6, X7} = {X2, X3, X4, X5, X6, X7} TT[[aeae]]steste? ? = {Taste, Tast, Teste, Test}= {Taste, Tast, Teste, Test}

Page 4: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 4

The PurposeThe Purpose

The regular expression is used to procThe regular expression is used to process strings. It makes users easily do ess strings. It makes users easily do sesearchingarching, , replacementreplacement, and , and deletiondeletion th though the aid of special characters.ough the aid of special characters.

TT[[aeae]]steste? ? = {Taste, Tast, Teste, Test} = {Taste, Tast, Teste, Test} -- These four strings, -- These four strings, TasteTaste, , TastTast, , TesteTeste,, and and TestTest, can be found out by only sea, can be found out by only searching the pattern “Trching the pattern “T[[aeae]]steste??”.”.

Page 5: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 5

What’s grep?What’s grep? gglobal lobal rregular egular eexpression xpression pprintrint

The The grepgrep command searches for the command searches for the pattern specified by the pattern specified by the PatternPattern parameter and writes each matching line parameter and writes each matching line to standard output. to standard output.

[-i ][-i ] : ignore the type of upper and lower : ignore the type of upper and lower cases cases [-v][-v] : inverse the output : inverse the output

Page 6: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 6

Page 7: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 7

alias & unaliasalias & unalias

Page 8: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 8

Various OperatorsVarious Operators

1.1. [ ][ ] presents any one character among those presents any one character among those characters inside.characters inside.

2.2. [ - ][ - ] presents any one character among the code presents any one character among the code range. range.

3.3. [^ ][^ ] represents the characters not in the range of a represents the characters not in the range of a list.list.

4.4. ^ ^ Matches the empty string at the beginning of a Matches the empty string at the beginning of a line.line.

5.5. $ $ Matches the empty string at the end of a line.Matches the empty string at the end of a line.

6.6. . . Matches any single character.Matches any single character.

7.7. * * The preceding item will be matched zero or The preceding item will be matched zero or more times.more times.

Page 9: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 9

1. [ ]1. [ ] presents any one character among those characters inside. presents any one character among those characters inside.

th[ei] = {the, thi}th[ei] = {the, thi}

Page 10: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 10

2. [ - ]2. [ - ] presents any one character among the code range. presents any one character among the code range. LANG=C : 0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW.Big5 : 0 1 2 3 4 ... a A b B c C d D ... z Z

Page 11: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 11

LANG=C : 0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW.Big5 : 0 1 2 3 4 ... a A b B c C d D ... z Z

2. [ - ]2. [ - ] presents any one character among the code range. presents any one character among the code range.

Page 12: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 12

Symbol Meaning

[:alnum:] 代表英文大小寫字元及數字,亦即 0-9, A-Z, a-z

[:alpha:] 代表任何英文大小寫字元,亦即 A-Z, a-z

[:blank:] 代表空白鍵與 [Tab] 按鍵兩者[:cntrl:] 代表鍵盤上面的控制按鍵,亦即包括 CR, LF, Tab, Del.. 等等[:digit:] 代表數字而已,亦即 0-9

[:graph:] 除了空白字元 ( 空白鍵與 [Tab] 按鍵 ) 外的其他所有按鍵[:lower:] 代表小寫字元,亦即 a-z

[:print:] 代表任何可以被列印出來的字元[:punct:] 代表標點符號 (punctuation symbol) ,亦即: " ' ? ! ; : # $...

[:upper:] 代表大寫字元,亦即 A-Z

[:space:] 任何會產生空白的字元,包括空白鍵 , [Tab], CR 等等

[:xdigit:]代表 16 進位的數字類型,因此包括: 0-9, A-F, a-f 的數字與字元

2. [ - ]2. [ - ] presents any one character among the code range. presents any one character among the code range.

Page 13: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 13

3. [^3. [^ ]] represents the characters not in the range of a list. represents the characters not in the range of a list.

Page 14: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 14

4. ^ 4. ^ Matches the empty string at the beginning of a line.Matches the empty string at the beginning of a line.

Page 15: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 15

5. $ 5. $ Matches the empty string at the end of a line. Matches the empty string at the end of a line.

Page 16: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 16

6. . 6. . Matches any single character.Matches any single character.

Page 17: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 17

7. * 7. * The preceding item will be matched zero or more times.The preceding item will be matched zero or more times.

go* = {g, go, goo, gooo, …}go* = {g, go, goo, gooo, …}

goo* = {go, goo, gooo, …}goo* = {go, goo, gooo, …}

Page 18: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 18

Extended Regular Extended Regular ExpressionsExpressions

In basic regular expressions the metacharacters "?", "+",In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use "{", "|", "(", and ")" lose their special meaning; instead use the the backslashed versionsbackslashed versions "\?", "\+", "\{", "\|", "\(", and "\)". "\?", "\+", "\{", "\|", "\(", and "\)".

Using Using grep -Egrep -E or or egrepegrep instead of grep. instead of grep.

1.1. ++ The preceding item will be matched one or more time The preceding item will be matched one or more times. s.

2.2. ? ? The preceding item will be matched zero or one time. The preceding item will be matched zero or one time. 3.3. | | represents the preceding item or the following item. represents the preceding item or the following item.4.4. ( )( ) represents group strings. represents group strings.5.5. {N}{N} The preceding item is matched exactly N times. The preceding item is matched exactly N times. 6.6. {N, }{N, } The preceding item is matched N or more times. The preceding item is matched N or more times. 7.7. {N,M}{N,M} The preceding item is matched at least N times, The preceding item is matched at least N times,

but not more than M times. but not more than M times.

Page 19: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 19

1. +1. + The preceding item will be matched one or more times. The preceding item will be matched one or more times.

goo+ = {goo, gooo, goooo, …}goo+ = {goo, gooo, goooo, …}

Page 20: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 20

2. ?2. ? The preceding item will be matched zero or one time. The preceding item will be matched zero or one time.

goog? = {goog, goo}goog? = {goog, goo}

Page 21: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 21

3. |3. | represents the preceding item or the following item. represents the preceding item or the following item.

goo|fav = {goo, fav}goo|fav = {goo, fav}

Page 22: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 22

4. ( )4. ( ) represents group strings. represents group strings.

f(oo|ee)d = {food, feed}f(oo|ee)d = {food, feed}

Page 23: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 23

5. {N}5. {N} The preceding item is matched exactly N times. The preceding item is matched exactly N times.

go\{2\} = {goo}go\{2\} = {goo}

go\{5\} = {gooooo}go\{5\} = {gooooo}

Page 24: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 24

6. {N, }6. {N, } The preceding item is matched N or more times. The preceding item is matched N or more times.

Page 25: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 25

7. {N,M}7. {N,M} The preceding item is matched at least N times, The preceding item is matched at least N times, but not more than M times.but not more than M times.

go\{2,5\}g = {goog, gooog, goooog, gooooog}go\{2,5\}g = {goog, gooog, goooog, gooooog}

Page 26: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 26

ExercisesExercises

1.1. What does What does grep -n '^[^A-z] 'grep -n '^[^A-z] ' mean? mean?2.2. What does What does grep -n 'g.*g 'grep -n 'g.*g ' mean? mean?3.3. What does What does egrep '(d(a|u)|cc?)d'egrep '(d(a|u)|cc?)d' mean? mean?4.4. How to print out How to print out non-empty linesnon-empty lines? ? 5.5. How to find out “How to find out “[LUG\2012][LUG\2012]”?”?6.6. Find all files and their contents contaFind all files and their contents conta

ining the symbol “ining the symbol “**” under /etc” under /etc

Page 27: NCNU Linux User Group 2012 NCNU Linux User Group 2012 王惟綸2012/07/10.

2012/7/10 27

ReferencesReferences

http://linux.vbird.org/linux_basic/0330reghttp://linux.vbird.org/linux_basic/0330regularex.phpularex.php

http://tldp.org/LDP/Bash-Beginners-Guide/http://tldp.org/LDP/Bash-Beginners-Guide/html/chap_04.htmlhtml/chap_04.html

http://en.wikipedia.org/wiki/Regular_exprhttp://en.wikipedia.org/wiki/Regular_expressionession

http://www.regular-expressions.info/http://www.regular-expressions.info/posix.htmlposix.html