Top Banner
Todd Benson RegEx 101
24

Regex 101

Jun 30, 2015

Download

Technology

Todd Benson

Basic introduction into regular expressions and some ways they may be used
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Regex 101

Todd Benson

RegEx 101

Page 2: Regex 101

Overview

• What is RegEx• RegEx Basics• Uses for RegEx• Useful RegExpressions

Page 3: Regex 101

What is RegEx?

“In computing, a regular expression (abbreviated regex or regexp) is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. “ - Wikipedia

Page 4: Regex 101

• “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” - Jamie Zawinski

Page 5: Regex 101

Why RegEx?

• Tools use it: Nessus, Burp, W3AF• All programming languages use it• Excellent tool to have in the toolbox

Page 6: Regex 101

RegEx Basics: Literal Matches

Literal Matches‘bat’ matches ‘bat’

12 special characters - \ ^ $ . | ? * + ( ) [ ]These must be escaped ‘\\’ ‘\$’

.‘.at’ Matches ‘bat’, ‘cat’, and ‘hat’

Page 7: Regex 101

RegEx Basics: Characture Classes

Character Classes • -- [ ]

‘[bc]at’ will match ‘bat’ or ‘cat’• --[^ ]

[^A-Z] will match any character that is not a capitol letter

Page 8: Regex 101

RegEx Basics: Shorthand Character Classes

Shorthand Character Classes• \d

Same as [0-9]• \D

Same as [^0-9]• \w

Same as [0-9A-Za-z_] • \W

Same as [^0-9A-Za-z_]• \s

tab, line feed, form feed, carriage return, and space• \S

Anything other than tab, line feed, etc.

Page 9: Regex 101

RegEx Basics: Anchors

Anchors• ^

Beginning of line ‘rpm -qa|grep ^ao’ would list all packages that start with ‘ao’

• $End of line‘[0-9][0-9][0-9]$’ would find all instances when a line ended with 3 consecutive digits

• \b \bWord boundary‘\bW.n*\b’ looks for words that begin with ‘W’ followed by any character followed by ‘n’ followed by zero or more characters‘Win’ ‘Windows’ ‘Won’ ‘Wonton’ ‘Winter’ ‘Wonderland’ ‘Wonder’ all match

Page 10: Regex 101

RegEx Basics: Non-Printable

Non-printable• -- \n

New Line• -- \r

Carriage Return

Page 11: Regex 101

RegEx Basics: Groups

Groups • --( )

Defines the scope and precedence of operators‘Write(ln)?’ matches ‘Write’ and ‘Writeln’

• -- |OR‘Gr(a|e)y’ matches ‘Gray’ and ‘Grey’‘(ITSO|OITS)’ matches ‘ITSO’ or ‘OITS’

Page 12: Regex 101

RegEx Basics: Quantification

QuantificationShows how often a token or group is allowed to occur

• ?Zero or one‘a?’ will match ‘’ and ‘a’

• *Zero or more‘a*’ will match ‘’ and ‘a’ and ‘aaaaaaaaa’

Page 13: Regex 101

RegEx Basics: Quantification (Cont.)

QuantificationShows how often a token or group is allowed to occur• +

One or more‘a+’ will match ‘a’ and ‘aaaaaaaaaaaa’

• { , }Minimum and Maximum‘a{3,7}’ will match between 3 and 7 ‘a’

Page 14: Regex 101

Uses: Searches

• Errors (error|exception|illegal|invalid|fail|stack|access|directory|file|not found|unknown|uid=|varchar|SQL|quotation mark|syntax|password) • Redirects(document|window)\.

Page 15: Regex 101

Uses: Searches (Cont.)

• DOM XSS((src|href|data|location|code|value|action)\s*["'\]]*\s*\+?\s*=)|((replace|assign|navigate|getResponseHeader|open(Dialog)?|showModalDialog|eval|evaluate|execCommand|execScript|setTimeout|setInterval)\s*["'\]]*\s*\()

• DOM XSS(location\s*[\[.])|([.\[]\s*["']?\s*(arguments|dialogArguments|innerHTML|write(ln)?|open(Dialog)?|showModalDialog|cookie|URL|documentURI|baseURI|referrer|name|opener|parent|top|content|self|frames)\W)|(localStorage|sessionStorage|Database)

Page 16: Regex 101

Uses: Searching Logs

• grep -v 156.132.142.[11-19] /var/log/apache2/other_vhosts_access.log|grep -v 156.132.103.*

• cat /var/log/apache2/other_vhosts_access.log|grep -o '\s[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\s' | sort -t . -k 3,3n -k 4,4n|uniq

Page 17: Regex 101

Uses: VI Search and Replace

• SS#:%s/\d{3}-\d{2}-\d{4}/123-45-6789/g

• email:%s/[0-9A-Za-z._%+-]+@[0-9A-Za-z._%+-]+\.[A-Za-z]{2,4}/[email protected]/g

Page 18: Regex 101

Uses: Command Line

openssl ciphers|sed ‘s/:/\n/g'|sort

Page 19: Regex 101

Uses: Output Mangaling

while read line; do host $line; done < ips.txt | sed 's/ has address / \/ /g‘ > foo.txt

Page 20: Regex 101

Uses: Programming

• Sanitizing input $name = preg_replace("/\<\s*?\/?script\s*?>/i", "&lt;script&gt;", $name);

Page 21: Regex 101

Useful RegExes

• SS# \d{3}-\d{2}-\d{4}

• Phone# (\(?\d{3}\)?[ -.])?\d{3}[ -.]\d{4}

• IP Addresses \b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

• email [0-9A-Z._%+-]+@[0-9A-Z._%+-]+\.[A-Z]{2,4}

• Find Base64 (?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?

• Credit Card# - HTML Tags - Dates

Page 22: Regex 101

Questions?

Page 23: Regex 101

Go forth and RegEx…

Page 24: Regex 101

References

• Web Application Hacker's Handbook• http://regex.info/blog/2006-09-15/247#comment-3085• http://en.wikipedia.org/wiki/Regular_expression• https://isc.sans.edu/regex.html• http://www.regular-expressions.info/examples.html• http://

blog.spiderlabs.com/2013/02/easy-dom-based-xss-detection-via-regexes.html

• https://en.wikipedia.org/wiki/Regular_expression• www.xkcd.com