Top Banner
String Matching dengan Regular Expression Masayu Leylia Khodra Referensi: Chapter 2 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by Daniel Jurafsky and James H. Martin 15-211 Fundamental Data Structures and Algorithms, by Ananda Gunawardena
14

String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Mar 08, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

String Matching dengan Regular Expression

Masayu Leylia Khodra

Referensi:Chapter 2 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by Daniel Jurafsky and James H. Martin

15-211 Fundamental Data Structures and Algorithms, by Ananda Gunawardena

Page 2: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

String Matching: Definisi

• Diberikan:

1. T: teks (text), yaitu (long) string yang panjangnya nkarakter

2. P: pattern, yaitu string dengan panjang m karakter(asumsi m <<< n) yang akan dicari di dalam teks.

Carilah (find atau locate) di dalam teks yang bersesuaiandengan pattern.

Page 3: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Contoh 1: Exact Matching

Page 4: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Contoh 2: Regex Matching

Page 5: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Notasi Umum Regex

Page 6: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Notasi Regex

Page 7: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Contoh 3: Regex for Email

Tentukan regexnya untuk semua email yang diwarnai

Page 8: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Contoh 4: Regex for Phone Number

Page 9: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Regular Expressions and Automata 9

Basic Regular Expression Patterns

• The use of the brackets [] to specify a disjunction of characters.

• The use of the brackets [] plus the dash - to specify a range.

Page 10: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Regular Expressions and Automata 10

Basic Regular Expression Patterns

• Uses of the caret ^ for negation or just to mean ^

• The question-mark ? marks optionality of the previous expression.

• The use of period . to specify any character

Page 11: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Finite State Machines (FSM)

• FSM is a computing machine that takes

– A string as an input

– Outputs YES/NO answer

• That is, the machine “accepts” or “rejects” the string

FSMInput String Yes / No

Referensi: Gunawardena, 2006

Page 12: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

FSM Model

• Input to a FSM– Strings built from a fixed alphabet {a,b,c}– Possible inputs: aa, aabbcc, a etc..

• The Machine– A directed graph

• Nodes = States of the machine• Edges = Transition from one state to another

• Special States– Start (q0) and Final (or Accepting) (q2)

• Assume the alphabet is {a,b}– Which strings are accepted by this FSM?

Referensi: Gunawardena, 2006

Page 13: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

FSM untuk String Matching

• Alphabet {a,b,c}

• Pattern “aabc”

• String: aaaaaaaaaaaabcddddddddddddddd

0Start 1 2 3 4a a b c

b|c

b|cc

a

b

a

4

Referensi: Gunawardena, 2006

Page 14: String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia

Regex di Java