Top Banner
Organisation Formal Languages Regular Expressions and Regular Languages Conclusion Radboud University Nijmegen Formal Languages, Grammars and Automata Helle Hvid Hansen [email protected] http://www.cs.ru.nl/ ~ helle/ Foundations Group – Intelligent Systems Section Institute for Computing and Information Sciences Radboud University Nijmegen 25 April 2014 Helle Hvid Hansen 25 April 2014 FLGA 1 / 24
24

Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

Jun 09, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Formal Languages, Grammars and Automata

Helle Hvid Hansen

[email protected]

http://www.cs.ru.nl/~helle/

Foundations Group – Intelligent Systems SectionInstitute for Computing and Information Sciences

Radboud University Nijmegen

25 April 2014

Helle Hvid Hansen 25 April 2014 FLGA 1 / 24

Page 2: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Outline

Organisation

Formal Languages

Regular Expressions and Regular Languages

Conclusion

Helle Hvid Hansen 25 April 2014 FLGA 2 / 24

Page 3: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Course Organisation

• Lectures: Fridays, 13:30 - 15:30.

• Register in Blackboard (to get email announcements).

• Course webpage (information, material, exercises):

www.ru.nl/foundations/education/courses/flga-2014/

Check there first, before asking/emailing me!

• Material (PDFs via webpage):• Languages and Automata, Lecture Notes (v2),

by Alexandra Silva (used in Talen en Automaten, RU)• Lecture Notes on Regular Languages and Finite Automata,

by Andrew Pitts (earlier version used last year)

• Lots of other material available via www.

Helle Hvid Hansen 25 April 2014 FLGA 3 / 24

Page 4: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Examination and Grading

• Final exam (closed-book!): Tue 24 June, 8:30-11:30.

• Midterm test: Friday 23 May 2014 (details TBD).

• Final grade F (given test grade T , exam grade E ):• If E ≤ 5 (fail), then F = E• If E > 5 (pass), then F = max{1/2(T + E ),E}.

• Students with right to extra time, please contact me asap.

Helle Hvid Hansen 25 April 2014 FLGA 4 / 24

Page 5: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Tutorials (Werkcolleges) and Homework

• Time: Fridays 15:30-17:30.

• Tutorial instructors: Lorena van Duuren and Emma Gerritse.

• Group 1: Last name starts with [A-L],instructor: Lorena van Duuren, room: HG00.633.

• Group 2: Last name starts with [M-Z],instructor: Emma Gerritse, room HG00.065.

• No compulsory homework, option to hand in one exercise perweek for feedback.

• Exercises will be made available on webpage Thursday evening(for the coming day).

Helle Hvid Hansen 25 April 2014 FLGA 5 / 24

Page 6: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

How to pass?

• Practice, practice, practice,... Do (non-compulsory) exercises!Yes, it takes discipline....But this is the only way to internalise the material.

• Go to tutorials to get feedback and solutions to exercises.

• Test and exam questions will be in line with exercises.

• 3ec means 3× 28 hours = 84 hours total.20 hrs for exam, 8 hours per week⇒ 4 hours of self-study and exercises per week.

• Re-examination: Tuesday 24 June 2014, 08:30 - 11:30.

Helle Hvid Hansen 25 April 2014 FLGA 6 / 24

Page 7: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Let’s get started

What is a language?

• Natural language (English, Dutch, Chinese, ...).Some words in the English language:“students”, “do”, “homework”.

• Programming language (C, Java, Python, ...)A string of the language C: “printf(“Hello world.”)”

• Mathematical language, e.g. “x − (y − x) = 2x − y”

• Logic languages, e.g. first-order logic: ∀x ∈ N ∃y ∈ N : y > x

• ...

A language consists of words (or strings).Words are sequences of letters/symbols from an alphabet.

Helle Hvid Hansen 25 April 2014 FLGA 7 / 24

Page 8: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Alphabet

Def. An alphabet is a finite set, often denoted Σ. Elements of analphabet are called letters or symbols.

Examples:

Σ1 = {a}Σ2 = {0, 1}Σ3 = {A,C ,G ,T}Σ4 = {a, b, c , d , . . . , x , y , z}Σ5 = Chinese alphabet: ± 40.000 symbolsΣ6 = {+,×,−, 0, 1, 2, 3, . . .}

mathematical “alphabet”, countably infinite, so not alphabet.

Helle Hvid Hansen 25 April 2014 FLGA 8 / 24

Page 9: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Words/Strings

Def. Given an alphabet Σ,

• a word (or string) over Σ is a finite sequence of letters from Σ.

• the empty word (i.e. sequence of length 0) is denoted by λ.

• the set of all words over Σ is denoted by Σ∗.

Examples:

• x − (y − x) = 2x − y is a word of length 12 over the alphabetΣ = {x , y ,−,+, (, ),=, 0, 1, 2}

• The students will do their homework is a word of length 11over the alphabetΣ = {The, students, will, do, their, homework, , a,b,c}

Helle Hvid Hansen 25 April 2014 FLGA 9 / 24

Page 10: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Inductive Definition of Words

Inductive definition of Σ∗:

Σ∗ is the smallest set satisfying the following rules:

1 λ ∈ Σ∗.

2 If w ∈ Σ∗ and a ∈ Σ, then wa ∈ Σ∗ (or equivalently, aw ∈ Σ∗)

(Why is “smallest set” important?)

Properties of Σ∗:

• Σ∗ 6= ∅ (why?)

• If Σ 6= ∅, then Σ∗ is infinite

Helle Hvid Hansen 25 April 2014 FLGA 10 / 24

Page 11: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Definitions by Induction on Words

• Definition of Σ∗ says: w is a word if and only if

w = λ or w = va for some word v and letter a.

• We can define a function f on words by defining f (w) byinduction on w (distinguish cases for f (w)):

Base case (w = λ): f (λ) = ...Inductive case (w = va): f (va) = ... (may use f (v))

• If f takes several arguments, we can choose one for theinduction, for example, define f (u,w) by induction on w(u is fixed wrt induction).

Helle Hvid Hansen 25 April 2014 FLGA 11 / 24

Page 12: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Concatenation of Words

• Given words u = ab and w = bc over Σ = {a, b, c}.We can concatenate them to create new words:

u · w = abbc, w · u = bcab, u · u = abab

• Concatenation is a binary operation · on words.

• We define u · w by induction on w . For all u ∈ Σ∗,

Base case: u · λ = uInductive case: u · va = (u · v)a for all v ∈ Σ∗ and a ∈ Σ.

(We will often write uv instead of u · v)

• Some properties: u(vw) = (uv)w , λu = u

Helle Hvid Hansen 25 April 2014 FLGA 12 / 24

Page 13: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

More Operations

• Reversal of words, e.g. (abc)R = cba.Define wR by induction on w :

Base case: λR = λInductive case: (va)R = a · vR for all v ∈ Σ∗ and a ∈ Σ

• Repeating a word: E.g. (ab)2 = abab, (ab)3 = ababab, etc.Define un by induction on n ∈ N (!)

u0 = λ and un+1 = u · un

(Base case: n = 0, Inductive case: n = n′ + 1.)

Helle Hvid Hansen 25 April 2014 FLGA 13 / 24

Page 14: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Counting Occurrences and Length

• |w |a is the number of occurrences of letter a in word w .E.g., |λ|a = 0, |abb|a = 1, |abb|b = 2.Define by induction on w :

|λ|a = 0 and |vb|a =

{|v |a + 1 if a = b|v |a if a 6= b

• |w | is the length of the word w . E.g., |abb| = 3.(Exercise: define it by induction)

Helle Hvid Hansen 25 April 2014 FLGA 14 / 24

Page 15: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Proof by Induction

Prove that some property P holds for all words.For example, P(u, v) could be |uv | = |u|+ |v |.

A proof by induction works as follows:

• Base case: Show P holds for λ (in example: P(u, λ)

• Induction Hypothesis (IH):Assume that P(u, v) holds for all words v of length < n.

• Show that P(u,w) holds for words w of length n(you may use the IH)

We conclude by induction that P(u,w) holds for all words u,w .

See lecture notes by Silva for more examples.See also exercises of this week.

Helle Hvid Hansen 25 April 2014 FLGA 15 / 24

Page 16: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Formal Language

Def. A language L over Σ is a set of words over Σ, that is, L ⊆ Σ∗.

Examples:

• ∅, {λ} are languages over any Σ.

• L1 = {an ∈ {a, b}∗ | n ∈ N is even}• L2 = {anbn ∈ {a, b}∗ | n ∈ N}• L3 = {anbncn ∈ {a, b, c}∗ | n ∈ N}• L4 = {an ∈ {a}∗ | n ∈ N is prime}• L5 = {w ∈ {0, 1}∗ | w is binary representation of a prime}• L6 = {e | e is a well-formed arithmetical expression}• L7 = {P | P is a syntactically correct Java program}• L8 = {S | S is a grammatically correct English sentence}

Helle Hvid Hansen 25 April 2014 FLGA 16 / 24

Page 17: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Operations on Languages

Let L, L1, L2 ⊆ Σ∗.

Concatenation: L1L2 = {uv ∈ Σ∗ | u ∈ L1, v ∈ L2}

Reversal: LR = {uR ∈ Σ∗ | u ∈ L}

Union: L1 ∪ L2 = {u ∈ Σ∗ | u ∈ L1 or u ∈ L2}

Intersection: L1 ∩ L2 = {u ∈ Σ∗ | u ∈ L1 and u ∈ L2}

Complement: L = {u ∈ Σ∗ | u 6∈ L}

Kleene star: L∗ =⋃

n∈N Ln = L0 ∪ L1 ∪ L2 ∪ L3 ∪ . . .(where L0 = {λ} and Ln+1 = LLn)

= {u1 · · · un | u1, . . . , un ∈ L, n ∈ N}

Helle Hvid Hansen 25 April 2014 FLGA 17 / 24

Page 18: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Regular Expressions

Def. The set RegEx(Σ) of regular expressions over Σ is thesmallest set satisfying:

1 0, 1 and all a ∈ Σ are in RegEx(Σ).

2 If r , s ∈ RegEx(Σ) then also

(r + s), rs, (r)∗

are in RegEx(Σ).

• We assume 0, 1 are not in Σ.

• We will omit parentheses by using convention that: ∗ bindsstronger than concatenation which binds stronger than +.E.g., we write r + st∗ instead of (r + s(t)∗).

Helle Hvid Hansen 25 April 2014 FLGA 18 / 24

Page 19: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Regular Languages

Def. The language L(e) denoted by a regular expressione ∈ RegEx(Σ) is defined inductively by:

L(0) = ∅L(1) = {λ}L(a) = {a} for all a ∈ Σ

L(rs) = L(r)L(s)

L(r + s) = L(r) ∪ L(s)

L(r∗) = L(r)∗

Def. A language L ⊆ Σ∗ is regular if there exists a regularexpression e ∈ RegEx(Σ) such that L = L(e).

Helle Hvid Hansen 25 April 2014 FLGA 19 / 24

Page 20: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Examples of Regular Languages

Let Σ = {a, b}.

regular expression e language L(e)

a + b {a, b} = Σ(a + b)∗ all words over Σ (Σ∗)a(a + b)∗ all words that begin with ab∗(a + 1)b∗ all words that contain zero or one aa(0 + 1 + b)∗ {a, ab, abb, abbb, . . .}(ab∗)∗0 the empty language (∅)((a + b)(a + b))∗ all words of even length(ab∗)∗a∗ Σ∗

Def.Two regular expressions r and s are equivalent if L(r) = L(s).

Helle Hvid Hansen 25 April 2014 FLGA 20 / 24

Page 21: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Some Questions

1 Given a word w and a regular expression e, is there analgorithm that computes whether w ∈ L(e)?

2 Given regular expressions e1, e2 over the same alphabet, isthere an algorithm that computes whether L(e1) = L(e2)?

3 Are all languages regular? If not, then how can we prove thatsome L is not regular?

Helle Hvid Hansen 25 April 2014 FLGA 21 / 24

Page 22: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Summary

Learning goals of today:

• Notion of formal language

• Operations on word and languages

• Regular expressions for specifying regular languages

Helle Hvid Hansen 25 April 2014 FLGA 22 / 24

Page 23: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

Remark on Notation in Lecture Notes

[Silva] [Pitts]

alphabet A, B Σ

empty word/string λ ε

regular expressions 0, 1, r + s ∅, ε, r |s

Helle Hvid Hansen 25 April 2014 FLGA 23 / 24

Page 24: Formal Languages, Grammars and Automatacs.ru.nl/is/education/courses/2014/formal-languages/slides/lec1.pdf · Formal Languages Regular Expressions and Regular Languages Conclusion

OrganisationFormal Languages

Regular Expressions and Regular LanguagesConclusion

Radboud University Nijmegen

What is this course (not) about?

FormalLanguages

Grammars(generators)

Automata(acceptors)

This course:regular, context-free languages and their automata and grammars.

Other courses:context-sensitive, recursively enumerable languages, Turingmachines.

Helle Hvid Hansen 25 April 2014 FLGA 24 / 24