Top Banner
Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages
34

Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Dec 27, 2015

Download

Documents

Mabel Roberts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Computer Science 112

Fundamentals of Programming IIRecursive Processing of Languages

Page 2: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Languages and Grammars

• A grammar specifies the rules for constructing well-formed sentences in a language

• Every language, including a programming language, has a grammar

Page 3: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Applications

• Grammar checkers in word processors

• Programming language compilers

• Natural language queries (Google, etc.)

Page 4: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Generate Sentences in English

• Given a vocabulary and grammar rules, one can generate some random and perhaps rather silly sentences

• Vocabulary - the set of words belonging to the parts of speech (nouns, verbs, articles, prepositions)

• Grammar - the set of rules for building phrases in a sentence (noun phrase, verb phrase, prepositional phrase)

Page 5: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence sentence

noun phrase verb phrase

A sentence is a noun phrase followed by a verb phrase

Page 6: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence sentence

noun phrase verb phrase

article noun

A noun phrase is an article followed by a noun

Page 7: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun

the girl

Pick actual words for those parts of speech at random

Page 8: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

the girl

A verb phrase is a verb followed by a noun phrase and a prepositional phrase

Page 9: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

the girl hit

Pick a verb at random

Page 10: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun

the girl hit

Expand a noun phrase again

Page 11: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun

the girl hit the boy

Pick an article and a noun at random

Page 12: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun preposition noun phrase

the girl hit the boy

A prepositional phrase is a preposition followed by a noun phrase

Page 13: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun preposition noun phrase

the girl hit the boy with

Pick a preposition at random

Page 14: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun preposition noun phrase

article noun

the girl hit the boy with

Expand another noun phrase

Page 15: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Structure of a Sentence

Similar to the behavior of strings so far

sentence

noun phrase verb phrase

article noun verb noun phrase prepositional phrase

article noun preposition noun phrase

article noun

the girl hit the boy with a bat

More random words from the parts of speech

Page 16: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Representing the Vocabularynouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair', 'fence', 'table', 'computer', 'cake', 'field']

verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']

prepositions = ['with', 'to', 'from', 'on', 'below', 'above', 'beside']

articles = ['a', 'the']

Use a list of words for each part of speech (lexical category)

Page 17: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Picking a Word at Randomnouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair', 'fence', 'table', 'computer', 'cake', 'field']

verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']

prepositions = ['with', 'to', 'from', 'on', 'below', 'above', 'beside']

articles = ['a', 'the']

import random

print(random.choice(verbs)) # Prints a randomly chosen verb

The random module includes functions to select numbers, sequence elements, etc., at random

Page 18: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Grammar Rulessentence = nounphrase verbphrase

nounphrase = article noun

verbphrase = verb nounphrase prepositionalphrase

prepositonalphrase = preposition nounphrase

A sentence is a noun phrase followed by a verb phrase

Etc., etc.

Page 19: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Define a Function for Each Rule# sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()

Each function builds and returns a string that is an instance of the phrase

Separate phrases and words with a space

Page 20: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Define a Function for Each Rule# sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()

# nounphrase = article noundef nounphrase(): return random.choice(articles) + ' ' + random.choice(nouns)

When a part of speech is reached, select an instance at random from the relevant list of words

Page 21: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Call sentence() to Try It Out # sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()

# nounphrase = article noundef nounphrase(): return random.choice(articles) + ' ' + random.choice(nouns)

for x in range(10): print(sentence()) # Display 10 sentences

You can also generate examples of the other phrases by calling their functions

Page 22: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Kinds of Symbols in a Grammar

• Terminal symbols: words in the vocabulary of the language

• Non-terminal symbols: words that describe phrases or portions of sentences

• Metasymbols: used to construct rules

Page 23: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Metasymbols for a Grammar

Metasymbols Use"" Enclose literal items= Means "is defined as"[ ] Enclose optional items{ } Enclose zero or more items( ) Group together required choices| Indicates a choice

Page 24: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

A Grammar of Arithmetic Expressions

expression = term { addingOperator term }

term = factor { multiplyOperator factor }

factor = primary ["^" primary ]

primary = number | "(" expression ")"

number = digit { digit }

digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

addingOperator = "+" | "-"

multiplyingOperator = "*" | "/"

Example sentences: 3, 4 + 5, 5 + 2 * 3, (5 + 2) * 3 ^ 4

Page 25: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Alternative Notation: Train Track

term = factor { multiplyingOperator factor }

factor

*

/

primary = number | "(" expression ")"

number

( )expression

Page 26: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Parsing

• A parser analyzes a source program to determine whether or not it is syntactically correct

Parser

Source language program

Syntax error messages

OK or not OK

Page 27: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Scanning

• A scanner picks out words in a source program and sends these to the parser

Parser

Source language program

Syntax error messages

Ok or not OKScanner

Lexical error messages

Tokens

Page 28: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Scanner(aString) Creates a scanner on a source string

get() Returns the current token (at the cursor)

next() Advances the cursor to the next token

The Scanner Interface

Page 29: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Tokens

• A Token object has two attributes:– type (indicating an operand or operator)– value (an int if it’s an operand, or the source string otherwise)

• Token types are– Token.EOE – Token.PLUS, Token.MINUS– Token.MUL, Token.DIV– Token.INT– Token.UNKNOWN

Page 30: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

The Token Interface

Token(source) Creates a token from a source string

str(aToken) String representation

isOperator() True if an operator, false otherwise

getType() Returns the type

getValue() Returns the value

Page 31: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Recursive Descent Parsing• Each rule in the grammar translates to a

Python parsing method

def expression(self): self.term() token = self.scanner.get() while token.getType() in (Token.PLUS, Token.MINUS): self.scanner.next() self.term() token = self.scanner.get()

expression = term { addingOperator term }

Page 32: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Recursive Descent Parsing• Each method is responsible for a phrase in

an expression

def term(self): self.factor() token = self.scanner.get() while token.getType() in (Token.MUL, Token.DIV): self.scanner.next() self.factor() token = self.scanner.get()

term = factor { multiplyingOperator factor }

Page 33: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

Recursive Descent Parsingprimary = number | "(" expression ")"

def primary(self): token = self.scanner.get() if token.getType() == Token.INT: self.scanner.next() elif token.getType() == Token.L_PAR: self.scanner.next() self.expression() self.accept(self._scanner.get(), Token.R_PAR, "')' expected") self.scanner.next() else: self.fatalError(token, "bad primary")

Page 34: Computer Science 112 Fundamentals of Programming II Recursive Processing of Languages.

For Monday

Expression Trees