Formal Language
Post on 05-Jan-2016
21 Views
Preview:
DESCRIPTION
Transcript
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 1/48
CogSci 131
Language as a formal system
Tom Griffiths
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 2/48
Admin
• Problem Set 0 is due tomorrow at 5pm
• Problem Set 1 will be out tomorrow – alittle harder than Problem Set 0 and morerepresentative of what to expect
–
the Turing machine problem is tricky!
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 3/48
Token manipulation systems
• System is defined fully by
–
a set of tokens
–
starting positions for those tokens
– formal rules stating how token positions canbe changed into other token positions
• Rules depend only on current positions,and define only the next positions
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 4/48
Language as a formal system
Noam Chomsky
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 5/48
Studying the mind in 1950
• Behaviorism
– explaining complex behaviors through simpleassociative learning mechanisms
– constructing theories of behavior withoutinternal mental states or representations
• e.g. language (= “verbal behavior ”)
–
speech acts are a response to environmentalstimuli, with learned sequential structure
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 6/48
The cognitive revolution
• Chomsky provided evidence for the idea thatwe can model the mind as a formal system
–
rigorous treatment of mental representations – using human data to evaluate formal proposals
• This was part of a more general revolution in
the way we approach behavior – making the study of cognition respectable
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 7/48
Symposium on Information Theory
• Often considered the birth of cognitive science(on 9/11/56, at MIT)
• Three famous papers presented:
– Allen Newell & Herbert Simon, “The Logic TheoryMachine: A complex information processing system”
–
Noam Chomsky,“Three models of language
”
– George Miller, “The magical number seven”
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 8/48
Behaviorist view of language
• People form associations betweenwords and things (semantics)
• People form associations between
words and other words (syntax) –
“the” followed by “word” makes it morelikely that “the” will be followed by “word”
“sheep”
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 9/48
What was Chomsky attacking?
• Simplistic behaviorist notions of syntax
•
Models of language as sequential
– e.g., n-th order Markov chains:
P(wi+n
|wi,...,w
i+n"1)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 10/48
wi+1 is independent of its history given wi
w w w w w w w w
Transition matrix
P (wi+1|wi
)
Markov chains
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 11/48
Markov chains
A. A. Markov
Chomsky's work in linguistics imply
concomitant understandings of aspects of
mental processing and human nature. His
theory of a universal grammar was seen by
many as a direct challenge to the
established behaviorist theories of the
external environment. The link between
human innate aptitude to language and mind
are innate. The acquisition and
development of innate propensities
triggered by the experiential input of the
time and in later discussions, we are
still far from understanding the genetic
setup of humans and aptitude to language
have been suggested at that time and had
major consequences for understanding how
language is learned by children.
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 12/48
What was Chomsky attacking?
• Simplistic behaviorist notions of syntax
•
Models of language as sequential
– e.g., n-th order Markov chains:
–
or, n-grams:P(w
i,...,w
i+n )
P(wi+n
|wi,...,w
i+n"1)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 13/48
P (model, of, language)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 14/48
P (model, of, quickly)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 15/48
Language
“a set (finite or infinite) of sentences, each finite inlength and constructed out of a finite set of elements”
all sequences
L This is a good sentence 1Sentence bad this is 0
linguistic analysis aims to separate the grammatical sequences which are sentences of L from the
ungrammatical sequences which are not
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 16/48
Grammatical
! meaningful
(1) “Colorless green ideas sleep furiously.”
(2) “Furiously sleep ideas green colorless.”
“It is fair to assume that neither sentence (1) nor (2)(nor indeed any part of these sentences) has everoccurred in an English discourse. Hence, in any
statistical model for grammaticalness, these sentenceswill be ruled out on identical grounds as equally‘remote’ from English.”
! probable*
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 17/48
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 18/48
Grammar
•
A formal system!
– tokens
– initial positions
–
rules for moving between positions
• !
such that final positions are sentences
“a device that generates all of the grammaticalsequences of L and none of the ungrammatical ones”
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 19/48
Syntax
•
Atomic formulas: proposition symbols(e.g. P, Q), True and False
• Complex formulas built out of simple
formulas via rules – if " and # are okay, ("$#) is okay
– if " and # are okay, ("%#) is okay
– if " and # are okay, ("&#) is okay
–
if " and # are okay, ("'#) is okay – if " is okay, ¬" is okay
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 20/48
Finite state grammar
START FINISHTHE
DOG
DOGS
RUNS
RUN
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 21/48
Finite state grammar
START FINISHTHE
DOG
DOGS
RUNS
RUN
THE
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 22/48
Finite state grammar
START FINISHTHE
DOG
DOGS
RUNS
RUN
THE DOG
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 23/48
HAIRY
Finite state grammar
START FINISHTHE
DOG
DOGS
RUNS
RUN
THE DOG RUNSTHE DOGS RUN
THE HAIRY HAIRY HAIRY HAIRY HAIRY H
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 24/48
HAIRY
Finite state grammar
START FINISHTHE
DOG
DOGS
RUNS
RUN
The set of languages generated by finite stategrammars are called “regular ” languages
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 25/48
English is not a regular language
•
Many simple languages are not regular
– e.g.anbn = { ab, aabb, aaabbb, aaaabbbb,!}
•
English exhibits similar dependencies – e.g. the dog the cat chased runs
• This “center embedding” indicates that
English is not a regular language – (provided we include infinitely long sentences)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 26/48
Phrase structure grammar
Tokens
Starting positions
Formal rules
S, NP, VP, T, N, V
the, man, ball, hit, took
S
S ( NP VPNP ( T NVP ( V NP
T ( theN ( man, ball,! V ( hit, took,!
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 27/48
Phrase structure grammar
S ( NP VPNP ( T NVP ( V NPT ( theN ( man, ball,! V ( hit, took,
!
S
NP VP
T N V NP
T Nthe man hit
the ball
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 28/48
Phrase structure grammar
• Context-free languages
–
rules of the form X ( Y
– e.g. anbn S ( aSb, S ( ab
• Context-sensitive languages
– rules of the form Z X W ( Z Y W
–
e.g. an
bn
c n
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 29/48
Transformational grammar
•
Phrase structure grammars miss somestructural connections between sentences
– e.g. active and passive forms
•
Hence transformations – e.g. if active form is grammatical, so is passive
• Transformational grammar is complicated!
– complexity of identifying grammatical sentences:
Regular O(n) Context-sensitive worse
Context-free O(n3) Transformational undecidable
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 30/48
Chomsky’s project
•
Identifying a formal system that captures thestructure of human language (and thought)
• Ignoring the limitations imposed by finite
human memory resources• This project was distinctive in
–
postulating rich structures involved in cognition
– using human data (linguistic intuitions) to rule out
certain kinds of formal systems
– also having implications for computer science
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 31/48
Marr ’s three levels
Computation “What is the goal of the computation, why is itappropriate, and what is the logic of the strategyby which it can be carried out?”
Representation and algorithm
“What is the representation for the input andoutput, and the algorithm for the transformation?”
Implementation “How can the representation and algorithm berealized physically?”
c o n s t r a i n s
c o n s t r a i n s
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 32/48
Break
Up next:
The Chomsky hierarchy
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 33/48
If language were finite
(Sung to the tune of “If I were a richman”, from Fiddler on the Roof. With
apologies to Noam Chomsky.)
If language were finite,One could memorize All sentences as if they were just lists.But it’s not. True novelty exists.Language is no finite sys-tem.
A finite state grammarIs a tempting second thoughtBut clearly isn’t what we’ve gotThe first words in almost any phraseCan constrain the end in many ways
So how about a push-down automaton
Might that be the very thing?
Just like a finite-state with a proper stack.There would be just one symbol poppedfrom the top
Plus one as input from the string And others there just waiting to pop back.
But hu-man language cannot be context-freeSwiss-German shows why this is trueThanks to the cross seri-al de-pen-den-cySo much for the very thought that language
could be finite
I’ve shown why the notion just won’t doBut now onto what language has to be! Oy!
Language isn’t finiteNor is it finite stateOr even possibly push-downNor just strings composed of verb and noun
Language is a complex sys-tem.There are transformationsOver parse-trees that yousimply cannot code as linear stringsEvery sentence is hierarchicalWith deep structure that can be revealed
I see language as governed by universal grammarWith a hefty set of rules And acquisition guided by a deviceI see grammar as context sensitive and complexNot like the grammar taught in schoolsWouldn’t such a system be quite nice?
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 34/48
The Chomsky hierarchy
Languages
Computable
Context sensitive
Context free
Regular
Machines
Turing machine
Bounded TM
Push-down automaton
Finite state automaton
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 35/48
state
read/write head
rules
tape
( state,read,move,state,write)
Turing machine
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 36/48
state
read/write head
rules
tape
( state,read,move,state,write)
Finite state automaton
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 37/48
Finite state
an
bm
s
Pushdown
anbm
anbn
s b
b
a
s
anbm
anbn
an
bn
cn
Readable Stack/ Bounded TM
b
b
a
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 38/48
Non-context-free constructions
•
“Cross-serial dependencies”
– occur in Swiss German and Dutch
–
in English: “respectively”
“Bob, Jim, and Ted earned $3, $4, and $5 respectively”
•
Cannot be produced by context-free grammar
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 39/48
The Chomsky hierarchy
Languages
Computable
Context sensitive
Context free
Regular
Machines
Turing machine
Bounded TM
Push-down automaton
Finite state automaton H u m a n s
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 40/48
The power of rules and symbols
•
Generativity
– “infinite use of finite means”
–
from tokens, initial positions, and rules, infinitely
many outcomes result – captures (constrained) novelty of language
• Structured representations
– e.g. hierarchical representations, expressing
relationships at multiple levels of abstraction
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 41/48
Structured representations
• Driving
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 42/48
Start car
Step on gas Turn on ignition
Take out keyMove leg Push gas Insert key
in ignition
Turn key
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 43/48
Structured representations
• Driving
• Cooking
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 44/48
(Humphreys & Forde, 1999)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 45/48
(Cooper & Shallice, 2000)
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 46/48
Structured representations
• Driving
• Cooking
•
Music and dance
• Is any behavior not hierarchically organized?
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 47/48
The power of rules and symbols
•
Generativity
– “infinite use of finite means”
–
from tokens, initial positions, and rules, infinitelymany outcomes result
– captures (constrained) novelty of language
• Structured representations
– e.g. hierarchical representations, expressing
relationships at multiple levels of abstraction
7/16/2019 Formal Language
http://slidepdf.com/reader/full/formal-language 48/48
Next week
• Take a look at Problem Set 1!
– it’s harder, may take longer, plan accordingly
• Tuesday: Learning structured representations
–
(or: Not learning structured representations)
– The Poverty of the Stimulus argument
top related