Natural Language Processing Chapter 4
Dec 31, 2015
Natural Language Processing
Chapter 4
323-670 Artificial Intelligence Chapter 42
NLP
• Language translation / multilingual translation
• Language understanding– Figure 14.5 p. 365 Interaction
among component– Figure 14.6 p. 366 A speech
Waveform
323-670 Artificial Intelligence Chapter 43
Figure14.5: More Interaction among Components
S
NP
VJohn
VP
NP PP
N
boy
saw DET
the
PP with a telescope
in the park
John saw the boy in the park with a telescope.
323-670 Artificial Intelligence Chapter 44
Figure14.5: More Interaction among Components
S
NP
VJohn
VP
NP
PPN
boy
saw DET
the
PP
with a dogin the park
John saw the boy in the park with a dog.
323-670 Artificial Intelligence Chapter 45
Figure14.5: More Interaction among Components
John saw the boy in the park with a statue.
S
NP
VJohn
VP
NP
N
boy
saw DET
the
PP
with a statue
in the park
323-670 Artificial Intelligence Chapter 46
Figure14.6: Local Ambiguity in a Speech Problem
The cat scares all the birds away.
k a t s k a r s
A cat’s cares are few.
323-670 Artificial Intelligence Chapter 47
The Problem: English sentences are incomplete descriptions of the information that they are intended to convey:
Some dogs are outside. I called Lynda to ask her
to the movies.She said she’ d love to
go.
Some dogs are on the lawn. She was home when I called. Three dogs are on the lawn. She answered the phone. Rover, Tripp, and Spot are I actually asked her. on the lawn.
The Good Side: Language allows speakers to be as vague or precise as they like. It also allows speakers to leave out things they believe their hearers already know.
323-670 Artificial Intelligence Chapter 48
The Problem: The same expression means different things in different contexts:
Where’s the water? (in a chemistry lab, it must be pure)Where’s the water? (when you are thirsty, it must be potable)Where’s the water? (dealing with a leaky roof, it can be filthy)
The Good Side: Language lets us communicate about an infinite world using a finite (and thus earnable) number of symbols.
323-670 Artificial Intelligence Chapter 49
The Problem: No natural language program canbe complete because new words, expressions, and meanings can be generated Quite freely:
I’ll fax it to you.
The Good Side: Language can evolve as the experiences that we want to communicate about evolve.
323-670 Artificial Intelligence Chapter 410
The problem: There are lots of ways to say the same thing:
Mary was born on October 11.Mary’s birthday is October 11.
The Good Side: When you know a lot, facts imply each other. Language is intended to be used by agents who know a lot.
Figure 15.1: Features of Language That Mark It Both Difficult and Useful
323-670 Artificial Intelligence Chapter 411
NLP Problems
• Figure 15.1 P. 378• English sentences are incomplete descriptions
of the information that are intended to convey. • The same expression means different things in
different context.• No natural language program can be complete
because of new words, expression, and meaning can be generated quite freely.
• There are lots of ways to say the same thing.
323-670 Artificial Intelligence Chapter 412
NLP Problems
1) Processing written text– using lexical, syntactic, and semantic
knowledge of the language – require the real world information
2) Processing spoken language– using all information needed aboveplus additional knowledge about phonology– handle ambiguities in speech
323-670 Artificial Intelligence Chapter 413
Step in NLP
1) Morphological Analysis2) Syntactic Analysis 3) Semantic Analysis4) Discourse Integration5) Pragmatic Analysis
– boundaries between these five phrases are often fuzzy.
323-670 Artificial Intelligence Chapter 414
1. Morphological Analysis
• Individual words are analyzed into components
• Nonword tokens such as punctuation are separated from the words
• I want to print Bill’s .int file.
proper noun
possessive suffix
file extension
323-670 Artificial Intelligence Chapter 415
2. Syntactic Analysis
• linear sequence of words are transformed into structures
• show how words relate to each other• English syntactic analyzer• If do not pass the syntactic analyzer
rejecte.g. (Boy the go to store the)
323-670 Artificial Intelligence Chapter 416
• Example of syntactic analysis Figure 15.2 p. 382 RM2, RM5, RM5
• A knowledge base FragmentFigure 15.3 p. 383 User073, F1, Printing, File_Structure, WaitingMental Event/ Physical Event Animate/Event
• Partial meaning for a sentence Figure 15.4 p. 384
2. Syntactic Analysis
323-670 Artificial Intelligence Chapter 417
Syntax The dog bites the man.
323-670 Artificial Intelligence Chapter 418
Apply rule
323-670 Artificial Intelligence Chapter 419
Parse Tree The man bits the dog.
323-670 Artificial Intelligence Chapter 420
The dog likes a man.Parse Tree
323-670 Artificial Intelligence Chapter 421
Internal Representative
323-670 Artificial Intelligence Chapter 423
• Top-down Parsing – Begin with start symbol and apply the
grammar rules forward until the symbols at the terminals of the tree correspond to the components of the sentence being parsed.
• Bottom-up Parsing– Begin with the sentence to be parsed and
apply the grammar rules backward until a single tree whose terminals are the words of the sentence and whose top node is the start symbol has been produced.
Syntactic Processing (2)
323-670 Artificial Intelligence Chapter 424
The man bits the dog.
Transition Network
323-670 Artificial Intelligence Chapter 425
ATN : Augmented Transition Network
• similar to finite state machineFigure 15.8 p.392 An ATN networkFigure 15.9 p.393 An ATN Grammar in List Form
• sentence “The long file has printed.”S NP Q1 AUX Q3 V Q4 (F)
halt
NP Det Q6 Adj Q6 N Q7 (F) (S DCL (NP (FILE (LONG) DEFINITE))
HAS
(VP PRINTED)) p.394
323-670 Artificial Intelligence Chapter 427
3. Semantic Analysis
• the structures created by the syntactic analyzer are assign meanings
• mapping between the syntactic structure and objects in the task domain
• If no mapping reject (colorless green ideas sleep furiously)• 1) It must map individual words into appropriate
objects in the knowledge base or database.• 2) It must create the correct structures to
correspond to the meanings of the individual words combine with each other.
323-670 Artificial Intelligence Chapter 432
รู�ปแสดงผลการูวิ เครูาะห์�ทางวิากยส�มพั�นธ์�ของปรูะโยค “I want to print Bill’s .init file.”
323-670 Artificial Intelligence Chapter 433
323-670 Artificial Intelligence Chapter 434
323-670 Artificial Intelligence Chapter 435
ผลการูวิ เครูาะห์�ทางควิามห์มายแสดงด�งรู�ป
323-670 Artificial Intelligence Chapter 436
ผลส�ดท�ายท !จากการูวิ เครูาะห์�ทางปฏิ บั�ติ ค&อค'าส�!งในย�น กซ์�ท !ใช้�ผลส�ดท�ายท !จากการูวิ เครูาะห์�ทางปฏิ บั�ติ ค&อค'าส�!งในย�น กซ์�ท !ใช้�ส� !งย�น กซ์�พั มพั�ไฟล�ท !ติ�องการูส�!งย�น กซ์�พั มพั�ไฟล�ท !ติ�องการู lpr /wsmith/stuff.initlpr /wsmith/stuff.init
323-670 Artificial Intelligence Chapter 442
4. Discourse Integration
• the meaning of the individual sentence may depend on the sentences that precede it and may influence the meanings of the sentences that follow it.
• (Ex. John want it.) “It” depends on the previous sentence.
• Current user who type word “I” is – User068 = Susan_Black
• We get F1 with filename in /wsmith/ directory
323-670 Artificial Intelligence Chapter 443
5. Pragmatic Analysis
• The structure representing what was said is reinterpreted to determine what was actually meant.
• (Ex. Do you know what time it is?) we should understand what to do....Understand to decide what to do as a result
• Representing the intended meaning– Figure 15.5 P. 385
Turbo Prolog
323-670 Artificial Intelligence Chapter 447
ftp://172.28.80.6/older/DosProgram/TPROLOGAlt + Enter = Big ScreenF1 : HelpF2 : SaveF3 : LoadF6 : Next/SwitchF8 : Previous GoalF9 : CompileF10 : Step (For trace) / EndAlt + T : Trace ON/OFFSet up window size edit Use arrow key to adjust the
size
TURBO PROLOG
323-670 Artificial Intelligence Chapter 448
Use the example from the EXAMPLE directory to try to program.
Start with EX03EX01.PROpredicates likes(symbol,symbol)
clauses likes(ellen, tennis). likes(john, football). likes(tom, baseball).
likes(eric, swimming) likes(mark, tennis). likes(bill, Activity) if likes(tom, Activity).
likes(mark, Activity) :- likes(ellen, Activity).
TURBO PROLOG
FACTS
RULES
323-670 Artificial Intelligence Chapter 449
ARITHMETICArithmetic operators: +, -, *, /, mod, div
Relational operators: >, <, =, >=, <=, <>, ><
Functions: sin, cos, tan, arctan, ln, log, exp, sqrt, round, trunc, abs
EX: 1 + 2 = 2 + 1, X = 5/2, X = 5 mod 2, 5 <> 9
PROLOG.HELP
323-670 Artificial Intelligence Chapter 450
char 1 byte characters
integer 2 byte integer numbers
real 8 byte floating point
numbers
symbol strings inserted in the
internal symbol table
string sequences of chars
"hello world\n"
PREDEFINED DOMAINS
323-670 Artificial Intelligence Chapter 451
CONSTANTS const1 = definition const2 = definition
[GLOBAL] DOMAINS dom [,dom] = [reference] declaration1; declaration2 listdom = dom* dom = <basisdom>[GLOBAL] DATABASE [ - <databasename> ] [determ] pred1(....) pred2(.....)
GLOBAL PREDICATES [determ|nondeterm] pred1(.........)
-(i,i,o,..)(i,o,i,..) [ language c|pascal|fortran ] [ as "name" ] pred2(........)
PREDICATES [determ|nondeterm] pred1(.........) pred2(........)
CLAUSES p(....):-p1(...), p2(.....), ... . p(....):-p1(...), p2(.....), ... .
include "filename" Include a file during compilation.
SUMMARY OF PROGRAM SECTIONS
323-670 Artificial Intelligence Chapter 452
random(RealVariable)(real) - (o)
random(MaxValue,RandomInt)(integer,integer) - (i,o)
sound(Duration,Frequency)(integer,integer) - (i,i)
beepdate(Year,Month,Day)
(integer,integer,integer) - (o,o,o) (i,i,i)time(Hours,Minutes,Seconds,Hundredths)
(integer,integer,integer,integer) - (o,o,o,o) (i,i,i,i)
trace(on/off)(string) - (i) (o)
MISCELLANEOUS
323-670 Artificial Intelligence Chapter 453
trap (PredicateCall,ExitCode,Predicate
ToCallOnError)
exit
exit (ExitCode)
(integer) - (i)
if exit to DOS then the DOS errorlevel task processing variable will
contain the value given to the exit predicate.
break (on/off)
(string) - (i) (o)
ERROR & BREAK CONTROL
323-670 Artificial Intelligence Chapter 454
display(String)
(string) - (i)
edit(InputString,OutputString)
(string,string) - (i,o)
edit(InputString,OutputString,Headstr,Headstr2,Msg,Pos,Helpfilename,
EditMode,Indent,Insert,TextMode,RetPos,RetStatus)
(string,string,string,string,string,integer,string,integer,integer,integer,integer,integer,integer)
- (i,o,i,i,i,i,i,i,i,i,i,o,o)
If the user saves the text from the editor, HeadStr2 will be used as the file name.
editmsg(InputString,OutputString,Headstr,Headstr2,Msg,Pos,Helpfilename,RetStatus)
(string,string,string,string,string,integer,string,integer) - (i,o,i,i,i,i,i,o)
EDITOR
323-670 Artificial Intelligence Chapter 455
makewindow(WindowNo,ScrAtt,FrameAtt,Framestr,Row,Column,Height,Width)
(integer,integer,integer,string,integer,integer,integer,integer)
shiftwindow(WindowNo)
(integer) - (i) (o)
gotowindow(WindowNo)
(integer) - (i)
resizewindow(StartRow,NoOfRows,StartCol,NoOfCols)
(integer,integer,integer,integer) - (i,i,i,i)
colorsetup(Main_Frame)
(integer) - (i)
WINDOW SYSTEM
323-670 Artificial Intelligence Chapter 456
readln(StringVariable)
(string) - (o)
readint(IntgVariable)
(integer) - (o)
readreal(RealVariable)
(real) - (o)
readchar(CharVariable)
(char) - (o)
keypressed
unreadchar(CharToBePushedBack)
(Char) - (i)
readterm( Domain, Variable )
(DomainName,Domain) - (i,_)
INPUT
323-670 Artificial Intelligence Chapter 457
write( Variable|Constant * )
nl
writef( FormatString, Variable|Constant* )In the format string the following options are known after a percentage
sign:
%d Normal decimal number. (chars and integers)
%u As an unsigned integer. (chars and integers)
%R As a database reference number. (database reference numbers)
%X As a long hexadecimal number. (strings, database reference numb).
%x As a hexadecimal number. (chars and integers).
%s Strings. (symbols and strings).
%c As a char. (chars and integers).
%g Reals in shortest posible format (default for reals)
%e Reals in exponetial notation
%f Reals in fixed notation
%lf Only for C compatibility (fixed reals)
\n - newline
\t - tabulator
\nnn - character with code nnn
OUTPUT
323-670 Artificial Intelligence Chapter 458
Natural Language Processing using prolog
Sentence :- Noun_phrase, Verb_phrase.
Noun_phrase :- Det, Noun.Noun_phrase :- Noun.
Verb_phrase :- Verb, Noun_phrase.Verb_phrase :- verb.
EX : The cat eats the fish. A man likes an apple.
323-670 Artificial Intelligence Chapter 459
EX13EX04.pro NLP.prodomains sentence = s(noun_phrase,verb_phrase) noun_phrase = noun(noun) ; noun_phrase(detrm,noun) noun = string verb_phrase = verb(verb) ; verb_phrase(verb,noun_phrase) verb = string detrm = stringpredicates s_sentence(string,sentence) s_noun_phrase(string,string,noun_phrase) s_verb_phrase(string,verb_phrase) d(string) n(string) v(string) startgoal
start.
goal:
Please enter the sentence >
Bill eats apple
323-670 Artificial Intelligence Chapter 460
clauses start :- write("\n Please enter a sentence > "), readln(Str), s_sentence(Str,s(_,_)). s_sentence(Str, s(N_Phrase,V_Phrase) ):- s_noun_phrase(Str, Rest, N_Phrase), s_verb_phrase(Rest, V_Phrase). s_noun_phrase(Str, Rest, noun_phrase(Detr,Noun)):- fronttoken(Str,Detr,Rest1), d(Detr), fronttoken(Rest1,Noun,Rest), n(Noun). s_noun_phrase(Str,Rest,noun(Noun)):- fronttoken(STR,Noun,Rest), n(Noun). s_verb_phrase(Str, verb_phrase(Verb,N_Phrase)):- fronttoken(Str,Verb,Rest1), v(Verb), s_noun_phrase(Rest1,"",N_Phrase). s_verb_phrase(Str,verb(Verb)):- fronttoken(STR,Verb,""), v(Verb).
EX13EX04.pro NLP.pro (cont)
323-670 Artificial Intelligence Chapter 461
EX13EX04.pro NLP.pro (cont)/* determiner */ d("the"). d("a"). d("an")./* nouns */ n(“Bill"). n("dog"). n("cat"). n("fish"). n("ant"). n("apple"). n("man"). n("bus")./* verbs */ v("is"). v("eats"). v("likes"). v("takes").
The cat likes fish
A man takes a bus
323-670 Artificial Intelligence Chapter 462
The End