Top Banner
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009
38

Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Dec 24, 2015

Download

Documents

Corey Stevens
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Artificial Intelligence

CIS 342

The College of Saint RoseDavid Goldschmidt, Ph.D.

March 6, 2009

Page 2: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Crossword Puzzle Construction

Given:– Dictionary of valid words

and phrases– Empty crossword grid

Problem:– Fill the crossword grid such

that all words both acrossand down are valid

– Assign clues

Page 3: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Crossword Puzzle Construction

Depth-First Search (DFS)– Fill in words until a solution is found

or a dead-end is encountered– Backtrack from dead-ends

– Questions: Where do we start? What word do we fill in next? What backtracking strategies do we use? How do we avoid repetition (boring puzzles)?

Page 4: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Crossword Puzzle Construction

Optimize the DFS:– Add longer (most constrained) words first– Associate weights with words in dictionary

based on frequency of letters Friendly crossword puzzle words

include letters: S, R, E, T, D, A, I, L Unfriendly crossword puzzle words

include letters: J, Q, X, Z, F, V, W e.g. quiz, fix, jazz, quaff, xylophone, wax

Page 5: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Crossword Puzzle Construction

Genetic Algorithm (GA)– Evolve a solution by crossovers and

mutations through many generations– Initial population of crossword grids:

Random letters? Random letters based on Scrabble® frequencies? Random words from dictionary?

– Fitness of each grid is number of valid words

Page 6: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

Given:– Crossword grid – Clues

Problem:– Fill the grid such

that all words correctly answerthe given clues

Page 7: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

Obtain candidate answers for each clue– Assign a confidence value to each candidate– Are we guaranteed to have the correct

answer?

Place candidate answers in grid until a solutionis found or a dead-end occurs– Which backtracking strategies

should we use?

Page 8: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

PROVERB — Duke University, 1999– Modules provide candidate answers

from dictionaries, encyclopedias,movie databases, etc.

– Module sources a Crossword Puzzle Database ofexactly 5142 previously solved puzzles

Pivotal in PROVERB’s success

– Another module generates all combinationsof letters (ouch!)

Page 9: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

Google CruciVerbalist (GCV)

Page 10: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

GCV solved 13x13 puzzle with 68 clues– Many clues are fill-in-the-blank

or pop-culture clues– Candidate answers

obtained from Googleresults page (top 50)

– Solved using 559 Google queries– Queries yielded 68 correct answers

44 correct answers had highest confidence

Page 11: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Solving Crossword Puzzles

Page 12: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Categorize clues based on text and type of clues:– Fill-in-the-blank clues– Synonyms/Antonyms– “Type of” (or “Kind of”) clues– Abbreviations– Clues with “and” or “or”– Singular or plural– Number of words in answer

Page 13: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Translate clues to Google-friendly forms– “To ___ is human”

“To * is human” “To * * is human”

– “Mary ___ little lamb” (2 words) “Mary * * little lamb”

– “___ to Joy” by Beethoven “* to Joy” by Beethoven “* * to Joy” by Beethoven

Page 14: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Translate clues to Google-friendly forms– Diplomacy

synonyms of Diplomacy

– Not dry opposite of dry antonyms of dry

– Joy synonyms of Joy

Page 15: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Translate clues to Google-friendly forms– Type of dancing [or Kind of dancing]

* dancing

– Second sight (abbr.) Second sight abbreviations of Second sight

– Superman’s admirer admirer of Superman

Page 16: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Translate clues to Google-friendly forms– Couldn’t move

Could not move Could opposite of move Could antonyms of move

– Knight or Danson Knight Danson

Page 17: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Clue Preprocessing

Translate clues to Google-friendly forms– Bosley and Arnold

Bosley Arnold Append an ‘s’

– Henson, and others [or Henson, and namesakes]

Henson Append an ‘s’

Page 18: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Results of Google-Querying

Page 19: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Results of Google-Querying

GCV excels at solving fill-in-the-blank and pop-culture clues– Why?

Though results are encouraging,using keyword-based searchingis limited– Why?

Page 20: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Populating the Crossword Grid

Use a Depth-First Search (DFS) algorithm:– Fill in the crossword grid based on confidence

values of candidate words– At each iteration:

Select candidate word with highest confidence valueamongst clues not yet placed

Attempt to fit candidate word into grid

– Halt when a solution is found or a dead-end occurs

Page 21: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Populating the Crossword Grid

When a dead-end occurs, what do we do?– Backtrack: Remove last word placed in grid

Disadvantages?

– Backjump: Identify culprit and remove all wordsback to culprit word

Disadvantages?

Page 22: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Populating the Crossword Grid

When a dead-end occurs, what do we do?– Extricating Backjump: Identify and remove

the culprit Disadvantages?

– How do we identifythe culprit?

Page 23: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Extricating Backjumping

Assign weights to the squares of the grid– Square weights correspond to confidence

valuesof candidate words placed

– e.g. Place TWAIN withconfidence value of 10at 5-Across

Page 24: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Extricating Backjumping

Weights of interlocking words are multiplied

Page 25: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Extricating Backjumping

Define grid weight of a word as the sum of each individual square weight

– e.g. TWAIN = 100, NOW = 72

Page 26: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Extricating Backjumping

When a dead-end occurs, the culprit is theword with the lowest grid weight

Page 27: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

Page 28: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

New York Times

Page 29: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

Page 30: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

TV Guide #42

Page 31: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

Page 32: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

TV Guide #63

Page 33: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

Page 34: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

A Sampling of Crossword Puzzles

Mensa Kids Puzzle #3

Page 35: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Results of Grid Solving

Page 36: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Limitations of Keyword-Based Search

Google and GCV use keyword-based tricksto artificially improve result sets– Word frequency & proximity to other words– Additional keywords to help direct queries to

good candidate answers e.g. synonyms of

– Grammatical and structural rearrangements

Page 37: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

Lack of precision in keyword-based search– Irrelevant results in candidate answer lists– Confidence values based on word

frequencyproduces many false positives

– Correct answer is often buried in other mediocre(and incorrect!) candidates

Limitations of Keyword-Based Search

Page 38: Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009.

In Conclusion....

Other uses of theWeb as an automatedinformation source?– Keyword-based search

is insufficient– Lacks the means for

machine-interpretableinformation

– Semantic Web