Top Banner
Learning Bit by Bit Class 4 - Ngrams
28

Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Learning Bit by Bit

Class 4 - Ngrams

Page 2: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Ngrams

• Counting words• Using observation to make predictions

Page 3: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Ngrams

• Corpus/Corpora

Page 4: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Unigram

• “how’s the weather out there?”• [how’s, the, weather, out, there]

Page 5: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Unigram

• how many words are there?

Page 6: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Unigram

• How many times does “weather” occur?

Page 7: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Unigram

• Prob “weather” = occurrences of “weather”/ total # words

Page 8: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Unigram

• P(“weather”) = c(“weather”) / c(total)

Page 9: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Bigram

• “the storm swept through the land”• [(the, storm), (storm, swept), (swept,

through), (through, the), (the land)]

Page 10: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Bigram

• How many times does “storm” follow “the”?

Page 11: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Bigram

• How many times does the word “the” occur?

Page 12: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Bigram

• Prob “the storm” given “the” = occurrences of “the storm”/ occurrences of “the”

Page 13: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Bigram

• Prob “the storm” = occurrences of “the storm”/ occurrences of “the”

• P(word n| word n-1)

Page 14: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Markov Assumption

• The assumption that the probability of a word can depend only on the previous word, or previous N words

• P(“land” | “the”)• P (“land” | “the storm swept through the”)

Page 15: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

N gram

• Extends bigram model to previous N words

Page 16: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Maximum Likelihood Estimation

• N-Gram probability based on corpus counts• P(word n| word n-1) = counts of word n-1 followed by word n /Counts of all times word n-1 occurs

Page 17: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Trigram

• “the quick red fox jumped the quick black bear. The quick red fox hopped away.”

• [(the, quick, red), (quick, red, fox), (red, fox, jumped), (fox, jumped, the), (jumped, the, quick), (the, quick, black), (quick, black, bear) (the, quick, red) (quick, red, fox), (red, fox, hopped), (fox, hopped, away)]

Page 18: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Trigram

• How many times does “the quick red” occur?

Page 19: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Trigram

• How many times does “the quick” occur?

Page 20: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Trigram

• Prob “the quick red” given “the quick” = occurrences of “the quick red” /

occurrences of “the quick”

Page 21: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test it in Google

• Google “the weather”• How many results?

Page 22: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test it in Google

• Google “the weather is”• How many results?

Page 23: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test it in Google

• Google “the weather out”• How many results?

Page 24: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test it in Google

• Google “weather the out”• How many results?

Page 25: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test it in Google

• Prob “the weather out” =Count “the weather out”/Count “the weather”

Page 26: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Test in Google

• Why so few results for “weather the out”?

Page 27: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Training and Testing

• Training set – bigger ie. 80-90%• Testing set – smaller ie. 10-20%

Page 28: Learning Bit by Bit Class 4 - Ngrams. Ngrams Counting words Using observation to make predictions.

Examples