Natural Language Natural Language Processing Processing Giuseppe Attardi Giuseppe Attardi Introduction to Probability Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Kl
Natural Language Natural Language ProcessingProcessing
Giuseppe AttardiGiuseppe Attardi
Introduction to ProbabilityIntroduction to Probability
IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein
OutlineOutline
ProbabilityProbability Basic probability Conditional probability
1. Introduction to 1. Introduction to ProbabilityProbability Experiment (trial)Experiment (trial)
Repeatable procedure with well-defined possible outcomes
Sample Space (S)Sample Space (S)• the set of all possible outcomes • finite or infinite
Example• coin toss experiment• possible outcomes: S = {heads, tails}
Example• die toss experiment• possible outcomes: S = {1,2,3,4,5,6}
Slides from Sandiway Fong
Introduction to ProbabilityIntroduction to Probability
Definition of sample space depends on what we Definition of sample space depends on what we are askingare asking Sample Space (S): the set of all possible outcomes Example
• die toss experiment for whether the number is even or odd• possible outcomes: {even,odd} • not {1,2,3,4,5,6}
More definitionsMore definitions
EventsEvents an event is any subset of outcomes from the sample space
ExampleExample die toss experiment let A represent the event such that the outcome of the die toss
experiment is divisible by 3 A = {3,6} A is a subset of the sample space S= {1,2,3,4,5,6}
ExampleExample Draw a card from a deck
• suppose sample space S = {heart,spade,club,diamond} (four suits) let A represent the event of drawing a heart let B represent the event of drawing a red card A = {heart} B = {heart,diamond}
Introduction to ProbabilityIntroduction to Probability
Some definitionsSome definitions Counting
• suppose operation oi can be performed in ni ways, then
• a sequence of k operations o1o2...ok • can be performed in n1 n2 ... nk ways
Example• die toss experiment, 6 possible outcomes• two dice are thrown at the same time• number of sample points in sample space = 6 6 =
36
Definition of ProbabilityDefinition of Probability
The probability law assigns to an event a The probability law assigns to an event a nonnegative numbernonnegative number
Called Called PP(A)(A) Also called the probability Also called the probability AA That encodes our knowledge or belief about the That encodes our knowledge or belief about the
collective likelihood of all the elements of collective likelihood of all the elements of AA Probability law must satisfy certain propertiesProbability law must satisfy certain properties
Probability AxiomsProbability Axioms
NonnegativityNonnegativity P(A) 0, for every event A
AdditivityAdditivity If A and B are two disjoint events, then the
probability of their union (either one or the other occurs) satisfies:
P(A B) = P(A) + P(B) MonotonicityMonotonicity
P(A) P(B) for any A B NormalizationNormalization
The probability of the entire sample space S is equal to 1, i.e. P(S) = 1
A A BB = = A A BB = =
An exampleAn example
An experiment involving a single coin tossAn experiment involving a single coin toss There are two possible outcomes, There are two possible outcomes, HH and and TT Sample space S is Sample space S is {H,T}{H,T} If coin is fair, should assign equal probabilities to If coin is fair, should assign equal probabilities to
2 outcomes2 outcomes Since they have to sum to 1Since they have to sum to 1
PP({H}) = 0.5({H}) = 0.5
PP({T}) = 0.5({T}) = 0.5
PP({H,T}) = ({H,T}) = PP({H}) + ({H}) + PP({T}) = 1.0 ({T}) = 1.0
Another exampleAnother example
Experiment involving 3 coin tossesExperiment involving 3 coin tosses Outcome is a 3-long string of Outcome is a 3-long string of HH or or TT
S ={HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} Assume each outcome is equiprobableAssume each outcome is equiprobable
“Uniform distribution” What is probability of the event that exactly 2 heads What is probability of the event that exactly 2 heads
occur?occur?AA = {HHT, HTH, THH} = {HHT, HTH, THH}PP((AA) = ) = PP({HHT})+({HHT})+PP({HTH})+({HTH})+PP({THH})({THH})
= 1/8 + 1/8 + 1/8= 1/8 + 1/8 + 1/8=3/8=3/8
Probability definitionsProbability definitions
In summary:In summary:
Probability of drawing a spade from 52 well-Probability of drawing a spade from 52 well-shuffled playing cards:shuffled playing cards:
25.04
1
52
13
outcomesofnumbertotal
EeventtoingcorrespondoutcomesofnumberEP )(
Probabilities of two eventsProbabilities of two events
If two events A and B are If two events A and B are independentindependent i.e. P(B) is the same whether P(A) occurredi.e. P(B) is the same whether P(A) occurred
ThenThen PP((AA and and BB) = ) = PP((AA) ) ·· PP((BB))
Flip a fair coin twiceFlip a fair coin twice What is the probability that they are both heads?What is the probability that they are both heads?
Draw a card from a deck, then Draw a card from a deck, then put it backput it back, draw a , draw a card from the deck againcard from the deck again What is the probability that both drawn cards are What is the probability that both drawn cards are
hearts?hearts?
How about non-uniform How about non-uniform probabilities? An exampleprobabilities? An example A biased coin,A biased coin,
twice as likely to come up tails as heads, is tossed twice
What is the probability that What is the probability that at least one head at least one head occurs?occurs?
Sample space = {hh, ht, th, tt} (h = heads, t = tails)Sample space = {hh, ht, th, tt} (h = heads, t = tails) Sample points/probability for the event:Sample points/probability for the event:
ht 1/3 x 2/3 = 2/9 hh 1/3 x 1/3= 1/9 th 2/3 x 1/3 = 2/9 tt 2/3 x 2/3 = 4/9
Answer: 5/9 = Answer: 5/9 = 0.56 (0.56 (sum of weights in sum of weights in redred)) = 1 - 4/9 (prob. of complement)= 1 - 4/9 (prob. of complement)
Computing ProbabilitiesComputing Probabilities
Direct counts (when outcomes are equally Direct counts (when outcomes are equally probable)probable)
Sum of union of disjoint eventsSum of union of disjoint events P(A or B) = P(A) + P(B)
Product of multiple independent eventsProduct of multiple independent events P(A and B) = P(A) ·· P(B)
Indirect probability:Indirect probability: P(A) = 1 – P(S – A)
S
AAP
#
#)(
Moving toward languageMoving toward language
What’s the probability of drawing a 2 from a deck What’s the probability of drawing a 2 from a deck of 52 cards with four 2s?of 52 cards with four 2s?
What’s the probability of a random word (from a What’s the probability of a random word (from a random dictionary page) being a verb?random dictionary page) being a verb?
P(drawing a two) 4
52
1
13.077
P(drawing a verb) #of ways to get a verb
all words
Probability and part of Probability and part of speech tagsspeech tags What’s the probability of a random word (from a What’s the probability of a random word (from a
random dictionary page) being a verb?random dictionary page) being a verb?
How to compute each of theseHow to compute each of these All words = just count all the words in the dictionaryAll words = just count all the words in the dictionary # of ways to get a verb: number of words which are verbs!# of ways to get a verb: number of words which are verbs! If a dictionary has 50,000 entries, and 10,000 are verbs…. If a dictionary has 50,000 entries, and 10,000 are verbs….
PP(V)(V) is is 10000/50000 = 1/5 = .2010000/50000 = 1/5 = .20
P(drawing a verb) #of ways to get a verb
all words
Conditional ProbabilityConditional Probability
A way to reason about the outcome of an A way to reason about the outcome of an experiment based on partial informationexperiment based on partial information In a word guessing game the first letter for the word is
a “t”. What is the likelihood that the second letter is an “h”?
How likely is it that a person has a disease given that a medical test was negative?
A spot shows up on a radar screen. How likely is it that it corresponds to an aircraft?
More preciselyMore precisely
Given an experiment, a corresponding sample Given an experiment, a corresponding sample space space SS, and a probability law, and a probability law
Suppose we know that the outcome is within Suppose we know that the outcome is within some given event some given event BB
We want to quantify the likelihood that the We want to quantify the likelihood that the outcome also belongs to some other given event outcome also belongs to some other given event AA
We need a new probability law that gives us the We need a new probability law that gives us the conditional probability of conditional probability of AA given given BBP(A|B)
An intuitionAn intuition
AA is “it’s raining now” is “it’s raining now” PP((AA)) in Tuscany is .01 in Tuscany is .01 BB is “it was raining ten minutes ago” is “it was raining ten minutes ago”
PP((AA||BB)) means “what is the probability of it raining now if means “what is the probability of it raining now if it was raining 10 minutes ago”it was raining 10 minutes ago”
PP((AA||BB)) is probably way higher than is probably way higher than PP(A)(A) Perhaps Perhaps PP((AA||BB)) is .10 is .10
Intuition: The knowledge about Intuition: The knowledge about BB should change our should change our estimate of the probability of estimate of the probability of AA..
Conditional probabilityConditional probability
One of the following 30 items is chosen at One of the following 30 items is chosen at randomrandom
What is What is PP(X)(X), the probability that it is an , the probability that it is an XX? ? What is What is PP(X|red)(X|red), the probability that it is an , the probability that it is an XX
given that it is red? given that it is red?
O X X X O O
O X X O X O
O O O X O X
O O O O X O
O X X X X O
S
Conditional ProbabilityConditional Probability
let let AA and and BB be events be events PP((BB||AA)) = the = the probabilityprobability of event of event BB occurring givenoccurring given event event
AA occurredoccurred definition:definition: PP((BB||AA) = ) = PP((AA BB) / ) / PP((AA))
A B
Conditional ProbabilityConditional Probability
Note: P(A,B) = P(B|A) · P(A)also: P(A,B) = P(B,A)hence: P(B|A) · P(A) = P(A|B) · P(B)hence: …
A BA,B
)(
),(
)(
)()|(
AP
BAP
AP
BAPABP
Bayes’ TheoremBayes’ Theorem
PP((BB)): prior probability: prior probability PP((BB||AA)): posterior probability: posterior probability
)(
)()|()|(
AP
BPBAPABP
IndependenceIndependence
What is What is PP((AA, , BB)) if if AA and and BB are independent? are independent?
PP((AA,,BB) = ) = PP((AA) · ) · PP((BB)) iffiff AA, , BB independent. independent.
PP((headsheads, , tailstails) = ) = PP((headsheads) · ) · PP((tailstails) = .5 · .5 = .25) = .5 · .5 = .25
Note: Note: PP((A|BA|B) ) = P= P((AA)) iff iff AA, , BB independent independentAlso: Also: PP((B|AB|A) ) = P= P((BB)) iff iff AA, , BB independent independent
Independent EventsIndependent Events
PP((AA) = ) = PP((AA||BB)) 25/100 = 15/60
PP((AA BB) = ) = PP((AA) ) •• PP((BB)) 15/100 = 25/100 • 60/100
S
A=25 B=60
15
Monty Hall ProblemMonty Hall Problem
The contestant is shown three doors.The contestant is shown three doors. Two of the doors have goats behind them Two of the doors have goats behind them
and one has a car.and one has a car. The contestant chooses a door.The contestant chooses a door. Before opening the chosen door, Monty Hall Before opening the chosen door, Monty Hall
opens a door that has a goat behind it.opens a door that has a goat behind it. The contestant can then switch to the other The contestant can then switch to the other
unopened door, or stay with the original unopened door, or stay with the original choice.choice.
Which is best?Which is best?
SolutionSolution
Consider the sample space: door Car, A, BConsider the sample space: door Car, A, B There are three options:There are three options:
1.1. Contestant chooses Car. If she changes, she Contestant chooses Car. If she changes, she loses; if she stays, she winsloses; if she stays, she wins
2.2. Contestant chooses A with goat. If she Contestant chooses A with goat. If she switches, she wins; otherwise she loses.switches, she wins; otherwise she loses.
3.3. Contestant chooses B with goat. If she Contestant chooses B with goat. If she switches, she wins; otherwise she loses.switches, she wins; otherwise she loses.
Switching gives 2/3 chances of winningSwitching gives 2/3 chances of winning
SummarySummary
ProbabilityProbabilityConditional ProbabilityConditional Probability IndependenceIndependence
Additional MaterialAdditional Material
http://onlinestatbook.com/chapter5/probability.html