Folio Add Math 2010 Baru

8/3/2019 Folio Add Math 2010 Baru

1/25

PART 1


2/25

HISTORY OF PROBABILITY

"A gambler's dispute in 1654 led to the creation of a mathematical theory ofprobability by two famous French mathematicians, Blaise Pascal and Pierre deFermat. Antoine Gombaud, Chevalier de Mr, a French nobleman with an interest

in gaming and gambling questions, called Pascal's attention to an apparentcontradiction concerning a popular dice game. The game consisted in throwing apair of dice 24 times; the problem was to decide whether or not to bet even moneyon the occurrence of at least one "double six" during the 24 throws. A seeminglywell-established gambling rule led de Mr to believe that betting on a double six in24 throws would be profitable, but his own calculations indicated just the opposite.

This problem and others posed by de Mr led to an exchange of letters betweenPascal and Fermat in which the fundamental principles of probability theory wereformulated for the first time. Although a few special problems on games of chancehad been solved by some Italian mathematicians in the 15th and 16th centuries, no

general theory was developed before this famous correspondence.

The Dutch scientist Christian Huygens, a teacher of Leibniz, learned of thiscorrespondence and shortly thereafter (in 1657) published the first book onprobability; entitled De Ratiociniis in Ludo Aleae, it was a treatise on problemsassociated with gambling. Because of the inherent appeal of games of chance,probability theory soon became popular, and the subject developed rapidly duringthe 18th century. The major contributors during this period were Jakob Bernoulli(1654-1705) and Abraham de Moivre (1667-1754).

In 1812 Pierre de Laplace (1749-1827) introduced a host of new ideas andmathematical techniques in his book, Thorie Analytique des Probabilits. BeforeLaplace, probability theory was solely concerned with developing a mathematicalanalysis of games of chance. Laplace applied probabilistic ideas to many scientificand practical problems. The theory of errors, actuarial mathematics, and statisticalmechanics are examples of some of the important applications of probability theorydeveloped in the l9th century.

Like so many other branches of mathematics, the development of probability theoryhas been stimulated by the variety of its applications. Conversely, each advance inthe theory has enlarged the scope of its influence. Mathematical statistics is one

important branch of applied probability; other applications occur in such widelydifferent fields as genetics, psychology, economics, and engineering. Many workershave contributed to the theory since Laplace's time; among the most important areChebyshev, Markov, von Mises, and Kolmogorov.


3/25

One of the difficulties in developing a mathematical theory of probability has been toarrive at a definition of probability that is precise enough for use in mathematics, yetcomprehensive enough to be applicable to a wide range of phenomena. The searchfor a widely acceptable definition took nearly three centuries and was marked bymuch controversy. The matter was finally resolved in the 20th century by treating

probability theory on an axiomatic basis. In 1933 a monograph by a Russianmathematician A. Kolmogorov outlined an axiomatic approach that forms the basisfor the modern theory. (Kolmogorov's monograph is available in English translationasFoundations of Probability Theory, Chelsea, New York, 1950.) Since then theideas have been refined somewhat and probability theory is now part of a moregeneral discipline known as measure theory."

Probability theory is a branch ofmathematics concerned with determining the longrun frequency or chance that a given event will occur. This chance is determined bydividing the number of selected events by the number of total events possible. Forexample, each of the six faces of a die has one in six probability on a single toss.

Inspired by problems encountered by seventeenth century gamblers, probabilitytheory has developed into one of the most respected and useful branches ofmathematics with applications in many different industries. Perhaps what makesprobability theory most valuable is that it can be used to determine the expectedoutcome in any situation from the chances that a plane will crash to the probabilitythat a person will win the lottery.

EMPIRICAL PROBABILITY

Empirical probability, also known as relative frequency, or experimental probability, isthe ratio of the number favorable outcomes to the total number of trials, not in a
http://science.jrank.org/pages/4181/Mathematics.htmlhttp://science.jrank.org/pages/2856/Frequency.htmlhttp://en.wikipedia.org/wiki/Frequency_(statistics)http://science.jrank.org/pages/2856/Frequency.htmlhttp://en.wikipedia.org/wiki/Frequency_(statistics)http://science.jrank.org/pages/4181/Mathematics.html


4/25

sample space but in an actual sequence of experiments. In a more general sense,

empirical probability estimates probabilities from experience and observation. The

phrase a posteriori probability has also been used as an alternative to empirical

probability or relative frequency. This unusual usage of the phrase is not directly

related to Bayesian inference and not to be confused with its equally occasional use

to refer to posterior probability, which is something else.

In statistical terms, the empirical probability is an estimate of a probability. If

modelling using a binomial distribution is appropriate, it is the maximum likelihood

estimate. It is the Bayesian estimate for the same case if certain assumptions are

made for the prior distribution of the probability

An advantage of estimating probabilities using empirical probabilities is that this

procedure is relatively free of assumptions. For example, consider estimating theprobability among a population of men that they satisfy two conditions: (i) that they

are over 6 feet in height; (ii) that they prefer strawberry jam to raspberry jam. A direct

estimate could be found by counting the number of men who satisfy both conditions

to give the empirical probability the combined condition. An alternative estimate

could be found by multiplying the proportion of men who are over 6 feet in height

with the proportion of men who prefer strawberry jam to raspberry jam, but this

estimate relies on the assumption that the two conditions are statistically

independent.

A disadvantage in using empirical probabilities arises in estimating probabilities

which are either very close to zero, or very close to one. In these cases very large

sample sizes would be needed in order to estimate such probabilities to a good

standard of relative accuracy. Here statistical models can help, depending on the

context, and in general one can hope that such models would provide improvements

in accuracy compared to empirical probabilities, provided that the assumptions

involved actually do hold. For example, consider estimating the probability that the

lowest of the daily-maximum temperatures at a site in February in any one year is

less zero degrees Celsius. A record of such temperatures in past years could be

used to estimate this probability. A model-based alternative would be to select of

family ofprobability distributions and fit it to the dataset contain past yearly values:

the fitted distribution would provide an alternative estimate of the required probability.

This alternative method can provide an estimate of the probability even if all values

in the record are greater than zero.

THEORETICAL PROBABILITY
http://en.wikipedia.org/wiki/Experiencehttp://en.wikipedia.org/wiki/Observationhttp://en.wikipedia.org/wiki/Bayesian_inferencehttp://en.wikipedia.org/wiki/Posterior_probabilityhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Maximum_likelihood_estimatehttp://en.wikipedia.org/wiki/Maximum_likelihood_estimatehttp://en.wikipedia.org/wiki/Bayesian_estimatehttp://en.wikipedia.org/wiki/Prior_distributionhttp://en.wikipedia.org/wiki/Statistically_independenthttp://en.wikipedia.org/wiki/Statistically_independenthttp://en.wikipedia.org/wiki/Statistical_modelhttp://en.wikipedia.org/wiki/Probability_distributionshttp://en.wikipedia.org/wiki/Experiencehttp://en.wikipedia.org/wiki/Observationhttp://en.wikipedia.org/wiki/Bayesian_inferencehttp://en.wikipedia.org/wiki/Posterior_probabilityhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Maximum_likelihood_estimatehttp://en.wikipedia.org/wiki/Maximum_likelihood_estimatehttp://en.wikipedia.org/wiki/Bayesian_estimatehttp://en.wikipedia.org/wiki/Prior_distributionhttp://en.wikipedia.org/wiki/Statistically_independenthttp://en.wikipedia.org/wiki/Statistically_independenthttp://en.wikipedia.org/wiki/Statistical_modelhttp://en.wikipedia.org/wiki/Probability_distributions


5/25

Assume you're looking for the probability of getting three heads out of three coinspins and that you're using a fair coin.

For coin spins, theoretical probability is very simple. The probability of getting threeheads in a row is 1/2 * 1/2 * 1/2 = 1/8. This means that if you tossed a coin three

times, you'd expect to see three heads once every 8 trials.

For experimental probability you need to define clear trials, for this experiment youcan't just spin a coin over and over and count the number of times you see threeheads in a row, for example, if you threw the following:

H T H H T T H H H H H T T H T T T

you have three cases where you have three heads in a row, but they all overlap sothese are not independent trials and cannot be compared to the theoretical result.When conducting your experiment, you know that if you get a T in your trial, itdoesn't matter what comes after, that trial has already failed to get three heads in arow. The trial is deemed a success if you get three heads in a row, naturally. As aresult, if you threw the above sequence, you would to determine your experimentalprobability in the following way:

H T fail H H T fail T fail H H H success H H T fail T fail H T fail T fail T fail

In this example we have 8 trials and one success, therefore the experimentalprobability is 1/8. The sample variance (look it up), however is also 1/8, meaning thatall you really know is that the experimental probability could be anywhere between 0and 1/4. The only way to get the variance down (and therefore reduce yourconfidence interval) is to perform more and more trials.

It's unlikely for the theoretical probability and experimental probability to beEXACTLY the same but the more trials you do, the more the experimental probabilitywill converge on the theoretical probability.


6/25

Probabilitydescribes the chance that an

uncertain event will occur.

Empirical Probability of an event is an "estimate" that the event willhappen based on how often the event occurs after collecting data or running an

experiment (in a large number of trials). It is based specifically on direct

observations or experiences.

Empirical Probability Formula

P(E) = probability that an event,E, will occur.

top = number of ways the specific event occurs.

bottom = number of ways the experiment could

occur.

Example: A survey was conducted todetermine students' favorite breeds of dogs. Each

student chose only one breed.Dog Collie Spaniel Lab Boxer PitBull Other

# 10 15 35 8 5 12

What is the probability that a student's favorite

dog breed is Lab?

Answer: 35 out of the 85 students chose Lab.

The probability is .

Theoretical Probabilityof an event is the number of ways that the eventcan occur, divided by the total number of outcomes. It is finding the probability of

events that come from a sample space of known equally likely outcomes.

Theoretical Probability Formula

P(E) = probability that an event,E, will occur.

n(E) = number of equally likely outcomes ofE.

n(S) = number of equally likely outcomes of sample

space S.

Example 1: Find the probability of rolling asix on a fair die.

Answer: The sample space for rolling is die is 6equally likely results: {1, 2, 3, 4, 5, 6}.

The probability of rolling a 6 is one out of 6 or .

Example 2: Find the probability of tossing a fair die and getting an odd number.

Answer:eventE: tossing an odd number


7/25

outcomes inE: {1, 3, 5}

sample space S: {1, 2, 3, 4, 5, 6}

Comparing Empirical and Theoretical Probabilities:

Karen and Jason roll two dice 50 times and record their

results in the accompanying chart.1.) What is their empirical probability of rolling a 7?

2.) What is the theoretical probability of rolling a 7?3.) How do the empirical and theoretical probabilities

compare?

Sum of the rolls of twodice

3, 5, 5, 4, 6, 7, 7, 5, 9,10,

12, 9, 6, 5, 7, 8, 7, 4,11, 6,

8, 8, 10, 6, 7, 4, 4, 5, 7,

9,9, 7, 8, 11, 6, 5, 4, 7, 7,

4,3, 6, 7, 7, 7, 8, 6, 7, 8,

9

Solution:1.) Empirical probability (experimental probability or observed

probability) is 13/50 = 26%.

2.) Theoretical probability (based upon what is possible whenworking with two dice) = 6/36 = 1/6 = 16.7% (check out the table

at the right of possible sums when rolling two dice).

3.) Karen and Jason rolled more 7's than would be expected

theoretically.


8/25

PART 2

Situation: A dice is toss once. The possible outcomes for the number of dice is:


9/25

a) Possible outcomes: {1,2,3,4,5,6}

b) (Table)

DICE1 1 2 3 4 5 6

DICE2

1 1,1 2,1 3,1 4,1 5,1 6,1

2 1,2 2,2 3,2 4,2 5,2 6,2

3 1,3 2,3 3,3 4,3 5,3 6,3

4 1,4 2,4 3,4 4,4 5,4 6,4

5 1,5 2,5 3,5 4,5 5,5 6,5

6 1,6 2,6 3,6 4,6 5,6 6,6

1 {1,2,3,4,5,6}

2 {1,2,3,4,5,6}

3 {1,2,3,4,5,6} (Tree diagram)

4 {1,2,3,4,5,6}

5 {1,2,3,4,5,6}

6 {1,2,3,4,5,6}


10/25

PART 3

a)

Sum,(x)

Possible Outcomes, P P(x)

2 1 { (1,1) } 1/363 2 { (1,2), (2,1) } 1/18


11/25

4 3 { (2,2), (3,1), (1,3) } 1/12

5 4 { (1,4), (4,1), (2,3), (3,2) } 1/9

6 5 { (1,5), (5,1), (2,4), (4,2), (3,3) } 5/36

7 6 { (1,6), (6,1), (2,5), (5,2), (3,4), (4,3) } 1/6

8 5 { (2,6), (6,2), (3,5), (5,3), (4,4) } 5/36

9 4 { (3,6), (6,3), (4,5), (5,4) } 1/910 3 { (4,6), (6,4), (5,5) } 1/12

11 2 { (5,6), (6,5) } 1/18

12 1 { (6,6) } 1/36

( Table 1 )

b)

A= ( The two numbers are not the same )

= { (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,4),

(3,5), (3,6), (4,1), (4,2), (4,3), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,6), (6,1),

(6,2), (6,3), (6,4), (6,5) }

B= ( The product of the two numbers is greater than 36 )

= { } =

C= ( Both numbers are prime or the difference between two numbers is odd )

= { (1,2), (1,4), (1,6), (2,1), (2,2), (2,3), (2,5), (3,2), (3,4), (3,6), (4,1), (4,3), (4,5),

(5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,3), (6,5) }

D= ( The sum of the two numbers are even and both numbers are prime )

= { (2,2), (3,3), (3,5), (5,3),(5,5) }


12/25

PART 4

Situation: The two dices are toss simultaneously 50 times. The sum of all Dots on

both turned-up faces are observe and recorded in the table as shown.

a)

Sum of the two numbers, (x) Frequency, (f) x2 5 4


13/25

3 3 9

4 11 16

5 7 25

6 1 36

7 2 49

8 3 649 1 81

10 10 100

11 3 121

12 4 144

( table 2 )

i) mean:

mean= fx

f

= (2x5)+(3x3)+(4x11)+(5x7)+(6x1)+(7x2)+(8x3)+(9x1)+(10x10)+(11x3)+(12x4)

( 50)

= 6.64

ii) variance:

= fx - (mean)

f

= 2744 (6.64)

50

= 10.79

iii) standard deviation:

=

=3.28

b) Predict the mean if the two dices are toss simultaneously increases to 100 times.


14/25

mean=

c)

Sum of the two numbers, (x) Frequency, (f) x

2 14 4

3 7 9

4 12 16

5 9 25

6 5 36

7 10 49

8 6 64

9 4 8110 13 100

11 14 121

12 6 144

i) mean:

mean= fx

f

=(2x14)+(3x7)+(4x12)+(5x9)+(6x5)+(7x10)+(8x6)+(9x4)+(10x13)+(11x

14)+(12x6)

( 100 )

= 6.82

ii) variance:

= fx - (mean)

f

= 5772 (6.82)

100


15/25

=11.21

iii) standard deviation:

=

=3.348


16/25

PART 5

Mean = x P(x)

Variance/ = x P(x) (mean)

Instruction: By using formulae given, calculate the actual mean, variance and

standard deviation based on Table 1.

mean= 77 x ( 1 )

= 77


17/25

variance/ = { 649 x (10) } (77)

= 561

standard deviation/=

= 23.69

b) The mean, variance and standard deviation result for part 4 are 6.64, 10.79 and

3.28 respectively while the mean, variance and standard deviation for part 5 are 77,

561 and 23.69 respectively.

The difference between empirical, which I think means observation or

experience and theoretical probability or speculative are as clear as night and day.

Empirical probability is the data that has been proven through trial and error such

as the statics on the accidents that involve driving while under the influence. Even

the proven data for deaths that are smoking related.

The theoretical probability is like guessing and taking a chance you are right much

like playing a game of cards you are taking that chance you have the better hand.

Insurance policies are made possible by empirical probability. We know the amount

of accidents,

and we know the amount of times something happens without error. Based on that, it

can be calculated what the chance (and thus the cost) is of a certain event.

(professional) Gambling is about theoretical probability. One can assume that all the

chips, cards, tables or whatever are completely fair (or even calculate the unfairness,

based on the method of shuffling), so one can calculate the odds of a certain set of

cards coming up, before they ever have. Dangerous medical procedures can also


18/25

have empirical probability playing as a factor. There is always a chance that

someone dies under the knife, or that someone cures on their own. Based on

those odds, a doctor could advise for or against certain procedures. Those odds are

based on other patients who have gone through the same thing.

c)

FURTHER EXPLORATION

Law of large numbers


19/25

An illustration of the Law of Large Numbers using die rolls. As the number of die rolls

increases, the average of the values of all the rolls approaches 3.5.

In probability theory, the law of large numbers (LLN) is a theorem that describes the

result of performing the same experiment a large number of times. According to the

law, the average of the results obtained from a large number of trials should be close

to the expected value, and will tend to become closer as more trials are performed.

For example, a single roll of a six-sided die produces one of the numbers 1, 2, 3, 4,

5, 6, each with equal probability. Therefore, the expected value of a single die roll is

According to the law of large numbers, if a large number of dice are rolled, the

average of their values (sometimes called thesample mean) is likely to be close

to 3.5, with the accuracy increasing as more dice are rolled.

Similarly, when a fair coin is flipped once, the expected value of the number of

heads is equal to one half. Therefore, according to the law of large numbers, the

proportion of heads in a large number of coin flips should be roughly one half. In

particular, the proportion of heads after n flips will almost surelyconverge to one

half as n approaches infinity.

Though the proportion of heads (and tails) approaches half, almost surely the

absolute (nominal) difference in the number of heads and tails will become large

as the number of flips becomes large. That is, the probability that the absolute

difference is a small number approaches zero as number of flips becomes large.Also, almost surely the ratio of the absolute difference to number of flips will
http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Theoremhttp://en.wikipedia.org/wiki/Averagehttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Dicehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Sample_meanhttp://en.wikipedia.org/wiki/Fair_coinhttp://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/Limit_of_a_sequencehttp://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/File:Largenumbers.svghttp://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Theoremhttp://en.wikipedia.org/wiki/Averagehttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Dicehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Sample_meanhttp://en.wikipedia.org/wiki/Fair_coinhttp://en.wikipedia.org/wiki/Almost_surelyhttp://en.wikipedia.org/wiki/Limit_of_a_sequencehttp://en.wikipedia.org/wiki/Almost_surely


20/25

approach zero. Intuitively, expected absolute difference grows, but at a slower

rate than the number of flips, as the number of flips grows.

The LLN is important because it "guarantees" stable long-term results for

random events. For example, while a casino may lose money in a single spin ofthe roulette wheel, its earnings will tend towards a predictable percentage over a

large number of spins. Any winning streak by a player will eventually be

overcome by the parameters of the game. It is important to remember that the

LLN only applies (as the name indicates) when a large number of observations

are considered. There is no principle that a small number of observations will

converge to the expected value or that a streak of one value will immediately be

"balanced" by the others. See the Gambler's fallacy.

Central limit theorem
http://en.wikipedia.org/wiki/Roulettehttp://en.wikipedia.org/wiki/Gambler's_fallacyhttp://en.wikipedia.org/wiki/Roulettehttp://en.wikipedia.org/wiki/Gambler's_fallacy


21/25

Histogram plot of average proportion of heads in a fair coin toss, over a large

number of sequences of coin tosses

In probability theory, the central limit theorem (CLT) states conditions under which

the mean of a sufficiently large number ofindependentrandom variables, each with

finite mean and variance, will be approximately normally distributed (Rice 1995). The

central limit theorem also requires the random variables to be identically distributed,

unless certain conditions are met. Since real-world quantities are often the balanced

sum of many unobserved random events, this theorem provides a partial explanation

for the prevalence of the normal probability distribution. The CLT also justifies the

approximation of large-samplestatistics to the normal distribution in controlledexperiments.

For other generalizations for finite variance which do not require identical distribution,

see Lindeberg's condition, Lyapunov's condition,Gnedenko and Kolmogorov states.

In more general probability theory, a central limit theorem is any of a set ofweak-

convergence theories. They all express the fact that a sum of many independent

random variables will tend to be distributed according to one of a small set of

"attractor" (i.e. stable) distributions. Specifically, the sum of a number of random

variables with power-law tail distributions decreasing as1 / | x | 1 where 1

Folio Add Math 2010 Baru

Documents