1 STAT 105 STAT 105 Real-Life Statistics: Real-Life Statistics: Your Chance for Your Chance for Happiness Happiness (or Misery) (or Misery) ?
Dec 30, 2015
11
STAT 105 STAT 105 Real-Life Statistics:Real-Life Statistics:
Your Chance for Your Chance for Happiness Happiness (or Misery)(or Misery)
?
22
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
History of Statistics 105History of Statistics 105
Wee Lee Loh
33
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Linjuan Qian Reetu Kumra
History of Statistics 105History of Statistics 105
44
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
History of Statistics 105History of Statistics 105
Yves Chretien
55
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
66
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Pedagogical MotivationPedagogical Motivation
To fill in the gap between intro-level To fill in the gap between intro-level courses and higher-level coursescourses and higher-level coursesIntro “service” courses jam-packed with toolsIntro “service” courses jam-packed with toolsHigher-level courses require advanced mathsHigher-level courses require advanced maths
To provide more depth and intuitionTo provide more depth and intuitionUseful for Masters and PhD students as wellUseful for Masters and PhD students as well
Gen-Ed introduction to statisticsGen-Ed introduction to statisticsUnforeseen side benefit: The Happy Team Unforeseen side benefit: The Happy Team
77
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Outcomes (so far)Outcomes (so far)
Positive mid-term feedbackPositive mid-term feedback Every student would recommend it to future Every student would recommend it to future
studentsstudents The process of developing the courseThe process of developing the course
Graduate School Dean is recommending an Graduate School Dean is recommending an institutionalized graduate seminars on institutionalized graduate seminars on designing new courses based on our model designing new courses based on our model
Attention to the subject and departmentAttention to the subject and department MediaMedia
GazetteGazetteCrimsonCrimson
StudentsStudents AdministrationAdministration
8© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
9© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
FINANCE
•What do you want to learn from this data?• How do you summarize the data?• How do you visualize the signal behind the noise?
10© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
FINANCE
• Would the “twistogram” idea work for the S&P 500 index over this extended time period?
11© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• The dating world is full of questions we would all love answers to:
• When you meet someone, should you play hard-to-get or make your attraction obvious?• Where should you go on a first date?• What is the best thing to do on the first date to impress your date?• What are the important factors that make two people “click” …
ROMANCE
12© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• Suppose you have been hired by a U.S. online dating company, and they want you to find out people’s opinions here in the US about these questions. • How would you go about collecting the information?
ROMANCE
13© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
ROMANCESurvey
Q: You just met someone, and are initially interested. Are you more likely to maintain/increase interest in the person if he/she plays hard-to-get, or if he/she is
obvious about being into you?
(a) HARD TO GET (I p... (b) CLEARLY INTO ME...
79%
21%
(a) HARD TO GET (I prefer a person who initially plays hard-to-get)
(b) CLEARLY INTO ME (I prefer someone who makes it clear he/she is very into me)
14© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• Suppose during your survey you fell in love with a Chinese person, and subsequently moved to China and now work for a Chinese online dating company. • You want to impress your new boss (and your new love), so you decide to repeat your U.S. survey, which had 1000 subjects, in China
ROMANCE
15© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
America has a population of about 304 million but China has a
population of about 1.3 billion. How many people would you need
to survey in China to get just as reliable results as in the U.S.?
1000 2000 3000 4000 > 4000
28%
9%
43%
11%8%
ROMANCE
1. 1000
2. 2000
3. 3000
4. 4000
5. > 4000
16© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• How do you test whether a new drug is effective?• Ideally, we perform a controlled clinical trial, by randomly assign one group of people to take the drug, and another group to take a placebo. • It needs to be double blinded.• When such an experiment is not possible due to practical or ethical issues, what can go wrong?
MEDICAL
17© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
MEDICALKidney stone treatmentC. R. Charig, D. R. Webb, S. R. Payne, O. E. Wickham (March 1986)Br Med J (Clin Res Ed) 292 (6524): 879–882.
Treatment A Treatment B
78% (273/350)
83% (289/350)
Treatment A Treatment B
SmallStone
93% (81/87)
87% (234/270)
Large Stone
73% (192/263)
69% (55/80)
Treatment B is better, right?
WRONG!
Simpson’s Paradox
18© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
19© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Small Stones
Treatment A
Treatment B
Successful 81 (93%) 234 (87%)
Unsuccessful
6 36
Slope = # successful / # unsuccessful = odds
20© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Large Stones
Treatment A
Treatment B
Successful 192 (73%) 55 (69%)
Unsuccessful
71 25
Slope = # successful / # unsuccessful = odds
21© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
CombinedTreatment
ATreatment
B
Successful81+192=2
73289
Unsuccessful
6+71=77 61
22© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
CombinedTreatment
ATreatment
B
Successful 273 (78%) 289 (83%)
Unsuccessful
77 61
23© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
CombinedTreatment
ATreatment
B
Successful 273 (78%) 289 (83%)
Unsuccessful
77 61
24© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
25© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• When and why does Simpson’s paradox occur?
• How do we deal with it?
MEDICAL
26© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• How is statistics an important part of our legal system?
• How might we use a statistic or probability as evidence in a trial?
• How are statistics often misinterpreted by lawyers and juries?
LEGAL
27© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
LEGAL
You have just been selected for jury duty. In 1996 in England, Denis Adams was suspect in a rape trial. Listen closely to the details of the case and the arguments presented before deciding your verdict.
(We have simplified the actual case/arguments for the purpose of this illustration.)
28© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• Adams’ DNA profile matches that of evidence found at the scene of the crime•If Adams is innocent, there is only a 1 in 20 million chance that his DNA would match that found at the crime• Therefore, the probability Adams is innocent is only .00000005, hence the probability he is guilty is 1 minus that, .9999995. Thus Adams is guilty beyond the shadow of a doubt.
LEGALProsecution Argument
29© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• If the odds of a DNA match for any person is 1/ 20,000,000, since there are 60 million people in England, there are on average 3 other people with this DNA type (in 1996). •Since it is equally likely to be any of these others, the probability of Adams’ guilt is 1/3 = .33, which is not enough certainty to convict.
LEGALDefense Argument
30© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
• In an identity line up, victim failed to pick out Adams• Victim describes an attacker in his 20’s• Adams is 37• Victim guessed Adams to be about 40• Adams had an alibi for the night of the crime (he spent the night with his girlfriend)
LEGALDefense Argument
31© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
LEGAL
Would you convict Adams?
Yes No
47%
53%
1. Yes
2. No
32© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
1) What is the probability that you drive into a tree given that you are drunk?
2) What is the probability that you are drunk given that you drive into a tree?
Why is it important to distinguish them?
LEGAL
33© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
If I randomly pick up one of these
chocolates, what do you think is the
probability there is champagne inside?
(a) 0 - .2 (b) .21 - .4 (c) .41 - .6 (d) .61 - .8 (e) .81 - 1
53%
30%
0%2%
14%
(a) 0 - .2
(b) .21 - .4
(c) .41 - .6
(d) .61 - .8
(e) .81 - 1
34© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
If I randomly pick up one of these
chocolates, what do you think is the
probability there is champagne inside?
(a) 0 - .2 (b) .21 - .4 (c) .41 - .6 (d) .61 - .8 (e) .81 - 1
29%
14%
21%
0%
36%
(a) 0 - .2
(b) .21 - .4
(c) .41 - .6
(d) .61 - .8
(e) .81 - 1
35© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
How certain are you about your
estimate? If you were to give an interval that you
are fairly confident contains the truth, how wide would this interval be? .05 .1 .35 .6 .75
1
9%
12% 12%
9%
24%
35%
1. .05
2. .1
3. .35
4. .6
5. .75
6. 1
36© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
Let’s collect some data!
37© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
Did your chocolate have champagne in it?
Yes No
100%
0%
(a) Yes
(b) No
38© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
If I randomly pick up one of these
chocolates, what is your best guess for
the probability of champagne inside?
0 .1 .2 .3 .4 .5 .6 .7 .8
55%
39%
6%
0% 0%0%0%0%0%0%
(a) 0(b) .1(c) .2(d) .3(e) .4(f) .5(g) .6(h) .7(i) .8(j) .9
39© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
How certain are you about your
estimate? If you were to give an interval that you
are fairly confident contains the truth, how wide would this interval be? .05 .1 .35 .6 .75
1
14%
24%
10%
21%
10%
21%1. .05
2. .1
3. .35
4. .6
5. .75
6. 1
Let’s collect more data!
40© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
Did your chocolate have champagne in it?
Yes No
89%
11%(a) Yes
(b) No
41© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
If I randomly pick up one of these
chocolates, what is your best guess for
the probability of champagne inside?
0 .1 .2 .3 .4 .5 .6 .7 .8
17%
75%
8%
0% 0%0%0%0%0%0%
(a) 0(b) .1(c) .2(d) .3(e) .4(f) .5(g) .6(h) .7(i) .8(j) .9
42© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
How certain are you about your
estimate? If you were to give an interval that you
are fairly confident contains the truth, how wide would this interval be? .05 .1 .35 .6 .75
1
22%
17%
4%
26%
9%
22%
1. .05
2. .1
3. .35
4. .6
5. .75
6. 1
And even more data…
43© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
Did your chocolate have champagne in it?
Yes No
83%
17%(a) Yes
(b) No
44© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
WINE AND CHOCOLATE
What happens as you accumulate more data?
1) Your estimates become more accurate
2) You can narrow in on your interval prediction (your uncertainty decreases)
3) In this case, you get to enjoy chocolate!
45© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
http://movies.aol.com//movie/forrest-gump/1036/video/tom-hanks-greatest-moments/1138699
“Life is like a box of chocolates… you never know what you’re going to get.”
BUT YOU CAN ESTIMATE IT!
(especially after you take STAT 105!)
4646
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Things We Do Differently …Things We Do Differently …Student/Faculty course design Student/Faculty course design
collaborationcollaborationModules, allowing “out of sequence” Modules, allowing “out of sequence”
teaching in terms of technical material teaching in terms of technical material The use of “Clickers” (Personal Response The use of “Clickers” (Personal Response
Devices)Devices)Module-based team projects and project Module-based team projects and project
presentationspresentationsModule-based guest lecturers Module-based guest lecturers AssessmentAssessment
Peer evaluationPeer evaluationAssignments, projects, no traditional examsAssignments, projects, no traditional exams
4747
Module-Based Approach Module-Based Approach (MBA)(MBA)
Statistical Topics Finance Romance Medical Legal Wine/ChocolateProbability Random variables/Probability Distributions
Rules of probabilityBayes's RuleDescriptive Statistics
Statistical Inference Hypothesis TestingPosterior probabilities and p-valuesDecision Theory
Advanced Statistics Linear RegressionANOVATime Series ModelsLogistic RegressionStatistical InteractionModel/Variable SelectionSimpson's ParadoxMultiple Hypothesis Testing
Study Design Survey MethodsExperimental DesignObservational StudiesSelection BiasResponse BiasPublication Bias
4848
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
ChallengesChallengesTime managementTime management
Structured material vs “improvised” Structured material vs “improvised” discussionsdiscussions
So much material, so little timeSo much material, so little timeStudent team dynamicsStudent team dynamicsPrerequisitesPrerequisites
Can we offer stat105 without prerequisites?Can we offer stat105 without prerequisites?Funding for course materialFunding for course material
e.g. wine and chocolate e.g. wine and chocolate Outside speaker expenses Outside speaker expenses
Scaling to a (much) larger class size in Scaling to a (much) larger class size in the futurethe future
4949
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Future Happiness …Future Happiness … Developing more modulesDeveloping more modules
SportsSports NutritionNutrition …………
Prepare a multimedia-based teaching packagePrepare a multimedia-based teaching package Text bookText book WebsiteWebsite
Similar courses aimed at different levelsSimilar courses aimed at different levels More advancedMore advanced Less advancedLess advanced
Build more Happy Teams! Build more Happy Teams!
5050
© 2008 Department of Statistics, Harvard © 2008 Department of Statistics, Harvard UniversityUniversity
Thanks much!Thanks much!
And we welcome your And we welcome your feedback!feedback!