Top Banner
STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012
38

STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Dec 18, 2015

Download

Documents

Charlene Little
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

STAT 110 - Section 5 Lecture 7

Professor Hao WangUniversity of South Carolina

Spring 2012

Page 2: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Last time: Picturing Bias and Variability

Page 3: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Last time: Margin of Error

The CNN Poll interviewed 1000 people. The approval rating was 57%. What is the margin of error for 95% confidence (using the quick formula)?

Answer: Recall 95% confidence

Page 4: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Margin of Error (continued)

Page 5: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Use MOE to calculate an interval that we think includes the parameter

Form for most confidence intervals:

Approximate (because we’re using the quick MOE) 95% confidence interval for p

Confidence Interval

Page 6: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Confidence StatementsA confidence statement interprets a confidence

interval and has two parts: a margin of error and a level of confidence.

Margin of error says how close the statistic lies to the parameter.

Level of confidence says what percentage of all possible samples result in a confidence interval which contains the true parameter

Page 7: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example: President Bush

Pre 9/11: 57% with MOE 3%

Post 9/11: 90% with MOE 3%

Interpretations– We are 95% confident that the percent of all

Americans who approve of the job president Bush was doing was between 54% and 60% before 9/11.

– We are 95% confident that the percent of all Americans who approve of the job president Bush was doing was between 87% and 93% after 9/11.

Page 8: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example: College Education

This May 2011 survey finds that 57% of the 2142 adult Americans polled think that “the higher education system in the United States fails to provide students good value for the money they and their families spend”. Using the quick formula for MOE, compute a 95% confidence interval for p.

Page 9: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example: Coke or Pepsi

Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do.

What is , the observed percent from the population?

A .725 = 72.5%

B .565 = 56.5%

C .029 = 2.9%

D .038 = 3.8%

Page 10: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example Coke Or Pepsi continued

Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do.

What is the margin of error for 95% confidence?

A square root of 1231 = 35.06 = 35.06%

B square root of 696 = 26.38 = 26.38%

C 1/square root of 1231 = 0.0285 = 2.85%

D 1/square root of 696 = 0.0379 = 3.79%

Page 11: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Hints for Interpretation

The conclusion of a confidence statement always applies to the population, not to the sample.

Our conclusion about the population is never completely certain.

If you want a smaller margin of error with the same confidence, take a larger sample.

Page 12: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.
Page 13: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.
Page 14: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.
Page 15: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Hints for Interpretation

It is very common to report the margin of error for 95% confidence.– If the level of confidence is not mentioned, assume

95% confidence.

Can choose to use a confidence level other than 95%.– Other popular levels: 80%, 90%, 99%– For a fixed sample size, if you increase the level of

confidence, your interval will become wider.– For a fixed confidence level, if you increase

sample size, your interval will become narrower

Page 16: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Population Size Doesn’t Matter

The variability of a statistic from a SRS does not depend on the size of the population as long as the population is at least 100 times larger than the sample.

Page 17: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example: Population Size Doesn’t Matter

Suppose we take a sample of size 1000 from a population of 4,000,000 (e.g., South Carolina). Then we take a sample of 1000 from a population of 300,000,000 (e.g., the whole US). Which sample statistic would have more variability (i.e., MOE) ?

A. The one from 4,000,000

B. The one from 300,000,000

C. They are the same.

Page 18: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Chapter 4 – Sample Surveys

in the Real World

Page 19: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Type of errors:

1. Sampling Errors

a. Random Sampling Error

b. Bad Sampling Methods

2. Non-sampling Errors

a. Processing errors

b. Poorly worded questions

c. Response error

d. Non-Response

Page 20: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Chapter 4 – Sample Surveys in the Real World

sampling errors – errors caused by the act of taking a sample

They cause sample results to be different from the results of a census.

sampling frame – a list of individuals from which we

will draw our sample

should list every individual in the population

Page 21: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Errors in Sampling

random sampling error – results from chance selection in the simple random sample

• MOE lets us calculate how serious the error is.

• The error is due to chance – always present. A large sample helps control this.

• MOE includes only random sampling error.

• Most sample surveys are afflicted with errors other than random sampling errors.

Page 22: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Errors in Sampling

Bad sampling method – a convenience sample or a voluntary response sample

is also a form of sampling error.

Voluntary sample

Convenience sample

undercoverage – occurs when some groups in the population are left out of the process of choosing the sample

Page 23: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

nonsampling errors – errors not related to the act of selecting a sample from the population

can even be present in a census

• nonrespone (missing data)

• response errors

• processing errors

• effects of data collection procedure

Page 24: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Example

The subject lies about past drug use.

A. Sampling Error: Bad Sampling MethodB. Non Sampling Error: Response ErrorC. Non Sampling Error: Non Response ErrorD. Non Sampling Error: Processing Error

Page 25: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

The subject cannot be contacted after five calls.

A. Sampling Error: Bad Sampling Method

B. Non Sampling Error: Response Error

C. Non Sampling Error: Non Response Error

D. Non Sampling Error: Processing Error

Example

Page 26: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Interviewers choose people on the street to interview.

A. Sampling Error: Bad Sampling MethodB. Non Sampling Error: Response ErrorC. Non Sampling Error: Non Response ErrorD. Non Sampling Error: Processing Error

Example

Page 27: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Consider Wording

Be aware that the wording of a question influences the answers.

Examples:

Is our government providing too much money for welfare programs?

– 44% said “yes”

Is our government providing too much money for assistance to the poor?

– 13% said yes

Page 28: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

More Complex Sample Designs

• Sometimes a strict simple random sample is difficult to obtain.

- Multistage Sampling Design

- Cluster Sampling

- Systematic Sampling

- Stratified Random Sampling

Page 29: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

• Stratified Random Sample

• Step 1: Divide the sampling frame into distinct groups of individuals, called strata.

• – Choose strata because you have an interest in the groups or because the individuals within each group are similar

• – Example: graduate/undergraduate students

• Step 2: Take a separate SRS in each stratum and combine these to make up the complete sample.

Page 30: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Stratified Random Sample. A club has 25 student members and 10 faculty members. The club can send 4 students and 2 faculty members to a convention.

Students 01 Barrett 06 Frazier 11 Hu 16 Liu 21 Ren

02 Brady 07 Gibellato 12 Jimenez 17 Marin 22 Santos

03 Chen 08 Gulati 13 Katsaounis 18 Nemeth 23 Sroka

04 Draper 09 Han 14 Kim 19 O’Rourke 24 Tordoff

05 Duncan 10 Hostetler 15 Kohlschmidt 20 Paul 25 Wang

Faculty 0 Berliner 2 Dean 4 Goel 6 Moore 8 Stasney

1 Craigmile 3 Fligner 5 Lee 7 Pearl 9 Wolfe

Line 116:14459 26056 31424 80371 65103 62253 50490 61181Choose a Stratified RS of 4 Students, then of 2 Faculty

Page 31: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Cluster Sampling

• In order to reduce costs in sampling, researchers focus on efficiency by sampling from clusters

• Clusters are often formed by geographic location, resulting in decreased travel costs for the research company.

• Randomly sample clusters then survey everyone in

each cluster.

Page 32: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Cluster Sample - Divide population into clusters.

Select one or more clusters and include everyone in those clusters in the sample.

• Example: SC has 46 counties. Select 5 counties at random, use all household in each selected county as sample.

• Example: USC has 5000 dorms. Select 100 dorms

at random, use all students in each selected

dorm as sample.

Page 33: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Want to find the opinions of US adults, but want to save on time and money by randomly selecting residences. All adults residing in a sampled residence will be interviewed.

A. Stratified B. Cluster C. Both

Page 34: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

• Want to find the opinions of US adults and need to make sure that 3 specific religious groups are represented. You sample 100 Christians, 100 Jewish, and 100 Muslims.

A. Stratified B. Cluster C. Both

Page 35: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

• Want to find the opinions of city dwelling US adults and need to make sure that the east and west coasts are represented. You send 5 interviewers to the east coast and 5 to the west coast. 5 City blocks are chosen at random. Everyone living in a chosen city block is interviewed. (similarly for the east coast)

A. Stratified B. Cluster C. Both

Page 36: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

Questions to Ask Before You Believe a Poll

• Who carried out the survey?

• What was the population?

• How was the sample selected?

• How large was the sample?

• What was the margin of error?

• What was the response rate?

• How were the subjects contacted?

• When was the survey conducted?

• What questions were asked?

Page 37: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

A – a cluster sample

B – a systematic sample

C – a stratified random sample

D - undercoverage

USC has 20,065 undergraduates and 7,423 graduate students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates and a simple random sample of 74 graduate students are taken. This is an example of:

Page 38: STAT 110 - Section 5 Lecture 7 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before.

A – a cluster sample

B – a systematic sample

C – a stratified random sample

D - undercoverage

USC has 20,065 undergraduates and 7,423 graduate

students. In an effort to gauge the opinions of all students on campus parking issues, a simple random sample consisting of 201 undergraduates is taken. This is an example of: