Top Banner
What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming from observations, counts, measurements, or responses. uld you care about statistics? tics helps you make informed decisions that affect your life. tics helps the government make decisions that affect many peo edical & LifeStyle Decisions Vaccines: Polio, Measles, Flu, HPV Meds: Blood Pressure, Cholesterol Hormone Replacement, Chemo Smoking Home in City/Country/Suburb College/Major Invest in Stock Market Marriage/Divorce/Children/Adopt Government Decisions •Raise Retirement Age (Soc. Sec.) •Drinking/Driving/ Seatbelt Laws •Mandatory School for children Common Statistical Data Census Health/
22

What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Jan 02, 2016

Download

Documents

Warren Barnett
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

What is Statistics?Statistics

The science of collecting, organizing, analyzing, and interpreting data in order to make decisions.

1

Data Information coming from observations, counts, measurements, or responses.

Why should you care about statistics?•Statistics helps you make informed decisions that affect your life.•Statistics helps the government make decisions that affect many people.

Medical & LifeStyle Decisions•Vaccines: Polio, Measles, Flu, HPV•Meds: Blood Pressure, Cholesterol•Hormone Replacement, Chemo•Smoking•Home in City/Country/Suburb•College/Major•Invest in Stock Market•Marriage/Divorce/Children/Adopt

Government Decisions•Raise Retirement Age (Soc. Sec.)•Drinking/Driving/Seatbelt Laws•Mandatory School for children

Common Statistical DataCensus Health/MedicalCrime ScientificEducation Economic

Page 2: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Statements Based on Data Collection•“People who eat three daily servings of whole grains have been shown to reduce their risk of…stroke by 37%.” (Source: Whole Grains Council)

•“Seventy percent of the 1500 U.S. spinal cord injuries to minors result from vehicle accidents, and 68 percent were not wearing a seatbelt.” (Source: UPI)

• In 2007, Florida’s High School graduation rate was: 65% compared with that of the U.S. at 74%. In 2006 the Florida graduation rate was 73%. (Data Source: U.S. Department of Education Website)

What makes a statement based on data correct or incorrect?•Statements may be correct based on the data•Statements may be incorrect for the data due to a calculation/analysis error•Statements may be correct for the data, but may be incorrect because the data collection was not done properly. (Difficult for average person to know)

Controversy – Interpretation – Biased DataCan you think of some controversial statistical statements?Do TV advertisements use statistical statements?Do politicians use statistical statements?

Page 3: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Data SetsPopulation The collection of all outcomes, responses, measurements, or counts that are of interest.

Sample A subset of the population.

3Larson/Farber 4th ed.

1. Collection of all black men over 40 in the U.S.

2. Collection of all computer users

3. Major of each college student at VCC

1. Collection of 10,000 black men over 40 who participated in a study on blood pressure.

2. Collection of 567 computer users surveyed.

3. Major of college students at VCC who take statistics.

“ALL” “SOME”

Some Statistical Data WebSites:

•http://www.usa.gov/Topics/Reference_Shelf/Data.shtml

•http://www.cdc.gov/DataStatistics/

Note that #3 is a sample but, it isNOT representative of the Population given. Why? HowCan this be fixed?

Page 4: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Example: Identifying Data SetsIn a recent survey, 1708 adults in the United States were asked if they think global warming is a problem that requires immediate government action. Nine hundred thirty-nine of the adults said yes. Identify the population and the sample. Describe the data set. (Adapted from: Pew Research Center)

4

• The population consists of the responses of all adults in the U.S.

• The sample consists of the responses of the 1708 adults in the U.S. in the survey.

• The sample is a subset of the responses of all adults in the U.S.

• The data set consists of 939 yes’s and 769 no’s.

Page 5: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Parameter and StatisticParameter

A number that describes a population characteristic.

Average age of all people in the United States

Statistic A number that describes a sample characteristic.

Average age of people from a sample of three states

5

Parameter or Statistic?1. A recent survey of a sample of MBAs reported that the average salary for an MBA is more than $82,000. (Source: The Wall Street Journal)

Sample statistic (average of $82,000 is based on a subset of the population)

2. Starting salaries for the 667 MBA graduates from the University of Chicago Graduate School of Business increased 8.5% from the previous year. Population parameter (the percent increase of 8.5% is based on all 667 graduates’ starting salaries)

Page 6: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Branches of Statistics

Descriptive Statistics Involves organizing, summarizing, and displaying data.

e.g. Tables, charts, averages

Inferential Statistics Involves using sample data to draw conclusions about a population.

6

Page 7: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Example: Descriptive and Inferential Statistics

1. Which part of the study represents the descriptive branch of statistics?

2. What conclusions might be drawn from the study using inferential statistics?

A large sample of men, aged 48, was studied for 18 years. For unmarried men, approximately 70% were alive at age 65. For married men, 90% were alive at age 65. (Source: The Journal of Family Issues)

7

Descriptive Statistics•For unmarried men, approximately 70% were alive at age 65.•For married men, 90% were alive at age 65.•The chart

A possible conclusion• Being married is associated with a longer life for men.

Page 8: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

1.2 Types of DataQualitative Data

Consists of attributes, labels, or nonnumerical entries.Major Place of birthEye color

8

Quantitative data

Numerical measurements or counts.

Age Weight of a letterTemperature

Example: The base prices of several vehicles are shown in the table. Which data are qualitative data and which are quantitative data? (Source Ford Motor Company)

Quantitative Data (Base prices of vehicles models are numerical entries)

Qualitative Data (Names of vehicle models are non-numerical entries)

Page 9: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Levels of MeasurementNominal level of measurement• Qualitative data only• Categorized w/names, labels, or qualities• No mathematical computations can be made

Ordinal level of measurement• Qualitative or quantitative data• Data can be arranged in order• Differences between data entries are not meaningful

9

(Source:Nielsen Media Research)

Lists the rank of five TV programs. Data can be ordered but no meaning for Difference between ranks.

Lists the call letters (names) of each network affiliate.

Interval level of measurement• Quantitative data only• Data can be ordered• Meaningful difference between data entries.• Zero represents a position on a scale (not an

inherent zero – zero does not imply “none”)

Quantitative data. Can find a difference between two dates, but a ratio does not make sense.

(Source: Major League Baseball)

Ratio level of measurement• Similar to interval WITH zero entry as

an inherent zero (implies “none”)• A ratio of 2 data values can be formed• One data value can meaningfully be

expressed as a multiple of another.

Can find differences.

Can write ratios.

“Twice as much” makes sense

Page 10: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Summary of Four Levels of Measurement

Level ofMeasurement

Put data in

categories

Arrangedata inorder

Subtractdata

values

Determine if one data value is a

multiple of another

Nominal Yes No No No

Ordinal Yes Yes No No

Interval Yes Yes Yes No

Ratio Yes Yes Yes Yes

10

There is a good summary of the levels with examples on P. 14 of your book.

Page 11: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Section 1.3 Experimental DesignTopics

• Data Collection Techniques observation Experiment Simulation survey

• Designing a statistical study (*)

• Designing an experiment (*)

11

Goal: Collect data and use the data to make decisions

• Sampling Techniques Random Stratified Cluster Systematic Biased

(*) In a study the researcher does not influence the responses In an experiment, the researcher applies a ‘treatment’, then observes the responses.

You may never need to develop a statistical study, BUT… You will likely need to interpret the results of one, AND… You WILL need to determine if the results are valid!

Page 12: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Designing a Statistical Study

3. Collect the data.

4. Describe the data using descriptive statistics techniques.

5. Interpret the data and make decisions about the population using inferential statistics.

6. Identify any possible errors.

1. Identify the variable(s) of interest (the focus) and the population of the study.

2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is representative of the population.

12

Page 13: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Data CollectionObservational study

• A researcher observes and measures characteristics of interest of part of a population but does not change existing conditions.

Example: Researchers observed and recorded the mouthing behavior on nonfood objects of children up to three years old. Source: Pediatric Magazine)

13

Experiment

• A treatment is applied to part of a population & responses are observed. Another part of the population may be used as ‘control’ group w/ a ‘placebo’ (no treatment)

Example: A group of diabetics took cinnamon extract daily while a control group took none. After 40 days, diabetics who had cinnamon reduced their risk of heart disease while the control group experienced no change.(Source: Diabetes Care)

Simulation

• Uses a math or physical model to reproduce conditions of a situation or process. - often involves the use of computers. (reproduction dangerous or impractical)

Example: Automobile manufacturers use simulations with dummies to study the effects of crashes on humans.

Survey

• An investigation of one or more characteristics of a population, commonly done by interview, mail or telephone. (Survey questions must be without bias)

Example: A survey is conducted on a sample of female physicians to determine whether the primary reason for their career choice is financial stability.

Page 14: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Example: Methods of Data CollectionWhich method of data collection would you use for each study?

1. A study of the effect of changing flight patterns on the number of airplane accidents.Solution: Simulation (Impractical to create this situation)

14

2. A study of the effect of eating oatmeal on lowering blood pressure.

Solution:Experiment (Measure the effect of a treatment – eating oatmeal)

3. A study of how fourth grade students solve a puzzle.Solution: Observational study (observe and measure certain characteristics of part of a population)

4. A study of U.S. residents’ approval rating of the U.S. president.

Solution: Survey (Ask “Do you approve of the way the president is handling his job?”)

Page 15: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Key Elements of Experimental DesignControl Randomization Replication

15

• Control for effects other than the one being measured.

• Confounding variables Occurs when an experimenter cannot tell the difference between the effects

of different factors on a variable. A coffee shop owner remodels her shop at the same time a nearby mall has

its grand opening. If business at the coffee shop increases, it cannot be determined whether it is because of the remodeling or the new mall.

• Placebo effect A subject reacts favorably to a placebo when in fact he or she has been

given no medical treatment at all. Blinding is a technique where the subject does not know whether he or

she is receiving a treatment or a placebo. Double-blind experiment neither the subject nor the experimenter

knows if the subject is receiving a treatment or a placebo. (Researchers prefer this)

Page 16: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Key Elements of Experimental Design

• Randomization - randomly assigning subjects to different treatment groups.

• Completely randomized design Subjects are assigned to different treatment groups through random selection.

• Randomized block design Divide subjects with similar characteristics into blocks, and then within each

block, randomly assign subjects to treatment groups.Example: An experimenter testing the effects of a new weight loss drink may first divide

the subjects into age categories. Then within each age group, randomly assign subjects to either the treatment group or control group.

16

Control Randomization Replication

• Matched Pairs Design Subjects are paired up according to a similarity.

One subject in the pair is randomly selected to receive one treatment while the other subject receives a different treatment.

Page 17: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Key Elements of Experimental Design

Replication is the repetition of an experiment using a large group of subjects.

Example: To test a vaccine against a strain of influenza, 10,000 people are given the vaccine and another 10,000 people are given a placebo. Because of the sample size, the effectiveness of the vaccine would most likely be observed.

17

Control Randomization Replication

Another Key Element: Sample SizeThe number of subjects in a study is very important and will be discussed atVarious times throughout the course.

Page 18: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Example: Experimental Design

A company wants to test the effectiveness of a new gum developed to help people quit smoking. Identify a potential problem with the given experimental design and suggest a way to improve it.

The company identifies one thousand adults who are heavy smokers. The subjects are divided into blocks according to gender. After two months, the female group has a significant number of subjects who have quit smoking.

18

Problem:

The groups are not similar. The new gum may have a greater effect on women than men, or vice versa.

Correction:

The subjects can be divided into blocks according to gender, but then within each block, they must be randomly assigned to be in the treatment group or the control group.

Page 19: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Sampling TechniquesSimple Random Sample

Every possible sample of the same size has the same chance of being selected.

x xx

xx

xx

x x

x

xx

xx

x

x x

xx

x

x

xx

xx xx x

xx

x

x

xxx

xx

x

x x

xx

x

x

xx

xx xx x

xx

x

x

xx

xx

x

x

x x

xx

x

x

xx

xx xx x

x x

x

xxx

xx

x

x x

xx

x

x

xx

xx xx x

x x

x

x

x xx

xx

xx

x

x

19

• Random numbers can be generated by a random number table, a software program or a calculator.

• Assign a number to each member of the population.

• Members of the population that correspond to these numbers become members of the sample.

Page 20: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Simple Random Sample

Step3: Read the digits in groups of threeStep4: Ignore numbers greater than 731

The students assigned numbers 719, 662, 650, 4, 53, 589, 403, and 129 would make up the sample.

20

Example: There are 731 students currently enrolled in statistics at your school. You wish to form a sample of eight students to answer some survey questions. Select the students who will belong to the simple random sample.

Step1: Assign numbers 1 to 731 to each student taking statistics.Step2: On the table of random numbers, choose a starting place at random (suppose you start in the third row, second column.)

Page 21: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Other Sampling Techniques

• Divide a population into groups (strata) and select a random sample from each group.

• To collect a stratified sample of the number of people who live in West Ridge County households, you could divide the households into socioeconomic levels and then randomly select households from each level.

21Larson/Farber

Stratified Cluster Systematic

• Divide the population into groups (clusters) and select all of the members in one or more, but not all, of the clusters.

• In the West Ridge County example you could divide the households into clusters according to zip codes, then select all the households in one or more, but not all, zip

codes.

• Choose a starting value at random. Then choose every kth member of the population.

• In the West Ridge County example you could assign a different number to each household, randomly choose a starting number, then select every 100th household.

Page 22: What is Statistics? Statistics The science of collecting, organizing, analyzing, and interpreting data in order to make decisions. 1 Data Information coming.

Example: Identifying Sampling TechniquesYou are doing a study to determine the opinion of students at your school regarding stem cell research. Identify the sampling technique used.

1. You divide the student population with respect to majors and randomly select and question some students in each major.

Solution:Stratified sampling (the students are divided into strata (majors) and a sample is selected from each major)

22

2. You assign each student a number and generate random numbers. You then question each student whose number is randomly selected.

Solution:Simple random sample (each sample of the same size has an equal chance of being selected and each student has an equal chance of being selected.)