Transcript

PSY 1950Descriptive StatisticsSeptember 17, 2008

• “It is well known to your Lordship, that the method practised by astronomers, in order to diminish the errors arising from the imperfections of instruments, and of the organs of sense, by taking the Mean of several observations, has not been so generally received, but that some persons, of considerable note, have been of opinion, and even publicly maintained, that one single observation, taken with due care, was as much to be relied on as the Mean of a great number.– Thomas Simpson to Earl of Macclesfield “On

the Advantage of Taking the Mean of a Number of Observations, in practical Astronomy” (1755)

• “The science of Means may be summed up in two problems; (1) To find how far the difference between any proposed Means (e.g. the average mortalities in different occupations) is accidental, or indicative of a law; (2) To find what is the best Mean, whether for the purpose contemplated by the first problem, The Elimination of Chance, or other purposes.”– F.Y. Edgeworth’s (1885) “Logic of

Statistics”

Population versus Samples

– Everyone born in Scotland in 1932 (n = 87, 498)

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Types of Variables• Discrete versus Continuous• Physical versus social sciences • Make an argument

– Group 1: Perception is discrete– Group 2: Perception is continuous– Group 3: Depression is discrete– Group 4: Depression is continuous

• What data/knowledge would support or even prove your claim?

Types of measurementVariable = academic interest in science

central tendency = modedispersion = bupkis (sort of)

NominalIndividual Interest

Paul ScienceDave Science

Larry Humanities

Adam Humanities

0

1

2

3

4

5

6

Science Humanities

Interest

Activism

Types of measurementVariable = academic interest in science

central tendency = mode, mediandispersion = range, interquartile range

OrdinalIndividual Interest

Paul mediumDave highLarry lowAdam low

01234567

Low Medium High

Interest

Activism

Types of measurementVariable = academic interest in science

central tendency = mode, median, meandispersion = range, interquartile range, std

deviation

IntervalIndividual Interest

Paul 4Dave 8Larry 2Adam 3

No interest Extreme interest

x

Types of measurementVariable = academic interest in science

central tendency = mode, median, meandispersion = range, interquartile range, std

deviation

RatioIndividual Interest

Paul 3Dave 7Larry 1Adam 2

No interest Extreme interest

x

• “The numbers do not remember where they came from.”– Lord, F.M. (1953). On the

statistical treatment of football numbers. The American Psychologist, 8, 750-751.

Correlational and Experimental Methods

• Complementary methods • Ecological validity versus inferential

power• Hypothesis generation versus

hypothesis testing• Others?

Correlational vs. Experimental Methods

• Cronbach, L. (1957). The two disciplines of scientific psychology, American Psychologist, 12, 671-684. (http://psychclassics.yorku.ca/Cronbach/Disciplines/)

– Cattell (1898): experimentalists’ "regard for the body of nature becomes that of the anatomist rather than that of the lover”

– Bartlett’s (1955): correlationists “chanting in unaccustomed harmony the words of the old jingle ‘God has a plan for every man, and he has a plan for you’”

• “We will come to realize that organism and treatment are an inseparable pair and that no psychologist can dismiss one or the other as error variance.”

Statistics and Methodology are Inseparable

• The use and even conception of descriptive statistics varies with experimental/correlational approach– Shape– Central tendency– Dispersion

Shape• Modality• Symmetry• Kurtosis

Shape: Modality

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Shape: Symmetry/Skewness

Shape: Kurtosis

Joanes, D. N., & Gill, C. A. (1998) Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society (Series’s D): The Statistician, 47, 183-189.

DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292-306.

Histogram Bin Size

Shimazaki, H. and Shinomoto, S. (2007) A method for selecting the bin size of a time histogram. Neural Computation, 19(6), 1503-1527.

http://www.ton.scphys.kyoto-u.ac.jp/~hideaki/res/sshist/cur/bin/histogram_appli.html

QuickTime™ and a decompressor

are needed to see this picture.

Central Tendency• Mean, Median, Mode• Selection criteria

– skew, extreme scores– undetermined values– open-ended distributions– discrete variables– (arithmetic manipulatibility)– (population estimator)

• Mean number of calories burned in an “extremely passionate” one-minute kiss:– 26

• Mean number of calories in a Hershey’s kiss:– 25

• # participants predicted to administer an apparently lethal shock to a stranger: 1%

• # participants who actually administered an apparently lethal shock to a stranger: 65%

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Estimating Population Central Tendency

• Efficiency

(10,000 samples of n=100 from population with µ=30, =5)

Estimating Population Central Tendency

• Resistance

(10,000 samples of n=99 from population with µ=30, =5; 1 outlier added with value of 100)

Dispersion/Variability/Spread

• Range: Xmax-Xmin

• Interquartile range: Q3-Q1• Deviation-based measures

Deviation• Deviation = X - µ

– Direction– Distance

• Mean deviation = (X - µ)/N– Always zero

• Mean absolute deviation (MAD)= (|X - µ|)/N– Advantage: intuitive– Disadvantage: mathmetically

cumbersome

Deviation• Mean squared deviation (variance)

=(X - µ)2/N = SS/N

– Advantage: mathematically practical– Disadvantage: Non-intuitive,

different units

• Standard deviation ()=√variance• Computation vs. definitional

formulae

QuickTime™ and a decompressor

are needed to see this picture.

Estimating population dispersion• Bias

– See http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

Why n-1 for samples?• Population with 2 = 25• 10,000 samples of n = 25

Why n-1 for samples?• degrees of freedom

– the rank of a quadratic form – the number of independent observations in a sample

of data that are available to estimate a parameter of the population from which that sample is drawn

– the number of scores in the sample that are independent and free to vary

Walker, H.W. (1940). Degrees of freedom. Journal of Educational Psychology, 31, 253-269.

Good, I.J. (1973). What are degrees of freedom? The American Statistician, 27, 5, 227–228

http://www.tufts.edu/~gdallal/dof.htmhttp://www.creative-wisdom.com/computer/sas/df.html

Why not other dispersion measures?– MAD (population MAD = 3.99),

range

Choosing a DV: Central Tendency vs. Dispersion

Choosing a DV: Central Tendency vs. Dispersion

• What is the mechanism? – e.g., sex differences in IQ

• Are deviations nuisance or essence?– e.g., heart rate

• Equal and opposite effects?– e.g., subject X treatment interactions

top related