Top Banner
PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with Tables and Graphs
39

BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Sep 04, 2018

Download

Documents

ngokhue
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

PSY 307 – Statistics for the Behavioral Sciences

Chapter 2 – Describing Data with Tables and Graphs

Page 2: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Frequency Distributions

One of the simplest forms of measurement is counting

How many people show a characteristic, have a given value or are members of a category.

Frequency distributions count how many observations exist for each value for a particular variable.

Page 3: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Frequency Table

A frequency table is a collection of observations:

Sorted into classesShowing the frequency for each class.

A “class” is a group of observations.When each class consists of a single observation, the data is considered to be ungrouped.

Page 4: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Creating a Table

List the possible values.Count how many observations exist for each possible value.

One way to do this is using hash-marks and crossing off each value.

Figure out the corresponding percent for each class by dividing each frequency by the total scores.

Page 5: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Unorganized Data

1, 5, 3, 3, 6, 2, 1, 5, 2, 1, 2, 6, 3, 4, 1, 6, 2, 4, 4, 2

A set of observations like this is difficult to find patterns in or interpret.

Page 6: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Example

Page 7: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

When to Create Groups

Grouping is a convenience that makes it easier for people to understand the data.Ungrouped data should have <20 possible values or classes (not <20 scores, cases or observations).Identities of individual observations are lost when groups are created.

Page 8: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Guidelines for Grouping

See pgs 29-30 in text.Each observation should be included in one and only one class.List all classes, even those with 0 frequency (no observations).All classes with upper & lower boundaries should be equal in width.

Page 9: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Optional Guidelines

All classes should have an upper and lower boundary.

Open-ended classes do occur.Select an interval (width) that is natural to think about:

5 or 10 are convenient, 13 is notThe lower boundary should be a multiple of class width (245-249).Aim for a total of about 10 classes.

Page 10: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Gaps Between Classes

With continuous data, there is an implied gap between where one boundary ends and the other starts.The size of the gap equals one unit of measurement – the smallest possible difference between scores.

That way no observations can ever fall within that gap.

Class sizes account for this.

Page 11: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Relative Frequency

Relative frequency – frequency of each class as a fraction (%) of the total frequency for the distribution.Relative frequency lets you compare two distributions of different sizes.Obtain the fraction by dividing the frequency for each group by the total frequency

Total = 1.00 (100%)

Page 12: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Example

Total = 20

4/20 = .20 or 20%

5/20 = .25 or 25%

3/20 = .15 or 15%

3/20 = .15 or 15%

2/20 = .10 or 10%

3/20 = .15 or 15%

Total = 1.0 or 100%

Page 13: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Cumulative Frequency

Cumulative frequency – the total number of observations in a class plus all lower-ranked classes.Used to compare relative standing of individual scores within two distributions.Add the frequency of each class to the frequencies of those below it.

Page 14: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Relative Frequency (Percent) and Cumulative Frequency

Page 15: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Cumulative Proportion (Percent)

The cumulative proportion or percent is the relative cumulative frequency.

Percent = proportion x 100It allows comparison of cumulative frequencies across two distributions.To obtain cumulative proportions divide the cumulative frequency by the total frequency for each class.

Highest class = 1.00 (100%)

Page 16: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Percentile Ranks

Percentile rank – percent of observations with the same or lower values than a given observation.Find the score, then use the cumulative percent as the percentile rank:

Exact ranks can be found from ungrouped data.Only approximate ranks can be found from grouped data.

Page 17: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Qualitative Data

Some categories are ordered (can be placed in a meaningful order):

Military ranks, levels of schooling (elementary, high school, college)

Frequencies can be converted to relative frequencies.Cumulative frequencies only make sense for ordered categories.

Page 18: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Interpreting Tables

First read the title, column headings and any footnotes.

Where do the data come from, source?Next, consider whether the table is well-constructed – does it follow the grouping guidelines.Finally, look at the data and think about whether it makes sense.

Focus on overall trends, not details.

Page 19: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Histograms

For quantitative data only.Equal units across x axis represent groups.Equal units across y axis represent frequency.Use wiggly line to show breaks in the scale.Bars are adjacent – no gaps.

Page 20: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Histogram Applets

http://www.stat.sc.edu/~west/javahtml/Histogram.htmlUses Old Faithful geyser data

http://www.shodor.org/interactivate/activities/histogram/?version=1.6.0_11&browser=MSIE&vendor=Sun_Microsystems_Inc.

Uses math SAT data

Notice that “bin width” refers to class or interval size.SPSS automatically creates classes or intervals.

Page 21: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Frequency Polygons

Also called a line graph.A histogram can be converted to a frequency polygon by connecting the midpoints of the bars.Anchor the line to the x axis at beginning and end of distribution.Two frequency polygons can be superimposed for comparison.

Page 22: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Creating a Line Graph from a Histogram

Page 23: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Stem-and-Leaf Displays

Constructing a display:Notice the highest and lowest 10sArrange 10s in ascending order.Copy right-hand digits as leaves.

The resulting display resembles a frequency histogram.Stems are whatever digits make sense to use.

Page 24: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Sample

Stem and leaf display showing the number of passing touchdowns.

3|2337

2|001112223889

1|2244456888899

Page 25: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

The Best Graph Every Drawn

Source: http://strangemaps.wordpress.com/

Page 26: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Details About the Graph

The map was the work of Charles Joseph Minard (1781-1870), a French civil engineer who was an inspector-general of bridges and roads, but whose most remembered legacy is in the field of statistical graphicsThe chart, or statistical graphic, is also a map. And a strange one at that. It depicts the advance into (1812) and retreat from (1813) Russia by Napoleon’s Grande Armée, which was decimated by a combination of the Russian winter, the Russian army and its scorched-earth tactics. To my knowledge, this is the origin of the term ’scorched earth’ – the retreating Russians burnt anything that might feed or shelter the French, thereby severely weakening Napoleon’s army. It unites temperature, time, geography and number of soldiers, all in one picture.

Page 27: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Purpose of Frequency Graphs

In statistics, we are interested in the shapes of distributions because they tell us what statistics to use.They let us identify outliers that might distort the statistics we will be using.They present data so that readers can quickly and easily grasp its meaning.

Page 28: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Shapes of Distributions

Normal – bell-shaped and symmetrical.Bimodal – two peaks.

Suggests presence of two different types of observations in the same data.

Positively skewed – lopsided due to extreme observations in right tail.Negatively skewed – extreme observations in left tail.

Page 29: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Shapes of Graphsbimodal normal

positive skew negative skew

Page 30: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Heavy vs Light-tailed Distributions

Heavy-tailed – a distribution with more observations in its tails.Light-tailed – a distribution with fewer observations in its tails and more in the center.Kurtosis – a statistic that measures the shape of the distribution and the size of the tails.

Page 31: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Qualitative Data

Bar graphs – similar to histograms.Bars do not touch.Categorical groups are on x-axis.

Pie charts

Where tax money goes.

Page 32: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Misleading Graphs

Bars should be equal widthsBars should be two-dimensional, not three-dimensionalWhen the lower bound of the y-axis (frequency) is cut-off (not 0), the differences are exaggerated.Height and width of the graph should be approximately equal.

Page 33: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Gallup’s Terry Schiavo Poll

Page 35: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Misleading Scales

Page 36: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

More Misleading Graphs

http://www.coolschool.ca/lor/AMA11/unit1/U01L02.htm

Page 37: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Constructing Graphs

Select the type of graph.Place groups on the x-axis.Place frequency on the y-axis.Values for the groups and frequencies depend on the data.Label the axes and give a title to the graph.

Page 38: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Parts of a Graph

Page 39: BHS 307 – Statistics for the Behavioral Sciencesnalvarado/BHS307PPTs/Witte PDFs/Chap2.pdf · PSY 307 – Statistics for the Behavioral Sciences Chapter 2 – Describing Data with

Other Kinds of Graphs

Frequency is not the only measure that can be displayed on the y-axis.

We are using a graph to explore the shape of a distribution in this chapter.

Usually the y-axis shows the dependent variable while the x-axis shows groups (independent variable).Graphs can be visually interesting!