My PPT on graphical representation of data

Post on 15-Nov-2014

131 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

This PPT was prepared for 3MPh, 3MPA, 3MPB, and 3MIT.

Transcript

Graphical representation of data

Compiled by Pedup Dukpa(For third year Paro College of Education

students)

Now, Let’s get it started

What do we mean by graphs?Who invented this versatile device

called graphs?Why are graphs important?When do we use/apply graphs?How do we know what kinds of

graphs to use?Where do we usually see graphs?

My personal definition of graphs:

I believe graphs to be a universal language that speaks volumes in terms of visual representation of data; that can be easily read and understood by everyone.

Inventor of graphs:

William Playfair (1759-1823), Scottish engineer and political economist, is the principal inventor of statistical graphs.

In 1786, he published “Commercial and Political Atlas” that contained 44 charts.

He invented three of the four basic forms of graph:

The statistical line graphThe bar graphThe pie graph

Different forms/types of Graphical representation of data:

PictographLine graphBar graphHistogram DotsScatter diagramPie ChartDendrogram

Contd’

Frequency polygonOgiveStem and leafBox and whiskerDirected and undirected

graphsPolar coordinate graphsThree dimensional graphs etc.

Choosing the right graphs

As discussed in the previous class…

Skills and techniques required… are….????

Pictograph/ Pictorial graph:

A pictograph or pictorial graph involves categories and counts of the number of people or things in a category (frequency). The layout of the graph can be horizontal or vertical.

Purpose of Pictograph:

To simply and clearly illustrate a mathematical relation.

No attempt is made to show data points or errors on such a graph.

Here, we have two types of graphs:Concrete Object GraphSymbolic Graph

Bar Graphs:

Bar graph is a pictorial representation of frequency distribution of ungroup data by a number of bars (rectangles) of uniform width erected either vertically or horizontally with equal spacing between them.

For example:

Notice that all data does not fall evenly on a multiple of 20, in fact, the bar is in between two grid lines.

Bar graphs are useful to get an overall idea of trends in

responses

Activity #

Visit W/Friends 175

Talk on Phone 168

Play Sports 120

Earn Money 120

Use Computers 65

Example 1: The number of trees planted by Paro College of Education students in different years on June 2 is given below:

Years 1997

1998 1999 2000 2001 2002 Total

# of trees planted

400 450 700 750 900 1500 4700

For the class to do:

Problem 1:

The data below shows the number of students present in different classes on a particular day:

Represent/draw the above data as bar graph.

Solution:

Homework Question 1:

The data regarding causes of accidents in factories are given below:

Draw a bar graph to represent the data given above.

Interpretation/Reading of bar graphs:

Referring to Example 1: Read the bar graph and answer the following questions:

In which year was the maximum number of trees planted by the Paro College of Education students?

What trend does the number of trees planted show?

In which years, the number of trees planted differ by 50 only?

Homework questions on reading/interpretation of bar graph:

Referring to homework question 1: Answer the following questions:

Which cause is responsible for the maximum number of accidents in factories? Which cause is the minimum?

Can you think of one of the “other” causes?

How many percent of accident could have been avoided by timely action?

Histogram and frequency polygon:

Histogram is a graphical representation of a continuous frequency distribution (i.e. grouped frequency distribution with no space between the rectangles/bars. Traditionally, class-intervals are taken along the horizontal axis while the respective class frequencies are taken along the vertical axis.

Note: The areas of the rectangles are proportional to the frequencies.

Frequency polygon is formed by the joining of the mid-points of the tops of the adjoining rectangles in a histogram

For example:

Consider the following frequency distribution of weights of 30 students of class third year math-physic/IT students.

Draw a histogram and a frequency polygon based on the above data.

C.I. (in kgs)

45-50 50-55 55-60 60-65 65-70 Total

Frequency 3 7 12 5 3 30

Example of constructing the frequency polygon without the help of histogram:

If we were to draw a frequency polygon for the amount of pocket allowance that a student in third year math-physic/IT gets (remember, this is just arbitrary) provided the following data:

Pocket money

0-50 50-100 100-150

150-200

200-250

250-300

# of students

16 25 13 26 15 5

For the class to do:

Problem 2: The daily earnings of 100 shopkeepers in Paro Valley are given below:

Draw a histogram and a frequency polygon to represent the above data.

Daily earning (in Nu.)

200-300

300-400

400-500

500-600

600-700

700-800

800-900

# of shops

3 12 15 30 25 12 3

Solution:

Stem and Leaf Plot:

A stem and leaf plot is a graphical data analysis technique for summarizing the distributional information of a variable. It is similar to a histogram, but it preserves the original numeric values in the data. As such, it is an effective alternative to the histogram for small to moderate size data sets. However, it is not recommended for large data sets.

In a stem-and-leaf plot each data value is split into a "stem" and a "leaf".  The "leaf" is usually the last digit of the number and the other digits to the left of the "leaf" form the "stem".  The number 123 would be split as:

Stem Leaf

12 3

Constructing a stem and leaf plot:

The Math test scores out of 50 marks are as follows:  35, 36, 38, 40, 42, 42, 44, 45, 45, 47, 48, 49, 50, 50, 50.

Solution: The stem and leaf plot should look like,

Math Test Scores (out of 50 pts)

Stem Leaf

3 5 6 8 

4 0 2 2 4 5 5  7 8 9 

5 0 0 0 

A stem-and-leaf plot shows the shape and distribution of data.  It can be clearly seen in the diagram above that the data clusters around the row with a stem of 4.

Points to remember:

Leaf is the digit in the place farthest to the right in the number, and the stem is the digit, or digits, in the number that remain when the leaf is dropped.

To show a one-digit number (such as 7) using a stem-and-leaf plot, use a stem of 0 and a leaf of 7.

To find the median in a stem-and-leaf plot, count off half the total number of leaves.

For comparing two sets of data:

We use back-to-back stem-and-leaf plot.For example: The numbers 40, 42, and 43 are from Data Set A & the

numbers 41, 45, 46, and 47 are from Data Set B.Construct a back-to-back stem- and-leaf plot.Solution:

Data Set A   Data Set B

Leaf Stem Leaf

3 2 0 4 1 5 6 7

Advantage of stem and leaf plot:

The stem-and-leaf plot over the histogram is that the stem-and-leaf plot displays not only the frequency for each interval, but also displays all of the individual values within that interval.

Moreover, the median and mode are easily readable.

Home-Work on stem and leaf plot:

Construct a stem and leaf plot, find the median and mode of the data using the plot created.

Special Case: (when the one of the stem and leaf values are missing)

For example, take the following data set:

10, 11, 20, 21, 24, 27, 27, 27, 28, 28, 29, 29, 29, 31, 33, 33, 33, 33, 33, 39, 53

(Notice here, 40’s are missing) The stem and leaf plot would then be:

1|01 2|01477788999 3|1333339 4| 5|3

Even though the peak corresponds with the 20s cohort, it's clear that the most frequently occurring value is 33, and hence the mode, is 33.

BOX-AND-WHISKER PLOT / 5 NUMBER SUMMARY:

They allow people to explore data and to draw informal conclusions when two or more variables are present. It shows only certain statistics rather than all the data. Five-number summary is another name for the visual representations of the box-and-whisker plot. The five-number summary consists of the median, the quartiles, and the smallest and greatest values in the distribution. Immediate visuals of a box-and-whisker plot are the center, the spread, and the overall range of distribution.

There are two types of box and whisker plot: Traditional box and whisker plot Modified version of the box and whisker plot.

Traditional box and whisker/ The 5 Number Summary

The five number summary is another name for the visual representation of the box and whisker plot.

The five number summary consist of : The median ( 2nd quartile) The 1st quartile The 3rd quartile The maximum value in a data set The minimum value in a data set

Single middle value

Review on The Median

The median is the middle value of a set of data once the data has been ordered.

Example 1. Ugyen hits 11 balls at T/phu driving range. The recorded distances of his drives, measured in yards, are given below. Find the median distance for his/her drives.

85, 125, 130, 65, 100, 70, 75, 50, 140, 95, 70

Median drive = 85 yards

50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 140

Ordered data

Two middle values so take the mean.

Review on The Median

The median is the middle value of a set of data once the data has been ordered.

Example 2. Rinzin hit 12 balls at T/phu driving range. The recorded distances of his drives, measured in yards, are given below. Find the median distance for his/her drives.

85, 125, 130, 65, 100, 70, 75, 50, 140, 135, 95, 70

Median drive = 90 yards

50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 135, 140

Ordered data

Finding the median, quartiles and inter-quartile range.

12, 6, 4, 9, 8, 4, 9, 8, 5, 9, 8, 10

4, 4, 5, 6, 8, 8, 8, 9, 9, 9, 10, 12

Order the data

Inter- Quartile Range = 9 - 5½ = 3½

Example 3: Find the median and quartiles for the data below.

Lower Quartile = 5½

Q1

Upper Quartile = 9

Q3

Median = 8

Q2

Upper Quartile = 10

Q3

Lower Quartile = 4

Q1

Median = 8

Q2

3, 4, 4, 6, 8, 8, 8, 9, 10, 10, 15,

Finding the median, quartiles and inter-quartile range.

6, 3, 9, 8, 4, 10, 8, 4, 15, 8, 10 Order the data

Inter- Quartile Range = 10 - 4 = 6

Example 4: Find the median and quartiles for the data below.

4 5 6 7 8 9 10 11 12

MedianLower

QuartileUpper

QuartileLowest Value

Highest Value

BoxWhiskerWhisker

Anatomy of a Box and Whisker Diagram.

Note: plotting the median, lower quartile and upper quartile i.e. the box portion shows the range of middle 50% of the members with the median being the mid-point.

Lower Quartile = 5½

Q1

Upper Quartile = 9

Q3

Median = 8

Q2

4 5 6 7 8 9 10 11 12

4, 4, 5, 6, 8, 8, 8, 9, 9, 9, 10, 12

Example 5: Draw a Box plot for the data below

Drawing a Box Plot.

Upper Quartile = 10

Q3

Lower Quartile = 4

Q1

Median = 8

Q2

3, 4, 4, 6, 8, 8, 8, 9, 10, 10, 15,

Example 6: Draw a Box plot for the data below

Drawing a Box Plot.

3 4 5 6 7 8 9 10 11 12 13 14 15

Upper Quartile = 180

Qu

Lower Quartile = 158

QL

Median = 171

Q2

Question: Sonam recorded the heights in cm of boys in his class as shown below. Draw a box plot for this data.

Drawing a Box Plot.

137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, 186

130 140 150 160 170 180 190cm

2. The boys are taller on average.

Question: Tashi recorded the heights in cm of girls in the same class and constructed a box plot from the data. The box plots for both boys and girls are shown below. Use the box plots to choose some correct statements comparing heights of boys and girls in the class. Justify your answers.

Drawing a Box Plot.

130 140 150 160 170 180 190

Boys

Girls

cm

1. The girls are taller on average.

3. The girls show less variability in height.

4. The boys show less variability in height.

5. The smallest person is a girl

6. The tallest person is a boy

Problem for the class to do:

Suppose you caught 13 fish, during the after-math of the Paro Flood along the river side and you measured the length of the fish to be: (in cms)

12, 13,5,8,9,20,16,14,14,6,9,12,12Draw a box and whisker plot based on medians.

Solution:Step 1: Rewrite the data in order, from smallest length to largest:

5,6,8,9,9,12,12,12,13,14,14,16,20Step 2: Now find the median of all the numbers. Notice that since there

are 13 numbers, the middle one will be the seventh number:i.e. 12

This must be the median (middle number) because there are six numbers on each side.

Step 3: Is to find the lower quartile. This is the middle of the lower six numbers. The exact centre is half-way between 8 and 9 ... which would be 8.5

Step 4: Now find the upper quartile. This is the middle of the upper six numbers. The exact centre is half-way between 14 and 14 ... which must be 14

Now we are ready to start drawing the actual box and whisker diagram

Step 5: Draw an ordinary number line that extends far enough in both directions to include all the numbers in your data: locate the 5 number…

5 10 15 20

Final box and whisker plot:

Modified version of the box and whisker plot

They do not typical contain the median and the quartiles though they do show the range of the data.

It is easier to compute the maximum, minimum, mean and the standard deviation of the data than it is to bin the data to compute the other variables. Especially, when the data has a probability density function (PDF) which is similar to that of the normal distribution.

The diagram normally includes the range, the mean, and value one standard deviation about the mean. These diagrams clearly show the location of 66% of the values by the range of the box. Recall in the traditional box and whisker diagram the bars show 50% of the data about the median.

Short-comings of this plot:

This method of display fails to show if the data does not have a near normal PDF.

For example: Highly skewed and bimodal data are more

difficult to discern using this data display method. The median and the traditional box-and-whisker diagram are often more representative when the data is bimodal.

Homework:

Investigate the Modified Box and whisker plot

Find out the difference between the two types of box and whisker plot

When and where should be use traditional and the modified box and whisker plot?

The End(Tashi Delek &

Have a Great Day)

top related