Top Banner
Graphical representation of data Compiled by Pedup Dukpa (For third year Paro College of Education students)
54

My PPT on graphical representation of data

Nov 15, 2014

Download

Documents

Pedup

This PPT was prepared for 3MPh, 3MPA, 3MPB, and 3MIT.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: My PPT on graphical representation of data

Graphical representation of data

Compiled by Pedup Dukpa(For third year Paro College of Education

students)

Page 2: My PPT on graphical representation of data

Now, Let’s get it started

What do we mean by graphs?Who invented this versatile device

called graphs?Why are graphs important?When do we use/apply graphs?How do we know what kinds of

graphs to use?Where do we usually see graphs?

Page 3: My PPT on graphical representation of data

My personal definition of graphs:

I believe graphs to be a universal language that speaks volumes in terms of visual representation of data; that can be easily read and understood by everyone.

Page 4: My PPT on graphical representation of data

Inventor of graphs:

William Playfair (1759-1823), Scottish engineer and political economist, is the principal inventor of statistical graphs.

In 1786, he published “Commercial and Political Atlas” that contained 44 charts.

Page 5: My PPT on graphical representation of data

He invented three of the four basic forms of graph:

The statistical line graphThe bar graphThe pie graph

Page 6: My PPT on graphical representation of data
Page 7: My PPT on graphical representation of data

Different forms/types of Graphical representation of data:

PictographLine graphBar graphHistogram DotsScatter diagramPie ChartDendrogram

Page 8: My PPT on graphical representation of data

Contd’

Frequency polygonOgiveStem and leafBox and whiskerDirected and undirected

graphsPolar coordinate graphsThree dimensional graphs etc.

Page 9: My PPT on graphical representation of data

Choosing the right graphs

As discussed in the previous class…

Skills and techniques required… are….????

Page 10: My PPT on graphical representation of data

Pictograph/ Pictorial graph:

A pictograph or pictorial graph involves categories and counts of the number of people or things in a category (frequency). The layout of the graph can be horizontal or vertical.

Page 11: My PPT on graphical representation of data

Purpose of Pictograph:

To simply and clearly illustrate a mathematical relation.

No attempt is made to show data points or errors on such a graph.

Here, we have two types of graphs:Concrete Object GraphSymbolic Graph

Page 12: My PPT on graphical representation of data

Bar Graphs:

Bar graph is a pictorial representation of frequency distribution of ungroup data by a number of bars (rectangles) of uniform width erected either vertically or horizontally with equal spacing between them.

For example:

Notice that all data does not fall evenly on a multiple of 20, in fact, the bar is in between two grid lines.

Bar graphs are useful to get an overall idea of trends in

responses

Activity #

Visit W/Friends 175

Talk on Phone 168

Play Sports 120

Earn Money 120

Use Computers 65

Page 13: My PPT on graphical representation of data

Example 1: The number of trees planted by Paro College of Education students in different years on June 2 is given below:

Years 1997

1998 1999 2000 2001 2002 Total

# of trees planted

400 450 700 750 900 1500 4700

Page 14: My PPT on graphical representation of data
Page 15: My PPT on graphical representation of data

For the class to do:

Problem 1:

The data below shows the number of students present in different classes on a particular day:

Represent/draw the above data as bar graph.

Page 16: My PPT on graphical representation of data

Solution:

Page 17: My PPT on graphical representation of data

Homework Question 1:

The data regarding causes of accidents in factories are given below:

Draw a bar graph to represent the data given above.

Page 18: My PPT on graphical representation of data

Interpretation/Reading of bar graphs:

Referring to Example 1: Read the bar graph and answer the following questions:

In which year was the maximum number of trees planted by the Paro College of Education students?

What trend does the number of trees planted show?

In which years, the number of trees planted differ by 50 only?

Page 19: My PPT on graphical representation of data
Page 20: My PPT on graphical representation of data

Homework questions on reading/interpretation of bar graph:

Referring to homework question 1: Answer the following questions:

Which cause is responsible for the maximum number of accidents in factories? Which cause is the minimum?

Can you think of one of the “other” causes?

How many percent of accident could have been avoided by timely action?

Page 21: My PPT on graphical representation of data

Histogram and frequency polygon:

Histogram is a graphical representation of a continuous frequency distribution (i.e. grouped frequency distribution with no space between the rectangles/bars. Traditionally, class-intervals are taken along the horizontal axis while the respective class frequencies are taken along the vertical axis.

Note: The areas of the rectangles are proportional to the frequencies.

Frequency polygon is formed by the joining of the mid-points of the tops of the adjoining rectangles in a histogram

Page 22: My PPT on graphical representation of data

For example:

Consider the following frequency distribution of weights of 30 students of class third year math-physic/IT students.

Draw a histogram and a frequency polygon based on the above data.

C.I. (in kgs)

45-50 50-55 55-60 60-65 65-70 Total

Frequency 3 7 12 5 3 30

Page 23: My PPT on graphical representation of data
Page 24: My PPT on graphical representation of data
Page 25: My PPT on graphical representation of data

Example of constructing the frequency polygon without the help of histogram:

If we were to draw a frequency polygon for the amount of pocket allowance that a student in third year math-physic/IT gets (remember, this is just arbitrary) provided the following data:

Pocket money

0-50 50-100 100-150

150-200

200-250

250-300

# of students

16 25 13 26 15 5

Page 26: My PPT on graphical representation of data
Page 27: My PPT on graphical representation of data
Page 28: My PPT on graphical representation of data

For the class to do:

Problem 2: The daily earnings of 100 shopkeepers in Paro Valley are given below:

Draw a histogram and a frequency polygon to represent the above data.

Daily earning (in Nu.)

200-300

300-400

400-500

500-600

600-700

700-800

800-900

# of shops

3 12 15 30 25 12 3

Page 29: My PPT on graphical representation of data

Solution:

Page 30: My PPT on graphical representation of data

Stem and Leaf Plot:

A stem and leaf plot is a graphical data analysis technique for summarizing the distributional information of a variable. It is similar to a histogram, but it preserves the original numeric values in the data. As such, it is an effective alternative to the histogram for small to moderate size data sets. However, it is not recommended for large data sets.

In a stem-and-leaf plot each data value is split into a "stem" and a "leaf".  The "leaf" is usually the last digit of the number and the other digits to the left of the "leaf" form the "stem".  The number 123 would be split as:

Stem Leaf

12 3

Page 31: My PPT on graphical representation of data

Constructing a stem and leaf plot:

The Math test scores out of 50 marks are as follows:  35, 36, 38, 40, 42, 42, 44, 45, 45, 47, 48, 49, 50, 50, 50.

Solution: The stem and leaf plot should look like,

Math Test Scores (out of 50 pts)

Stem Leaf

3 5 6 8 

4 0 2 2 4 5 5  7 8 9 

5 0 0 0 

A stem-and-leaf plot shows the shape and distribution of data.  It can be clearly seen in the diagram above that the data clusters around the row with a stem of 4.

Page 32: My PPT on graphical representation of data

Points to remember:

Leaf is the digit in the place farthest to the right in the number, and the stem is the digit, or digits, in the number that remain when the leaf is dropped.

To show a one-digit number (such as 7) using a stem-and-leaf plot, use a stem of 0 and a leaf of 7.

To find the median in a stem-and-leaf plot, count off half the total number of leaves.

Page 33: My PPT on graphical representation of data

For comparing two sets of data:

We use back-to-back stem-and-leaf plot.For example: The numbers 40, 42, and 43 are from Data Set A & the

numbers 41, 45, 46, and 47 are from Data Set B.Construct a back-to-back stem- and-leaf plot.Solution:

Data Set A   Data Set B

Leaf Stem Leaf

3 2 0 4 1 5 6 7

Page 34: My PPT on graphical representation of data

Advantage of stem and leaf plot:

The stem-and-leaf plot over the histogram is that the stem-and-leaf plot displays not only the frequency for each interval, but also displays all of the individual values within that interval.

Moreover, the median and mode are easily readable.

Page 35: My PPT on graphical representation of data

Home-Work on stem and leaf plot:

Construct a stem and leaf plot, find the median and mode of the data using the plot created.

Page 36: My PPT on graphical representation of data

Special Case: (when the one of the stem and leaf values are missing)

For example, take the following data set:

10, 11, 20, 21, 24, 27, 27, 27, 28, 28, 29, 29, 29, 31, 33, 33, 33, 33, 33, 39, 53

(Notice here, 40’s are missing) The stem and leaf plot would then be:

1|01 2|01477788999 3|1333339 4| 5|3

Even though the peak corresponds with the 20s cohort, it's clear that the most frequently occurring value is 33, and hence the mode, is 33.

Page 37: My PPT on graphical representation of data

BOX-AND-WHISKER PLOT / 5 NUMBER SUMMARY:

They allow people to explore data and to draw informal conclusions when two or more variables are present. It shows only certain statistics rather than all the data. Five-number summary is another name for the visual representations of the box-and-whisker plot. The five-number summary consists of the median, the quartiles, and the smallest and greatest values in the distribution. Immediate visuals of a box-and-whisker plot are the center, the spread, and the overall range of distribution.

There are two types of box and whisker plot: Traditional box and whisker plot Modified version of the box and whisker plot.

Page 38: My PPT on graphical representation of data

Traditional box and whisker/ The 5 Number Summary

The five number summary is another name for the visual representation of the box and whisker plot.

The five number summary consist of : The median ( 2nd quartile) The 1st quartile The 3rd quartile The maximum value in a data set The minimum value in a data set

Page 39: My PPT on graphical representation of data

Single middle value

Review on The Median

The median is the middle value of a set of data once the data has been ordered.

Example 1. Ugyen hits 11 balls at T/phu driving range. The recorded distances of his drives, measured in yards, are given below. Find the median distance for his/her drives.

85, 125, 130, 65, 100, 70, 75, 50, 140, 95, 70

Median drive = 85 yards

50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 140

Ordered data

Page 40: My PPT on graphical representation of data

Two middle values so take the mean.

Review on The Median

The median is the middle value of a set of data once the data has been ordered.

Example 2. Rinzin hit 12 balls at T/phu driving range. The recorded distances of his drives, measured in yards, are given below. Find the median distance for his/her drives.

85, 125, 130, 65, 100, 70, 75, 50, 140, 135, 95, 70

Median drive = 90 yards

50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 135, 140

Ordered data

Page 41: My PPT on graphical representation of data

Finding the median, quartiles and inter-quartile range.

12, 6, 4, 9, 8, 4, 9, 8, 5, 9, 8, 10

4, 4, 5, 6, 8, 8, 8, 9, 9, 9, 10, 12

Order the data

Inter- Quartile Range = 9 - 5½ = 3½

Example 3: Find the median and quartiles for the data below.

Lower Quartile = 5½

Q1

Upper Quartile = 9

Q3

Median = 8

Q2

Page 42: My PPT on graphical representation of data

Upper Quartile = 10

Q3

Lower Quartile = 4

Q1

Median = 8

Q2

3, 4, 4, 6, 8, 8, 8, 9, 10, 10, 15,

Finding the median, quartiles and inter-quartile range.

6, 3, 9, 8, 4, 10, 8, 4, 15, 8, 10 Order the data

Inter- Quartile Range = 10 - 4 = 6

Example 4: Find the median and quartiles for the data below.

Page 43: My PPT on graphical representation of data

4 5 6 7 8 9 10 11 12

MedianLower

QuartileUpper

QuartileLowest Value

Highest Value

BoxWhiskerWhisker

Anatomy of a Box and Whisker Diagram.

Note: plotting the median, lower quartile and upper quartile i.e. the box portion shows the range of middle 50% of the members with the median being the mid-point.

Page 44: My PPT on graphical representation of data

Lower Quartile = 5½

Q1

Upper Quartile = 9

Q3

Median = 8

Q2

4 5 6 7 8 9 10 11 12

4, 4, 5, 6, 8, 8, 8, 9, 9, 9, 10, 12

Example 5: Draw a Box plot for the data below

Drawing a Box Plot.

Page 45: My PPT on graphical representation of data

Upper Quartile = 10

Q3

Lower Quartile = 4

Q1

Median = 8

Q2

3, 4, 4, 6, 8, 8, 8, 9, 10, 10, 15,

Example 6: Draw a Box plot for the data below

Drawing a Box Plot.

3 4 5 6 7 8 9 10 11 12 13 14 15

Page 46: My PPT on graphical representation of data

Upper Quartile = 180

Qu

Lower Quartile = 158

QL

Median = 171

Q2

Question: Sonam recorded the heights in cm of boys in his class as shown below. Draw a box plot for this data.

Drawing a Box Plot.

137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, 186

130 140 150 160 170 180 190cm

Page 47: My PPT on graphical representation of data

2. The boys are taller on average.

Question: Tashi recorded the heights in cm of girls in the same class and constructed a box plot from the data. The box plots for both boys and girls are shown below. Use the box plots to choose some correct statements comparing heights of boys and girls in the class. Justify your answers.

Drawing a Box Plot.

130 140 150 160 170 180 190

Boys

Girls

cm

1. The girls are taller on average.

3. The girls show less variability in height.

4. The boys show less variability in height.

5. The smallest person is a girl

6. The tallest person is a boy

Page 48: My PPT on graphical representation of data

Problem for the class to do:

Suppose you caught 13 fish, during the after-math of the Paro Flood along the river side and you measured the length of the fish to be: (in cms)

12, 13,5,8,9,20,16,14,14,6,9,12,12Draw a box and whisker plot based on medians.

Solution:Step 1: Rewrite the data in order, from smallest length to largest:

5,6,8,9,9,12,12,12,13,14,14,16,20Step 2: Now find the median of all the numbers. Notice that since there

are 13 numbers, the middle one will be the seventh number:i.e. 12

This must be the median (middle number) because there are six numbers on each side.

Page 49: My PPT on graphical representation of data

Step 3: Is to find the lower quartile. This is the middle of the lower six numbers. The exact centre is half-way between 8 and 9 ... which would be 8.5

Step 4: Now find the upper quartile. This is the middle of the upper six numbers. The exact centre is half-way between 14 and 14 ... which must be 14

Now we are ready to start drawing the actual box and whisker diagram

Step 5: Draw an ordinary number line that extends far enough in both directions to include all the numbers in your data: locate the 5 number…

5 10 15 20

Page 50: My PPT on graphical representation of data

Final box and whisker plot:

Page 51: My PPT on graphical representation of data

Modified version of the box and whisker plot

They do not typical contain the median and the quartiles though they do show the range of the data.

It is easier to compute the maximum, minimum, mean and the standard deviation of the data than it is to bin the data to compute the other variables. Especially, when the data has a probability density function (PDF) which is similar to that of the normal distribution.

The diagram normally includes the range, the mean, and value one standard deviation about the mean. These diagrams clearly show the location of 66% of the values by the range of the box. Recall in the traditional box and whisker diagram the bars show 50% of the data about the median.

Page 52: My PPT on graphical representation of data

Short-comings of this plot:

This method of display fails to show if the data does not have a near normal PDF.

For example: Highly skewed and bimodal data are more

difficult to discern using this data display method. The median and the traditional box-and-whisker diagram are often more representative when the data is bimodal.

Page 53: My PPT on graphical representation of data

Homework:

Investigate the Modified Box and whisker plot

Find out the difference between the two types of box and whisker plot

When and where should be use traditional and the modified box and whisker plot?

Page 54: My PPT on graphical representation of data

The End(Tashi Delek &

Have a Great Day)