Top Banner

of 57

BIOSTAT Chapter2

Jul 06, 2018

Download

Documents

denpaspasan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/17/2019 BIOSTAT Chapter2

    1/57

    Chapter 2:Frequency

    Distributions

    1

  • 8/17/2019 BIOSTAT Chapter2

    2/57

    Frequency Distributions

    After collecting data, the first task for aresearcher is to organize and simplify thedata so that it is possible to get a generaloverview of the results. This is the goal ofdescriptive statistical techniques.

     

    One method for simplifying and organizing data

    is to construct a frequency distribution.

    2

  • 8/17/2019 BIOSTAT Chapter2

    3/57

    Frequency Distributions (cont.)

     A frequency distribution is an organizedtabulation showing exactly how manyindividuals are located in each category on

    the scale of measurement.  A frequency distribution presents an

    organized picture of the entire set ofscores, and it shows where eachindividual is located relative to others inthe distribution.

    3

  • 8/17/2019 BIOSTAT Chapter2

    4/57

       A table that organizes data values into classesor intervals along with number of values that

    fall in each class fre!uency, f ".

    1. Ungrouped requency !istribution " fordata sets with few different values. #ach

    value is in its own class.

    $. %rouped requency !istribution& for datasets with many different values, which

    are grouped together in the classes.

    FREQUENCY DISTRIBUTIONS(CONT.)

  • 8/17/2019 BIOSTAT Chapter2

    5/57

    Grouped and Ungrouped

    Frequency Distributions

    CoursesTaken

    Frequency,  f 

    1 25

    2 38

    3 217

    4 1462

    5 932

    6 15

    Ungrouped 

    Age ofVoters

    Frequency,  f 

    18-3 22

    31-42 58

    43-54 62

    55-66 413

    67-78 158

    78-9 32

    %rouped

  • 8/17/2019 BIOSTAT Chapter2

    6/57

    Ungrouped requency !istributions

     !u"#er of $eas %n a $ea $o&'a"()e '%*e+ 5

    5 5 4 6 4

    3 7 6 3 5

    6 5 4 5 5

    6 2 3 5 5

    5 5 7 4 3

    4 5 4 5 6

    5 1 6 2 6

    6 6 6 6 4

    4 5 4 5 3

    5 5 7 6 5

    $eas (er (o& Freq, f  $eas (er (o&

    Freq,f 

    1 1

    2 2

    3 5

    4 9

    5 18

    6 12

    7 3

  • 8/17/2019 BIOSTAT Chapter2

    7/57

    Frequency Distribution Tables

     A frequency distribution table consists of at leasttwo columns # one listing categories on the scale ofmeasurement $" and another for fre!uency f".

    %n the $ column, values are listed from the highest

    to lowest, without s&ipping any. 'or the fre!uency column, tallies are determined

    for each value (how often each X value occursin the data set). (hese tallies are the fre!uenciesfor each $ value.

    (he sum of the fre!uencies should e!ual n.

    )

  • 8/17/2019 BIOSTAT Chapter2

    8/57

    Grouped Frequency

    Distribution Soeties! ho"e#er! a set o$ scores co#ersa "ide range o$ #alues. %n these situations!a list o$ all the & #alues "ould be quite long

    ' too long to be a siple presentation o$the data.

     To reedy this situation! a grouped

    frequency di!ri"u!ion table is used.

  • 8/17/2019 BIOSTAT Chapter2

    9/57

    Grouped Frequency Distribution(cont.)

    %n a grouped table! the & colun listsgroups o$ scores! called c#$ in!er%$#!rather than indi#idual #alues.

     These inter#als all ha#e the sae "idth!usually a siple nuber such as 2! *! +,!and so on.

    -ach inter#al begins "ith a #alue that is aultiple o$ the inter#al "idth. The inter#al"idth is selected so that the table "ill ha#eapproiately ten inter#als.

  • 8/17/2019 BIOSTAT Chapter2

    10/57

    /ey Concepts:

      Data in its original $or and structureare called r$& d$!$' ungrouped d$!$.

      E$p#e* Conider !+e fo##o&ing d$!$on ,- &oen !e!o!erone eru #e%e#

    (core) e$ured in g'd#.

    /0 12 1, 34 34 ,2 /4 ,1 ,3 50

    02 3, 50 24 5, ,5 55 ,5 /3 /4

    ,- 56 56 /4 12 ,0 34 15 /, ,-

    ,0 35 21 55 // 32 21 ,2 32 30

  • 8/17/2019 BIOSTAT Chapter2

    11/57

    0 To con!ruc! $ frequencydi!ri"u!ion of !+e gi%en r$& d$!$7&e 8r! 8nd !+e +ig+e! eru

    #e%e# $nd prep$re $ co#un of !+e!+ee #e%e# "eginning fro !+e+ig+e! %$#ue $nd ending $! !+e#o&e! one. Since !+e +ig+e!eru #e%e# i 02 $nd !+e #o&e! i247 &e +$%e*

  • 8/17/2019 BIOSTAT Chapter2

    12/57

    Seru9e%e#

    f Seru9e%e#

    1 % 32 %%%

    4* % *1 %%

    45 % ** %

    4 %% *5 %

    61 % *+ %%66 % 51 %%

    63 % 54 %

    65 % 53 %

    62 %%% 5* %%

    31 % 5 %%

    3* % 5, %%

    35 % 4 %%

    3 %% 2 %

  • 8/17/2019 BIOSTAT Chapter2

    13/57

    7hen these data are placed into a syste "herein theyare organi8ed! then these parta9e the nature o$grouped d$!$. This procedure o$ organi8ing data into

    groups is called a frequency di!ri"u!ion !$"#e(FDT).

    E$p#e*

    The $ollo"ing presents a $requency distribution table o$

    the grouped data o$ the urine aylase (scores) o$ +*patients in aylase unitshour.

    SCORES FREQUENCY  

    +,'+1 *

    2,'21 5

    ,'1

    5,'1 2

    *,'*1 +

    +*

  • 8/17/2019 BIOSTAT Chapter2

    14/57

    Coponen! of $ Frequency T$"#e

    C#$ In!er%$#: these are the nubers de;ning theclass< consists o$ the end nubers called the c#$#ii! naely the upper #ii! and the #o&er #ii!.

    C#$ frequency: sho"s the nuber o$ obser#ation

    $alling in the class C#$ Bound$rie: these are the so called true class

    liits! classi;ed as:

    9o&er C#$ Bound$ry (9CB): de;ned as the

    iddle #alue o$ the lo"er class liit o$ the classand the upper class liit o$ the preceding class

    Upper C#$ Bound$ry: de;ned as theiddle #alue bet"een the upper class liit o$

    the class and the lo"er liit o$ the net class

  • 8/17/2019 BIOSTAT Chapter2

    15/57

    C#$ i;e: the di=erence bet"een t"o consecuti#eupper liits or t"o consecuti#e lo"er liits

    C#$$r

  • 8/17/2019 BIOSTAT Chapter2

    16/57

  • 8/17/2019 BIOSTAT Chapter2

    17/57

    Steps in ConstructingFDT:

     Step +) Deterine the nuber o$ classes. For ;rstapproiation! it is suggested to use the STUR>ESAROI=ATION FOR=U9A.

     

    62.244 #og n

    "here $pproi$!e nu"er of c#$e

      n nu"er of c#$ 

    # +

  • 8/17/2019 BIOSTAT Chapter2

    18/57

    E$p#e* Ge no& con!ruc! !+e FDTof !+e !e!o!erone eru #e%e# of ,-&oen $ +o&n in !+e r$& d$!$*

  • 8/17/2019 BIOSTAT Chapter2

    19/57

    Step 2) Deterine the range @:

    "here R $iu %$#ue:iniu %$#ue

    R $:in

    R 02:24

    R 36

  • 8/17/2019 BIOSTAT Chapter2

    20/57

    Step ) Deterine the approiate class si8e C usingthe $orula

      C R'

    where R= range & K= Sturges Approximation

    Formula

    No!e* %t is usually con#enient to round o= C to anearest "hole nuber.

    C R' C 36'3

    C 6-.63/

    C6-

  • 8/17/2019 BIOSTAT Chapter2

    21/57

    Step 5) Deterine the lo"est class inter#al (or the;rst class). This class should include the iniu#alue in the data set. For uni$ority! let us agreethat $or our purposes! the lo"er liit o$ the lo"est

    class inter#al should start at the iniu #alue.

    9e! u decide !o !$r! $! !+e iniu%$#ue. T+u !+e #o&e! c#$ i !+ec#$ 24:,6.

  • 8/17/2019 BIOSTAT Chapter2

    22/57

    Step *) Deterine all class liits by adding theclass si8e C to the liits o$ the pre#ious class.

     The classes constructed by adding +, each class

    liit. Thus "e ha#e:

    24 ,6

    ,4 56

    54 36

    34 /6

    /4 16

    14 06

    04 6-6

    Step 3: Tally the scores or obser#ation $alling in each class

  • 8/17/2019 BIOSTAT Chapter2

    23/57

    Step 3: Tally the scores or obser#ation $alling in each class.

    C#$e

    24 : ,6

    ,4 : 56

    54 : 36

    34 : /6

    /4 : 16

    14 : 06

    04 :6-6

    T$##y

    %%%%

    %%%%'%%%%

    %%%%

    %%%%'%%%

    %%%%'%%%

    %%%%

    %

    Frequency

    5

    0

    ,

    1

    1

    5

    6

    N,-

  • 8/17/2019 BIOSTAT Chapter2

    24/57

    T+e fo##o&ing !$"#e preen! !+e cop#e!e frequencydi!ri"u!ion !$"#e indic$!ing !+e c#$ "ound$rie7 !+ec#$ $r

  • 8/17/2019 BIOSTAT Chapter2

    25/57

    Frequency Distribution

    Graphs %n a frequency di!ri"u!ion gr$p+! the score

    categories (& #alues) are listed on the & ais

    and the $requencies are listed on the A ais. G+en !+e core c$!egorie coni! of

    nueric$# core fro $n in!er%$# or r$!ioc$#e7 !+e gr$p+ +ou#d "e ei!+er $

    +i!ogr$ or $ po#ygon.

  • 8/17/2019 BIOSTAT Chapter2

    26/57

    Bistogras

    %n a +i!ogr$! a bar is centered abo#e eachscore (or class inter#al) so that the height o$the bar corresponds to the $requency and the

    "idth etends to the real liits! so thatadacent bars touch.

  • 8/17/2019 BIOSTAT Chapter2

    27/57

  • 8/17/2019 BIOSTAT Chapter2

    28/57

    olygons

    %n a po#ygon! a dot is centered abo#e eachscore so that the height o$ the dot

    corresponds to the $requency. The dots arethen connected by straight lines. Enadditional line is dra"n at each end to bringthe graph bac9 to a 8ero $requency.

    2*

  • 8/17/2019 BIOSTAT Chapter2

    29/57

  • 8/17/2019 BIOSTAT Chapter2

    30/57

    ?ar graphs

    7hen the score categories (& #alues) areeasureents $ro a noinal or an

    ordinal scale! the graph should be a bargraph.

    E "$r gr$p+  is ust li9e a histograecept that gaps or spaces are le$t

    bet"een adacent bars.

    3+

  • 8/17/2019 BIOSTAT Chapter2

    31/57

  • 8/17/2019 BIOSTAT Chapter2

    32/57

    @elati#e $requency

    any populations are so large that it isipossible to 9no" the eact nuber o$

    indi#iduals ($requency) $or any speci;ccategory.

    %n these situations! population distributionscan be sho"n using re#$!i%e frequency 

    instead o$ the absolute nuber o$ indi#iduals$or each category.

    32

  • 8/17/2019 BIOSTAT Chapter2

    33/57

  • 8/17/2019 BIOSTAT Chapter2

    34/57

    Sooth cur#e

    %$ the scores in the population are easured onan inter#al or ratio scale! it is custoary to

    present the distribution as a oo!+ cur%e rather than a agged histogra or polygon.

     The sooth cur#e ephasi8es the $act that thedistribution is not sho"ing the eact $requency$or each category.

    3

  • 8/17/2019 BIOSTAT Chapter2

    35/57

  • 8/17/2019 BIOSTAT Chapter2

    36/57

    Frequency distribution

    graphs Frequency distribution graphs are use$ul

    because they sho" the entire set o$ scores.

    Et a glance! you can deterine the highestscore! the lo"est score! and "here the scoresare centered.

     The graph also sho"s "hether the scores areclustered together or scattered o#er a "ide

    range.

    3-

  • 8/17/2019 BIOSTAT Chapter2

    37/57

    Shape

    E graph sho"s the +$pe o$ the distribution.

    E distribution is ye!ric$# i$ the le$t sideo$ the graph is (roughly) a irror iage o$ theright side.

    ne eaple o$ a syetrical distribution isthe bell'shaped noral distribution.

    n the other hand! distributions are

  • 8/17/2019 BIOSTAT Chapter2

    38/57

    ositi#ely andIegati#elyS9e"ed Distributions

    %n a poi!i%e#y

  • 8/17/2019 BIOSTAT Chapter2

    39/57

  • 8/17/2019 BIOSTAT Chapter2

    40/57

    ercentiles! ercentile

    @an9s!and %nterpolation  The relati#e location o$ indi#idual scores

    "ithin a distribution can be described bypercentiles and percentile ran9s.

     The percen!i#e r$n

  • 8/17/2019 BIOSTAT Chapter2

    41/57

    ercentiles! ercentile

    @an9s!and %nterpolation (cont.)  To ;nd percentiles and percentile ran9s! t"o ne"

    coluns are placed in the $requency distribution table:ne is $or cuulati#e $requency (c$) and the other is $orcuulati#e percentage (cJ).

    -ach cuulati#e percentage identi;es the percentileran9 $or the upper real liit o$ the corresponding scoreor class inter#al. 7hen scores or percentages do notcorrespond to upper real liits or cuulati#epercentages! you ust use interpolation to deterinethe corresponding ran9s and percentiles. In!erpo#$!ion is a atheatical process based on the assuption thatthe scores and the percentages change in a regular!linear $ashion as you o#e through an inter#al $ro oneend to the other.

    1

  • 8/17/2019 BIOSTAT Chapter2

    42/57

    %nterpolation

    7hen scores or percentages do notcorrespond to upper real liits orcuulati#e percentages! you ust useinterpolation to deterine thecorresponding ran9s and percentiles.

    In!erpo#$!ion  is a atheatical processbased on the assuption that the scores

    and the percentages change in a regular!linear $ashion as you o#e through aninter#al $ro one end to the other.

    2

  • 8/17/2019 BIOSTAT Chapter2

    43/57

  • 8/17/2019 BIOSTAT Chapter2

    44/57

    Ste'and'>ea$ Displays

    E !e:$nd:#e$f dip#$y pro#ides a #eryeKcient ethod $or obtaining anddisplaying a $requency distribution.

    -ach score is di#ided into a !e consisting o$ the ;rst digit or digits! and a

    #e$f  consisting o$ the ;nal digit. Finally! you go through the list o$ scores!

    one at a tie! and "rite the lea$ $or eachscore beside its ste.

     The resulting display pro#ides anorgani8ed picture o$ the entire distribution. The nuber o$ lea$s beside each stecorresponds to the $requency! and theindi#idual lea$s identi$y the indi#idualscores.

  • 8/17/2019 BIOSTAT Chapter2

    45/57

    D i ti St ti ti

  • 8/17/2019 BIOSTAT Chapter2

    46/57

    Descripti#e Statistics

    Class E''%Ls o$ + Students

    +,2 ++*

    +24 +,1++ 41

    14 +,3

    +5, ++1

    1 16++,

    Class ?''%Ls o$ + Students

    +26 +32

    ++ +,13 +++

    4, +,1

    1 46

    +2, +,*+,1

    'a"()e ))ustrat%on+

    .%c. /rou( %s '"arter0

    Each individual may be different. If you try to understand agroup by remembering the qualities of each member, you

     become overwhelmed and fail to understand the group.

  • 8/17/2019 BIOSTAT Chapter2

    47/57

    Descripti#e Statistics

     Which group is smarter now?

    Class A--Average IQ Class B--Average IQ

      110.54 110.23

     They’re roughly the same!

     With a summary descriptive statistic, it ismuch easier to answer our question.

  • 8/17/2019 BIOSTAT Chapter2

    48/57

    ther Graphs

    Beide Hi!ogr$7 !+ere $re o!+ere!+od of gr$p+ing qu$n!i!$!i%ed$!$*

    0 S!e $nd 9e$f #o!0 Do! #o!

    0 Tie Serie

  • 8/17/2019 BIOSTAT Chapter2

    49/57

    Ste and >ea$ lots

    Repreen! d$!$ "y ep$r$!ing e$c+ d$!$ %$#ue in!o !&op$r!* !+e !e (uc+ $ !+e #ef!o! digi!) $nd !+e #e$f (uc+$ !+e rig+!o! digi!)

    Larson/Farber 4th ed.   49

  • 8/17/2019 BIOSTAT Chapter2

    50/57

    Constructing Ste and >ea$

    lots0 Split each data #alue at the sae place #alue to $or the

    !e and a #e$f . (7ant *'2, stes).

    Errange all possible stes #ertically so there are noissing stes.

    7rite each lea$ to the right o$ its ste! in order.

    Create a 9ey to recreate the data.

    Mariations o$ ste plots:1. Split stems

    2. Back to back stem plots.

    +

    C t ti St d

  • 8/17/2019 BIOSTAT Chapter2

    51/57

    Constructing a Ste'and'>ea$ lot

    1

  • 8/17/2019 BIOSTAT Chapter2

    52/57

    Dot lots

    Do! p#o!

    0 Consists o$ a graph in "hich each data #alue isplotted as a point along a scale o$ #alues

    igure $'(

    i i

  • 8/17/2019 BIOSTAT Chapter2

    53/57

     Tie Series(aired data)

    Tie Serie

    Data set is coposed o$ quantitati#e entriesta9en at regular inter#als o#er a period o$ tie.

    e.g.! The aount o$ precipitation easuredeach day $or one onth.

    Use a !ie erie c+$r! to graph.

    tie

       L  u  a  n   t   i   t  a   t   i  #

      e

       d  a

       t  a

    Time')eries %raph

  • 8/17/2019 BIOSTAT Chapter2

    54/57

    Time )eries %raph

    /umber of 0creens at rive#%n ovies

    (heaters

    igure $'*

  • 8/17/2019 BIOSTAT Chapter2

    55/57

    Graphing Lualitati#e Data Sets

    ie C+$r!

    E circle is di#ided intosectors that representcategories.

    +areto hart

     A vertical bar graph in which theheight of each bar represents

    fre!uency or relative fre!uency.

    Categories

       F  r  e  q  u  e  n

      c  y

    C t ti i Ch t

  • 8/17/2019 BIOSTAT Chapter2

    56/57

    Constructing a ie Chart

    Find !+e !o!$# $p#e i;e.

    Con%er! !+e frequencie !o re#$!i%e frequencie (percen!).

    -

    ar%ta) 'tatus Frequency, f( %n "%))%ons

    e)at%e frequency

     !eer arr%e& 553

    arr%e& 1277

    %&oe& 139

    %orce& 228

    (otal4 215.)

    55325 or 25

    2197≈

    1277

    2197≈

    139

    2197≈

    228

    2197≈

  • 8/17/2019 BIOSTAT Chapter2

    57/57

    Constructing areto Charts

    0 Create a bar $or each category! "here the heighto$ the bar can represent $requency or relati#e$requency.

     The bars are o$ten positioned in order o$

    decreasing height! "ith the tallest bar positionedat the le$t.