7/28/2019 Stats Lec1
1/29
andFinance
7/28/2019 Stats Lec1
2/29
2nd and3rd year
course
Introductiontostatisticalprinciplesandtechniques
Core
text:
Barrow
Emphasisonpracticalapplications
Assessment:project
7/28/2019 Stats Lec1
3/29
Keydefinitions
Population:the
entire
set
of
observations/census
Sample:asubgroupofthepopulation
denotedbyGreekcharacterse.g. and2
Stat st c:anestimateo t eparameterca cu ate usingt esamp e
denotedby
normal
characters
e.g.
and
s
2
x
7/28/2019 Stats Lec1
4/29
DescriptiveStatistics
Descriptive
statistics
summarize a
mass
of
information
Wemayusegraphical and/ornumerical methods
Exam les of ra hical methods are the bar chart XY chart ie chart
andhistogram(seeseminar1andworkshop1forpractice)
Examplesofnumericalmethodsareaverages andstandard
deviations
7/28/2019 Stats Lec1
5/29
NumericalTechniques
Weexamine
measures
of
Location
spers on
7/28/2019 Stats Lec1
6/29
Measuresof
Location
Mean strictl
thearithmetic
mean
the
well
known
avera e
Median e.g.theincomeofthepersoninthemiddleofthe
distribution
. .
These
different
measures
can
give
different
answers
7/28/2019 Stats Lec1
7/29
e
eano
t e
ncome
str ut on
Person 1 2 3 4 5 6 7 8 9 10income 15 15 20 25 45 55 70 85 125 250
705x
.10 n
Mean income is therefore 70,500 per year
NB: use if data is the whole population , or if the data is a sample
same formula .
x
7/28/2019 Stats Lec1
8/29
TheMedian
Theincomeofthemiddleperson i.e.theonelocated
halfway
through
the
distribution
Poorest Richest
This persons income
,
7/28/2019 Stats Lec1
9/29
Calculatingthe
Median
Wehave10observationsinthesample,sotheperson5.5in
rank
order has
the
median
wealth.
This
person
is
somewhere
, ,
Person 1 2 3 4 5 6 7 8 9 10
ence eme an ncome s , peryear
Q:
what
happens
to
the
median
if
the
richest
persons
income
Q:whathappenstothemean?
7/28/2019 Stats Lec1
10/29
e
o e
Themodeistheobservationwiththehighestfrequency
,
Itispossibletohaveasampleorpopulationwithnomode,ormore
than1mode
. .
7/28/2019 Stats Lec1
11/29
Measuresof
Dispersion
ange e erence e weensma es an arges o serva on.Notveryinformativeformostpurposes
Variance basedon
all
observations
in
the
population
or
sample
7/28/2019 Stats Lec1
12/29
TheVariance
Thevarianceistheaverageofallsquareddeviationsfromthemean:
n
x 2
NB:use2 forpopulationvariance;forsamplevarianceuses2 and
7/28/2019 Stats Lec1
13/29
TheVariance
(cont.)
fSmall
variance
variance
7/28/2019 Stats Lec1
14/29
Calculatin theSam leVariance22 xnx
n
i
1
ns
x 15 15 20 25 45 55 70 85 125 250
x2
225 225 400 625 2025 3025 4900 7225 15625 62500
497055.7010;9677562500....225225 221
2 xnxi i
5230
9
49705967752
s
NB: Variance is in 2, so we use the square root, known as the
standard deviation, s. s=72.318, i.e. 72,318.
7/28/2019 Stats Lec1
15/29
StandardDeviation
Usefultohelpusestimate
a)The%ofobs.thatliewithinagivennumberofstandarddeviations
b)Whereaparticularobservationliesrelativetothemean
7/28/2019 Stats Lec1
16/29
Cheb shevsRule
100(11/k2)%ofobservationsliewithink standarddeviationsabove
an e ow emean.
e.g.100*[11/(22)]%=75%ofobs.liewithin2s.d.s eithersideofthemean
75%
-1s-2s +1s +2s
7/28/2019 Stats Lec1
17/29
IftheunderlyingdistributionisNormal(morenextweek),then
68%ofobservationsliewithin 1st.devs
95%ofobservationsliewithin 2st.devs
99%ofobservationsliewithin 3st.devs
7/28/2019 Stats Lec1
18/29
zscorestellushowmanystandarddeviationsanobservationliesaboveorbelowthemean
xz
z>0meansthattheobservationliesabovethemean
z< means a eo serva on es e ow emean
. . .
5565
Thus 65isexactl 1st.dev.abovethemean
10
7/28/2019 Stats Lec1
19/29
Summary
Wecanusegraphicalandnumericalmeasurestosummarisedata
Theaimistosimplifywithoutdistortingthemessage
dispersion[variance,
standard
deviation,
zscores]
provide
agood
descriptionofthedata
7/28/2019 Stats Lec1
20/29
statisticswhenthedataisgrouped
7/28/2019 Stats Lec1
21/29
DataonWealthintheUK
a e . e s r u on o wea , ,
7/28/2019 Stats Lec1
22/29
Themean
of
the
Wealth
Distribution
mid-point,Range x f fx
0 5.0 3,417 17,085.0
10,000 17.5 1,303 22,802.525 000 32.5 1 240 40 300.0
40,000 45.0 714 32,130.0
50,000 55.0 642 35,310.0
60,000 70.0 1,361 95,270.0
, . , , .
100,000 125.0 2,708 338,500.0150,000 175.0 1,633 285,775.0
200,000 250.0 1,242 310,500.0
, . , .
500,000 750.0 367 275,250.0
1,000,000 1500.0 125 187,500.0
2,000,000 3000.0 41 123,000.0o a , , , .
443.1335.722,225,2 fx
933,16
7/28/2019 Stats Lec1
23/29
Calculatingthe
Median
16,933observations,henceperson8,466.5inrankorder hasthemedianwealth
Range Frequency
Cumulat ive
frequency
0 3,417 3,417
10,000 1,303 4,720
25,000 1,240 5,960
40,000 714 6,674
50 000 642 7 316
Number with wealthless than 60k
60,000 1,361 8,677
80,000 1,270 9,947
Number with wealthless than 80k
: : :
7/28/2019 Stats Lec1
24/29
Calculatingthe
Median
(cont.)
Tofindtheprecisemedian,use
xxx LUL
2
316,7
933,16 907.76
361,1608060
Medianwealthis76,907
7/28/2019 Stats Lec1
25/29
TheMode
(cont.)
Forgroupe ata,t emo ecorrespon stot e nterva w t greatestfrequencydensity
Range Frequency
width
u y
density
, , .
10,000 1,303 15,000 0.0869Modalclass
25,000 1,240 15,000 0.0827
40,000 714 10,000 0.0714
50,000 642 10,000 0.0642
,
7/28/2019 Stats Lec1
26/29
TheVariance
Thevarianceistheaverageofallsquareddeviationsfromthemean:
xf2
2
Thelargerthisvalue,thegreaterthedispersionoftheobservations
7/28/2019 Stats Lec1
27/29
RangeMid-point
(000) Frequency, f
Deviation
(x - ) (x - )2
f(x - )2
0 5.0 3,417- 126.4 15,987.81 54,630,329.97
10,000 17.5 1,303- 113.9 12,982.98 16,916,826.55
25,000 32.5 1,240- 98.9 9,789.70 12,139,223.03
40,000 45.0 714- 86.4 7,472.37 5,335,274.81
50,000 55.0 642- 76.4 5,843.52 3,751,537.16
60,000 70.0 1,361- 61.4 3,775.23 5,138,086.73
80,000 90.0 1,270- 41.4 1,717.51 2,181,241.95
100,000 125.0 2,708- 6.4 41.51 112,411.42
150,000 175.0 1,633 43.6 1,897.22 3,098,162.88
200,000 250.0 1,242 118.6 14,055.79 17,457,288.35300,000 400.0 870 268.6 72,122.92 62,746,940.35
500,000 750.0 367 618.6 382,612.90 140,418,932.52
1,000,000 1500.0 125 1,368.6 1,872,948.56 234,118,569.53
2,000,000 3000.0 41 2,868.6 8,228,619.88 337,373,415.02
Total 16,933 895,418,240.28
07.880,52
933,16
28.240,418,89522
f
xf
7/28/2019 Stats Lec1
28/29
TheStandard
Deviation
Thevarianceismeasuredinsquareds(becauseweusedsquareddeviations)
deviation
957.22907.880,52 ,
7/28/2019 Stats Lec1
29/29
SampleMeasures
Forsamp e ata,use
2
2 xxf 1 n
Thisgives
an
unbiased
estimate of
the
population
variance
Takethesquarerootofthisforthesamplestandarddeviation