Stat 31, Section 1, Last Time • Distributions (how are data “spread out”?) • Visual Display: Histograms • Binwidth is critical • Bivariate display: scatterplot • Course Organization & Website https://www.unc.edu/%7Emarron/UNCstat31-2005/Stat31sec1Hom e.html
26
Embed
Stat 31, Section 1, Last Time Distributions (how are data “spread out”?) Visual Display: Histograms Binwidth is critical Bivariate display: scatterplot.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Increasing Variation (appears proportional to trend)
• “Seasonal Effect” - 12 Month Cycle(Peak in summer, less in winter)
Airline Passengers Example
Interesting variation: log transformation
• Stabilizes variation
• Since log of product is sum
• Shows changing variation prop’l to trend
• Log10 is “most interpretable”
(log10(1000) = 3, …)
• Generally useful trick (there are others)
Airline Passengers Example
A look under the hoodhttps://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg5Raw.xls
• Use Chart Wizard
• Chart Type: Line (or could do XY)
• Use subtype for points & lines
• Use menu for first log10
• Although could just type it in
• Drag down to repeat for whole column
Time Series HW
HW: 1.36, 1.37
• Use EXCEL
Exploratory Data Analysis 5
Numerical Summaries of Quant. Variables:
Idea: Summarize distributional information
(“center”, “spread”, “skewed”)
In Text, Sec. 1.2
for data
(subscripts allow “indexing numbers” in list)nxxx ,...,, 21
Numerical Summaries
A. “Centers” (note there are several)
1. “Mean” = Average =
• Greek letter “Sigma”, for “sum”
In EXCEL, use “AVERAGE” function
nxx n 1
xxn
iin
1
1
Numerical Summaries of Center
2. “Median” = Value in middle (of sorted list)
Unsorted E.g: Sorted E.g:
3 0
1 1
27 “in middle”? (no) 2 better “middle”!
2 3
0 27
EXCEL: use function “MEDIAN”
Difference Betw’n Mean & Median
Symmetric Distribution: Essentially no difference
Right Skewed:
50% area 50% area
M
bigger since “feels tails more strongly”x
Difference Betw’n Mean & Median
Outliers (unusual values):
Nice Web Example:
http://www.stat.sc.edu/~west/applets/box.html• Mean feels outliers much more strongly• Leaves “range of most of data”• Good notion of “center”? (perhaps not)• Median affected very minimally• Robustness Terminology: