Ch 3 – Examining RelationshipsCh 3 – Examining Relationships
YMS – 3.1YMS – 3.1
ScatterplotsScatterplots
Some VocabularySome Vocabulary
Response Variable Response Variable – Measures an outcome of a studyMeasures an outcome of a study– AKA dependent variableAKA dependent variable
Explanatory Variable Explanatory Variable – Attempts to explain the observed outcomesAttempts to explain the observed outcomes– AKA independent variableAKA independent variable
Scatterplot Scatterplot – Shows the relationship between two quantitative Shows the relationship between two quantitative
variables measured on the same individuals variables measured on the same individuals
ScatterplotsScatterplotsExaminingExamining– Look for overall pattern and any deviationsLook for overall pattern and any deviations– Describe pattern with form, strength, and directionDescribe pattern with form, strength, and direction
Drawing Drawing – Uniformly scale the vertical and horizontal axesUniformly scale the vertical and horizontal axes– Label both axesLabel both axes– Adopt a scale that uses the entire available gridAdopt a scale that uses the entire available grid
Categorical Variables Categorical Variables – Add a different color/shape to distinguish between Add a different color/shape to distinguish between
categorical variables categorical variables
Classwork p125 #3.7, 3.10-3.11Classwork p125 #3.7, 3.10-3.11Homework: #3.16, 3.22 and 3.2 BlueprintHomework: #3.16, 3.22 and 3.2 Blueprint
YMS – 3.2YMS – 3.2
CorrelationCorrelation
CorrelationCorrelation
Measures the direction and strength of the Measures the direction and strength of the linear relationship between two linear relationship between two quantitative variables quantitative variables
Facts About CorrelationFacts About Correlation
Makes no distinction between explanatory and Makes no distinction between explanatory and response variablesresponse variables
Requires both variables be quantitativeRequires both variables be quantitative
Does not change units when we change units of Does not change units when we change units of measurementmeasurement
Sign of Sign of rr indicates positive or negative association indicates positive or negative association
rr is inclusive from -1 to 1 is inclusive from -1 to 1
Only measures strength of linear relationshipsOnly measures strength of linear relationships
Is not resistant Is not resistant
In Class ExercisesIn Class Exercises
p146 #3.28, 3.34 and 3.37p146 #3.28, 3.34 and 3.37
Correlation Guessing GameCorrelation Guessing Game
HomeworkHomework
3.3 Blueprint3.3 Blueprint
YMS – 3.3YMS – 3.3
Least-Square RegressionLeast-Square Regression
RegressionRegression
Regression LineRegression Line– Describes how a response variable Describes how a response variable yy changes as an changes as an
explanatory variable explanatory variable xx changes changes
LSRL of LSRL of yy on on xx – Makes the sum of the squares of the vertical Makes the sum of the squares of the vertical
distances of the data points from the line as small as distances of the data points from the line as small as possible possible
Line should be as close as possible to the points Line should be as close as possible to the points in the vertical direction in the vertical direction – Error = Observed (Actual) – Predicted Error = Observed (Actual) – Predicted
LSRLLSRL
Equation of the LSRLEquation of the LSRL
SlopeSlope InterceptIntercept
Coefficient of determination – rCoefficient of determination – r22
The fraction of the variation in the values of The fraction of the variation in the values of yy that is explained by the least-squares regression that is explained by the least-squares regression of of yy on on xx
Measures the contribution of Measures the contribution of xx in predicting in predicting yy
If If xx is a poor predictor of is a poor predictor of yy, then the sum of the , then the sum of the squares of the deviations about the mean squares of the deviations about the mean (SST) and the sum of the squares of deviations (SST) and the sum of the squares of deviations about the regression line (SSE) would be about the regression line (SSE) would be approximately the same. approximately the same.
Understanding r-squared: Understanding r-squared: A single point simplificationA single point simplification
Al CoonsBuckingham Browne & Nichols School
Cambridge, [email protected]
Error eliminated by y-hat model
yError w.r.t. mean model
Error eliminated by y-hat model
Proportion of error eliminated by Y-hat model
Error w.r.t. mean model=
r2 = proportion of variability accounted for by the given model (w.r.t the mean model).
Error eliminated by y-hat model
yError w.r.t. mean model
Proportion of error eliminated by Y-hat model
Error w.r.t. mean model=
= ~
Error eliminated by y-hat model
Facts about Facts about Least-Squares Regression Least-Squares Regression
Distinction between explanatory and response Distinction between explanatory and response variables is essentialvariables is essentialA change of one standard deviation in A change of one standard deviation in xx corresponds to a change of corresponds to a change of rr standard deviations in standard deviations in yyLSRL always passes through the pointLSRL always passes through the pointThe square of the correlation is the fraction of the The square of the correlation is the fraction of the variation in the values of variation in the values of yy that is explained by the that is explained by the least-squares regression of least-squares regression of yy on on xx
Classwork: Transformations and LSRL WSClasswork: Transformations and LSRL WS
Homework: #3.39 and ABS Matching to Plots Extension Homework: #3.39 and ABS Matching to Plots Extension Question (we’ll finish the others in class)Question (we’ll finish the others in class)
Residuals Residuals
observed y – predicted y orobserved y – predicted y or
Positive values show that data point lies above Positive values show that data point lies above the LSRLthe LSRL
The mean of residuals is always zeroThe mean of residuals is always zero
Residual PlotsResidual Plots
A scatterplot of the regression residuals A scatterplot of the regression residuals against the explanatory variableagainst the explanatory variable
Helps us assess the fit of a regression lineHelps us assess the fit of a regression line
Want a random patternWant a random pattern
Watch for individual points with large Watch for individual points with large residuals or that are extreme in the residuals or that are extreme in the xx direction direction
Outliers vs. Influential Outliers vs. Influential ObservationsObservations
Outlier Outlier – An observation that lies outside the overall An observation that lies outside the overall
pattern of the other observationspattern of the other observations
Influential observation Influential observation – Removing this point would markedly change Removing this point would markedly change
the result of the calculationthe result of the calculation
Classwork: Residual Plots WSClasswork: Residual Plots WS
Homework: p177 #3.52 and 3.61Homework: p177 #3.52 and 3.61
Doctor’s for the PoorDoctor’s for the Poor
This will be graded for accuracy!This will be graded for accuracy!
Ch 3 ReviewCh 3 Review
p176 #3.50-3.51, 3.56, 3.59, 3.69, p176 #3.50-3.51, 3.56, 3.59, 3.69, 3.76-3.773.76-3.77