Top Banner
PPT 18-1 Correlation/Regression RELATIONSHIP ANALYSIS
37

Correlation & Regression - Relationship Analysis

Oct 30, 2014

Download

Business

Siddharth Nath

Correlation & Regression - Relationship Analysis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Correlation & Regression - Relationship Analysis

PPT 18-1

Correlation/Regression

RELATIONSHIP ANALYSIS

Page 2: Correlation & Regression - Relationship Analysis

PPT 18-2

Learning Objectives

• The meanings and uses of regression and correlation analyses

• Calculate regressions and correlation• Basics of multivariate statistical

analysis techniques

Page 3: Correlation & Regression - Relationship Analysis

PPT 18-3

Statistics Not Always Black and White

• How does the story relate to marketing research?• Explain the meaning of this statement from the

story: “Statistical fallacies by themselves might create a certain amount of random mischief. But the big problem is that statistics which seem to confirm the dogmas of the intelligentsia are seized upon and trumpeted throughout academia and the media, with little or no concern for “multicollinearity” or any of the other pitfalls.”

• How can the Internet be used to help you understand multicollinearity, correlation, and other statistical concepts?

Page 4: Correlation & Regression - Relationship Analysis

PPT 18-4

Relationship Analysis

The examination of the association between two or more variables. In marketing, some of the more apparent relationships include associations between advertising and sales, company size and advertising budget, supply and demand for products, and customer satisfaction and customer loyalty.

Page 5: Correlation & Regression - Relationship Analysis

PPT 18-5

Scatter Diagrams

• Two related variables, called bivariate data, plotted as points on a graph.

• Each point on the diagram represents a pair of values, one based on the X scale (independent variable) and the other based on the Y scale (dependent variable).

• Making a scatter diagram usually is the initial step in investigating the relationship between two variables, because the diagram shows visually the shape and degree of closeness of the relationship.

• A scatter diagram also indicates whether the relationship between the two variables is positive or negative.

Page 6: Correlation & Regression - Relationship Analysis

PPT 18-6

Simple Regression Analysis

Refers to statistical techniques for measuring the relationship between a dependent variable and one or more independent variables. The relationship between two variables is characterized by how they vary together. Given pairs of X and Y variables, regression analysis measures the direction (positive or negative) and rate of change (slope) in Y as X changes, or vice versa. Using the values of the independent variable, it attempts to predict the values of an interval-or ratio-scaled dependent variable.

Page 7: Correlation & Regression - Relationship Analysis

PPT 18-7

Regression Analysis Requires Two Operations

• Derive an equation, called the regression equation, and a line representing the equation to describe the shape of the relationship between the variables. The regression line is the line drawn through a scatter diagram that “best fits” the data points and accurately describes the relationship between the two variables. The equation and its line may be linear or curvilinear.

• Estimate the dependent variable (Y) from the independent variable (X), based on the relationship described by the regression equation.

Page 8: Correlation & Regression - Relationship Analysis

PPT 18-8

Correlation Analysis

• Statistical techniques for measuring the closeness of the relationship between variables.

• It measures the degree to which changes in one variable are associated with changes in another.

• It can only indicate the degree of association or covariance between variables. Covariance is a measure of the extent to which two variables are related.

Page 9: Correlation & Regression - Relationship Analysis

PPT 18-9

Correlation Analysis - continued

• Regression and correlation analysis may be either simple or multiple. Simple analysis uses only two variables, one dependent and one independent. Multiple analysis deals with three or more variables, one dependent and two or more independent.

Page 10: Correlation & Regression - Relationship Analysis

PPT 18-10

Regression Equation and Line

Researchers estimate the regression line using the following equation:Y = 0+ 1Xi + I

0 = the Y intercept when X equals zero

1 = the slope of the regression line, which is the increase or decrease in Y for each change of one unit of XXi = a given value of the independent variable

i = the observation numberi = the error term associated with the ith observation

Page 11: Correlation & Regression - Relationship Analysis

PPT 18-11

Regression Equation and Line - continued

The model involves parameters that are unknown ( 0 and 1) but can be estimated from sample data. The error term, i, referred to as “eta,” is also unobservable, but can be estimated from sample data.

Page 12: Correlation & Regression - Relationship Analysis

PPT 18-12

The Lack Of Precision Can Be Due To

• Complexity of most marketing and other business problems

• The functional form of the relationship between the dependent and independent variables may differ from the one selected

• Measurement of the variables may be imperfect• Data are typically available only at an aggregate

level• Data are based on human behavior, so the error

term in the model may account for a “random” component in behavior

Page 13: Correlation & Regression - Relationship Analysis

PPT 18-13

Least-Squares Method

• A statistical technique that fits a straight line to a scatter diagram by using the shortest vertical distances of all the points from the straight line.

• The equation derived by this method will yield a regression line that best fits the data.

Page 14: Correlation & Regression - Relationship Analysis

• Regression coefficients are the values that represent the effect of the individual independent variables on the dependent variable.

Page 15: Correlation & Regression - Relationship Analysis

PPT 18-14

Standard Deviation of Regression

The standard deviation of the Y values from the regression line (Yc). This is also called the standard error of estimate, since it can be used to measure the error of the estimates of individual Y values based on the regression line.

Page 16: Correlation & Regression - Relationship Analysis

PPT 18-15

Total Deviation

Total deviation = Unexplained deviation + Explained deviation

The terms “explained” and “unexplained” are used here to indicate whether or not a portion of the total deviation is reduced by the introduction of the X values in computing Yc values. When these values are summed and squared individually, they estimate the explained and unexplained variation of Y.

Page 17: Correlation & Regression - Relationship Analysis

PPT 18-16

Coefficient of Determination (r2)

• The strength of association or degree of closeness of the relationship between two variables measured by a relative value. It demonstrates how well the regression line fits the scattered points.

• It indicates the amount of variation in the dependent variable that is explained by the variation in the independent variable and vice versa.

• It is defined as the ratio of the explained variation to the total variation.

Page 18: Correlation & Regression - Relationship Analysis

PPT 18-17

Coefficient of Determination (r2) - continued

• When r2 is close to 1, the Y values are very close to the regression line. When r2 is close to 0, the Y values are not close to the regression line.

• r2 is always a positive number. It cannot tell whether the relationship between the two variables is positive or negative.

Page 19: Correlation & Regression - Relationship Analysis

PPT 18-18

Correlation Coefficient

• The square root of r2, is frequently computed to indicate the direction of the relationship in addition to indicating the degree of the relationship.

• It is the correlation between the observed and predicted values of the dependent variable.

Page 20: Correlation & Regression - Relationship Analysis

• Since the range of r2 is from 0 to 1, the coefficient of correlation r will vary within the range of from 0 to 1.

• The + sign of r will mean a negative correlation. The sign of r is the same as the sign of b (the slope) in the regression equation.

Page 21: Correlation & Regression - Relationship Analysis

PPT 18-19

Calculating Regressions Using Computers

• To run the calculations using SPSS– Click on “Statistics”– Then click on “Regression” and “Linear”– These commands designate the

statistical test to be run

• To run calculations using Excel– Click on “Tools” and “Data Analysis”– Then click on “Regression.”

Page 22: Correlation & Regression - Relationship Analysis

PPT 18-20

Multiple Regression Analysis

• This test will determine the association or relationship between dependent and independent variables.

• In multiple regression analysis, more than two variables are included in the examination. While the dependent variables is still represented by Y, the independent variables are represented by X1, X2, X3, . . . and so on

Page 23: Correlation & Regression - Relationship Analysis

• Since with multiple regression we are dealing with more than one independent variable, we refer to the association between the dependent and independent variables as the coefficient of multiple determination, denoted by.

Page 24: Correlation & Regression - Relationship Analysis

PPT 18-21

Calculating Multiple Regression Using Computers

• To perform the computations using SPSS for Windows– Click on “Statistics”– Then click on “Regression” and “Linear”– These commands designate the statistical

test to be run

• To run the calculations using Excel– Click on “Tools” and “Data Analysis”– Then click on “Regression.”

Page 25: Correlation & Regression - Relationship Analysis

PPT 18-22

Forecasting Using Time Series Analysis

• Numerical variables that are calculated, measured, or observed sequentially on a regular chronological basis are called time series

• A time series representing an organization’s is the result of interactions of many changing forces

• The forces can be business, economic, political, and social influences as well as the forces of nature.

Page 26: Correlation & Regression - Relationship Analysis

PPT 18-23

Time Series Patterns Or Components

• Secular trends - direction of a time series movement over a long period of time usually represented by a straight line or a smooth curve.

• Seasonal variation - repeating periodic movement of a time series

Page 27: Correlation & Regression - Relationship Analysis

• Cyclical fluctuations or “business cycles” - expansions (ups) and contractions (downs) of business activities around the normal value

• Irregular movements - erratic movements, including all types of time series movements other than secular, seasonal, or cyclical

Page 28: Correlation & Regression - Relationship Analysis

PPT 18-24

Two Popular Forecasting Techniques

• Trend Analysis - Used when historical data is plotted or extrapolated to project some outcome in the future.

• Exponential Smoothing -Type of weighted average forecasting technique that assigns heavier weights to recent data and lighter weights to less recent data. When forecasting, the more recent data are more likely to be better predictors of the near future than are earlier periods.

Page 29: Correlation & Regression - Relationship Analysis

PPT 18-25

Multivariate Statistical Analysis

Any simultaneous analysis of more than two variables.

Many times, multivariate techniques are a means of performing in one analysis what used to take multiple analyses using univariate techniques (analysis of single-variable distributions).

Common multivariate techniques: multiple discriminant analysis, multidimensional scaling, factor analysis, cluster analysis and conjoint analysis.

Page 30: Correlation & Regression - Relationship Analysis

PPT 18-26

Multiple Discriminant Analysis (MDA)

• Appropriate tool for testing the hypothesis that the group means of a set of independent variables for two or more groups are equal.

Used if the dependent variable is categorical [either dichotomous or multichotomous and the independent variables are either interval or ratio data.

When two classifications are being examined, it is referred to as a two-group discriminant analysis. When three or more classifications are identified, then multiple discriminant analysis is used.

Page 31: Correlation & Regression - Relationship Analysis

PPT 18-27

Multiple Discriminant Analysis (MDA)

- continued Intent of this technique is two-fold:

(1) to understand group differences(2) to predict the likelihood that a variable will

belong to a particular group, based on several independent variables.

Linear combination is known as the discriminant function

An important function of discriminant analysis is to create a classification matrix, which shows the number of correctly and incorrectly classified cases.

Page 32: Correlation & Regression - Relationship Analysis

PPT 18-28

Factor Analysis

• Groups attributes that are alike.• Used to examine interrelationships among

many variables and to explain these variables in terms of their common underlying and unobservable dimensions (called “factors”).

• Factor analysis can be used to reduce the information contained in several original variables into a smaller, more manageable, set of variables while losing as little information as possible.

• Data must be gathered from interval scales.

Page 33: Correlation & Regression - Relationship Analysis

PPT 18-29

Cluster Analysis

• Grouping data into “clusters” such that elements in the same group are similar to each other, and elements in different groups are as different as possible.

• Partitions a sample into homogeneous classes. • Used to identify market segments--groups of

consumers with relatively similar needs.• Seeks to identify constructs that underlie objects.• Interval scales must have been used during data

gathering.• Creates different groups or requires previous

knowledge of the group membership for each item included.

Page 34: Correlation & Regression - Relationship Analysis

PPT 18-30

Multidimensional Scaling

• Also referred to as perceptual mapping. • Used to identify important dimensions underlying

respondents’ evaluations of test objects. • Convert consumer judgments of similarity or

preference into distances represented in multidimensional space.

• Multidimensional scaling techniques are used to identify important dimensions underlying customer evaluations of products, services, or companies.

Page 35: Correlation & Regression - Relationship Analysis

PPT 18-31

Conjoint Analysis

Provides information about the relative importance respondents place on individual attributes when choosing from multiple brands.

Built on the assumption that consumers make complex decisions based not on one factor at a time but on several factors “jointly” (thus the term “conjoint”).

Page 36: Correlation & Regression - Relationship Analysis

PPT 18-32

Net Impact

• The Internet– Will not help researchers with statistical analyses.– Will lend qualitative support for the research

findings obtained from the quantitative analyses.– Can inform researchers about advancements

made in statistical analyses through published manuscripts, discussion groups, and chat groups

– Researchers also use electronic mail extensively to share their research findings

Page 37: Correlation & Regression - Relationship Analysis

PPT 18-33

Decision Time!

If correlation analysis is a popular and informative statistical method, why should researchers bother using the somewhat intimidating multivariate statistical techniques? Do you feel there is really much to gain from these methods?

http://www.swlearning.com/marketing/shao/powerpoint/CH18_7.ppt#5