MULTIVARIATE ANALYSIS SPSS OPERATION AND APPLICATION STUDENT NAME: DENIZ YILMAZ STUDENT NUMBER: M0987107 1
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 1/27
MULTIVARIATE ANALYSIS
SPSS OPERATION AND
APPLICATION
STUDENT NAME: DENIZ YILMAZ
STUDENT NUMBER: M0987107
1
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 2/27
REGRESSION ANALYSIS
CONTENTS
1
2
3
4
5
INTRODUCTION
RELIABILITY ANALYSIS
CORRELATIONS
COMPARE MEANS
GENERAL LINEAR MODEL
6
.
FACTOR ANALYSIS
72
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 3/27
1.INTRODUCTION
In this study I gave 70 soccer players as a data andtested their name, nationality, incomes, marital
status, weight, height, performance, goal, age,
nationality, red card, yellow card, disabilities, in 12
variables .
It allows for a great deal of flexibility in the data format
It provides the user with a comprehensive set of procedures for
data transformation and file manipulation
It offers the researcher a large number of statistical analyses
processes commonly used in social sciences.
3
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 4/27
Reliability is the correlation of an item, scale, or
instrument with a hypothetical one which truly
measures what it is supposed to. Since the true
instrument is not available, reliability is estimated in
one of four ways:
Internal consistency: Cronbach's alpha
Split-half reliability: The Spearman-Brown coefficientTest-retest reliability: The Spearman Brown coefficient
Inter-rater reliability: Intraclass correlation, of which
there are six types
4
2.RELIABILITY ANALYSIS
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 5/27
Correlations, Nonparametic CorrelationsThere are two types of correlations: bivariate and partial. Bivariatecorrelation is a correlation between two variables. Partial correlation looksat the relationship between two variables while controlling the effect of oneor more additional variables. Pearsons·s product-moment correlationcoefficient and Spearman·s rho are examples of bivariate correlationcoefficient.
Pearsons Correlation Coefficient Pearson correlation requires only that data are interval for it to be anaccurate measure of the linear relationship between two variables.
Partial Correlation
Partial correlation is used to examine the relationship between two
variables while controlling for the effects of one or more additionalvariables. the sign of the coefficient informs us of the direction of therelationship and the size of the coefficient describes the strength of therelationship
3.CORRELATIONS
5
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 6/27
CORRELATIONS
6
Pearson correlation requires only that data are interval for it to be an
accurate measure of the linear relationship between two variables.Figure provides a matrix of the correlation coefficient for the three variables.
Incomes is negatively related to goals. The output also show that age is
positively related to the amount of incomes. Finally goals appears to be
negatively related to the age .
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 7/27
One Sample T-Test
One sample t-test is a statistical procedure that is used to know the mean
difference between the sample and the known value of the population
mean. In one sample t-test, we know the population mean. We draw a
random sample from the population and then compare the sample mean
with the population mean and make a statistical decision as to whether or
not the sample mean is different from the population.
4. COMPARE MEANS
s. In our table population mean of AGE is 1.87 if we look at the significance
value, we can see,it is less than the predetermined significance level, then we
can reject the null hypothesis and conclude that the population mean and the
sample mean are statistically different. If the calculated value is greater than
the predetermined significance level, than we can accept the null hypothesis
and conclude that the mean of the population and sample are statistically
different...
7
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 8/27
COMPARE MEANS
Independent T-Test The Independent Samples T Test compares the mean scores of two groups on a given
variable.
Hypotheses:
Null: The means of the two groups are not significantly different.
Alternate: The means of the two groups are significantly different.
We have a closer look to GOALS:
In here I want to compare the mean goals of soccer players who are the age between 30-40 and under 30 years old.
In independent sample test we see the Levene's Test for Equality of Variances. This tells us if we have met our second
assumption (the two groups have approximately equal variance on the dependent variable). If the Levene's Test is
significant (the value under "Sig." is less than .05), the two variances are significantly different. If it is not significant
(Sig. is greater than .05), the two variances are not significantly different; that is, the two variances are approximately
equal. Here, we see that the significance is .985, which is greater than .05. We can assume that the variances are
approximately equal. As the t value of 0.985 with 47 degrees of freedom is not significant (greater then our 0.05
significance level) we fail to reject the null hypothesis:
t (47) =-0.595; p =0.555, NS
8
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 9/27
Paired T-Test
Paired sample t-test is a statistical technique that is used to compare
two population means in the case of two samples that are
correlated.
Hypothesis:
Null hypothesis
Alternative hypothesis
The level of significance:
In paired sample t-test, after making the hypothesis, we choose the
level of significance. In most of the cases in the paired sample t-test,significance level is 5%.
COMPARE MEANS
9
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 10/27
One Way ANOV A The One-Way ANOVA compares the mean of one or more groups based on
one independent variable (or factor).
Assumptions: The two groups have approximately equal variance on the
dependent variable. We can check this by looking at the Levene's Test.
Hypotheses:Null: There are no significant differences between the groups' mean scores.
Alternate: There is a significant difference between the groups' mean scores
In one way ANOVA:
First, we are looking the Descriptive Statistics
Next we see the results of the Levene's Test of Homogeneity of Variance
Lastly, we see the results of our One-Way ANOV A
COMPARE MEANS
10
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 11/27
One Way ANOV A Post-Hoc Comparisons :We can look at the results of the Post-Hoc
Comparisons to see exactly which pairs of groups are significantly
different. There are three parts in post-hoc tests: Tukey·s test , Scheffe and
LSD results
Homogenous subsets: (The Tukey range test) gives information similar
with post hoc tests, but in a different format.The important point is whether
(Sig. is greater than 0.05) or(Sig. is less than 0.05).
Mean plots : are used to see if the mean varies between different
groups of the data.
COMPARE MEANS
11
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 12/27
General Linear Model
The general linear model can be seen as an extension of linear multiple
regression for a single dependent variable, and understanding the multiple
regression model is fundamental to understanding the general linear model.
The general purpose of multiple regression (the term was first used by
Pearson, 1908) is to quantify the relationship between several independent
or predictor variables and a dependent or criterion variable.
General Linear Model menu includes :
Univariate GLM
Multivariate GLM
Repeated MeasuresVariance Components
5. GENERAL LINEAR MODEL(GLM)
12
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 13/27
GENERAL LINEAR MODEL(GLM)
UnivariateGLMUnivariate GLM is the general linear model now often used to implement
such long-established statistical procedures as regression and members of
the ANOVA family.
The Between-Subjects Factors information table in Figure is an example
of GLMs output. This table displays any value labels defined for levels of the
between-subjects factors. In this table, we see that GOALS= 1 ,2 and 3correspond to (under 100, between 100-150 and over 150 goals)
respectively
13
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 14/27
GENERAL LINEAR MODEL(GLM) UnivariateGLM
Tests of Between-Subjects Effects :Type III in Figure SS shows the sums
of squares and other statistics differ for most effects. The ANOVA table in
Figure demonstrates the PERFORMANCE by GOALS interaction effect is
not significant at p = 0 .815
14
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 15/27
GENERAL LINEAR MODEL(GLM)
MultivariateGLMMultivariate GLM is often used to implement two long-established
statistical procedures - MANOVA and MANCOVA.
Tests of Between-Subjects Effects :(Test of overall model significance) The overall
F test appears, illustrated below, in the "Corrected Modelµ and answers the question,
"Is the model significant for each dependent?´There will be an F significance level for
each dependent. That is, the F test tests the null hypothesis that there is no difference
in the means of each dependent variable for the different groups formed by
categories of the independent variables. For the example below, the multivariate
GLM is found to be not significant for all three dependent variables.
15
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 16/27
Multivariate
GLM
Between-Subjects SSCP Matrix: contains the sums of squares attributable
to model effects. These values are used in estimates of effect size.
Multivariate Tests (Test of individual effects overall ) : in contrast to the overall F
test, answer the question, "Is each effect significant for at least one of the dependent
variables?" That is, where the F test focuses on the dependents, the multivariate tests
focus on the independents and their interactions.
Types of individual effects:
Hotelling's T-Square
Wilks' lambda
Pillai-Bartlett trace
Roy's greatest characteristic root (GCR)
GENERAL LINEAR MODEL(GLM)
16
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 17/27
GENERAL LINEAR MODEL(GLM)
MultivariateGLM Box's M tests MANOVA's assumption of homoscedasticity using the F distribution. If
p(M)<.05, then the covariances are not significantly different. Thus we want M not to be
significant, rejecting the null hypothesis that the covariances are not homogeneous. In the
figure of SPSS output below, Box's M shows the assumption of equality of covariances
among the set of dependent variables is violated with respect to the groups formed by
the categorical independent factor "performanceµ.
Levene's test :SPSS also outputs Levene's
test as part of Manova.. If Levene's test is
significant, then the data fail the assumption of
equal group error variances. In the figure
,Levene's test shows that the assumption of
homogeneity of error variances among the
groups of "performance" is violated for two of
the three dependent variables listed. 17
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 18/27
Factor AnalysisAttempts to identify underlying variables, or factors, that explain the
pattern of correlations within a set of observed variables. Factor analysis is
often used in data reduction to identify a small number of factors that
explain most of the variance observed in a much larger number of manifest
variables. Factor analysis requires that you have data in the form of
correlations, so all of the assumptions that apply to correlations, are
relevant .
Types of factor analysis: :
Principal componentCommon factor analysis
6. FACTOR ANALYSIS
18
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 19/27
FACTOR ANALYSIS
CorrelationMatrix: We can use the correlation matrix to check the pattern of
relationships. First scan the significance values and look for any variable for which themajority of values are greater than 0.05.
All we want to see in this figure is that
the determinant is not 0. If thedeterminant is 0, then there will be
computational problems with the factor
analysis, If we look at figure : a.
Determinant = .835 which is greater
than necessary value 0.00001. Wecan say, there is no problem with this
data.
19
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 20/27
KMO and Barlett·s Test: Two important parts of the output
a. Kaiser-Meyer-OlkinMeasure of Sampling Adequacy :This measure variesbetween 0 and 1, and values closer to 1 are better. Kaiser recommends accepting
value is greater than 0.05. For this data (figure) the KMO value is 0.472 , the
minimum suggested value is 0.6 which is mean factor analysis is not appropriate
for this data.
b. Bartlett's Test of Sphericity : The null hypothesis that the original correlation
matrix is an identity matrix. For factor analysis to work we need some
relationships between variables and if the R-matrix were an identity matrix then
all correlation coefficients would be zero. Therefore we want to this test to be
significant. Barlett,s Test is not significant (0.061), because the value need to be
(p<0.001), therefore factor analysis is not appropriate.
FACTOR ANALYSIS
20
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 21/27
FACTOR ANALYSIS
Total Variance Explained: Figure lists the eigenvalues associated witheach linear component (factor) before extraction, after extraction and after
rotation. Before extraction, SPSS has identified 4 linear components within the
data set .The Eigenvalues associated with each factor represent the variance
explained by that particular linear component and SPSS also displays
eigenvalue in terms of the percentage of variance explained ( so factor 1
explains 35.360 total variance)it should be clear that the first two factors
explain relatively large amount of variance(especially factor1) whereas
subsequent factors explain only small amounts of variance.
21
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 22/27
FACTOR ANALYSIS
Communalities & Component Matrix: Figure shows the table of
communalities before and after extraction principal component analysis workson the initial assumption that all variance is common; therefore before
extraction the communalities are all 1. The communalities on the column labeled
extraction reflect the common variance in the data structure. So, we can say
that 69.7% of the variance associated with question 1 is common. This output
also shows the component matrix before rotation. This matrix containsthe loadings of each variable onto each factor. At this stage SPSS has
extracted two factors.
22
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 23/27
Regression analysis
Includes any techniques for modeling and analyzing several variables, when
the focus is on the relationship between a dependent variable and one or
more independent variables.
The regression equation is the simplest form regression analysis involves
finding the best straight line relationship to explain how the variation in anoutcome (or dependent) variable, Y, depends on the variation in a predictor
(or independent or explanatory) variable, X. Once the relationship has been
estimated we will be able to use the equation:
Y = 0 .1X
The basic technique for determining the coefficients 0 and 1 is OrdinaryLeast Squares (OLS): values for 0 and 1 are chosen so as to minimize the
sum of the squared residuals (SSR). The SSR may be written as:
7. REGRESSION ANALYSIS
23
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 24/27
REGRESSION ANALYSIS
Model summary: From the model summary table, we can find how well the
model fits the data .This table displays R, R squared, adjusted R squared, and
the standard error. R is the correlation between the observed and predicted
values of the dependent variable. The values of R range from -1 to 1.
R squared is the proportion of variation in the dependent variable
explained by the regression model. The values of R squared range from 0 to
1. Adjusted R-Square is an adjustment for the fact that when one has a large
number of independents, it is possible that R2 will become artificially high
simply.
24
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 25/27
REGRESSION ANALYSIS
ANOVA: Besides R-squared we can use Anova (Analysis of variance) to
check how well the model fits the data.
The F statistic is the regression mean square (MSR) divided by the residual
mean square (MSE). If the significance value of the F statistic is small (smaller
than 0.05) then the independent variables do a good job explaining the
variation in the dependent variable. If the significance value of F is larger than
0.05 then the independent variables do not explain the variation in the
dependent variable, and the null hypothesis that all the population values for
the regression coefficients are 0 is accepted.
25
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 26/27
26
The Collinearity Diagnostics table :in SPSS is an alternative method ofassessing if there is too much multicollinearity in the model. High eigenvalues
indicate dimensions (factors) which account for a lot of the variance in the
crossproduct matrix. Eigenvalues close to 0 indicate dimensions which explain little
variance. Multiple eigenvalues close to 0 indicate an ill-conditioned crossproduct
matrix, meaning there may be a problem with multicollinearity .
REGRESSION ANALYSIS
8/8/2019 Spss Presentation - Copy
http://slidepdf.com/reader/full/spss-presentation-copy 27/27
27