YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Factor

KURSUS PENINGKATAN PROFESIONALISME INSTITUT PERGURUAN GAYA

24 – 26 JULAI 2006

Factor Analysis in Educational Research

Dr. Soon Seng Thah Instructional Objectives: After studying this module, you will be able to:

1. Run factor analysis using SPSS; 2. Understand conceptual underpinnings of the factor analytic approach in

educational research; and 3. Interpret SPSS factor analysis output and make inferences from the output

generated.

1

Page 2: Factor

Factor Analysis in Educational Research

Soon Seng Thah, Ph.D 1. Introduction Due to the complexity of human interactions in education, there is a general tendency for educational researchers to use numerous variables in their study. When this happens, reporting research findings becomes cluttered with too much information which may not necessarily yield the desired results. Due to this, there is a need for some sort of “summarisation” on the number of variables and portray what is necessary without substantial loss of meaning. This is possible through the use of factor analysis. Basically, factor analysis can be used to examine the underlying patterns or relationships for a large number of variables and to determine whether the information can be condensed or summarised in a smaller set of factors or components with a minimal loss of information. By providing an empirical estimate of the structure of variables, factor analysis becomes an objective basis for creating summated scales. A good understanding of factor analysis is imperative for educational researchers who intend to venture into the more complex realm of variable relationships. 2. What is factor analysis? Factor analysis is an inter-dependence technique, meaning the variables cannot be classified as either dependent or independent but instead all the variables are analysed simultaneously in an effort to find an underlying structure to the entire set of variables or subjects. Factor analysis involves defining sets of variables that are highly inter-related known as factors. Basically, there are two types of factor analysis – exploratory as used in the SPSS Base module and confirmatory as in SPSS Amos. Confirmatory factor analysis is also called structural equation modeling (SEM). Assuming that you have designed a questionnaire with 100 items outlining separate characteristics of effective management in schools. With so many items, it would be difficult to evaluate the effect of these variables separately as they are too specific. You may want some general evaluative dimensions rather than just those specific items. However, to ascertain these general evaluative dimensions, you must still obtain information about the 100 specific items. These items which correlate highly are assumed to be a member of the a broader dimension. These dimensions become composites of the specific variables which in turn allow the dimensions to be interpreted and described. Thus, a factor analysis might identify certain dimensions such as motivation, climate, structure, interactions, etc. as the broader evaluative dimensions prevalent among the respondents. Each of these dimensions contains specific items that together constitute a broader evaluative dimension. The school administrators could then use these dimensions or factors to define what constitute leadership effectiveness and work towards a plan of action which will lead to the attainment of the desired objectives.

2

Page 3: Factor

3. How do you go about running factor analysis using SPSS? Due to the advanced statistical concepts used in factor analysis, it is very tedious to calculate factor analysis computations manually. However, there are a number of statistical software which can do this chore systematically – one of which is SPSS. The following shows you a step-by-step approach in running factor analysis using SPSS. Step 1: From the menus, select Analyze Data reduction Factor (see Figure 1).

Figure 1: Factor analysis menu Step 2: The factor analysis dialog box appears as follows (see Figure 2):

Figure 2: Factor analysis dialog box

3

Page 4: Factor

Select the variables for factor analysis by transferring the variables to the “Variables” box (click ► (right arrow) to transfer the variables). Step 3: Select Descriptives from the tab at the bottom of the factor analysis dialog box and the Descriptives dialog box appears (see Figure 3).

Figure 3: Descriptives dialog box Tick the following: Under Statistics - Univariate descriptives, Initial solution, under Correlation Matrix – Coefficient, Significance levels, Anti-image and KMO and Bartlett’s test of sphericity. Click Continue. Step 4: Select Extraction from the tab at the bottom of the factor analysis dialog box and the Factor Analysis: Extraction dialog box appears (see Figure 4).

Figure 4: Extraction dialog box

Tick the following: Under Method – Principal components, under Analyze – Correlation matrix, under Display – Unrotated factor solution and Scree plot, under

4

Page 5: Factor

Extract – Eigenvalues over: 1, and accept the default value of 25 for Maximum Iterations for Convergence. Click Continue. Step 5: Select Rotation from the tab at the bottom of the factor analysis dialog box and the Factor Analysis: Rotation dialog box appears (see Figure 5).

Figure 5: Rotation dialog box

Tick the following: Under Method – Varimax, under Display – Rotated solution and Loading plot(s). Accept the default value of 25 for Maximum Iterations for Convergence. Click Continue. Step 6: Select Options from the tab at the bottom of the factor analysis dialog box and the Factor Analysis: Options dialog box appears (see Figure 6).

Figure 6: Options dialog box Select the following: Under Missing Cases – Exclude cases listwise, under Coefficient Display Format – Sorted by size.

5

Page 6: Factor

Click Continue. Step 7: Select OK to run the factor analysis and display output. Step 8: Interpret the factor analysis output. When interpreting the factor analysis output generated by SPSS, you need to know and understand a number of fundamental principles associated with this type of analysis. Let’s discuss these principles systematically while doing away with the technical jargon. Running a factor analysis using SPSS and interpreting its output entails understanding 7 principal stages in the factor analysis procedure. These stages are as follows: 8.1 Stage 1: Objectives of factor analysis

a. Knowing the type of factor analysis You must consider which type of factor analysis you need. Basically, as far as exploratory factor analysis is concerned, there are two - i. R factor analysis; and ii. Q factor analysis. The R factor analysis is the most common and analyses a set of variables to identify the dimensions that are latent (not easily observed). On the other hand, Q factor analysis combines or condenses large number of people (cases) into distinctly different groups within a larger population. Due to computational difficulties, the Q factor analysis is not frequently used, but instead cluster analysis is used to group individual respondents. With the two types of exploratory factor analyses in mind, you then select either variables as in R factor analysis or cases (respondents) as in Q factor analysis. Our concern in this write-up is on the R factor analysis.

b. Data summarisation versus data reduction Using factor analysis will result in two outcomes – data summarisation and data reduction. When summarising data, factor analysis describes data in smaller number of concepts than the original individual variables. The fundamental concept involved in data summarisation is the definition of structure. Using this structure, you can view the set of variables at various levels of generalisations from the most detailed variables to the more generalised level where individual variables are grouped and viewed not for what they represent individually but for what they represent collectively in expressing a concept. On the other hand, data reduction extends this process by deriving an empirical value called factor score for each dimension or factor and then substituting this value for the original values.

c. Variable selection When selecting variables for factor analysis, variable selection must be done with care. You must specify variables with the potential dimensions that can be identified through the character and nature of the variables submitted to factor analysis. For example, if you would like to “extract” a factor on leadership styles in educational

6

Page 7: Factor

administration, make sure that items pertaining to this aspect are included, otherwise factor analysis would not be able to identify this dimension. The phenomenon “garbage in, garbage out” is very real in factor analysis! If you indiscriminately include a large number of variables and hope that factor analysis will “figure it out”, then the possibility of poor results is high. It must be pointed out that the quality and meaning of the derived factors reflect the conceptual underpinnings of the variables included in the analysis. You yourself as the researcher must know what the elements and their relationships are to be able to optimise the use of factor analysis in educational research. 8.2 Stage 2: Designing a factor analysis When designing a factor analysis, the researcher has to bear in mind three basic elements: a. correlations among variables or respondents; b. variable selection and measurement; and c. sample size.

a. Correlations among variables or respondents Factor analysis uses a correlation matrix which shows relationships among the variables as the basic data input. An example of this correlation is reflected in Table 1: Correlations among variables. Variables b1 – b5 shown as columns correlate with variables b1 – b5 shown as rows in a correlation matrix. This table shows the existence of strong correlations among the variables with values greater than .7 – this reflects one of the ideal situations for running factor analysis.

Table 1: Correlations among variables

b. Variable selection and measurement When running a factor analysis, you have to consider two issues: i. the type of variables to be used; and ii. the number of variables to be included.

i. Type of variables The primary requirement is that a correlation value can be calculated among all variables. Thus, metric variables are suitable such as those whose levels of measurement are at the interval or ratio level. Non-metric variables are problematic because they cannot use the same type of correlation measures used by metric variables – although technically possible e.g. you can define them to resemble dummy variables (coded 0 and 1). However, it would be advisable to avoid them. It must be noted that SPSS does not perform this type of dummy variable computation in factor analysis. Hair, Black, Babin, Anderson, & Tatham (2006) says “…the researcher should be sure to include several variables (five or more) that may represent each

7

Page 8: Factor

proposed factor.” p. 112. As the strength of factor analysis lies in finding patterns among groups of variables, it is of little use in identifying factors composed of only a single variable. When designing a study based on factor analysis, you should identify several key variables (also referred to as key indicants or marker variables) that closely reflect the hypothesised underlying factors.

c. Sample size According to Hair et al. (2006), the researcher should not generally run a factor analysis if the sample size is less than 50. Preferably, the sample size should be 100 or larger. As a general rule, the minimum is to have at least five times as many observations as the number of variables to be analysed, and the more acceptable sample size would be 10:1 ratio. Some writers even propose 20 cases for each variable. 8.3 Stage 3: Assumptions in factor analysis When using factor analysis a number of assumptions must be taken into account and these can be divided into: a. conceptual; and b. statistical.

a. Conceptual The conceptual assumptions are related to the set of variables and sample selected. The set of variables selected must possess some underlying structure. You should ensure that the observed patterns are conceptually valid and appropriate in the context of factor analysis because the technique has no means of determining appropriateness other than the correlations among variables. As an example, mixing dependent and independent variables in a single factor analysis and then using the derived factors to support dependence relationship is inappropriate. You must also ensure that the sample is homogenous with respect to the underlying factor structure. For example, it would be inappropriate to apply factor analysis to a sample of males and female respondents for a set of items known to differ because of gender. When the two sub-samples (male and female) are combined, the resulting correlations and factor structure will be a poor representation of the unique structure of each group. In this case, separate factor analysis should be performed and the results can be compared with each other.

b. Statistical From the statistical point of view, normality in distribution is necessary only if a statistical test is applied to the significance of the factors but these tests are seldom used. In factor analysis, some degree of multi-collinearity may, in fact, be desirable because the objective is to identify inter-related sets of variables. Multi-collinearity refers to the extent to which a variable can be explained by the other variables in the analysis. You have to ensure that the variables are sufficiently inter-correlated to produce representative factors. The degree of inter-correlatedness can be ascertained through viewing the overall measures of inter-correlation.

8

Page 9: Factor

You must also ensure that the data matrix have sufficient correlations. To do this, a visual inspection of the data matrix must be done and if it does not reveal a substantial number of correlations greater than .30, then factor analysis is probably inappropriate. The correlations among variables can also be analysed by computing the partial correlations among variables. A partial correlation is the correlation that is unexplained when the effects of other variables are taken into account. If “true” factors exist in the data, then the partial correlation should be small because the variable can be explained by the variables loading on the factors. If partial correlations are high (above .7) indicating no underlying factors, then factor analysis is inappropriate. Partial correlation can be determined using the anti-image correlation matrix under Descriptives in the SPSS factor analysis procedure. Thus, larger partial or anti-image correlations are indicative of a data matrix perhaps not suited for factor analysis. Table 2 shows the anti-image correlation derived from an SPSS analysis – note that the anti-image correlations are small indicating suitability for factor analysis.

Table 2: Anti-image correlation among variables b1 b2 b3 b4 b5

Another method of determining appropriateness of factor analysis is via the use of Bartlett test of sphericity - a statistical test for the presence of correlations among the variables (see Figure 7). This test provides the statistical significance that the correlation matrix has significant correlations among at least some of the variables. It must be pointed out that increasing the sample size will result in the Bartlett test becoming more sensitive detecting correlations among variables. This procedure can be found under Descriptives in the SPSS factor analysis procedure. Figure 7 shows the Bartlett’s test of sphericity having a significance level of .000. This determines the appropriateness of the variables for factor analysis.

Figure 7: Bartlett’s test of sphericity A third measure (also available in SPSS) is the measure of sampling adequacy (MSA). MSA is used to quantify the degree of inter-correlations among the variables and the appropriateness of factor analysis. MSA index ranges from 0 to 1, reaching 1 when each variable is perfectly predicted without error by the other variables. MSA can be interpreted with the following guidelines:

9

Page 10: Factor

.80 or above – meritorious

.70 or above – middling

.60 or above – mediocre

.50 or above – miserable Below .50 – unacceptable. In Figure 8, the MSA value is very high, .948 indicating that factor analysis is appropriate.

Figure 8: Measures of Sampling Adequacy (MSA) Note that MSA increases as: i. sample size increases; ii. average correlations increase; iii. number of variables increases; or iv. number of factors decreases. You must always have an overall MSA value of .50 and above before proceeding with factor analysis. 8.4 Stage 4: Deriving factors and assessing overall fit After you have run a correlation matrix using the variables selected for factor analysis, you can then identify the underlying structure of relationships. There are two things you will have to consider: a. select the factor extraction method; and b. select the criteria for the number of factors to extract.

a. Selecting the factor extraction method When selecting the factor extraction method, you need to understand an element called the partitioning of variance of a variable. The variance is a value (i.e. derived from the square of the standard deviation) that represents the total amount of dispersion of values for a single variable about its mean. When a variable is correlated with another variable, we say that it shares variance with the other variables, and the amount of sharing between just two variables is simply the squared correlation. For example, if two variables have a correlation of .05, then each variable shares 25% (i.e. .502) of its variance with the other variable. In factor analysis, we need to know how much of a variable’s variance is shared with other variables in that factor. The total variance of any variable can be divided (partitioned) into three types of variance:

i. Common variance: Common variance is defined as that variance in a variable that is shared with all other variables in the analysis. This variance is accounted for (shared) based on a variable’s correlation with all other variables in the analysis. A variable’s communality is the estimate of its shared or common, variance among the variables as represented by the derived factors.

ii. Specific variance (also called unique variance): This is the variance

associated with only a specific variable. This variance cannot be explained

10

Page 11: Factor

by the correlations to the other variables but is still associated uniquely with a single variable.

iii. Error variance: This is the variance that cannot be explained by

correlations with other variables but is due to unreliability in the data-gathering process, measurement error or a random component in the measured phenomenon in question.

When a variable is more highly correlated with one or more variables, the common variance (communality) increases. On the other hand, if unreliable measures or other sources of extraneous error variance are introduced, then the amount of possible common variance and the ability to relate the variable to any other variable is reduced. Two methods of factor extraction are prevalent in factor analysis, viz. i. Common factor analysis; and ii. Component analysis (known as principal component analysis in SPSS). When choosing either of the above, you need to know what “variance” as described above means. The selection of one method over the other is based on two criteria: i. the objectives of the factor analysis; and ii. the amount of prior knowledge about the variance in the variables. Component analysis is used when the objective is to summarise most of the original information (variance) in a minimum number of factors for prediction purposes. On the other hand, common factor analysis is used primarily to identify underlying factors or dimensions that reflect what the variables share in common. The most common direct comparison between the two methods is by their use of the explained and unexplained variance. Principal component analysis considers the total variance and derives factors that contain small proportions of unique variance, and in instances error variance. In this analysis, unities (values of 1.0) are inserted in the diagonal of the correlation matrix, so that the full variance is brought into the factor matrix. On the other hand, common factor analysis considers only the common or shared variance assuming that both the unique and error variance are not of interest in defining the structure of the variables. Thus, common factor analysis excludes a portion of the variance included in a principal component analysis and because of this, it is more restrictive and suffers from factor indeterminacy. Because of this, most statistical analysis software (including SPSS) use the component analysis method as the default – bear in mind that you can change this default if necessary. Therefore, component factor analysis is most appropriate for the following situations:

i. data reduction is the primary concern, focusing on the minimum number of factors needed to account for the maximum portion of the total variance represented by the original set of variables; and

ii. prior knowledge suggests that specific and error variance represent a

relatively small proportion of the total variance. After running your factor analysis, you can then extract the initial unrotated factors. By examining the unrotated factors, you can have a rough idea on the potential for

11

Page 12: Factor

data reduction and obtain a preliminary estimate of the number of factors to extract. Final determination of the number of factors can only be done after the results of the rotated factors are interpreted. Table 3 shows the unrotated factors derived from a factor analysis. Note that the components or factors in an unrotated factor matrix not easily distinguished beyond the first component or factor – that’s precisely why you need a rotated factor solution.

Table 3: Unrotated factors of the Component Matrix

b. Criteria for the number of factors to extract How do we decide on the number of factors to extract? To answer this question, you must look at your conceptual basis and also the empirical evidence. You begin with some predetermined criteria, such as the general number of factors plus some general thresholds of practical relevance. A peek at the current literature shows that there is no exact quantitative basis for deciding the number of factors to extract. However, the following criteria for the number of factors to extract are recommended:

i. Latent root criterion: This is the most commonly used technique. With component analysis, each variable contributes a value of 1 to the total eigenvalue (eigen means characteristic). Thus, only factors having latent roots or eigenvalues greater than 1 are considered significant and all factors with latent roots less than 1 are considered insignificant and

12

Page 13: Factor

disregarded (see Figure 9). Using an eigenvalue as the cut-off point is most reliable when the number of variables is between 20 and 50.

ii. A priori criterion: When using this, you should already know how many

factors to extract before running factor analysis. You can instruct SPSS to stop the analysis when the desired number of factors has been extracted. Thus, this approach is good when you want to test a hypothesis about the number of factors to be extracted and can be used to replicate another researcher’s work to extract the same number of factors.

iii. Percentage of variance criterion: This criterion is an approach based on

achieving a specified cumulative percentage of total variance extracted by successive factors. The purpose is to ensure practical significance for the derived factors by ensuring that they explain at least a specified amount of variance. There is no absolute threshold for all applications. In the social sciences including educational research where information is less precise compared to natural sciences, it is quite common to consider a solution that accounts for 60% of the total variance as satisfactory. In some instances even less has been observed. Table 4 shows how approach is undertaken.

Table 4: Percentage of variance criterion

In Table 4, total variance of 60.18 is achieved at Component 4 (see Cumulative % column). This means that the first 4 factors explain 60.18% of the total variance – Factor 1 constituting 36.93% of the total variance, Factor 2 – 10.30%, Factor 3 – 7.57% and Factor 4 – 5.38%.

13

Page 14: Factor

iv. Scree test criterion: The scree test is used to identify the optimum

number of factors that can be extracted before the amount of unique variance begins to dominate the common variance structure. The scree test is derived by plotting the latent roots against the number of factors in their order of extraction, and the shape of the resulting curve is used to evaluate the cut-off point (see Figure 9). From Figure 9, it can be observed that Factor 12 and below slopes downward becoming approximately a horizontal line. In this respect, the point at which the curve first begins to “straighten out” is considered to indicate the maximum number of factors to extract. In this case, the first 11 factors qualify.

Figure 9: Test Criteria for Latent Root and Scree

As far as the selection is concerned, usually you seldom use only a single criterion in determining how many factors to extract. When writing a research report using factor analysis, you may describe the selection criterion in general and then focus on the best criterion. When selecting factors, too few factors would result in not revealing the correct structure and, thus, important dimensions may be omitted. If too many factors are retained, then the interpretation becomes more difficult when the results are rotated. You should compare and contrast to arrive at the best representation of the data.

14

Page 15: Factor

8.5 Stage 5: Interpreting the factors When interpreting the factors from your SPSS output, you need to have strong conceptual basis in the study you have undertaken and factor analysed such as the theoretical paradigms and accepted principles. Interpreting factors is circular in nature, i.e. involving making initial evaluations, viewing the factors and then refining the results – and may consist of several iterations to obtain the final solution. Basically, two steps are involved when interpreting the factors, viz. a. viewing the unrotated factor matrix; and b. interpreting the rotated factor matrix.

a. Viewing the unrotated factor matrix

Using unrotated factor solutions achieve your objective of data reduction but you may not be able to adequately interpret the variables well. Therefore, the solution lies in a rotated factor solution – a method to achieve simpler and theoretically meaningful factor solutions thereby improving the interpretation of the factors by reducing some of the ambiguities resulting from unrotated factor solutions (see Table 3 above for the unrotated factors of component matrix).

b. Interpreting the rotated factor matrix After taking a cursory look at the unrotated solution, you then focus only on the rotated solution. You look at the rotated factor loadings for each variable in order to determine the variable’s role and contribution in determining the factor structure. When evaluating this, you may want to respecify your factor model due to: i. deletion of a variable(s) from the analysis; ii. specifying a different rotation method; iii. need to extract a different number of factors; and iv. desire to change from one extraction method to another. i. Rotation of factors When running a factor analysis and interpreting its output, you need to understand factor rotation. What is factor rotation? Factor rotation is the process of adjusting the factor axes to achieve a simpler and pragmatically more meaningful factor solution. To put it simply, you rotate to “bring out” prominently the factor loadings so that you view the factors easily. In unrotated factor solution, it extracts factors in the order of their variance extracted. The first factor tends to be a general factor with almost every variable loading significantly and it accounts for the largest amount of variance (see Table 4 above). The second and subsequent factors are then based on the residual amount of variance, each accounts for successively smaller portions of variance. The ultimate effect of rotating the factor matrix is to redistribute the variance from earlier factors to later ones and achieve a simpler and theoretically more meaningful pattern. Basically there are two types of rotation: i. orthogonal factor rotation (involving right angles); and ii. oblique factor rotation. Orthogonal factor rotation are used more widely because the more popular software package such as SPSS has this option. Hair et al. (2006) is of the opinion that the oblique rotation method is still subject to some controversy. Figure 10 below shows how an orthogonal rotation works.

15

Page 16: Factor

Figure 10: Orthogonal factor rotation

In Figure 10, five variables are depicted in a two-dimensional factor diagram illustrating factor rotation. The vertical axis represents Unrotated Factor II and the horizontal axis represents Unrotated Factor I. The axes are labeled with 0 at the origin and extend outward to +1.0 or -1.0. The numbers on the axes represent the factor loadings. The five variables are labeled V1 ,V2 ,V3 ,V4 , and V5. The factor loading for variable 2 (V2) on Unrotated Factor II is determined by drawing a dashed line horizontally from the data point to the vertical axis for Factor II. Similarly, a vertical line is drawn from variable 2 (V2) to the horizontal axis of Unrotated Factor I to determine the loading of variable 2 (V2) on Factor I. A similar procedure is followed for the remaining variables which determine the factor loadings for the unrotated and rotated solutions, as displayed in Table 5 for comparison purposes. On the unrotated first factor, variables 1 and 2 are very high in the positive direction. Variable 5 is moderately high in the negative direction, and variables 3 and 4 have considerably lower loadings in the negative direction.

0-.50 -1.0 +.50 +1.0

-.50

-1.0

+.50

+1.0

Unrotated Factor II Rotated Factor II

Unrotated Factor I

V1

V2

V3

V4

V5 Rotated Factor I

Extracted from: Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E., & Tatham, R.L. (2006). Multivariate data analysis. 6th ed. Upper Saddle River, NJ: Pearson Prentice Hall. p. 124

16

Page 17: Factor

From a visual inspection of Figure 10, two clusters of variables are obvious – variable 1 and variable 2 go together, as do variables 3, 4, and 5. However, such patterning of variables is not so obvious from the unrotated factor loadings. By rotating the original axes clockwise, as shown in Figure 10 (see the thick dotted axes), you obtain a completely different factor loading pattern. Note that when rotating the factors, the axes are maintained at 90 degrees (right angle=90o). This shows that the factors are mathematically independent and that the rotation has been orthogonal. After rotating the factor axes, variables 3, 4, and 5 load high on Factor I and variables 1 and 2 load high on Factor II. Thus, the clustering of these variables into two groups is more obvious after the rotation, even though the relative position of the variables remain unchanged.

Table 5: Comparison between rotated and unrotated factor loadings

Unrotated Factor Loadings Rotated Factor Loadings Variables I II I II

V1 .50 .80 .03 .94 V2 .60 .70 .16 .90 V3 .90 -.25 .95 .24 V4 .80 -.30 .84 .15 V5 .60 -.50 .76 -.13

The SPSS statistical package has a number of approaches under the orthogonal factor rotation: a. Quartimax; b. Varimax; and c. Equimax. Basically, orthogonal rotation aims to simplify the rows and columns of the factor matrix to facilitate interpretation. In a factor matrix, columns represent factors with each row representing a variable’s loading across the factors. By simplifying the rows, you are making as many values in each row as close to zero as possible, i.e. maximising a variable’s loading on a single factor. On the other hand, by simplifying the columns, you are making as many values in each column as close to zero as possible, i.e. making the number of high loadings as few as possible.

a. Quartimax rotation Quartimax rotation focuses on rotating the initial factor so that a variable loads high on one factor and as low as possible on all other factors. This type of rotation centres on simplifying the rows. The quartimax rotation has not proven successful in producing simpler structures and tends to produce a general factor as the first factor on which most if not all the variables have loadings. Because of this, quartimax rotation is not in line with the goals of rotation.

b. Varimax rotation Varimax rotation centres on simplifying the columns of the factor matrix. With this type of rotation, the maximum possible simplification is reached if there are only 1s and 0s in a column. This method maximises the sum of variances of required loadings of the factor matrix. With varimax rotation, some high loadings (i.e. close to +1 and -1) are likely as are some loadings near 0 in each column of the matrix. The logic is that interpretation is easiest when the variable-factor correlations are: i. close to either +1 or -1, thus indicating a clear positive or negative association between the

17

Page 18: Factor

variable and the factor; or ii. close to 0 indicating a clear lack of association. Thus, varimax rotation tends to give a clearer separation of factors and is the most popular rotation method and considered superior to others. Use this rotation method when running factor analysis in SPSS.

c. Equimax rotation The equimax approach is a compromise between the quartimax and varimax approaches. Rather than simplifying on the rows or columns it tries to accomplish some of each. This type of rotation has not gained widespread acceptance and is not frequently used. ii. Judging the significance of factor loadings When using factor analysis, you must decide on which factors are worth paying attention to by looking at the factor matrix displaying the factor loadings. Because a factor loading is the correlation of the variable and the factor, the squared loading is the amount of the variable’s total variance accounted for by the factor. Thus, a .30 loading translates to 9% (.30 x .30 = 0.09) explanation and a .50 loading denotes that 25% (.50 x .50 = 0.25) of the variance is accounted for by the factor. The loading must exceed .70 for the factor to account for 50% of the variance of a variable. Therefore, the larger the absolute size of the factor loading, the more important the loading in interpreting the factor matrix. Because of this, we can assess the loadings as follows (Hair et al., 2006):

i. Factor loading in the range of ± .30 to ± .40 are considered to meet the minimal level for interpretation of structure;

ii. Loadings of ± .50 or greater are considered practically significant; and

iii. Loadings exceeding + .70 are considered indicative or well-defined

structure and are the goal of any factor analysis. The above guidelines are applicable when the sample size is 100 or larger and where the emphasis is on practical and not statistical significance. Table 6 shows a factor matrix with rotated components. The columns show components/factors while the rows show variables.

18

Page 19: Factor

Table 6: Rotated Factor Matrix

Component 1 2 3 4 5 6 7 b6_3 .848 .153 .121 .031 .192 .033 -.004b4 .847 .161 .204 .040 .011 .064 -.020b5 .837 .166 .194 .051 .052 .074 -.011b3 .830 .139 .190 .052 -.001 .112 -.026b7 .821 .167 .244 .070 .128 .054 -.025b6_2 .814 .212 .110 .064 .184 .002 .024b9 .805 .175 .264 .100 .097 .059 -.035b6_1 .804 .169 .103 .082 .180 -.028 .004b8 .799 .183 .279 .038 .118 .045 -.021b2 .773 .172 .308 .159 -.082 .083 -.006b1 .768 .141 .274 .112 -.063 .104 .028b11 .663 .161 .307 .266 -.001 -.073 -.018b10 .545 .196 .218 .319 -.086 -.154 .159d7 .206 .807 -.005 -.020 .118 .103 -.035d6 .182 .791 .030 .038 .104 .064 .048d11 .172 .790 .199 .092 .079 .051 -.024d5 .183 .788 .092 .030 .086 .020 .061d8 .196 .781 .059 -.032 .125 .146 -.074d2 .117 .771 .141 .105 -.049 .024 -.037d10 .178 .769 .205 .028 .075 .050 -.002d1 .133 .766 .107 .100 .018 .085 -.011d9 .184 .758 .172 .046 .048 -.048 -.049d4 .134 .739 .185 .061 .015 -.012 .018d3 .039 .708 .225 .167 -.076 .073 .026c3 .224 .155 .806 .162 .055 -.062 .059c2 .312 .182 .799 .151 .043 -.085 .016c5 .347 .228 .779 .099 .046 .053 .037c4 .315 .194 .776 .082 .056 .086 -.055c1 .334 .194 .760 .142 .061 -.044 .008c8 .449 .252 .689 .067 .148 .157 -.109c6 .420 .236 .679 .069 .117 .220 -.113c7 .471 .208 .653 .053 .153 .152 -.143e12 .084 .102 .054 .766 -.029 -.105 .007e11 .165 .075 .031 .745 -.023 .133 .007e9 .110 .073 .117 .641 .315 .203 -.096e10 .097 .014 .203 .611 .286 .206 -.034e8 .159 .104 .177 .490 .388 .294 -.150e7 .066 .054 .233 .423 .415 .330 -.007e13 .095 .079 .133 .421 .239 .222 .204e6 .136 .144 .169 .298 .664 .148 -.104e5 .239 .155 .051 .182 .627 -.121 .086e3 .068 .168 .073 .232 -.018 .795 .023e2 .106 .149 -.007 .199 .119 .740 .090e1 -.005 -.066 -.093 -.038 -.177 -.020 .790e4 -.027 .034 .018 .013 .412 .190 .611

19

Page 20: Factor

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a Rotation converged in 11 iterations. iii. Interpreting the factor matrix For easier interpretation of the factor matrix, it is best to sort all the factor loadings. SPSS has an option for this. You may find the distinctive variables for each factor and look for a conceptual basis to assess practical significance. The 5-step procedure as shown below provides you with a guideline. Step 1: Examine the factor loading matrix Look at the rotated factor loadings. The factors are arranged as columns, thus each column of numbers represents the loadings of a single factor (see Table 6). Step 2: Identify the significant loadings for each variable The interpretation must start with the first factor and move horizontally from left to right, looking for the highest loading for that variable on any factor. When the highest loading (largest absolute factor loading) is identified, it should be underlined or marked if significant. Attention then focuses on the second variable, again moving from left to right horizontally looking for the highest loading for that variable on any factor and underlining or marking it. Review all other variables using the same technique. When identifying loadings, sometimes a variable can be found to have more than one significant loading – this is called cross-loading. Cross-loading may pose some problems as there exist “shared variables”. In this case, cross-loading variables can be considered for deletion. From Table 7, Factor 1 has high loadings for those variables circled (beginning with b6_3 and ending in b10) while loadings for Factor 2 begin with variable d7 and ending in variable d3.

20

Page 21: Factor

Table 7: Varimax rotated factor loading matrix

Step 3: Assess the communalities of the variables Once you have identified the significant loadings, you should then look for any variables that are not adequately accounted for by the factor solution such as examining each variable’s communality to assess whether the variables meet acceptable levels of explanation. You identify variables with communalities less than .50 as not having sufficient explanation. Table 8 shows the communalities extracted from the factor analysis. Note that communalities under b1 to b11 have values exceeding .50 and thus have sufficient level of explanation.

21

Page 22: Factor

Table 8: Communalities of variables

Step 4: Respecify the factor model if needed After you have identified the significant loadings and communalities, you may discover that: i. a variable has no significant loadings; ii. even with a significant loading, a variable’s communality is deemed too low; or iii. a variable has a cross-loading. If this situation arises, you have the following remedies:

i. Ignore those problematic variables and interpret the solution as it is appropriate if the objective is solely data reduction;

ii. Evaluate each of those variables for possible deletion depending on the

variable’s overall contribution to the research as well as the communality index. If the variable is of minor importance to the study or has unacceptable communality value, it may be eliminated and the model respecified;

iii. Use an alternative rotation method;

iv. Decrease the number of factors retained; and v. Modify the type of factor model used (try others besides principal

component analysis). Step 5: Label the factors When an acceptable factor solution has been obtained in which all variables have a significant loading on a factor, you can then assign some meaning to the pattern of factor loadings. Variables with higher loadings are considered more important and have greater influence on the name or label selected to represent a factor. The assigned name or label to a factor must accurately reflect the variable loadings on that factor. Note that the signs (negative (-) or positive (+)) are interpreted just as with any other correlation coefficients. On each factor, like signs mean the variables are

22

Page 23: Factor

positively related and opposite signs mean the variables are negatively related. In factor analysis, the signs for factor loading relate only to the factor on which they appear, not to other factors in the solution. 8.6 Stage 6: Validation of factor analysis Once you have run a factor analysis and interpreted its results, you may want to assess the degree of generalisability of the results to the population and the potential influence of individual cases or respondents on the overall results. Perhaps, the most direct approach towards validation is to move to a confirmatory procedure, i.e. via confirmatory factor analysis through the use of structural equation modeling. SPSS has a software called AMOS which will do this rather easily. AMOS is bundled together with SPSS 14.0. 9. Conclusion In conclusion, basically factor analysis serves two basic purposes – first, to explore variable areas in order to identify the factors underlying the variables and second, to test hypothesis about the relations among variables. While the first purpose is well-known and well-accepted, the second is not so well-known and well-accepted. However, from another perspective, factor analysis via the confirmatory approach through the use of structural equation modeling (SEM) provides much potential in analytical prowess. There is a need for users to venture into SEM as it addresses some of the weaknesses of the exploratory domain.

Bibliography Coakes, S.J. & Steed, L.G. (2003). SPSS: Analysis without anguish, Version 11.0 for

Windows. Milton, Queensland: John Wiley & Sons Australia. George, D. & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and

reference 11.0 update. 4th ed. Boston: Allyn & Bacon. Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E., & Tatham, R.L. (2006).

Multivariate data analysis. 6th ed. Upper Saddle River, NJ: Pearson Prentice Hall.

Kerlinger, F.N. & Lee, H.B. (2000). Foundations of behavioral research. 4th ed.

Orlando, FL: Harcourt College Publishers. SPSS Inc. (2003). SPSS Brief Guide. Chicago, IL: SPSS Inc. SPSS Inc. (2003). SPSS Base 12.0 User’s Guide. Chicago, IL: SPSS Inc. Velicer, W.F. & Jackson, D.N. (1990). Component analysis versus common factor

analysis: Some issues in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1-28.

23


Related Documents