Quantitative Tools for Qualitative Data Richard Bell University of Melbourne.

Post on 28-Mar-2015

216 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

Transcript

Quantitative Tools for Qualitative Data

Richard Bell

University of Melbourne

For copies of this presentation-130 slides-(about 500kb in a zipped file)

email: rcb@unimelb.edu.au

What kind of Qualitative Data can be Analysed?

• Not raw continuous text data

• Discrete text units that are replicated

• Any kind of coding that has been made

What does the data have to look like

• It must be able to be represented by a table

• not necessarily a two-way table

• for example– Giegler & Klein coding of personal

advertisements– a four-way table: magazine, sex, concept,

category

Categorization

Magazine Sex Concept Fitness Compassion Figure Values Erotic

Z F Self 44 99 50 11 101

Z F Seeking 41 12 9 11 85

Z F Relationship 6 0 12 5 3

Z M Self 67 97 67 18 207

Z M Seeking 80 9 11 9 37

Z M Relationship 1 0 3 4 1

WN F Self 8 14 17 18 107

WN F Seeking 19 1 4 38 59

WN F Relationship 20 0 0 3 0

WN M Self 9 7 4 3 42

WN M Seeking 11 2 6 3 19

WN M Relationship 1 0 1 0 0

Giegler & Klein data as a four-way table

Here, data is stored as a table, the first 4 columns define the cells,

the last column gives the frequency in the cell.

To analyse this data at a case level

Use the SPSS WEIGHT BY function ie

WEIGHT BY FREQ.

Kinds of tables

• Rows are participants, columns are categories

• Rows are categories, columns are participants

• Rows are one set of categories, columns are another set of categories

Data in cells of table

• Indicator to indicate present/absence of relationship between rows and columns

• Frequencies or counts of indicators

• Values of categories

CATEGORY PARTICIPANT

Neuropsychological experience 1 2 3 4 5 6 7 8 Problems with concentration

• • • • • • •

Takes longer to learn new information/ get something in my mind.

• •

Gaps in long term memory/ difficulty accessing memories.

• •

Thinking more affected by emotions • • •

Less capacity to express oneself creatively.

Occasional difficulty finding a word or name in conversation

• •

Vague complaint of something not being quite right, not being able to think too well.

Category 2: Other experiences Affects everything • • • Ability to do things • • • • • • • • Problems with medications • • • • • Independence • Relationships and social life • • • • • • • • Speech and communication • • • • • • • • Quality of sleep • • • • Work/employment • • • • • • Category 3: amplifying factors for neuropsychological experiences (as well as other experiences)

Personality • • Stress or feeling under pressure/ emotional reaction

• • • • •

work /activity schedule • • • • • previous cognitive ability • Social interaction where there are several conversations to follow

• •

Noisy environment • Category 4: general strategies towards difficulties, which also help neuropsychological experiences

Reviewing things, planning & prioritising, limiting activities

• • • • •

Succeeding with activities, sense of achievement helps/ confronting the problem

• • • • •

Writing things down • • • • •

Indicator of present/absence of relationship between rows and columns

Data from Huber (1997)

Site A B C D E F G H ISIN 0.11 0.07 0.09 0.13 0.15 0.16 0.15 0.12 0.12SGR 0.07 0.05 0.09 0.13 0.18 0.16 0.2 0.18 0.2TCO 0.5 0.46 0.43 0.34 0.33 0.28 0.26 0.25 0.2TIN 0.17 0.21 0.19 0.09 0.13 0.2 0.13 0.16 0.16TGR 0.07 0.09 0.11 0.14 0.1 0.07 0.13 0.16 0.13TOP 0.09 0.12 0.09 0.17 0.11 0.13 0.12 0.14 0.18

SIN: Students learn as self-regulated individualsSGR: Students learn in autonomous groupsTCO: Teacher is in controlTIN: Teacher dominates, but allows some individual autonomyTGR: Teacher dominates, but allows some small group autonomyTOP: Teacher dominates, but is open to students' initiatives

Proportions of Activities by Site (Frequencies)

Group Interaction Intensity

Interaction Frequency

Feeling of Belonging

Physical Proximity

Relationship Formality

Crowd slight slight none close formal Audience low nonrecurring slight close formal Public slight slight slight distant no

relationship Mob high nonrecurring high close informal Family high frequent high close informal Relatives moderate infrequent variable distant formal Community low infrequent variable close formal

Values of categories

Getting data into statistical packages such as SPSS

• Transfer data directly from qualitative packages such as Nvivo

• Use SPSS text-import wizard (best with precoded data, ie numbers)

• Enter data by hand

Transfer data from qualitative packages

• Need to be able to export tables.

• Should only be done for tables where rows (or columns) are units of analysis (ie documents or respondents)

• should be saved as a text file (ie has the extension .txt as in table1.txt)

Transfer data from printed table

• Type into SPSS

• Transfer table from Word document– Word document to Excel spreadsheet– Excel spreadsheet to SPSS spreadsheet

The Table in Word

1. Remove the heading row

1. Remove the headings

Move subheadings into a column

Insert a new column into the table

Copy subheading into empty column cells that subheading applied to

Shorten text and insert headings in columns that willbecome SPSS variable names (ie < 9 characters no spaces

Select table and copy to clipboard

Open Excel

Paste from clipboard

Save spreadsheet

Open SPSS

Under the File pull down to Open new Data

Change the file type to Excel files [.xls]

And open the saved Excel spreadsheet

If you have names as column headings in the first row of the Excel spreadsheetSPSS can read them as its variable names

SPSS opens the file(the variable view)

The data view

Notice there are dud lines in this file-they need to be edited out

The file fixed up

Now we need to change oura) repeated phrases (variable ‘type’)b) symbols (variables p1 to p8)into numbers

Do this thru Automatic Recodeunder the ‘Transform’ tab

Need to create a numeric variableinto which values of alphanumericvariable are transformed(alphanumeric values saved as labels)

Transferring Cross-Category tables into SPSS

[where Rows are one set of categories, columns are another set of categories]

• Three types of table:– Cells of the table contain frequencies– Cells of the table contain other data– Cells of the table contain binary indicator

(yes/no, true/false, present/absent etc)

Transferring Frequency Tables: 1

If only two dimensions to table (rows are categories of one variable, columns are categories of another)

– can feed table straight in as table• easy but won’t have labelled output

– feed table in cell by cell (as for more complex tables)

• more complex but allows for labelled output and other possibilities

Feeding table in as table

• Only have cells of table as data

• Can only run one procedure (correspondence analysis) via syntax.

Feeding table in cell by cell

• Have to use syntax (data list function)data list free

/ block slice row column frequency.

begin data.

1 1 1 287

1 1 2 143

1 2 1 94

1 2 2 23

end data.

Data list FREE / EMS PMS GENDER MARSTAT FREQ.Weight by freq.Begin data.1 1 1 1 171 1 1 2 41 1 2 1 281 1 2 2 111 2 1 1 361 2 1 2 41 2 2 1 171 2 2 2 42 1 1 1 542 1 1 2 252 1 2 1 602 1 2 2 422 2 1 1 2142 2 1 2 3222 2 2 1 682 2 2 2 130end data.

Var labels EMS, 'Extramarital Sex'/ PMS, 'Premarital Sex' / GENDER, 'Gender' / MARSTAT,'Marital Status'.Value labels EMS, PMS, 1 'Yes' 2 'No' / GENDER, 1 'Women' 2 'Men' /

MARSTAT, 1 'Divorced' 2 'Still Married'.

‘Traditional’ Quantitative Methods for Qualitative Data

• Miles & Huberman (1994)– hierarchical cluster analysis

• Giegler & Klein (1994)– correspondence analysis

• Bazely (2002)– cluster analysis– correspondence analysis

Cluster Analysis

Figure 9.11 (p.203) from Graham Gibbs (2002)Qualitative data Analysis: Explorations with Nvivoas an SPSS data file

Cluster Analysis: Solution I

Dendrogram using Average Linkage (Between Groups): Chi-square measure

Rescaled Distance Cluster Combine

C A S E 0 5 10 15 20 25 Label Num +---------+---------+---------+---------+---------+

Worklink 10 Youth Training 11 Adult training 1 Redundancy Counselli 6 Start Up Business un 7 Training Access Poin 8 Workers Coops 9 Business Access Sche 3 Careers & Education 4 BCETA 2 Careers Information 5

Cluster Analysis: Solution II

Dendrogram using Average Linkage (Between Groups): Anderberg’s D Measure

Rescaled Distance Cluster Combine

C A S E 0 5 10 15 20 25 Label Num +---------+---------+---------+---------+---------+

Careers & Education 4 Training Access Poin 8 BCETA 2 Start Up Business un 7 Careers Information 5 Adult training 1 Worklink 10 Youth Training 11 Workers Coops 9 Redundancy Counselli 6 Business Access Sche 3

Cluster Analysis: Solution III

Dendrogram using Single Linkage

Rescaled Distance Cluster Combine

C A S E 0 5 10 15 20 25 Label Num +---------+---------+---------+---------+---------+

Careers & Education 4 Training Access Poin 8 Worklink 10 Youth Training 11 Workers Coops 9 Business Access Sche 3 BCETA 2 Start Up Business un 7 Careers Information 5 Adult training 1 Redundancy Counselli 6

Cluster Analysis

• Varies according with coefficient chosen as measure of association between rows (or columns)

• Varies according to method of clustering

• Use with extreme caution

Other Quantitative Methods

• Find weights for categories of variable that maximize relationships between variables

• correspondence analysis– finds weights for categories of row and

categories of column

• also traditional least-squares procedures– eg regression, principal components & others

Correspondence Analysis

• Similar to principal components• Originally derived for tables of frequencies

– [for statistics to apply need one respondent per cell, but can be used with multiple responses across cells]

• but can be used with indicator data• Can produce separate maps of relationships

between categories of rows or columns• Can produce a joint map of categories of rows or

columns

Giegler & Klein

• Examined personal advertisements

• in a number of German magazines

• eg

• Young man, 35 y, 176cm, slim with car, good income, looks for a lovely high-bosomed and well-developed partner for a common future.

Categories of the content analysis.

1. CI cultural interests Bach, Picasso, Rilke, opera 2. IM intellectual mobility brain, quick-witted 3. AP academic professions doctor, lawyer, dentist 4. HEC high economic status villa, yacht, car 5. FB fitness of body sport, strong, robust 6. CLC compassion, life crises poverty, lonely 7. SEX sex tolerant, hot, leather, lover, horny 8. BA body attributes attractive, slim, pretty, well developed 9. HIP high image profession business man, head-clerk10. IV inner values honest, sincere, frank, heart11. SBE social behav. erotic flirt, candle light, romantic, necking12. SB social behaviour friend, life long, emancipated, eager to help13. FO family orientation mother, father, children, marriage, family14. HT holiday, travel abroad, mountains, nature, south, journeys15. NAT nationality African, Turk, foreigner16. 30Y 17-30 years17. 45Y 31-45 years18. 60Y 46-60 years19. OLD over 60 years20. PO pleasure orientation hedonistic, gourmet, wine, good meal21. SDN smoking, drinking negative abstinent, non drinker, cigarette free22. SOC social activities parties, friends, pub, cinema23. SIN single24. DW divorced, widowed

CI IM AP HEC FB CLC SEX BA HIP IV SBE SB FO HT NAT 30Y 45Y 60Y OLD PO1001 2 2 1 0 1 3 0 1 0 1 2 0 1 1 0 1 0 0 0 21002 2 1 0 0 1 0 0 0 0 1 1 1 0 1 1 0 0 1 0 1

Data:One row per adEach column contains number of instancesfor each coding category

ie Each ad will appear a number of times in the cell of any table – total frequency of table is the number ofcodings not the number of ads

Cut-down version of Giegler & Klein exampleCategory Magazine

Z WN WAZ TIP EXP H&W

High SES 234 39 27 14 29 27

Fitness 239 68 58 44 55 44

Compassion 217 24 43 19 35 13

Sex 49 9 52 54 71 127

Figure 152 32 57 49 85 46

Image 43 41 25 30 36 90

Values 58 65 16 14 6 45

Erotic 434 227 303 313 268 374

Friend 303 104 132 182 197 224

Family 515 197 291 282 344 353

Travel 260 149 111 98 90 130

30yo 208 57 135 116 143 283

45yo 132 20 58 52 54 116

60yo 37 10 11 32 9 31

Old 36 108 85 68 97 44

Hedonist 165 124 187 146 156 127

Wowser 70 10 99 134 113 160

Social 14 1 32 45 29 148

Single 56 13 23 9 12 18

Separated 54 15 26 22 16 81

Correspondence Analysis

• In SPSS one of the data reduction options (like factor analysis) as Correspondence Analysis [can be run as syntax or point-and-click]

• also a syntax-only option called ANACOR which is more limited but can analyse a table directly when the only data in the SPSS spreadsheet is the table frequencies.

data list free / A B C D E F G H J.begin data.

0.11 0.07 0.09 0.13 0.15 0.16 0.15 0.12 0.12 0.07 0.05 0.09 0.13 0.18 0.16 0.2 0.18 0.2

0.5 0.46 0.43 0.34 0.33 0.28 0.26 0.25 0.2 0.17 0.21 0.19 0.09 0.13 0.2 0.13 0.16 0.16 0.07 0.09 0.11 0.14 0.1 0.07 0.13 0.16 0.13 0.09 0.12 0.09 0.17 0.11 0.13 0.12 0.14 0.18end data.do repeat xs = A to J.compute xs = xs * 100.end repeat.

ANACOR TABLE = ALL (6,9).

ANACOR syntax example: Huberman proportions table shown earlier

Indicates data values separated by spaces

Identifies columns

}Changes data values from proportions to percentages

Simplest ANACOR syntax (just identifies numbers of rows & columns)

Correspondence AnalysisThe point-and-click way

DimensionSingular

Value Inertia Chi Square Sig.

Proportion of InertiaConfidence Singular Value

Accounted for CumulativeStandard Deviation

Correlation

2

1 .299 .089 .576 .576 .008 .161

2 .198 .039 .252 .828 .009

3 .141 .020 .129 .956

4 .062 .004 .024 .981

5 .055 .003 .019 1.000

Total .155 1948.580 .000(a) 1.000 1.000

a. Five possible dimensionsb. Singular value – square root of eigenvaluec. Inertia – eigenvalues (variance)d. Chi-square – could be partitioned between dimensions

(only valid if cells in table are independent)

a. b. c. d.

-How many dimensions?-Fit of Solution

Row and Column Points

Symmetrical Normalization

Dimension 1

2.01.51.0.50.0-.5-1.0-1.5

1.2

.8

.4

0.0

-.4

-.8

-1.2

-1.6

Content

Magazine

Separated

Single

Social

Wowser

Hedonist

Old

60yo

45yo

30yo

.

Travel

Family

.Erotic

Values

ImageFigure

Sex

Compassion

Fitness

High SES

H&W

. TIPWAZ

WN

Z

Score in Dimension

Inertia

Contribution

Of Points to Inertia of Dimension

Of Dimension to Inertia of Point

Mass 1 2 1 2 1 2 Total

Z .264 -.779 .380 .056 .537 .193 .863 .136 .999

WN .106 -.349 -.939 .030 .043 .473 .130 .625 .755

WAZ .143 .137 -.276 .007 .009 .055 .123 .331 .454

TIP .138 .391 -.166 .011 .071 .019 .559 .067 .625

EXP .149 .213 -.218 .010 .023 .036 .198 .137 .335

H&W .199 .691 .472 .042 .318 .224 .676 .208 .884

Active Total

1.000 .155 1.000 1.000

Details for Magazines

Location in spatial

representation

Different ways ofdescribing fit ofeach magazine

Content Mass Score in Dimension 1 Contribution

1 2

Of Point to Inertia of Dimension Of Dimension to Inertia of Point

2 1 2 Total

High SES .029 -1.463 .670 .022 .211 .067 .873 .121 .995

Fitness .040 -.939 .125 .011 .119 .003 .985 .011 .996

Compassion .028 -1.407 .626 .019 .185 .055 .859 .113 .971

Sex .029 .830 .438 .007 .066 .028 .801 .147 .949

Figure .034 -.418 .085 .004 .020 .001 .487 .013 .501

Image .021 .470 .012 .004 .016 .000 .360 .000 .360

Values .016 -.456 -.639 .010 .011 .034 .105 .137 .242

Erotic .153 .109 -.172 .002 .006 .023 .262 .437 .699

. .091 .040 .061 .001 .000 .002 .034 .050 .084

Family .158 -.004 -.063 .001 .000 .003 .001 .109 .110

Travel .067 -.367 -.278 .005 .030 .026 .493 .188 .681

30yo .075 .384 .384 .006 .037 .056 .548 .363 .911

45yo .034 .079 .582 .002 .001 .059 .026 .944 .970

60yo .010 .130 .350 .002 .001 .006 .030 .145 .175

Old .035 .181 -1.417 .015 .004 .354 .023 .955 .978

Hedonist .072 .118 -.578 .006 .003 .122 .048 .756 .804

Wowser .047 .814 .160 .012 .103 .006 .767 .020 .786

Social .021 1.482 .970 .020 .157 .102 .721 .204 .926

Single .010 -.676 .275 .002 .016 .004 .742 .081 .823

Separated .017 .379 .717 .004 .008 .044 .192 .455 .647

Active Total 1.000 .155 1.000 1.000

Similar Fit information for ad categorizations

More Complex versions…

• Sometimes known as Multiple Correspondence Analysis

• HOMALS HOMogeneity analysis by Alternating Least Squares

• For example

• The complete data structure of Giegler & Klein

Categorization

Magazine Sex Concept Fitness Compassion Figure Values Erotic

Z F Self 44 99 50 11 101

Z F Seeking 41 12 9 11 85

Z F Relationship 6 0 12 5 3

Z M Self 67 97 67 18 207

Z M Seeking 80 9 11 9 37

Z M Relationship 1 0 3 4 1

WN F Self 8 14 17 18 107

WN F Seeking 19 1 4 38 59

WN F Relationship 20 0 0 3 0

WN M Self 9 7 4 3 42

WN M Seeking 11 2 6 3 19

WN M Relationship 1 0 1 0 0

Giegler & Klein data as a four-way table

Multiple Correspondence Analysis

[HOMALS]

Dimension 1

2.01.51.0.50.0-.5-1.0-1.5

Dim

en

sio

n 2

.8

.6

.4

.2

0.0

-.2

-.4

-.6-.8

Category

Concept

Sex

Magazine

Erotic

Values

Figure

Compassion

Fitness.

Relationship

Seeking

Self

M

F WN

Z

Some other questions

• How well could we predict magazine usage from the other factors?

• Could use – multinomial regression if cells independent

(and sample size very large)– categorical regression if just want to look at

effects

A new issue:The kind of

transformation to be chosen

Kinds of tranformations

• Depends on what we want to assume• Not inherent in the data• Basic Kinds

– Nominal - Categorical (unordered categories)– Ordinal (Assumes data are ordered)– Numeric -Interval (Assumes data on a scale

with equal intervals)

• Recent advance– Spline (smoothes ordinal & nominal

transformations)

Model Summary

Multiple R R SquareAdjusted R

Square

.338 .115 .113

Dependent Variable: MAGAZINEPredictors: SEX CONCEPT CATEGORY

Standardized Coefficients

Beta Std. Error

df F-ratio Prob

SEX -.195 .008 2 535.707 .000

CONCEPT -.034 .008 3 16.377 .000

CATEGORY .273 .008 20 1052.267

.000

Transformation: MAGAZINE

Optimal Scaling Level: Nominal.

Categories

H&WEXPTIPWAZWNZ

Qu

an

tific

atio

ns

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

Another example: How do characteristics distinguish among

groups?

• Famous example

• (Not real)

GROUPInteraction Intensity

Interaction Frequency

Feeling of Belonging

Physical Proximity

Relationship Formality

Crowd slight slight none close formal

Audience low nonrecurring slight close formal

Public slight slight slight distant no relationship

Mob high nonrecurring high close informal

Family high frequent high close informal

Relatives moderate infrequent variable distant formal

Community low infrequent variable close formal

Summary of a qualitative analysis of the characteristics of groups as postulated by Gutman from Bell & Sirjamaki (1962)

Object Scores Labeled by GROUP

Cases weighted by number of objects.

Dimension 1

1.51.0.50.0-.5-1.0-1.5-2.0

Dim

en

sio

n 2

2.0

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

CommunityRelatives

FamilyMob

Public

Audience

Crowd

Quantifications

Dimension 1

1.51.0.50.0-.5-1.0-1.5-2.0

Dim

en

sio

n 2

2.0

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

Relationship Formali

ty

Physical Proximity

Feeling of Belonging

Interaction Frequenc

y

Interaction Intensit

y

informal

formal

no relations

closedistanthigh

variable

slight

none

frequent

infrequent

nonrecurring

slight

high

moderate

low

slight

Category Quantifications

• Here the data were all treated as nominal• Dimensions were quantification values• Different quantifications for different

dimensions• Only possible for nominal data• Other (ordinal, numeric) must have same

quantification on each dimension. Nominal can also be similarly restricted.

For example: Using regression

• Make the group the dependent variable • Other nominal variables cannot be multiple-

nominal because regression coefficients are unidimensional

• Use other variables to predict group– Artificial example few cases relatively many variables

will give perfect prediction

– Can still compare prediction & evaluate categories

Predictors of

Group

Standardized Coefficients

Beta

Interaction Intensity -1.084

Interaction Frequency .689

Feeling of Belonging 1.219

Physical Proximity -.209

Relationship Formality .060

Transformation: GROUP

Optimal Scaling Level: Nominal.

Categories

CommunityRelativesFamilyMobPublicAudienceCrowd

Qu

an

tific

atio

ns

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

-2.5

Transformation: Interaction Frequency

Optimal Scaling Level: Nominal.

Categories

frequentinfrequentnonrecurringslight

Qu

an

tific

atio

ns

2.0

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

Transformation: Feeling of Belonging

Optimal Scaling Level: Spline Nominal (degree 2, interior knots 2).

Categories

highvariableslightnone

Qu

an

tific

atio

ns

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

Transformation: Relationship Formality

Optimal Scaling Level: Nominal.

Categories

informalformalno relationship

Qu

an

tific

atio

ns

2.0

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

Principal Components: Demographics

• Age Group [treat as ordinal]

• Education Level [treat as ordinal]

• Marital Status [ nominal ]

• Work Status [ nominal – allow different quantications for different dimensions]

Transformation: Age Group

Optimal Scaling Level: Spline Ordinal (degree 2, interior knots 2).

Variable Principal Normalization.

Categories

60-64yrs

55-59yrs

50-54yrs

45-49yrs

40-44yrs

35-39yrs

30-34yrs

25-29yrs

20-24yrs

18-19yrs

.

Qu

an

tific

atio

ns

4

2

0

-2

-4

-6

Transformation: Work Status

Dimension 1

Optimal Scaling Level: Multiple Nominal.

Variable Principal Normalization.

Categories

OtherUnemployedDon't workPart-timeFull-time

Qu

an

tific

atio

ns

1.0

.5

0.0

-.5

-1.0

Transformation: Work Status

Dimension 2

Optimal Scaling Level: Multiple Nominal.

Variable Principal Normalization.

Categories

OtherUnemployedDon't workPart-timeFull-time

Qu

an

tific

atio

ns

2

1

0

-1

-2

-3

Transformation: Marital Status

Optimal Scaling Level: Nominal.

Variable Principal Normalization.

Categories

Single

Widowed

Divorced-Separated

living together

2nd marriage

1st marriage

Qu

an

tific

atio

ns

2

1

0

-1

-2

-3

Component Loadings and Centroids

Variable Principal Normalization.

Dimension 1

1.51.0.50.0-.5-1.0-1.5-2.0-2.5

Dim

en

sio

n 2

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

-2.5

Work Status

Component Loadings

Other

Unemployed

Don't work

Part-time

Full-time

Marital Status

Education LevelAge Group

Combining Qualitative & Quantitative Data

• The availability of numeric and other transformations

• makes the combining of quantitative & qualitative data

• simple

Combining Qualitative & Quantitative Data

• Use Categorical Regression setting measurement levels appropriately

• Use Categorical Principal Components setting measurement levels appropriately

• Save transformed variables and use ordinary regression or factor analysis for better options (eg hierarchical regression or factor rotation)

Combining Qualitative & Quantitative Data

• Preserve independence of sets of data

• Generalized (more than two sets) non-linear canonical variate analysis

• OVERALS

OVERALS• A tool for relating sets of variables

• Variant that is a common statistical model is canonical variate analysis (producing a canonical correlation between two sets of variables

• OVERALS – Allows for more than two sets– Allows variables to be numeric, categorical or

ordinal

A current data set

• PhD project by Simone Pica

• People with psychosis featuring social withdrawal– 19 young people suffering from psychosis with

symptoms of social withdrawal– Unstructured interviews– Standard psychiatric measures also completed

Data

• Interviews transcribed, categories formed from content, coding made

• Diagnosis (DSM III-R)

• Scores on quantitative measures– Premorbid Adjustment Scale (PAS)– Symptoms of Negative Schizophrenia (SANS)

Raw materialUm, when I got home I thought it was probably a good thing I didn’t

go because um, it sort of relates to motivation as well, I wasn’t really that motivated to go out and deal with people and stuff. If more of my friends were there, I’d probably would have gone, if it was a party and all my friends were there I would have thought cool you know, I’d have to go even if I only had a few dollars, that’s cool, I can go without drinks, cigarettes, I’d just want to be there you know but probably because there would have been only a couple of people I would have known there and the rest of them I wouldn’t have known. I sort of thought no, I wouldn’t have a good time because if I wanted to meet people, I like meeting people, but when I meet people I always have to talk about my psychosis, and whenever I have to talk about my psychosis, its like everyone is listening you know, and they all just stop what they are doing and they listen, “psychosis, what is that?” and then I have to explain everything about it and they are all listening type of thing, honing in type of thing.

Classified material• 3. EXPERIENCED DIFFICULTY COMMUNICATING• He couldn’t talk because he became jumbled, he couldn’t focus

on one thing he kept thinking about whether his ex-friend was going to mention the letter to other people there

• He stayed in small groups of people throughout the evening in order to avoid saying something inappropriate that would draw attention to him

• When he felt comfortable he found it easier to talk• He found that the comfortable feeling didn’t last, it wore off when

the ‘wall’ came and he found it difficult to think of things to talk about

• When he was with the group of people he didn’t know what to talk to people about so he remained silent

• He didn’t know what to talk about because he couldn’t think of anything intelligent to say

• When he was with people and he didn’t know what to talk about his mind was blank, he didn’t think anything

felt differentstressed

uncomfortabledifficulty

communicatingconcern about others

views of them

1 Absent Present Absent Absent

2 Absent Present Present Present

3 Absent Absent Absent Present

4 Present Absent Absent Present

5 Present Present Present Present

6 Present Absent Present Present

Qualitative Data: eg Presence of categories in interview transcripts

DSM-IIIR diagnosis Frequency Percent Cumulative Percent

Schizophrenic 11 55.0 57.9

Schizophreniform 3 15.0 73.7

Schizoaffective 2 10.0 84.2

Delusional 2 10.0 94.7

Bipolar 1 5.0 100.0

Qualitative measures: eg DSM diagnosis

PAS Child PAS Adolesc PAS Adult

14 6 9

21 6 8

35 8 11

46 5 4

54 5 5

64 6 7

74 4 5

Quantitative Measures: eg Premorbid Adjustment Scales

DIM1

1.0.8.6.4.20.0-.2-.4

DIM

21.0

.8

.6

.4

.2

0.0

-.2

-.4

-.6

-.8

SET

Diagnosis

Interview

SANS

PAS

DSM-IIIR Dimension 2

DSM-IIIR Dimension 1

self boring

want to be alone

shy/inferior

stigma judged reject

.concern others views

.stressed incommunica

Attention

Anhedonia

Avolition

Alogia

Affect Adult

Adolesc

Child

Dimension 1 Transformation Plot for DSM-IIIR

DSM-IIIR

BipolarDelusionalSzoaffectiveSzphreniformSzphrenic

Ca

teg

ory

Qu

an

tific

atio

ns

for

DS

M-I

IIR

1

0

-1

-2

-3

Dimension 2 Transformation Plot for DSM-IIIR

DSM-IIIR

BipolarDelusionalSzoaffectiveSzphreniformSzphrenic

Ca

teg

ory

Qu

an

tific

atio

ns

for

DS

M-I

IIR

2

1

0

-1

-2

-3

Fit of Solution

Summary of Analysis

Dimension Sum 1 2 Loss Set 1 .220 .545 .764 Set 2 .359 .267 .626 Set 3 .284 .302 .585 Set 4 .119 .326 .445 Mean .245 .360 .605Eigenvalue .755 .640 Fit 1.395

Summary of Analysis

Dimension 1 2 SumLoss PAS .220 .545 .764 31.6% SANS .359 .267 .626 25.9% Text .284 .302 .585 24.1% DSM .119 .326 .445 18.4% Mean .245 .360 .605 (Loss) 30%Eigenvalue .755 .640 1.395 (Fit) 70%Total 1.000 1.000 2.000 100%

Fit of Solution

Object Scores Labeled by ID

Cases weighted by number of objects.

Dimension 1

210-1-2-3-4

Dim

en

sio

n 2

3

2

1

0

-1

-2

-3

2019

18

17

16

15

14

13

12

11

1098

7

65

43

2

1

Some pointers for Optimal Scaling

• for SPSS optimal scaling – CATREG & CATPCA have most

sophisticated options– CATREG produces standard regression output– Both CATREG & CATPCA can

• save transformed variables (for repeating analysis in ordinary mode eg for rotating components)

• Eliminate need to specify range (unlike HOMALS & OVERALS which must have range 1 to n specified)

Some pointers for Optimal Scaling

• Cautions• In general category quantifications only hold for

the set of variables in the analysis• (Incredibly) there is little published experience

with these techniques• Remember to use in exploratory mode

– Change transformations and see what happens

– Delete outlying variables/categories

the end

for more information, email

rcb@unimelb.edu.au

top related