Top Banner
Evaluation of tools used to measure critical thinking development in nursing and midwifery undergraduate students: A systematic review Author Carter, Amanda G, Creedy, Debra K, Sidebotham, Mary Published 2015 Journal Title Nurse Education Today Version Accepted Manuscript (AM) DOI https://doi.org/10.1016/j.nedt.2015.02.023 Copyright Statement © 2015 Published by Elsevier Ltd. Licensed under the Creative Commons Attribution- NonCommercial-NoDerivatives 4.0 International (http://creativecommons.org/licenses/by-nc- nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, providing that the work is properly cited. Downloaded from http://hdl.handle.net/10072/161703 Griffith Research Online https://research-repository.griffith.edu.au
32

TITLE: Evaluation of tools used to measure critical ...

Apr 25, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: TITLE: Evaluation of tools used to measure critical ...

Evaluation of tools used to measure critical thinkingdevelopment in nursing and midwifery undergraduatestudents: A systematic review

Author

Carter, Amanda G, Creedy, Debra K, Sidebotham, Mary

Published

2015

Journal Title

Nurse Education Today

Version

Accepted Manuscript (AM)

DOI

https://doi.org/10.1016/j.nedt.2015.02.023

Copyright Statement

© 2015 Published by Elsevier Ltd. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in anymedium, providing that the work is properly cited.

Downloaded from

http://hdl.handle.net/10072/161703

Griffith Research Online

https://research-repository.griffith.edu.au

Page 2: TITLE: Evaluation of tools used to measure critical ...

2

TITLE: Evaluation of tools used to measure critical thinking development in nursing

and midwifery undergraduate students: A systematic review.

Word Count 4494 (without references and tables)

Authors:

Amanda G. Carter RM BHealthSc MMid

School of Nursing and Midwifery

Griffith University, Brisbane, Australia.

Debra K. Creedy RN PhD

Professor, Centre for Health Practice Innovation

Griffith Health Institute

Griffith University, Brisbane, Australia.

Mary Sidebotham RM PhD

School of Nursing and Midwifery

Griffith University, Brisbane, Australia.

Corresponding Author

Amanda G. Carter

School of Nursing and Midwifery

Griffith University

University Drive

Meadowbrook. Queensland 4131, Australia

Ph +61 7 33821535

[email protected]

Page 3: TITLE: Evaluation of tools used to measure critical ...

3

Evaluation of tools used to measure critical thinking development in nursing and

midwifery undergraduate students: A systematic review

Abstract

Background: Well developed critical thinking skills are essential for nursing and midwifery

practice. The development of students’ higher-order cognitive abilities, such as critical

thinking, is also well recognised in nursing and midwifery education. Measurement of critical

thinking development is important to demonstrate change over time and effectiveness of

teaching strategies.

Objective: To evaluate tools designed to measure critical thinking in nursing and midwifery

undergraduate students.

Data sources: The following six databases; CINAHL, Ovid Medline, ERIC, Informit,

PsycINFO and Scopus were searched and resulted in the retrieval of 1,191 papers.

Review methods: After screening for inclusion, each paper was evaluated using the Critical

Appraisal Skills Programme Tool. Thirty-four studies met the inclusion criteria and quality

appraisal. Sixteen different tools that measure critical thinking were reviewed for reliability

and validity and extent to which the domains of critical thinking were evident.

Results: Sixty percent of studies utilised one of four standardised commercially available

measures of critical thinking. Reliability and validity were not consistently reported and there

was variation in reliability across studies that used the same measure. Of the remaining

studies using different tools, there was also limited reporting of reliability making it difficult to

assess internal consistency and potential applicability of measures across settings.

Conclusions: Discipline specific instruments to measure critical thinking in nursing and

midwifery are required, specifically tools that measure the application of critical thinking to

practice. Given that critical thinking development occurs over an extended period,

measurement needs to be repeated and multiple methods of measurement used over time.

Key words: critical thinking, nursing, midwifery, measures, scales, evaluation

Introduction

Page 4: TITLE: Evaluation of tools used to measure critical ...

4

The development of critical thinking (CT) skills has long been recognised as a priority in

tertiary education. The landmark Delphi study by the American Philosophical Association

(APA) produced an international expert consensus definition of critical thinking. Critical

thinking is described as purposeful, self-regulatory judgment which results in interpretation,

analysis, evaluation, and inference (Facione, 1990). Critical thinkers consider events or

issues in a controlled, purposeful, focussed and conscious way (Mong-Chue, 2000).

Critical thinking is a crucial skill for nurses and midwives who, like other healthcare

clinicians, need to effectively manage complex care situations in fast paced environments

that demand increasing accountability (Mong-Chue, 2000; Muoni, 2012; Pucer, Trobec, &

Žvanut, 2014). The processes of clinical decision-making and problem-solving require

advanced CT skills (Muoni, 2012). CT is also essential for clinicians to critique and apply

evidence, especially in situations where uncertainty regarding ‘best practice’ remains unclear

(Scholes et al, 2012).

Although the development of students’ higher order cognitive abilities is recognised as

important in nursing and midwifery education, the measurement of these vital skills is

inconsistent or neglected (Walsh & Seldomridge, 2006). The measurement of CT is

important to identify deficits and developments in students’ cognitive capacities as well as

demonstrate the effectiveness of teaching strategies. The purpose of this systematic review

was to evaluate tools used to measure CT development in nursing and midwifery

undergraduate students.

Search Strategies Utilised

A search of major databases CINAHL, Ovid Medline, ERIC, Informit, PsycINFO and Scopus,

was conducted in September 2014. The search was limited to English language articles

published in peer reviewed journals during 2001-2014. This period was chosen as the

results of a Delphi study to define CT in nursing was published in 2000 (Scheffer &

Rubenfeld, 2000). Scholarly work about CT in nursing would have further developed since

that publication.

The inclusion criteria were original research studies that utilised experimental designs to

assess CT development in undergraduate nursing and/or midwifery students. Papers were

excluded if CT was not specifically measured on more than one occasion; the sample was

post-graduate students, full text was not available in English, discussion papers that did not

involve original research, or did not use an experimental design.

Five search terms were entered into the databases with the article title, abstract and body all

searched. The search terms used were:

Page 5: TITLE: Evaluation of tools used to measure critical ...

5

1. “critical thinking” AND midwife*

2. “critical thinking” AND midwife* AND measure*

3. “critical thinking” AND midwife* AND evaluat*

4. “critical thinking” AND students, nursing AND measure*

5. “critical thinking” AND students, nursing AND evaluat*

The search was conducted sequentially using the search engines and search terms. An

initial search, filtering for date, language and source of publication, identified 1,191 papers.

Once duplicates were excluded, each identified citation was reviewed using the inclusion

and exclusion criteria and filtered through three screening levels i.e., (i) title screening; (ii)

title and abstract screening; and (iii) full-text screening. Articles that were not relevant or did

not meet inclusion criteria were discarded. Finally 35 papers were included. No papers

involving midwifery undergraduate students met the inclusion criteria and hence the samples

in all of the papers are undergraduate nursing students.

Overview of Tools

Twenty-one (60%) of the 34 studies reviewed utilised one of four standardised commercially

available measures of critical thinking. These were the California CT Disposition Inventory

(10 studies), the California CT Skills Test (5 studies), the Watson-Glaser CT Appraisal (3

studies) and Health Services Reasoning Test (3 studies). Two studies used both the

Californian CT Skills Test and California CT Disposition Inventory. All of these tools have

reported psychometric reliability and validity allowing comparison across settings,

disciplines, and time. Relatively few of the included studies (9 out of 21) undertook a

reliability analysis of the tool for their current context. There were twelve other measurement

tools utilised in the studies reviewed. See Table 1 for a comparison of tools employed in the

studies reviewed.

Page 6: TITLE: Evaluation of tools used to measure critical ...

6

Ta

ble

1:

De

sc

rip

tio

n o

f T

oo

ls/M

eth

od

s t

o m

ea

su

re c

riti

ca

l th

ink

ing

fo

nt

in t

he

ta

ble

dif

fers

fro

m t

he

te

xt.

? m

ak

e t

he

m a

ll t

he

sa

me

Na

me

of

Instr

um

en

t/

Au

tho

r/

Ye

ar

De

ve

lop

ed

Aim

of

too

l N

um

be

r o

f It

em

s/

form

at

Psyc

ho

me

tric

Te

sti

ng

S

co

res

Tim

e t

o

Co

mp

lete

F

ac

tor

Do

ma

ins

Me

asu

red

Th

e C

alif

orn

ia

Critica

l T

hin

kin

g

Dis

po

sitio

n

Inve

nto

ry

(CC

TD

I) /

F

acio

ne

&

Fa

cio

ne

/ 19

92

Me

asu

re th

e

exte

nt to

wh

ich

an

in

div

idu

al

po

sse

sse

s th

e

att

itu

de

s o

f a

critica

l th

inke

r.

De

sig

ne

d f

or

use

b

y th

e g

en

era

l a

du

lt p

opu

latio

n

75

Lik

ert

ite

ms,

“ag

ree

-dis

ag

ree

” sca

le, stu

den

t’s

se

lf r

ep

ort

Cro

nba

ch

’s

alp

ha

.9

0

for

the

ove

rall

instr

um

ent

an

d

.71

to

.8

0

for

the

se

ve

n

su

bsca

les

Ma

xim

um

sco

re o

f 6

0 in

ea

ch

do

ma

in.

Ne

gative

dis

po

sitio

n

is a

sco

re b

elo

w 3

0.

Th

e to

tal m

axim

um

sco

re is 4

20

po

ints

. S

co

res >

350

in

dic

ate

a h

igh C

T

dis

positio

n.

Sco

res

less <

280

ind

icate

p

au

city o

f C

T

20

-30

min

s

Op

en

-min

de

dn

ess,

ana

lyticity,

co

gn

itiv

e,

ma

turity

, tr

uth

-se

ekin

g,

syste

maticity,

inq

uis

itiv

ene

ss, an

d

se

lf-c

onfide

nce

.

Ca

lifo

rnia

C

ritica

l T

hin

kin

g

Skill

s T

est

(CC

TS

T)/

F

acio

ne

&

Fa

cio

ne

/ 19

92

De

sig

ne

d f

or

asse

ssm

en

t of

en

try o

r exit le

ve

l C

T s

kill

s o

f va

rio

us g

rou

ps o

f co

llege

stu

de

nts

a

nd

fo

r e

va

lua

tion

of

lea

rnin

g

ou

tco

me

s o

f va

rio

us c

urr

icu

lar

pro

gra

ms.

34

Mu

ltip

le

ch

oic

e ite

ms

use

s a

gen

eric

sce

na

rio

re

qu

irin

g a

n

accu

rate

an

d

co

mp

lete

in

terp

reta

tion

of

the q

ue

stio

n

Th

e K

ud

er-

Ric

ha

rdson

(K

R-2

0)

estim

ate

of

inte

rna

l co

nsis

ten

cy o

f th

e

CC

TS

T is r

epo

rte

d in

th

e

test

ma

nu

al to

be r

= .70

Th

e m

axim

um

to

tal

sco

re is 3

4.

A s

co

re

of

≥24

in

dic

ate

s v

ery

str

ong

CT

skill

s. A

sco

re 1

3-2

3

ind

icate

s a

mid

-ra

nge

skill

le

ve

l S

co

res o

f ≤ 1

2

ind

icate

fun

da

me

nta

l w

ea

kn

esse

s in C

T

skill

s.

45

-50

min

s

An

aly

sis

, in

fere

nce,

eva

lua

tio

n,

ded

uctive

and

in

du

ctive

re

aso

nin

g

He

alth

Scie

nce

s

Re

aso

nin

g T

est

(HS

RT

) /F

acio

ne,

F

acio

ne

, &

W

inte

rha

lte

r/

Ad

ap

tation

of

the

CC

TS

T

sp

ecific

ally

d

esig

ned

fo

r u

se

b

y h

ea

lth

scie

nce

s s

tude

nts

33

mu

ltip

le

ch

oic

e

que

stion

s u

se

s

a h

ea

lth

re

late

d

sce

na

rio

re

qu

irin

g a

n

Inte

rna

l con

sis

ten

cy .77

to

.8

4.

ove

rall

inte

rna

l co

nsis

ten

cy v

alu

e o

f .8

1

with

Ku

de

r-R

ich

ard

so

n

form

ula

20, a

nd

an

ove

rall

.81 r

elia

bili

ty c

oeff

icie

nt

To

tal sco

re r

eflects

o

ve

rall

CT

skill

s.

Ma

xim

um

sco

re is

33

. S

co

res o

f 2

5 o

r a

bo

ve

re

pre

sen

t str

ong

CT

skill

s,

30

-50

min

s

An

aly

sis

, in

fere

nce,

eva

lua

tio

n, in

du

ctive

re

ason

ing

and

d

ed

uctive

rea

son

ing

Page 7: TITLE: Evaluation of tools used to measure critical ...

7

201

0

and

pro

fessio

na

ls

to a

sse

ss t

he

ir

CT

and

clin

ica

l re

ason

ing

skill

s.

accu

rate

an

d

co

mp

lete

in

terp

reta

tion

of

the q

ue

stio

n

sco

res f

rom

15

to

24

are

co

nsid

ere

d m

id-

ran

ge

and

rep

resen

t co

mp

ete

nce in

CT

skill

s in

mo

st

situ

atio

ns,

an

d

sco

res o

f 1

4 o

r b

elo

w r

ep

resen

t fu

nd

am

en

tal

we

akn

esse

s in C

T

skill

s

Th

e W

ats

on

-G

lase

r C

ritica

l T

hin

kin

g

Ap

pra

isa

l (W

GC

TA

) /

Wats

on

&

Gla

se

r/

orig

ina

lly

de

ve

loped

in

1

92

5, m

ost

rece

nt

revis

ion

2

01

2

Me

asu

res b

oth

lo

gic

al a

nd

cre

ative

co

mp

on

en

ts o

f C

T a

nd

asse

sse

s

CT

ab

ility

in

in

div

idu

als

with

at

lea

st a

nin

th

gra

de

edu

ca

tio

n

40

mu

ltip

le

ch

oic

e ite

ms

an

sw

erin

g

sce

na

rio

ba

se

d

que

stion

s

Re

liab

ility

rep

ort

ed to

be

>

.8.

Usin

g the

Sp

ea

rmen

-B

row

n fo

rmu

la,

relia

bili

ty

for

the

to

tal sco

re o

f th

e

WG

CT

A w

as

esta

blis

he

d a

t .7

7.

Th

is is

co

nsis

tent

with

the s

plit

-ha

lf r

elia

bili

ty

co

eff

icie

nts

, ra

ng

ing

fr

om

.76

to

.85

Ma

xim

um

sco

re is

80

40

-50

min

s

Infe

ren

ce

, re

co

gn

itio

n o

f a

ssu

mp

tion

s,

ded

uctio

n,

inte

rpre

tation

an

d

eva

lua

tio

n o

f a

rgu

me

nts

Th

ink a

lou

d

ana

lytic

fra

me

wo

rk /

Da

ly/

20

01

An

aly

se

q

ua

lita

tive

da

ta to

syn

the

sis

e

co

nce

ptio

n o

f C

T

A s

ca

le o

f a

rgu

me

nt/ep

iste

mo

log

ica

l co

mp

lexity is

used

to

asse

ss

vid

eota

ped

clie

nt

sim

ula

tion

No

t sta

ted

Sco

res r

ang

e f

rom

1

-4

No

tim

e

co

mm

itm

en

t b

y s

tude

nt.

U

se

s

lea

rnin

g

activitie

s

inte

gra

ted

in

to the

co

urs

e

Str

uctu

ral

co

mp

on

en

ts o

f d

iffe

ren

tia

tion

and

in

teg

ration

in

re

ason

ing

, situ

ation

m

od

elli

ng

and

a

rgu

me

nt an

d

evid

en

tia

l str

uctu

re.

Critica

l T

hin

kin

g

Ab

ility

Sca

le

(CT

AS

) fo

r

Asse

ss

dim

en

sio

ns o

f C

T

of

co

llege

20

ite

ms

me

asu

red

usin

g

a L

ike

rt s

ca

le 1

Cro

nba

ch's

alp

ha

was

foun

d t

o b

e .7

4 (

Pa

rk,

199

9)

To

tal sco

res h

ave a

p

ossib

le r

an

ge

fro

m

5 to

10

0,

with h

ighe

r

No

t sta

ted

In

telle

ctu

al cu

rio

sity,

hea

lth

y s

kep

ticis

m,

inte

llectu

al in

teg

rity

,

Page 8: TITLE: Evaluation of tools used to measure critical ...

8

Co

llege

S

tud

en

ts/

Pa

rk/1

99

9

stu

de

nts

=

ab

so

lute

ly d

o

no

t ag

ree t

o 5

=

ab

so

lute

ly a

gre

e

sco

re in

dic

ating

str

ong

er

CT

ab

ility

p

rude

nce

, a

nd

o

bje

ctivity

Critica

l T

hin

kin

g

Dis

po

sitio

n

Sca

le f

or

Nu

rsin

g

Stu

den

ts

(CT

DS

) /P

ark

&

Kim

/20

09

(K

ore

an

ve

rsio

n

on

ly)

Asse

ss o

f C

T

dis

positio

n in

K

ore

an

nu

rse

s

35

ite

ms

asse

sse

d a

5-

po

int

Lik

ert

sca

le. S

tude

nt

se

lf-r

epo

rt

Cro

nba

ch

’s a

lph

a =

.7

8

(Pa

rk &

Kim

, 2

00

9)

Th

e to

tal sco

re

ran

ge

s f

rom

35

to

1

75

, w

ith

a h

ighe

r sco

re in

dic

ating

a

hig

he

r le

ve

l of

critica

l th

inkin

g

dis

positio

n

No

sta

ted

Inte

llectu

al in

teg

rity

, cre

ativity, cha

llen

ge

, o

pe

n-m

inde

dne

ss,

pru

de

nce

, o

bje

ctivity, tr

uth

se

ekin

g,

inq

uis

itiv

ene

ss,

Critica

l T

hin

kin

g

Pro

ce

ss T

est

(CT

PT

)/

Ed

uca

tio

na

l R

esou

rce

s In

c./

199

9

De

ve

lop

ed

sp

ecific

ally

fo

r n

urs

ing

stu

de

nts

. F

ocu

s o

n c

ritica

l th

inkin

g p

roce

ss

skill

s w

ith

in a

n

urs

ing

e

nviron

men

t, n

ot

leve

l of

nu

rsin

g

co

nte

nt

kn

ow

led

ge

50

ite

m m

ultip

le

ch

oic

e

Th

e a

ve

rage

re

liab

ility

co

eff

icie

nt

wa

s .

93

with

d

em

on

str

ate

d e

vid

en

ce

of

co

nte

nt

an

d d

iagn

ostic

va

lidity (

An

de

rson

et a

l,

200

0).

No

t sta

ted

6

0 m

inu

tes

Asse

sse

s 4

aspe

cts

of

the c

ritica

l th

inkin

g p

roce

ss:

liste

nin

g,

writing

, sp

ea

kin

g, a

nd

re

ad

ing

, an

d 5

le

ve

ls

of

ab

str

act th

inkin

g:

prio

ritizin

g,

infe

ren

tia

l re

aso

nin

g,

goa

l se

ttin

g,

app

lica

tion

of

kn

ow

led

ge

, a

nd

e

va

lua

tio

n o

f p

red

icte

d o

utc

om

es.

Th

ink a

lou

d

pro

toco

l /

Mo

rey,

200

2

Pro

vid

e a

va

lid

so

urc

e o

f q

ua

lita

tive

da

ta

on

th

inkin

g a

nd

th

ou

gh

t p

roce

sse

s

A r

atin

g too

l an

d

rub

ric u

sin

g a

4

po

int

Lik

ert

sca

le fo

r e

ight

co

gn

itiv

e

pro

ce

sse

s, le

ve

l of

critica

l th

inkin

g, an

d fo

r

Tw

o f

acu

lty r

ate

d the

th

ink-a

loud

sce

na

rio

re

sp

on

ses w

ith 9

7.9

to

100

pe

rcen

t ra

ter

ag

ree

men

t.

No

t p

rovid

ed

No

tim

e

co

mm

itm

en

t b

y s

tude

nt

as

use

s le

arn

ing

activitie

s

inte

gra

ted

in

to the

co

urs

e

Co

llect,

re

vie

w,

rela

te,

inte

rpre

t,

infe

r, d

iag

no

sis

, a

ct,

and

eva

lua

te

Page 9: TITLE: Evaluation of tools used to measure critical ...

9

accu

racy o

f n

urs

ing

d

iagn

osis

, co

nclu

sio

ns,

and

eva

lua

tion

.

N3

ca

se

rep

ort

a

ccre

dita

tio

n

form

/T

aiw

an

N

urs

es

Asso

cia

tion

/ n

o

da

te a

va

ilab

le

No

t sta

ted

45

crite

ria

(in

clu

din

g 3

6

str

eng

ths a

nd 9

w

ea

kn

esse

s

Inte

r-ra

ter

relia

bili

ty =

.8

93

, in

tern

al co

nsis

ten

cy

of

KR

-20

= .7

9 a

nd

te

st-

rete

st re

liab

ility

of

.32

(p

<0

.01

).

To

tal sco

res r

ang

ed

fr

om

0-4

5.

No

tim

e

co

mm

itm

en

t b

y s

tude

nt.

U

se

s

lea

rnin

g

activitie

s

inte

gra

ted

in

to the

co

urs

e

Co

nstr

ucte

d o

n the

b

asis

of

the

nu

rsin

g

pro

ce

ss.

Critica

l in

qu

iry p

oin

ts a

re

liste

d u

nd

er

ea

ch

ste

p o

f th

e n

urs

ing

p

roce

ss

Dis

cu

ssio

n

boa

rd a

na

lysis

/ P

uce

r T

robe

c &

Ž

va

nu

t / 20

14

An

aly

se

d

iscu

ssio

n b

oa

rd

po

sts

fo

r e

vid

en

ce

of

CT

Dis

cu

ssio

n

po

sts

exa

min

ed

a

ga

inst six

e

lem

en

ts o

f critica

l th

inkin

g

No

t sta

ted

N

ot

sta

ted

6

0 m

inu

tes

An

aly

sis

, in

fere

nce,

inte

rpre

tation

, e

xp

lan

ation

, e

va

lua

tio

n, an

d s

elf-

reg

ula

tion

.

Critica

l T

hin

kin

g

sca

le (

CT

S)

/ C

he

ng

, W

ang

, W

u, &

Hw

an

g, /

199

6

No

t p

rovid

ed

60

ite

m m

ultip

le

ch

oic

e

que

stion

s.

Pa

rtic

ipa

nts

ch

oo

se

on

e

co

rre

ct a

nsw

er

fro

m e

ith

er

one

in

fiv

e o

r d

ich

oto

mo

us

resp

on

se s

ets

a

cco

rdin

g t

o t

he

ite

m s

itua

tion

s.

CT

S d

em

onstr

ate

d

ade

qu

ate

re

liab

ility

(in

tern

al con

sis

ten

cy a

s

we

ll a

s s

plit

ha

lf r

elia

bili

ty)

and

con

ve

rgen

t a

s w

ell

as

kn

ow

n g

roup

va

lidity.

Th

e h

igh

er

sco

res

ind

icate

bett

er

CT

skill

s

No

t p

rovid

ed

Infe

ren

ce

, re

co

gn

itio

n o

f a

ssu

mp

tion

s,

ded

uctio

n,

inte

rpre

tation

, a

nd

e

va

lua

tio

n o

f a

rgu

me

nt.

Critica

l T

hin

kin

g

Asse

ssm

en

t

(CT

A)

/ A

sse

ssm

en

t

De

term

ine

stu

de

nts

’ o

ve

rall

pe

rfo

rma

nce

on

sp

ecifie

d C

T s

kill

s

40

gen

eric

mu

ltip

le c

ho

ice

q

ue

stion

s

CT

A h

as a

glo

ba

l a

lpha

of

.69 a

nd

a s

tan

da

rdiz

ed

ite

m a

lpha

of

.70

fo

r a

ll 40

ite

ms in

first-

tim

e

Ma

xim

um

sco

re o

f 4

0

No

t p

rovid

ed

Inte

rpre

tation

, a

na

lysis

, eva

lua

tio

n,

infe

ren

ce,

exp

lan

ation

, se

lf-

Page 10: TITLE: Evaluation of tools used to measure critical ...

10

Te

chn

olo

gie

s

Institu

te /

200

1

de

term

ined

to

be

n

ece

ssa

ry fo

r su

cce

ss in

an

acad

em

ic

pro

gra

m f

or

nu

rsin

g s

tud

y.

exa

min

ee

s (

AT

I, 2

001

).

reg

ula

tion

Blo

om

s

Ta

xon

om

y /

Jo

ne

s, 2

00

8

Asse

ss s

tud

en

t’s

de

ve

loped

n

urs

ing

ca

re

pla

ns f

or

evid

en

ce

of

critica

l th

inkin

g

Usin

g n

urs

ing

ca

re p

lans

No

t p

rovid

ed

No

t p

rovid

ed

No

tim

e

co

mm

itm

en

t b

y s

tude

nt

as

use

s le

arn

ing

activitie

s

inte

gra

ted

in

to the

co

urs

e

Kn

ow

ledg

e,

co

mp

reh

en

sio

n,

app

lica

tion

, a

na

lysis

, syn

the

sis

, e

va

luation

Co

nce

pt m

ap

sco

rin

g /

Da

ley

Sh

aw

, B

alis

trie

ri,

Gla

sen

ap

p,

Pia

cen

tine

/

199

9

Asse

ss s

tud

en

t’s

ab

ility

to

de

ve

lop

co

nce

pt m

ap

s

that

refle

ct

CT

u

sed

in

the

n

urs

ing

pro

ce

ss.

Usin

g c

on

ce

pt

ma

ps

Inte

r-ra

ter

relia

bili

ty w

as

pe

rfo

rme

d w

ith

tw

o

asse

sso

rs in t

he

pilo

t stu

dy a

nd t

he

pe

rcen

tage

of

ag

ree

men

t of

the

in

dep

en

de

nt

sco

res w

as

85

%.

Con

ten

t va

lidity

esta

blis

h b

y D

ale

y e

t a

l (1

999

).

No

t p

rovid

ed

N

o t

ime

co

mm

itm

en

t b

y s

tude

nt

as

use

s le

arn

ing

activitie

s

inte

gra

ted

in

to the

co

urs

e

Me

an

ingfu

l, v

alid

a

nd

sig

nific

an

t

Critica

l T

hin

kin

g

Sca

le (

CT

SM

) /

McM

aste

r U

niv

ers

ity /

200

2

No

t p

rovid

ed

10

ite

ms. E

ach

ite

m is s

co

red

on

a s

ix-p

oin

t L

ike

rt s

ca

le o

f 1

to

6

, w

ith

1

co

rre

spo

nd

ing

to

“ne

ve

r” a

nd

6

to “

alw

ays”.

.

Cro

nba

ch

’s a

lph

a

co

eff

icie

nt

.93 a

nd

tw

o-

we

ek te

st-

rete

st

relia

bili

ty

co

eff

icie

nt

wa

s .

92

.

To

tal sco

res r

ang

e

fro

m 1

0 to

60

with

h

ighe

r sco

res

ind

icatin

g h

ighe

r le

ve

l of

CT

co

mp

ete

ncy

No

t p

rovid

ed

No

t sta

ted

Page 11: TITLE: Evaluation of tools used to measure critical ...

11

Page 12: TITLE: Evaluation of tools used to measure critical ...

12

Inclu

ded

stu

die

s w

ere

lis

ted

in

a s

um

ma

ry t

ab

le (

Ta

ble

2)

du

ring

th

e s

ea

rch

. T

he

stu

die

s a

re p

resen

ted

in

gro

up

s a

cco

rdin

g t

o t

he

too

l u

tilis

ed

.

Aft

er

the

in

itia

l sea

rch a

ll a

rtic

les iden

tified

in s

ub

se

qu

en

t se

arc

he

s w

ere

ch

ecked

ag

ain

st a

rtic

les in

th

e s

um

ma

ry tab

le a

nd

dup

lica

tes

exclu

ded

. E

ach

art

icle

wa

s a

lso

en

tere

d into

a r

efe

ren

ce

mana

ge

me

nt

da

taba

se (

End

no

te)

inclu

din

g th

e s

ea

rch

te

rm a

nd

en

gin

e u

sed

to

loca

te e

ach

art

icle

. A

qu

alit

y a

pp

rais

al p

roce

ss w

as p

erf

orm

ed u

sin

g the

Critica

l A

pp

rais

al S

kill

s P

rog

ram

me

(C

AS

P)

too

l (C

AS

P, 2

01

3)

and

one

art

icle

of

po

or

qua

lity w

as e

xclu

ded

. T

he

exclu

de

d s

tud

y is id

en

tified

in t

he

su

mm

ary

ta

ble

. F

ollo

win

g t

he

qua

lity a

pp

rais

al p

roce

ss 3

4

pap

ers

we

re s

ele

cte

d f

or

revie

w.

Page 13: TITLE: Evaluation of tools used to measure critical ...

13

Ta

ble

2:

Art

icle

s t

ha

t m

et

inc

lus

ion

an

d q

ua

lity

cri

teri

a

Au

tho

r,

ye

ar

an

d

loc

ati

on

De

sig

n/In

terv

en

tio

n

Pa

rtic

ipa

nts

R

esu

lts

R

elia

bil

ity a

nd

va

lid

ity

asse

ss

me

nt

Qu

ality

A

pp

rais

al u

sin

g

CA

SP

Ca

lifo

rnia

n C

riti

ca

l T

hin

kin

g D

isp

osit

ion

In

ve

nto

ry (

CC

TD

I)

Ata

y,

&

Ka

rab

aca

(2

012

).

Tu

rke

y

Pre

- po

st-

test

co

ntr

ol

gro

up

de

sig

n te

stin

g

eff

ects

of

usin

g c

on

ce

pt

pla

ns

80

fre

sh

ma

n a

nd

so

ph

om

ore

nu

rsin

g

stu

de

nts

Sta

tistica

lly s

ign

ific

an

t in

cre

ase

in C

T s

co

res fo

r e

xpe

rim

en

tal g

roup

.

Cro

nba

ch

’s a

lph

a f

or

the

w

as .8

8.

Inclu

de

Sh

in,

Le

e,

Ha

, &

Kim

(2

006

) K

ore

a

Lon

gitu

din

al stu

dy u

sin

g

CC

TD

I e

ach

ye

ar

for

4

ye

ars

60

nu

rsin

g s

tud

en

ts

co

mm

en

ce

d o

n s

tud

y,

32

co

mp

lete

d a

ll fo

ur

su

rve

ys

Sta

tistica

lly s

ign

ific

an

t im

pro

ve

men

t in

CT

d

isp

ositio

n

Cro

nba

ch

’s a

lph

a f

or

the

C

CT

DI w

as .

59 in

Yr

1, .5

3

for

Yr

2, .6

6 fo

r Y

r 3

, a

nd

.7

3 f

or

Yr

4.

Sig

nific

an

tly

low

er

than

ove

rall

med

ian

a

lpha

co

eff

icie

nt of

.90

rep

ort

ed b

y F

acio

ne (

199

4)

Inclu

de

Tiw

ari,

Ave

ry, &

La

i (2

006

).

Ho

ng K

ong

Expe

rim

en

tal de

sig

n,

pre

-po

st

test te

sting

th

e

eff

ects

of

PB

L. 4

tim

e

po

ints

te

ste

d

79

1st y

ea

r nu

rsin

g

stu

de

nts

.

Sig

nific

antly g

rea

ter

imp

rove

men

t in

CT

sco

res fo

r e

xpe

rim

en

tal g

roup

No

re

po

rtin

g o

f re

liab

ility

of

CC

TD

I fo

r th

is s

tud

y.

Inclu

de

Evan

s &

B

end

el,

(20

04

).

Un

ited

S

tate

s

Qu

asi-e

xpe

rim

en

tal, n

on

-e

qu

iva

len

t co

ntr

ol g

roup

d

esig

n te

stin

g n

arr

ative

p

ed

ag

og

y

114

un

de

rgra

dua

te

nu

rsin

g s

tude

nts

,

No

sig

nific

an

t d

iffe

ren

ce

s in

C

T s

co

res b

etw

ee

n c

on

tro

l a

nd

exp

erim

en

tal g

roup

s

No

re

po

rtin

g o

f re

liab

ility

of

CC

TD

I fo

r th

is s

tud

y.

Inclu

de

Wood &

T

oro

nto

(2

012

) U

SA

Expe

rim

en

tal stu

dy

testing

the

eff

ects

of

hu

man

pa

tie

nt

sim

ula

tion

85

2nd y

ea

r nu

rsin

g

stu

de

nts

H

igh

er

me

an

po

st-

test to

tal

sco

res c

om

pa

red

with p

re-

test to

tal sco

res in

e

xpe

rim

en

tal g

roup

stu

de

nts

.

No

re

po

rtin

g o

f re

liab

ility

of

CC

TD

I fo

r th

is s

tud

y.

Inclu

de

Ste

wa

rt &

D

em

pse

y

(20

05

).

US

A

Lon

gitu

din

al stu

dy, a

t 5

tim

e-p

oin

ts te

stin

g

eff

ects

of

wh

ole

pro

gra

m

55

nu

rsin

g s

tud

en

ts

recru

ited

, 3

4 s

tud

en

ts

co

mp

lete

d a

ll su

rve

ys

Su

bsca

le a

nd t

ota

l sco

res d

id

no

t sig

nific

an

tly in

cre

ase

th

rou

gh

ou

t th

e p

rog

ram

.

Cro

nba

ch

’s a

lph

a f

or

the

C

CD

DI w

as c

alc

ula

ted

at

ea

ch

ph

ase

: S

oph

om

ore

se

me

ste

r 2

=

.

Inclu

de

Page 14: TITLE: Evaluation of tools used to measure critical ...

14

.71.

Ju

nio

r se

me

ste

r 1

= .

77

Ju

nio

r se

me

ste

r 2

= .

76

S

en

ior

se

me

ste

r 1

= .6

7

Se

nio

r se

me

ste

r 2

= .7

5

Ye

h &

Ch

en

(2

005

).

Ta

iwa

n

A p

re-

an

d p

ost-

test

qua

si-e

xpe

rim

en

tal

rese

arc

h d

esig

n te

stin

g

the e

ffe

cts

of

a C

T

lectu

re a

nd in

tera

ctive

vid

eod

isc s

yste

m

126

RN

-BN

stu

de

nts

S

tatistica

lly s

ign

ific

an

t d

iffe

ren

ce

s b

etw

een

pre

and

p

ost-

test o

ve

rall

sco

res

No

re

po

rtin

g o

f re

liab

ility

of

CC

DT

I fo

r th

is s

tud

y.

Inclu

de

Yu

, Z

ha

ng

, X

u, W

u &

W

ang

(20

12

).

Ch

ina

Cro

sso

ve

r e

xp

erim

enta

l stu

dy te

sting

th

e e

ffe

cts

of

PB

L

76

2nd y

ea

r nu

rsin

g

stu

de

nts

.

Sta

tistica

l im

pro

ve

me

nt in

o

ve

rall

CT

DI sco

res fo

llow

ing

P

BL

Fo

r th

is s

tud

y the

ove

rall

Cro

nba

ch

’s a

lph

a w

as

.899

9

Inclu

de

De

hko

rdi, &

H

eyda

rne

jad

, (2

008

).

US

A

Qu

asi-e

xpe

rim

en

tal

de

sig

n te

stin

g the

eff

ects

of

PB

L

40

2nd y

ea

r nu

rsin

g

stu

de

nts

pa

rtic

ipa

ted

.

Sta

tistica

l im

pro

ve

me

nt in

C

TD

I sco

res f

ollo

win

g P

BL

N

o r

epo

rtin

g o

f re

liab

ility

of

CC

DT

I fo

r th

is s

tud

y

Inclu

de

Za

de

h,

Kh

aje

ali,

K

ha

lkh

ali,

&

Mo

ha

mm

ad

pou

r (2

01

4).

Ir

an

Qu

asi-e

xpe

rim

en

tal

stu

dy te

sting

th

e e

ffe

cts

of

an

evid

en

ce

ba

se

d

nu

rsin

g c

ou

rse

48

3rd

ye

ar

nu

rsin

g

stu

de

nts

C

CT

DI sco

res w

ere

sig

nific

an

tly h

igh

er

follo

win

g

the in

terv

en

tion

No

re

po

rtin

g o

f re

liab

ility

of

CC

DT

I.

Inclu

de

Ca

lifo

rnia

n C

riti

ca

l T

hin

kin

g T

est

(CC

TS

T)

Ch

au,

et a

l (2

001

).

Ho

ng K

ong

Pre

-te

st/

po

st-

test d

esig

n

testing

the

eff

ects

of

4

vig

nett

es.

101

1st a

nd 2

nd y

ea

r n

urs

ing

stu

de

nts

re

cru

ited

of 8

3

co

mp

lete

d b

oth

pre

an

d

po

st-

tests

.

No

sta

tistica

l d

iffe

ren

ce in

pre

a

nd

po

st te

st

sco

res.

KR

-20

of

the

CC

TS

T w

as

.74 a

nd

su

bsca

les r

an

ge

d

fro

m .

30

to

.61

.

Inclu

de

Be

ckie

,

A p

re-p

ost

test, n

on

-1

83

BN

stu

den

ts

Co

ho

rt 1

re

ce

ive

d t

he

ne

w

Cro

nba

ch a

lph

a o

n C

CT

ST

In

clu

de

Page 15: TITLE: Evaluation of tools used to measure critical ...

15

Lo

wry

, &

B

arn

ett,

(20

01

).

Un

ited

S

tate

s

equ

iva

len

t co

ntr

ol g

roup

d

esig

n. E

xp

erim

enta

l g

roup

expe

rie

nce

d n

ew

cu

rric

ulu

m

co

nsis

ted o

f 3 c

oh

ort

s

of

stu

de

nts

, 1

co

ntr

ol

co

ho

rt a

nd 2

coh

ort

s

that

exp

erie

nce

d the

n

ew

cu

rric

ulu

m

cu

rric

ulu

m, a

ch

ieved

sig

nific

an

tly h

igh

er

CT

sco

res

than

co

ntr

ols

. C

oho

rt 3

, th

e

2nd

cla

ss t

o e

xp

eri

en

ce

th

e

revis

ed c

urr

icu

lum

, fa

iled

to

d

em

on

str

ate

im

pro

ve

d C

T

sco

res a

nd

rep

ort

ed s

om

e

de

cre

ases.

ran

ge

d f

rom

.5

5 to

.8

3.

Inte

rna

l con

sis

ten

cy o

f to

ol

low

an

d v

aried

acro

ss te

sts

.

Sp

elic

, et

al.,(

20

01

).

Un

ited

S

tate

s

Lon

gitu

din

al stu

dy

testing

eff

ects

of

diffe

ren

t p

ath

wa

ys

136

stu

de

nts

in

3

und

erg

radu

ate

p

ath

wa

ys,

trad

itio

na

l,

acce

lera

ted

and

RN

-B

SN

Sta

tistica

lly s

ign

ific

an

t in

cre

ase

in C

T s

co

res fo

r a

ll p

ath

wa

ys

Th

e C

CT

ST

ha

s 3

4 ite

ms.

No

de

mon

str

ate

d v

arian

ce

(a

ll stu

den

ts s

co

red

th

e

sa

me

) on

so

me

ite

ms, α

le

ve

l th

ere

fore

co

mp

ute

d o

n

less th

an

30

ite

ms.

Inclu

de

Whee

ler,

&

Co

llin

s,

(2

003

)

Un

ited

S

tate

s

Qu

asi-e

xpe

rim

en

tal

de

sig

n. T

estin

g the

eff

ects

of

co

ncep

t m

ap

pin

g c

om

pa

red

to

tr

ad

itio

na

l nu

rsin

g c

are

p

lan

s.

A c

on

ven

ien

ce s

am

ple

(n

= 7

6)

Sig

nific

ant

diffe

ren

ce

b

etw

een

pre

– p

ost te

st

sco

res fo

r b

oth

gro

up

s. N

o

diffe

ren

ce

fo

un

d b

etw

ee

n

expe

rim

en

tal an

d c

on

tro

l g

roup

s.

No

re

po

rtin

g o

f re

liab

ility

of

CC

TS

T f

or

this

stu

dy.

Inclu

de

Yu

an

, K

una

vik

tiku

l,

Klu

nklin

,

& W

illia

ms,

(20

08

).

Ch

ina

A q

ua

si-e

xpe

rim

en

tal,

two

-gro

up

pre

–po

st te

st

de

sig

n te

stin

g the

eff

ects

of

PB

L

All

46 Y

ea

r 2

nu

rsin

g

stu

de

nts

P

BL s

tude

nts

ha

d

sig

nific

an

tly g

rea

ter

imp

rove

men

ts o

n o

ve

rall

CC

TS

T

KR

20 f

or

the

CC

TS

T-A

wa

s

.80 f

or

the

to

tal sca

le a

nd

be

twe

en

.6

0-.

78 f

or

su

bsca

les.

Inclu

de

Ca

lifo

rnia

n C

riti

ca

l T

hin

kin

g S

kills

Te

st

(CC

TS

T)

& C

alifo

rnia

n C

riti

ca

l T

hin

kin

g D

isp

os

itio

n I

nve

nto

ry (

CC

TD

I)

Ra

ve

rt,

(20

08

).

Sta

tes

Pre

-post

test de

sig

n

testing

eff

ects

of

hu

ma

n

pa

tie

nt sim

ula

tio

n

30

1st y

ea

r stu

den

ts

No

diffe

ren

ces in

CT

sco

res

No

re

po

rtin

g o

f re

liab

ility

of

the C

CT

ST

or

CC

TD

I fo

r th

is s

tud

y.

Inclu

de

Na

be

r &

W

yatt

, (2

014

) U

nited

S

tate

s

Expe

rim

en

tal, p

re–p

ost

test d

esig

n t

esting

eff

ects

of

refle

ctive

w

ritin

g

70

4th s

em

este

r nu

rsin

g

stu

de

nts

T

he

expe

rim

enta

l g

roup

's

tota

l C

CT

ST

and

CC

TD

I sco

res d

id n

ot

incre

ase

sig

nific

an

tly fo

llow

ing

the

in

terv

en

tio

n.

No

re

po

rtin

g o

f re

liab

ility

of

CC

TS

T o

r C

CT

DI sca

le f

or

this

stu

dy.

Inclu

de

Page 16: TITLE: Evaluation of tools used to measure critical ...

16

He

alt

h S

cie

nc

es R

ea

so

nin

g T

est

(HS

RT

)

Su

llivan

-M

an

n,

Pe

rro

n,

&

Fe

llne

r (2

009

).

Un

ited

S

tate

s

Mix

ed

-mo

de

l e

xpe

rim

en

tal de

sig

n,

testing

eff

ects

of

mu

ltip

le

sim

ula

tion

53

nu

rsin

g s

tud

en

ts

fro

m t

he

me

dic

al-

su

rgic

al co

urs

e

Sta

tistica

lly s

ign

ific

an

t in

cre

ase

in C

T s

co

res fo

r e

xpe

rim

en

tal g

roup

.

Re

liab

ility

of

the

HR

ST

not

rep

ort

ed f

or

this

stu

dy.

Inclu

de

Sh

inn

ick,. &

W

oo,

(20

13

).

Un

ited

S

tate

s

On

e-g

roup

, q

ua

si-

expe

rim

en

tal, p

re-p

ost

test d

esig

n.

Teste

d th

e

eff

ects

on o

ne

hu

ma

n

pa

tie

nt sim

ula

tio

n

A c

on

ven

ien

ce s

am

ple

of

154

, 3

rd o

r 4

th y

ea

r n

urs

ing

stu

de

nts

Fo

llow

ing H

PS

th

ere

we

re n

o

sta

tistica

lly s

ign

ific

an

t g

ain

s

in C

T,

with

so

me

de

cre

ase

in

sco

res (

no

t sta

tistica

lly

sig

nific

an

t).

No

re

po

rtin

g o

f re

liab

ility

of

HS

RT

fo

r th

is s

tud

y.

Inclu

de

Go

od

sto

ne

e

t a

l,

(20

13

).

US

A

A t

wo

-gro

up

qua

si-

expe

rim

en

tal p

re-p

ort

te

st d

esig

n t

esting

th

e

eff

ects

of

hig

h f

ide

lity

pa

tie

nt sim

ula

tio

n

(HF

PS

) co

mpa

red

to

ca

se

stu

dy

42

1st s

em

este

r a

sso

cia

te d

eg

ree

n

urs

ing

stu

de

nts

. A

lloca

ted to

tw

o

gro

up

s,

HF

PS

, an

d

ca

se

stu

dy g

roup

,

Th

ere

wa

s a

sig

nific

an

t in

cre

ase

in t

he

HS

RT

sco

res

for

the

ca

se

stu

dy g

rou

p

(p=

0.0

03

) bu

t n

ot fo

r th

e

HF

PS

gro

up

.

.No

rep

ort

ing

of

relia

bili

ty o

f H

SR

T f

or

this

stu

dy

Inclu

de

Wa

tso

n-G

lase

r C

riti

ca

l T

hin

kin

g A

pp

rais

al (W

GC

TA

)

L'E

pla

tten

ier

(20

01

).

Un

ited

S

tate

s

Lon

gitu

din

al stu

dy

testing

4 t

imes o

ve

r 3

ye

ar

und

erg

radu

ate

p

rog

ram

83

nu

rsin

g s

tud

en

ts

N

o c

ha

ng

e in

CT

sco

res a

s

stu

de

nt

pro

gre

ssed

th

rough

th

e p

rog

ram

.

No

re

po

rtin

g o

f re

liab

ility

of

WG

CT

A f

or

this

stu

dy.

Inclu

de

Bro

wn

, A

lve

rson

, &

P

epa

(2

001

).

Un

ited

S

tate

s

Lon

gitu

din

al stu

dy,

testing

at th

e b

eg

inn

ing

a

nd

en

d o

f de

gre

e.

Te

sting

diffe

ren

t p

ath

wa

ys a

nd

le

ng

th o

f p

rog

ram

Co

nve

nie

nce s

am

ple

(n

=

12

3)

of

thre

e g

roup

s

of

ba

cca

lau

reate

n

urs

ing

stu

de

nts

: tr

ad

itio

na

l, R

N-B

SN

, a

nd

acce

lera

ted

.

A s

ign

ific

an

ce d

iffe

ren

ce

foun

d b

etw

ee

n p

re-

and

po

st

WG

CT

A s

co

res fo

r tr

ad

itio

na

l stu

de

nts

(p=

0.0

07

) an

d R

N-

BS

N (

p=

0.0

29

), w

ith

no

d

iffe

ren

ce

fo

r a

cce

lera

ted

stu

de

nts

.

Re

liab

ility

fo

r th

e t

ota

l sco

re

of

the W

GC

TA

wa

s

esta

blis

he

d a

t .7

7 (

usin

g

Sp

ea

rmen

-Bro

wn f

orm

ula

).

Co

nsis

ten

t w

ith t

he

sp

lit-h

alf

relia

bili

ty c

oeff

icie

nts

(.6

9 to

.8

5),

re

po

rted

by W

ats

on

a

nd

Gla

se

r

Inclu

de

Wa

tso

n-G

lase

r C

riti

ca

l T

hin

kin

g A

pp

rais

al (W

GC

TA

) a

nd

Th

ink

Alo

ud

An

aly

tic

al F

ram

ew

ork

Da

ly,

A lon

gitu

din

al m

ulti-

43

nu

rsin

g s

tud

en

ts

No

sta

tistica

l d

iffe

ren

ce in

N

o r

epo

rtin

g o

f re

liab

ility

of

Inclu

de

Page 17: TITLE: Evaluation of tools used to measure critical ...

17

(20

01

).

Un

ited

K

ingd

om

me

tho

d d

esig

n w

ith

tr

ian

gu

latio

n.

co

mp

lete

d W

GC

TA

. 12

stu

de

nts

co

mp

lete

d

thin

k a

lou

d a

na

lytica

l fr

am

ew

ork

WG

CT

A s

co

res. L

ittle

e

vid

en

ce

of

CT

de

mon

str

ate

d

in t

hin

k a

loud

ana

lytica

l fr

am

ew

ork

WG

CT

A f

or

this

stu

dy.

No

d

iscu

ssio

n o

f re

liab

ility

or

va

lidity o

f th

ink a

loud

a

na

lytica

l fr

am

ew

ork

. N

ot

cle

ar

whe

the

r th

e th

ink

alo

ud

to

ol w

as v

alid

ate

d o

r re

vie

wed

by e

xp

ert

s a

nd

inte

r-ra

ter

relia

bili

ty w

as n

ot

dis

cu

sse

d

Cri

tic

al T

hin

kin

g A

bilit

y S

ca

le (

CT

AS

) fo

r C

olle

ge

Stu

de

nts

Ch

oi,

Lin

dq

uis

t, &

S

ong

, (2

014

).

Ko

rea

No

n-e

qu

iva

lent

co

ntr

ol

gro

up

pre

–p

ost

test

de

sig

n te

stin

g e

ffe

cts

of

PB

L.

90

1st y

ea

r nu

rsin

g

stu

de

nts

No

sig

nific

an

t d

iffe

ren

ce

s in

C

T s

co

res b

etw

ee

n c

on

tro

l a

nd

exp

erim

en

tal g

roup

s

Cro

nba

ch's

alp

ha

was .7

1

wh

ich

is c

on

sis

ten

t w

ith

th

e

rep

ort

ed .

74

by P

ark

(19

99

).

No

t ava

ilab

le in

En

glis

h

Inclu

de

Cri

tic

al T

hin

kin

g D

isp

osit

ion

Sc

ale

fo

r N

urs

ing

Stu

de

nts

(C

TD

S)

Ju

n, Le

e,

Pa

rk,

Ch

ang

&

Kim

(2

01

3).

S

outh

K

ore

a

Qu

asi-e

xpe

rim

en

tal

stu

dy te

sting

eff

ects

of

5E

le

arn

ing

cycle

mo

de

l w

ith

PB

L

161

1st y

ea

r n

urs

ing

stu

de

nts

Sta

tistica

lly s

ign

ific

an

t in

cre

ase

in C

T s

co

res fo

r e

xpe

rim

en

tal g

roup

.

Cro

nba

ch

’s a

lph

a w

as .81

. C

TD

S n

ot a

va

ilab

le in

E

ng

lish

, 2

0 p

oin

t se

lf r

epo

rt

Lik

ert

sca

le m

ea

su

res

dis

positio

n a

s a

pro

xy f

or

CT

skill

s.

Inclu

de

Cri

tic

al T

hin

kin

g P

roce

ss T

est

(CT

PT

)

De

Sim

on

e,

(20

06

).

Un

ited

S

tate

s

Expe

rim

en

tal de

sig

n

testing

eff

ects

of

acce

lera

ted

pro

gra

m

38

nu

rsin

g s

tud

en

ts

und

ert

akin

g a

n

acce

lera

ted

pro

gra

m

(12

mon

ths in le

ng

th)

Incre

ase in

CT

sco

res n

ot

sig

nific

an

tly d

iffe

ren

t A

ve

rag

e r

elia

bili

ty

co

eff

icie

nt

wa

s .

93

. In

clu

de

Cri

tic

al T

hin

kin

g P

roce

ss T

est

+ T

hin

k A

lou

d P

roto

co

l

Mo

rey,

(20

12

).

Un

ited

S

tate

s

An

expe

rim

enta

l d

esig

n

testing

an o

nlin

e

an

ima

ted p

eda

go

gic

al

age

nt.

45

associa

te d

eg

ree

nu

rsin

g s

tude

nts

in

the

ir f

ina

l se

me

ste

r

No

diffe

ren

ces in

CT

fo

r e

ithe

r to

ol

No

re

po

rtin

g o

f re

liab

ility

of

CT

PT

. T

wo f

acu

lty r

ate

d the

th

ink-a

loud

sce

na

rio

re

sp

on

ses w

ith 9

7.9

to 1

00

Inclu

de

Page 18: TITLE: Evaluation of tools used to measure critical ...

18

pe

rcen

t ra

ter

ag

ree

me

nt.

Lim

ite

d info

rma

tion

p

rovid

ed

re

ga

rdin

g th

e t

hin

k

alo

ud

pro

toco

l.

N3

Ca

se

Re

po

rt A

cc

red

ita

tio

n F

orm

Ch

en,

&

Lin

, (2

00

1)

Ta

iwa

n

Qu

asi- e

xpe

rim

enta

l d

esig

n w

ith

pre

-po

st

test

testing

eff

ects

of

a

rese

arc

h c

ou

rse

168

1st

ye

ar

nu

rsin

g

stu

de

nts

.

Expe

rim

en

tal g

roup

rep

ort

ed

sig

nific

an

tly h

igh

er

CT

sco

res

than

co

ntr

ol g

roup

No

re

po

rtin

g o

f re

liab

ility

of

the N

3 c

ase r

epo

rt fo

rm.

Un

cle

ar

wh

eth

er

too

l m

ea

su

red

stu

den

ts’ a

bili

ty

to c

ritiqu

e a

n a

rtic

le r

ath

er

than

CT

ab

ilitie

s..

Inclu

de

Dis

cu

ssio

n B

oa

rd A

na

lysis

Pu

ce

r,

Tro

be

c,

&

Žva

nu

t,

(20

14

)

Slo

ven

ia

Qu

asi-e

xpe

rim

en

t stu

dy

testing

the

eff

ects

of

an

IC

T p

rog

ram

wh

ich

p

resen

ted

scen

ario

s t

ha

t m

irro

r clin

ica

l situa

tion

s.

45

1st y

ea

r nu

rsin

g

stu

de

nts

Qu

alit

ative a

na

lysis

of

the

d

iscu

ssio

n b

oa

rds s

ho

wed

a

sig

nific

an

t im

pro

ve

me

nt

in %

of

po

sts

fo

r w

hic

h t

he

o

pin

ion

s a

nd

co

nclu

sio

ns o

f th

e p

art

icip

an

ts w

ere

ju

stifie

d

with

va

lid a

rgu

men

ts.

No

re

po

rtin

g o

f to

ol

relia

bili

ty.

No

dis

cu

ssio

n

reg

ard

ing d

eve

lop

men

t of

too

ls, e

xp

ert

re

vie

w p

roce

ss

or

psych

om

etr

ic te

sting

of

the t

oo

l

Inclu

de

Cri

tic

al T

hin

kin

g S

ca

le (

CT

S)

Lee

et

al

(20

13

)

Ta

iwa

n

Lon

gitu

din

al stu

dy,

me

asu

rin

g a

t 4

tim

e-

po

ints

te

sting

the

eff

ects

of

co

nce

pt

map

pin

g

A c

on

ven

ien

ce s

am

ple

of

95

stu

den

ts,

Bo

th c

on

tro

l an

d

expe

rim

en

tal g

roup

s h

ad

h

ighe

r in

itia

l C

T s

co

res th

at

tend

ed

to

de

cre

ase

ove

r tim

e.

No

re

po

rtin

g o

f re

liab

ility

of

CT

sca

le fo

r th

is s

tud

y.

Inclu

de

Cri

tic

al T

hin

kin

g A

sse

ss

me

nt

(CT

A)

Ma

nn

, (2

012

).

US

A

Expe

rim

en

tal, p

re-p

ost-

test, m

ixe

d m

eth

od

d

esig

n te

stin

g the

eff

ects

of

gra

nd

rou

nd

s

21

2nd y

ea

r nu

rsin

g

stu

de

nts

.

No

sig

nific

an

t d

iffe

ren

ce

b

etw

een

CT

sco

res fo

r th

e

two

gro

up

s.

In the

co

ntr

ol

gro

up

, stu

den

ts' s

co

res

ind

icate

d a

de

cre

ase

CT

sco

res.

No

re

po

rtin

g o

f re

liab

ility

of

CT

A f

or

this

stu

dy.

Inclu

de

Blo

om

s T

ax

on

om

y

Page 19: TITLE: Evaluation of tools used to measure critical ...

19

Jo

ne

s,(

20

0

8).

US

A

A q

ua

si-e

xpe

rim

en

tal,

pre

-po

st

test stu

dy

testing

the

eff

ects

of

PB

L

60

2nd

ye

ar

nu

rsin

g

stu

de

nts

.

Inte

rve

ntion

gro

up

d

em

on

str

ate

d a

hig

he

r sig

nific

an

t in

cre

ase in

CT

co

mp

are

d to t

he

co

ntr

ol

gro

up

.

No

re

po

rtin

g o

f re

liab

ility

. U

ncle

ar

wh

eth

er

the

to

ol

wa

s v

alid

ate

d o

r re

vie

wed

b

y e

xp

ert

s. B

loo

ms

taxon

om

y u

se

d to d

eve

lop

th

e t

oo

l, b

ut

no

atte

mp

t to

re

late

th

is t

o t

he

recog

nis

ed

d

efin

itio

ns o

f C

T

Inclu

de

Co

nc

ep

t M

ap

Sc

ori

ng

Ab

el. &

F

ree

ze,

(20

06

) U

SA

Lon

gitu

din

al stu

dy

me

asu

rem

en

t o

ve

r 4

tim

ep

oin

ts te

stin

g t

he

eff

ects

of

co

ncep

t m

ap

pin

g

28

associa

te d

eg

ree

nu

rsin

g s

tude

nts

T

he

re w

as a

sig

nific

an

t in

cre

ase

in m

ea

n s

co

res o

f th

e f

irst

co

nce

pt

map

to

th

e

ave

rag

e m

ea

n s

co

re o

f th

e

last

two

map

s (

p=

0.0

5).

No

re

po

rtin

g o

f re

liab

ility

of

too

l. L

imited

info

rmatio

n

abo

ut sco

ring

crite

ria

, n

ee

de

d m

ore

info

rmation

h

ow

th

is s

co

re r

ela

tes to

critica

l th

inkin

g

Inclu

de

Cri

tic

al T

hin

kin

g L

ike

rt S

ca

le (

CT

LS

)

Ste

ve

ns,

Bre

nn

er

&

Bre

nn

er

(20

09

) U

SA

Pre

-post

test

expe

rim

en

tal de

sig

n

testing

the

PA

LS

le

arn

ing

app

roa

ch

15

nu

rsin

g s

tud

en

ts

Incre

ase in

sco

res o

n C

TLS

b

ut no

sta

tistica

l an

aly

sis

p

erf

orm

ed

.

No

re

po

rtin

g o

f re

liab

ility

of

CT

LS

fo

r th

is s

tud

y o

r p

revio

usly

.

Exclu

de

d

ue

to

la

ck o

f sta

tistica

l a

na

lysis

a

nd

re

po

rtin

g

of

resu

lts.

Cri

tic

al T

hin

kin

g S

ca

le (

CT

SM

)

Tse

ng

, e

t a

l (2

011

).

Ta

iwa

n

A q

ua

si-e

xpe

rim

en

tal

de

sig

n m

ea

su

rem

en

t o

ve

r 3

tim

e-p

oin

ts te

stin

g

the e

ffe

cts

of

PB

L.

120

RN

stu

de

nts

.

Th

e C

TS

sco

res w

ere

sig

nific

an

tly h

igh

er

in th

e

expe

rim

en

tal g

roup

Cro

nba

ch

’s a

lph

a

co

eff

icie

nt

of

the C

TS

wa

s

.94.

Lim

ite

d info

rma

tion

re

ga

rdin

g t

he

CT

S to

ol a

nd

h

ow

it m

ea

su

red

CT

.

Inclu

de

Page 20: TITLE: Evaluation of tools used to measure critical ...

20

Results

All 34 studies measured CT skill development or change, either following completion of a

specific educational intervention or an undergraduate nursing program. Most studies were

conducted in Western countries namely USA (n=20), United Kingdom (n=1), others were

conducted in Taiwan (n=4), Korea (n=3), China (n=2), Iran (n=1), Hong Kong (n=2), Turkey

(n=1), and Slovenia (n=1).

Reliability, Validity and Factor Domains of the Tools

Reliability, validity and factor domains of the tools were examined. This included examination

of previous and current reliability and validity testing. In respect to reliability, Facione and

Facione (1992b) noted that a Kuder-Richardson (KR-20) range of .65 to .75 for this type of

instrument is acceptable. Kaplan and Sacuzzo (1997) similarly reported that reliability

estimates in the range of .70 to .80 are acceptable.

Factor Domains

In addition to developing a definition of CT, the APA also concluded that critical thinking

comprised two dimensions; cognitive skills and disposition (Facione, 1990). Within the

cognitive skills dimension, four sub-skills were defined; interpretation, analysis, evaluation,

and inference. The disposition dimension was defined as truth-seeking, open-mindedness,

analyticity, systematicity, self-confidence, inquisitiveness, and maturity of judgment (Facione

& Facione,1992a). Some scholars argued about the applicability of the universal definition of

CT to the discipline of nursing. Scheffer and Rubenfeld (2000) conducted a Delphi study to

develop a consensus definition of CT in nursing. A set of 17 consensus CT skills and habits

of the mind were developed, many of which reflected Facione’s (1990) earlier work with the

addition of creativity, intuition and transforming knowledge (Scheffer & Rubenfeld, 2000).

There has not been any published work on a definition of critical thinking for midwifery. The

construct validity of the tools was assessed according to the dimensions and sub-skills of CT

as outlined in the previous work of Facione (1990) and Scheffer and Rubenfeld (2000).

The California Critical Thinking Disposition Inventory (CCTDI) uses the APA consensus

definition of critical thinking as the theoretical basis to measure the extent to which an

individual possesses the attitudes of a critical thinker (Facione & Facione, 1992a). The

domains assessed are: open-mindedness, analyticity, cognitive, maturity, truth-seeking,

systematicity, inquisitiveness, and self-confidence.

Page 21: TITLE: Evaluation of tools used to measure critical ...

21

The CCTDI has a reported overall median alpha coefficient of .90 (Facione, 1994),

demonstrating good reliability. Within the twelve studies that utilised the CCTDI only four

(Atay & Karabacak, 2012; Shin et al., 2006; Stewart & Dempsey, 2005; Yu et al., 2012)

tested reliability of the CCTDI. Two of the studies (Atay et al., 2012; Yu et al., 2012) reported

reliability levels similar to those reported by Facione (1994) of .88 and .89. However, Stewart

and Dempsey (2005) reported only marginal reliability with an alpha coefficient between .67

and .75. Shin (2006) reported a much lower alpha coefficient of .53. These inconsistent

results place some doubt on the reliability of this tool in different nursing education contexts.

The California Critical Thinking Skills Test (CCTST) was designed to measure critical

thinking in college students (Facione, 1992b). The CCTST measures the ability of

participants to draw conclusions in the areas of analysis, inference, evaluation, deductive

and inductive reasoning. (Facione & Facione, 1998). These skills relate to the APA

consensus definition of critical thinking (Facione, 1990). The KR-20 estimate of internal

consistency of the CCTST was r = .70 (Facione & Facione, 1998). Four of the seven studies

that utilised the CCTST reported on reliability. Two studies reported low alpha coefficients of

.62 (Beckie et al, 2001) and between .55 and .83 (Spelic et al, 2001). The CCTST was used

to track development of CT in students undertaking different study pathways (Spelic et al.,

2001). Some concerns were expressed with the internal consistency of the CCTST across

the different cohorts. The total score α for the RN-BSN group was very low (alpha = .31)

compared to the traditional and accelerated pathways cohorts (alpha = .66). Spelic et al.

(2001) suggested that the reliability of tools with few items and involving a timed test

administration is low. The CCTST comprises 34 items, and Spelic et al. (2001) found that on

several items all students scored the same. When these items were removed the α level for

30 items was .62. This limitation highlights the value of using multiple measures in the

assessment of CT.

The second study using the CCTST demonstrated inconsistent results (Beckie, et al., 2001).

Two cohorts of nursing students in a new curriculum focussing on CT skills completed the

CCTST over three time-points. The first group experienced significantly improved CT scores

from baseline but scores of the second group revealed decreased CT scores. This variation

in results across the two cohorts undertaking the same curriculum places doubt on the

reliability of this tool.

The other two studies that tested the reliability of the CCTST (Chau et al., 2001; Yuan et al.,

2008) reported similar results to Facione and Facione (1998). The differences in findings

Page 22: TITLE: Evaluation of tools used to measure critical ...

22

between these four studies may indicate that the CCTST does not consistently measure CT

in nursing practice across different settings.

The HSRT is a commercially available, recent adaptation of the CCTST specifically designed

for health sciences students and professionals to assess their CT and clinical reasoning

skills (Goodstone et al, 2013). Similar to the CCTST the HSRT uses the sub-skills identified

within the APA consensus definition of critical thinking. The HRST is considered a reliable

and valid measure of critical thinking for entry level nursing students with a KR 20 of .81

(Facione, Facione & Winterhalter, 2010). The three studies that used this tool all tested the

effects of simulation on CT but none reported reliability (Sullivan-Mann, et al, 2009; Shinnick

& Woo, 2013; Goodstone et al, 2013). One study (Sullivan-Mann, et al, 2009) reported an

increase in student’s CT skills following simulation but the other two studies (Goodstone et

al,2013; Shinnick & Woo, 2013) reported no statistical increase, with decrease in scores in

one study. These inconsistent results could indicate the HSRT is not a reliable tool across

diverse settings and populations.

The WGCTA, originally developed in the 1920’s, measures both logical and creative

components of CT and assesses CT ability in individuals with at least a ninth grade

education (Watson & Glaser, 1980). The test comprises 80 proposed arguments related to

25 statements that include problems, arguments, and interpretations. On completion a total

score is produced based on the assessment of five critical thinking skills: inference,

recognition of assumptions, deduction, interpretation and evaluation of arguments, which

align to the CT sub-skills defined by Facione (1990). The WGCTA measures the underlying

constructs of classical logic and general reasoning skills rather than application of CT skills

(Walsh & Seldomridge, 2006). Only the study by Brown et al. (2001) reported an alpha

coefficient of .77. This is consistent with the split-half reliability coefficients of .69 to .85

reported by Watson and Glaser (1980). The three studies that used the WGCTA were all

conducted in the USA and used a longitudinal design to detect change in CT across different

undergraduate nursing degrees (L'Eplattenier, 2001; Brown et al, 2001; Daly, 2001). Two of

the studies (L’Eplattenier, 2001; Daly, 2001) found no change in CT scores whereas Brown

et al (2001) reported increases in CT scores of students undertaking traditional and RN-BSN

pathways but no change for students in the accelerated pathway. These inconsistencies in

findings may support claims that the constructs within the WGCTA are not suited to measure

CT skills in the nursing discipline (Walsh & Seldomridge, 2006).

Of the twelve non-standardised tools utilised to measure critical thinking in this review, only

four tested reliability. The Critical Thinking Ability Scale (CTAS) for College Students has a

Page 23: TITLE: Evaluation of tools used to measure critical ...

23

reported Cronbach's alpha of .74 (Park, 1999). The CTAS was used by Choi et al (2014) to

measure the effect of problem based learning (PBL) on CT and had a reported Cronbach’s

alpha of .71. Although the aim was to measure changes in students’ CT abilities, the CTAS

is a self-report tool that assesses the domains of; intellectual curiosity, healthy skepticism,

intellectual integrity, prudence, and objectivity, which relate more to CT disposition rather

than skills.

The Critical Thinking Disposition Scale (CTDS) for Nursing Students developed by Park &

Kim (2009) has a reported Cronbach’s alpha of .78. Jun et al. (2013) used the CTDS to

measure critical thinking development in 161 nursing students, and reported a Cronbach’s

alpha of .81. The CTDS uses the concepts of intellectual integrity, creativity, challenge,

open-mindedness, prudence, objectivity, truth seeking, inquisitiveness, which directly relate

to dispositional characteristics identified by both Facione, (1990) and Scheffer & Rubenfeld

(2000). This tool is not available in English which limits use in other settings. Similar to the

CCTDI, the CTDS only measures CT disposition not the application of these skills in

practice.

The N3 case report accreditation form developed by the Taiwan Nurses Association was

used to assess students’ CT abilities in the critique of case study reports (Chen & Lin, 2003).

Testing of this tool resulted in good inter-rater reliability = .89 (Pearson r), internal

consistency of KR-20 = .79, but low test-retest reliability of .32 after a 16 week interval.

However, the construct validity of this tool is questionable. The criteria of the tool do not

reflect any of the CT constructs. Instead the tool was constructed on the basis of the nursing

process with critical inquiry points listed under each step of the nursing process (Chen & Lin,

2003). The study tested the effects of a research course, and found significantly higher CT

scores in students who undertook the course. However, it was unclear whether the tool

measured students’ abilities to critique an article rather than their CT abilities.

The Critical Thinking Process Test (CTPT), a commercial tool developed by Educational

Resources, has a reported reliability coefficient of.93 (Anderson et al, 2000). The CTPT

measured CT development in two studies but neither reported on reliability (DeSimone,

2006; Morey, 2012). The CTPT assesses four aspects of the critical thinking process;

listening, writing, speaking, and reading, and five levels of abstract thinking; prioritizing,

inferential reasoning, goal setting, application of knowledge, and evaluation of predicted

outcomes. Several concepts partially relate to elements of the recognised definition of CT.

This tool is expensive to administer and not widely used (Fountain, 2011).

Page 24: TITLE: Evaluation of tools used to measure critical ...

24

The Critical Thinking Scale (CTSM) developed by McMaster University assesses the effects

of PBL and concept mapping on CT (Tseng et al., 2011). The reported Cronbach’s

coefficient of .94 (Tseng et al., 2011), was replicated in another study which reported .93

(Chou, Jian, Tseng & Ko, 2014). The concepts of inference, recognition of assumptions,

deduction, interpretation, and evaluation of argument reflect the critical thinking sub-skills

identified by Facione (1990). The CTSM is a student self-report test but may not measure

CT in practice.

A validated concept map scoring criteria was used to measure CT development over a one

year period (Abel & Freeze, 2006). Inter-rater reliability with two assessors found an 85%

level of agreement (Abel & Freeze, 2006). The authors stated that content validity had

previously been established, and no further testing of internal consistency was performed.

The scoring criteria were: 1) meaningful relationships between two concepts indicated by a

connecting line; 2) hierarchy shows a general to specific approach; 3) cross-links show

meaningful connections between one segment of the hierarchy; and 4) examples describe

specific instances of a concept (Lawson, 2012). It was unclear how the scoring criteria

related to the dimensions of CT. The study demonstrated increases in students’ concept

map scores as they progressed through the curriculum, but it is uncertain whether this

increase was representative of increases in critical thinking or simply improved competence

in concept mapping.

The Critical Thinking Scale (CTS) assesses CT through the concepts of inference,

recognition of assumptions, deduction, interpretation, and evaluation of argument (Lee et al.,

2013). These concepts match those suggested within the two recognised definitions of CT.

In a study examining the effects of concept mapping on CT skills, Lee et al., (2013) reported

that previous reliability testing convergent as well as known group validity was conducted by

the developer of the tool Cheng et al (1996). No further testing of the reliability of the tool

was conducted by Lee et al. Using a longitudinal design, students’ CT scores were

compared between those exposed to one semester of teaching on concept mapping with a

control group (Lee et al, 2013). Initial increases in CT scores were found in both groups but

decreased over time. These findings infer the teaching methodologies were not effective but

also may indicate the CTS is not reliable in measuring changes in CT over time.

The Critical Thinking Assessment (CTA) tool was used to evaluate the effects of a grand

round education strategy on CT (Mann, 2012). The CTA has a reported alpha of .69 and a

standardized item alpha of .70 in first-time examinees (Assessment Technologies Institute,

2001). No reliability testing was performed by Mann (2012). The CTA uses 40 multiple

Page 25: TITLE: Evaluation of tools used to measure critical ...

25

choice questions based on the domains of interpretation, analysis, evaluation, inference,

explanation and self-regulation (ATI, 2003). Four of these domains (interpretation, analysis,

evaluation and inference) directly relate to the recognised domains of CT. There were no

differences in the CT scores in the control or experimental groups, with a decrease in scores

in the control group (Mann, 2012). The unexpected decrease in CT scores could be due to

the very small sample size of 21, with only 4 students in the control group.

Four of the twelve non-standardised tools were newly developed with the specific purpose of

measuring critical thinking in action (Daly, 2001; Jones, 2008; Morey, 2012; Pucer et al.,

2014).The studies utilised practice-based teaching, learning, and assessment activities to

measure CT which not only presents opportunities to evaluate the application of CT but also

reduces survey and response burden as the activities are embedded in student learning.

However, none of these studies reported reliability of these newly developed tools.

Pucer et al. (2014) used a discussion board tool to analyse student’s postings according to

identified core key elements of critical thinking (as defined by Facione, 1990). A significant

improvement in the percentage of posts where the opinions and conclusions of participants

were justified with valid arguments was reported (Pucer et al., 2014). However, limited

information was presented on the development of the tool, process of expert review and

validation, or inter-rater reliability.

The effect of PBL on students’ CT development was measured by grading nursing care

plans over a semester (Jones, 2008). The grading system was based on the six levels of

Blooms taxonomy of cognitive learning and were described as; comprehending information,

organising ideas, and evaluating information and actions. Students who experienced the

PBL educational intervention reported higher CT scores. It was not clear however, whether

the tool was validated or reviewed by experts. Although Blooms taxonomy was used as the

basis of the tool, there did not seem to be any attempt to relate the grading domains to the

recognised definitional elements of CT (Facione, 1990; Scheffer & Rubenfeld, 2000).

In an attempt to establish concurrent validity Morey (2012) used both a newly developed

qualitative tool based on a ‘think aloud protocol’, and a standardised tool (CTPT) to measure

the effects of an animated pedagogical agent on critical thinking. The think aloud protocol

used elements of the nursing process to assess students’ thinking in solving a clinical

scenario (Morey, 2012). The elements of collect, review, relate, interpret, infer, diagnosis,

act, and evaluate did not align directly with the recognise definitions of CT. Both groups

displayed significant improvements in CT levels and correct conclusions from baseline to

post-intervention on the think-aloud protocol, but only the pedagogical agent group had a

Page 26: TITLE: Evaluation of tools used to measure critical ...

26

significant result for “evaluation”. These mixed results may indicate the difficulty in

measuring CT development in a standardised exam format. Reliability testing and construct

validity of the think aloud were not reported, therefore results must be viewed with caution.

Daly (2001) also compared the use of a newly developed think-aloud analytic framework and

a standardised tool (WGCTA) to measure CT development over an 18 month period. No

statistical improvement in the WGCTA scores was found. The think aloud qualitative

assessment demonstrated consistent evidence of reasoning that reflected an “enduring

absolutist epistemology” but portrayed little evidence of CT (Daly 2001). The authors

explained that reasoning of this nature usually involves a single theory structured argument

which is contradictory to the principles of CT (Daly, 2001). Although both tools indicated

similar results, no reliability testing was conducted. The constructs of this new tool were

described as differentiation and integration in reasoning, situation modelling and argument

and evidential structure (Daly, 2001), which do not incorporate the recognised definitional

elements of CT (Facione, 1990; Scheffer & Rubenfeld, 2000).

Discussion

This review included studies from 9 different countries using 16 different tools. This section

discusses the findings in relation to the reliability, validity and factor domains of the

standardised tools and then examines the non-standardised tools.

The reliability of tools used to measure CT in nursing practice was not reported consistently

and varied considerably. Only two authors of new tools reported on internal stability using a

test-retest, and at best, split-half reliability for internal consistency was reported. The review

included four commercially available tools and this cost may limit their use for routine

evaluation of classroom teaching effectiveness. The CCTDI and the CCTST had reported

reliability ranging from .31 to .89 and some authors using these tools did not test reliability

for their specific context. The CCTDI measures students’ self-report CT disposition and does

not measure the development of CT skills. Relying on student self-report may be affected by

recall bias and a socially desirable response set (Tiwari et al., 2006). The act of critical

thinking involves both skills and habit of the mind (Scheffer & Rubenfeld, 2000). The CCTDI

only measures the habits of the mind. For a complete assessment of student’s critical

thinking both skills and disposition need to be measured, and the CCTST should be used in

conjunction with the CCTDI (Insight Assessment, 2013).

A lack of congruence between items in the CCTST and the CCTDI could account for

inconsistencies in reliability. Although the cognitive skills underlying the framework for the

Page 27: TITLE: Evaluation of tools used to measure critical ...

27

CCTST and the CCTDI were identified as important to the practice of nursing (Stone,

Davidson, Evans, & Hansen, 2001), the same study found less agreement on whether the

items reflected CT skills required of nurses. Inconsistent results across studies have

prompted questions related to the reliability of the CCTDI to measure dispositional attitudes

(Walsh & Seldomridge, 2006), and the lack of stability of the instrument (Walsh & Hardy,

1997; Kakai, 2003).

Limited reporting of tool reliability makes it difficult to assess their applicability in the nursing

and midwifery contexts. Concern could also be justified over the focus of existing tools

(especially standardised tools) on the measurement of formal logic and general thinking

skills, rather than the application of CT in practice (Seldomridge & Walsh, 2006).

Four new tools that measure the application of CT skills in nursing in practice were reviewed.

However, none of these new tools were tested for reliability. When the domains were

compared to the recognised definition of CT, construct validity was only established for one

tool (Jones, 2008). None of the studies conducted a factor analysis to establish validity. In

the development of the new tools, items were drawn from concepts thought to be useful but

no testing was conducted to confirm this. Therefore, further research with large samples,

factor analysis, and testing of different forms of reliability and validity, are required before

implementing these tools into practice.

CT is also considered to be a multidimensional concept, and a single test in a multiple

choice format may be inadequate to accurately detect change in development. There is a

need to ensure that measures of CT development address the complexity of practice and

are adaptive to the nursing and midwifery environments (Rubenfeld & Scheffer, 2006). A

mixed method approach and triangulation of findings may provide greater validity, reliability,

and insight into CT development.

Conclusion

There was limited reporting of the reliability of tools in the included studies. Overall there

was relatively little emphasis placed on validity of newly developed tools. Inconsistent results

were found in studies using standardised tools, placing doubt of the reliability of these tools

in the nursing context. On examination of the domain concepts construct validity was

questionable with several non-standardised tools used.

Nursing and midwifery education needs to prepare graduates to work effectively in complex,

fast paced and uncertain environments. Continued collection of data using measures of

Page 28: TITLE: Evaluation of tools used to measure critical ...

28

generalised CT is unlikely to help improve curricula, teaching methods, or preparation of

students for professional practice. There is a need to develop discipline specific instruments

to measure CT in nursing and midwifery, and more specifically tools that measure the

application of CT to practice. Considering the complexity of critical thinking in nursing and

midwifery practice, and that CT development occurs over a long time, measurement requires

a long term, multi-method approach over this time.

Page 29: TITLE: Evaluation of tools used to measure critical ...

29

References

Anderson, N., Booth, L., Catalano, J., Gaines, L., Horner, M., % McCormick, S., (2000).

Critical thinking process test: Development and technical eport. Stillwell, KS: Educational

Resource, Inc.

Assessment Technologies Institute, LLC. (2001). CT assessment: Developmental and statistical report. Overland Park, KS: Author.

Abel, W. M., & Freeze, M. (2006). Evaluation of concept mapping in an associate degree nursing program. Journal of Nursing Education, 45(9), 356-364.

Atay, S., & Karabacak, Ü. (2012). Care plans using concept maps and their effects on the critical thinking dispositions of nursing students. International Journal of Nursing Practice, 18(3), 233-239. doi: 10.1111/j.1440-172X.2012.02034.x

Beckie, T. M., Lowry, L. W., & Barnett, S. (2001). Assessing critical thinking in baccalaureate nursing students: a longitudinal study. Holistic Nursing Practice, 15(3), 18-26.

Brown, J. M., Alverson, E. M., & Pepa, C. A. (2001). The influence of a baccalaureate program on traditional, RN-BSN, and accelerated students' critical thinking abilities. Holistic Nursing Practice, 15(3), 4-8.

Brunt, B. A. (2005). Models, measurement, and strategies in developing critical-thinking skills. Journal of Continuing Education in Nursing, 36(6), 255-262.

CASP (2013). Critical Thinking Appraisal Skills Programme:CASP Checklists. UK: CASP.

Retrieved from http://www.casp-uk.net/#!casp-tools-checklists/c18f8 Chau, J. P. C., Chang, A. M., Lee, I. F. K., Ip, W. Y., Lee, D. T. F., & Wootton, Y. (2001).

Effects of using vidoetaped vignettes on enhancing students' critical thinking ability in a baccalaureate nursing programme. Journal of Advanced Nursing, 36(1), 112-119.

doi: 10.1046/j.1365-2648.2001.01948.x Chen, F., & Lin, M. (2003). Effects of a nursing literature reading course on promoting critical

thinking in two-year nursing program students. Journal of Nursing Research (Taiwan Nurses Association), 11(2), 137-147.

Cheng, Y. Y., Wang, W. C., Wu, J. J., & Hwang, C. K. (1996). A preliminary report on theconstruction of the critical thinking scale (in Chinese). Psychological Testing, 43,

213-226. Choi, E., Lindquist, R., & Song, Y. (2014). Effects of problem-based learning vs. traditional

lecture on Korean nursing students' critical thinking, problem-solving, and self-directed learning. Nurse Education Today, 34(1), 52-56. doi:

10.1016/j.nedt.2013.02.012. Chou, F.H., Jian, S.Y., Tseng, H.C., Ko, H.G., (2004). The evaluation of students

performance in applying problem-based learning to a nursing course. The Research

Outcome of National Science Council, Taiwan. Daley, B.J., Shaw, C.R., Balistrieri, T., Glasenapp, K., & Piacentine, L. (1999). Concept

maps: A strategy to teach and evaluate critical thinking. Journal of Nursing Education, 38, 42-47.

Daly, W. M. (2001). The development of an alternative method in the assessment of critical thinking as an outcome of nursing education. Journal of Advanced Nursing, 36(1),

120-130. doi: 10.1046/j.1365-2648.2001.01949.x Dehkordi, A. H., & Heydarnejad, M. S. (2008). The effects of problem-based learning and

lecturing on the development of Iranian nursing students' critical thinking. Pakistan Journal of Medical Sciences, 24(5), 740-743.

DeSimone, B. B. (2006). Curriculum design to promote the critical thinking of accelerated bachelor's degree nursing students. Nurse Educator, 31(5), 213-217.

Evans, B. C., & Bendel, R. (2004). Cognitive and ethical maturity in baccalaureate nursing students: did a class using narrative pedagogy make a difference? Nursing Education Perspectives, 25(4), 188-195.

Facione, P. A. (1990). Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction, Executive Summary: “The Delphi Report”.

Page 30: TITLE: Evaluation of tools used to measure critical ...

30

CA: The Californian Academic Press. Retrieved from http://assessment.aas.duke.edu/documents/Delphi_Report.pdf

Facione, P. A. & Facione, N. C. (1992a). The California Critical Thinking Dispositions Inventory (CCTDI); and the CCTDI Test manual. Millbrae, CA: California Academic

Press. Facione, P. A., & Facione N. C. (1992b). The California Critical Thinking Skills Test: Test

Manual. Millbrae, CA: California Academic Press. Facione, P., & Facione, N. (1994). The California Critical Thinking Disposition Inventory

(CCTDI): Test Manual. Millbrae, CA: California Academic Press. Facione, N. C, & Facione, P. A. (1996). Assessment design issues for evaluating critical

thinking in nursing. Holistic Nursing Practice, 10(3), 41-53. Facione, P. A., & Facione, N. C. (1998). The California Critical Thinking Skills Test: CCTST

test manual. Millbrae, CA: California Academic Press. Facione, P. A., Facione, N. C., & Winterhalter, K. (2010). The Health Sciences Reasoning

Test: Test manual. Millbrae, CA: California Academic Press. Fountain, L. (2011). Thinking Like a 21st Century Nurse: Theory, Instruments, and

Methodologies for Measuring Clinical Thinking. Paper presented at the Annual

Meeting of the American Educational Research Association New Orleans, University of Maryland.

Goodstone, L., Goodstone, M. S., Cino, K., Glaser, C. A., Kupferman, K., & Dember-Neal, T. (2013). Effect of simulation on the development of critical thinking in associate degree nursing students. Nursing Education Perspectives, 34(3), 159-162.

Insight Assessment (2013). California Critical Thinking Disposition Inventory (CCTDI). San Jose, CA.: The Californian Academic Press Retrieved from: http://www.insightassessment.com/Products/Products-Summary/Critical-Thinking-Attributes-Tests/California-Critical-Thinking-Disposition-Inventory-CCTDI#sthash.HBvLVC1c.dpbs

Jones, M. (2008). Developing clinically savvy nursing students: an evaluation of problem-based learning in an associate degree program. Nursing Education Perspectives, 29(5), 278-283.

Jun, W. H., Lee, E. J., Park, H. J., Chang, A. K., & Kim, M. J. (2013). Use of the 5E learning cycle model combined with problem-based learning for a fundamentals of nursing course. Journal of Nursing Education, 52(12), 681-689.

Kakai H. (2003). Re-examining the factor structure of the California Critical Thinking Disposition Inventory. Perceptual and Motor Skills, 96(2), 435-438.

Kaplan, R. M., Sacuzzo, D.P. (1997) Psychological Testing: Principles, Applications and Issues (4th ed.), Pacific Grove, CA: Brooks/Cole.

Lawson, S. B. (2012). The Effectiveness of Concept Mapping as an Educational Tool to Enhance Critical Thinking Skills in Undergraduate Nursing Students. Unpublished Thesis. Indiana: Ball State University.

L'Eplattenier, N. (2001). Tracing the development of critical thinking in baccalaureate nursing students. Journal of the New York State Nurses Association, 32(2), 27-32.

Lee, W., Chiang, C. H., Liao, I. C., Lee, M. L., Chen, S.L., & Liang, T. (2013). The longitudinal effect of concept map teaching on critical thinking of nursing students. Nurse Education Today, 33(10), 1219-1223. doi: 10.1016/j.nedt.2012.06.010

Mann, J. (2012). Critical Thinking and Clinical Judgment Skill Development in Baccalaureate Nursing Students. Kansas Nurse, 87(1), 26-31.

Mong-Chue, C. (2000). Professional issues. The challenges of midwifery practice for critical thinking. British Journal of Midwifery, 8(3), 179-183.

Morey, D. J. (2012). Development and Evaluation of Web-Based Animated Pedagogical Agents for Facilitating Critical Thinking in Nursing. Nursing Education Perspectives, 33(2), 116-120. doi: 10.5480/1536-5026-33.2.116

Muoni, T. (2012). Decision-making, intuition, and the midwife: Understanding heuristics. British Journal of Midwifery, 20(1), 52-56.

Page 31: TITLE: Evaluation of tools used to measure critical ...

31

Naber, J., & Wyatt, T. H. (2014). The effect of reflective writing interventions on the critical thinking skills and dispositions of baccalaureate nursing students. Nurse Education Today, 34(1), 67-72. doi: 10.1016/j.nedt.2013.04.002

Park, S.H., (1999). The effects of the program for the improvement of college students' critical thinking ability [Korean]. Journal of Educational Psychology, 13 (4), 93–112.

Park, J.A., & Kim, B.J. (2009). Critical thinking disposition and clinical competence in general hospital nurses [Korean]. Journal of Korean Academy of Nursing, 39, 840-850.

Paul, R. W. (1993). Critical Thinking. Santa Rosa, CA; Foundationfor Critical Thinking.

Pucer, P., Trobec, I., & Žvanut, B. (2014). An information communication technology based approach for the acquisition of critical thinking skills. Nurse Education Today, 34(6),

964-970. doi: 10.1016/j.nedt.2014.01.011 Ravert, P. (2008). Patient simulator sessions and critical thinking. Journal of Nursing

Education, 47(12), 557-562. doi: 10.3928/01484834-20081201-06 Rubenfeld, M. G. & Scheffer,B. K. ( 2006). Critical thinking TACTICS for nurses : tracking,

assessing, and cultivating thinking to improve competency-based strategies.

Sudbury, Mass : Jones and Bartlett. Scheffer, B. K., & Rubenfeld, M. G. (2000). A consensus statement on critical thinking in

nursing. Journal of Nursing Education, 39(8), 352-359.

Scholes, J., Endacott, R., Biro, M., Bulle, B., Cooper, S., Miles, M., Gilmour, C., Buykx, P., Kinsman, L., Boland, R., Jones, J., Zaidi, F. 2012. Clinical decision-making: midwifery students’ recognition of, and response to, post partum haemorrhage in the simulation environment. BMC Pregnancy and Childbirth 12, 19.

Seldomridge, L. A., & Walsh, C. M. (2006). Measuring critical thinking in graduate education: what do we know? Nurse Educator, 31(3), 132-137.

Shin, K. R., Lee, J. H., Ha, J. Y., & Kim, K. H. (2006). Critical thinking dispositions in baccalaureate nursing students. Journal of Advanced Nursing, 56(2), 182-189. doi: 10.1111/j.1365-2648.2006.03995.x

Shinnick, M. A., & Woo, M. A. (2013). The effect of human patient simulation on critical thinking and its predictors in prelicensure nursing students. Nurse Education Today, 33(9), 1062-1067. doi: 10.1016/j.nedt.2012.04.004

Spelic, S. S., Parsons, M., Hercinger, M., Andrews, A., Parks, J., & Norris, J. (2001). Evaluation of critical thinking outcomes of a BSN program. Holistic Nursing Practice, 15(3), 27-34.

Stevens, J., Brenner, C., & Brenner, Z. R. (2009). The peer active learning approach for clinical education: a pilot study. Journal of Theory Construction & Testing, 13(2), 51-

56. Stewart, S., & Dempsey, L. F. (2005). A longitudinal study of baccalaureate nursing

students' critical thinking dispositions. Journal of Nursing Education, 44(2), 81-84.

Stone, C. A., Davidson, L. J., Evans, J. L., & Hansen, M. A. (2001). Validity evidence for using a general critical thinking test to measure nursing students' critical thinking. Holistic Nursing Practice, 15(4), 65-74.

Sullivan-Mann, J., Perron, C. A., & Fellner, A. N. (2009). The effects of simulation on nursing students' critical thinking scores: a quantitative study. Newborn and Infant Nursing Reviews, 9(2), 111-116.

Tiwari, A., Lai, P., So, M., & Yuen, K. (2006). A comparison of the effects of problem-based learning and lecturing on the development of students' critical thinking. Medical Education, 40(6), 547-554.

Tseng, H. C., Chou, F. H., Wang, H.H., Ko, H.K., Jian, S.Y., & Weng, W.C. (2011). The effectiveness of problem-based learning and concept mapping among Taiwanese registered nursing students. Nurse Education Today, 31(8), e41-46. doi:

10.1016/j.nedt.2010.11.020 Walsh, C. M, & Hardy, R. C. (1997) Factor structure stability of the California Critical

Thinking Disposition Inventory across gender and various student majors. Perceptual and Motor Skills, 85,1211-1228.

Page 32: TITLE: Evaluation of tools used to measure critical ...

32

Walsh, C. M., & Seldomridge, L. A. (2006). Measuring critical thinking: one step forward, one step back. Nurse Educator, 31(4), 159-162.

Watson, G., & Glaser, E. M. (1980). Watson-Glaser Critical Thinking Appraisal. San Antonio,

TX: Psychological Corp. Wheeler, L. A., & Collins, S. K. R. (2003). The influence of concept mapping on critical

thinking in baccalaureate nursing students. Journal of Professional Nursing, 19(6), 339-346.

Wood, R. Y., & Toronto, C. E. (2012). Measuring Critical Thinking Dispositions of Novice Nursing Students Using Human Patient Simulators. Journal of Nursing Education, 51(6), 349-352. doi: 10.3928/01484834-20120427-05

Yeh, M., & Chen, H. (2005). Effects of an educational program with interactive videodisc systems in improving critical thinking dispositions for RN-BSN students in Taiwan. International Journal of Nursing Studies, 42(3), 333-340.

Yu, D., Zhang, Y., Xu, Y., Wu, J., & Wang, C. (2012). Improvement in critical thinking dispositions of undergraduate nursing students through problem-based learning: a crossover-experimental study. Journal of Nursing Education, 52(10), 574-581.

Yuan, H., Kunaviktikul, W., Klunklin, A., & Williams, B. A. (2008). Improvement of nursing students' critical thinking skills through problem-based learning in the People's Republic of China: a quasi-experimental study. Nursing & Health Sciences, 10(1), 70-76.

Zadeh, H. H., Khajeali, N., Khalkhali, H., & Mohammadpour, Y. (2014). Effect of evidence-based nursing on critical thinking disposition among nursing students. Life Science Journal, 11 (9 Spec. Issue), 487-491.