Page 1
Evaluation of tools used to measure critical thinkingdevelopment in nursing and midwifery undergraduatestudents: A systematic review
Author
Carter, Amanda G, Creedy, Debra K, Sidebotham, Mary
Published
2015
Journal Title
Nurse Education Today
Version
Accepted Manuscript (AM)
DOI
https://doi.org/10.1016/j.nedt.2015.02.023
Copyright Statement
© 2015 Published by Elsevier Ltd. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (http://creativecommons.org/licenses/by-nc-nd/4.0/) which permits unrestricted, non-commercial use, distribution and reproduction in anymedium, providing that the work is properly cited.
Downloaded from
http://hdl.handle.net/10072/161703
Griffith Research Online
https://research-repository.griffith.edu.au
Page 2
2
TITLE: Evaluation of tools used to measure critical thinking development in nursing
and midwifery undergraduate students: A systematic review.
Word Count 4494 (without references and tables)
Authors:
Amanda G. Carter RM BHealthSc MMid
School of Nursing and Midwifery
Griffith University, Brisbane, Australia.
Debra K. Creedy RN PhD
Professor, Centre for Health Practice Innovation
Griffith Health Institute
Griffith University, Brisbane, Australia.
Mary Sidebotham RM PhD
School of Nursing and Midwifery
Griffith University, Brisbane, Australia.
Corresponding Author
Amanda G. Carter
School of Nursing and Midwifery
Griffith University
University Drive
Meadowbrook. Queensland 4131, Australia
Ph +61 7 33821535
[email protected]
Page 3
3
Evaluation of tools used to measure critical thinking development in nursing and
midwifery undergraduate students: A systematic review
Abstract
Background: Well developed critical thinking skills are essential for nursing and midwifery
practice. The development of students’ higher-order cognitive abilities, such as critical
thinking, is also well recognised in nursing and midwifery education. Measurement of critical
thinking development is important to demonstrate change over time and effectiveness of
teaching strategies.
Objective: To evaluate tools designed to measure critical thinking in nursing and midwifery
undergraduate students.
Data sources: The following six databases; CINAHL, Ovid Medline, ERIC, Informit,
PsycINFO and Scopus were searched and resulted in the retrieval of 1,191 papers.
Review methods: After screening for inclusion, each paper was evaluated using the Critical
Appraisal Skills Programme Tool. Thirty-four studies met the inclusion criteria and quality
appraisal. Sixteen different tools that measure critical thinking were reviewed for reliability
and validity and extent to which the domains of critical thinking were evident.
Results: Sixty percent of studies utilised one of four standardised commercially available
measures of critical thinking. Reliability and validity were not consistently reported and there
was variation in reliability across studies that used the same measure. Of the remaining
studies using different tools, there was also limited reporting of reliability making it difficult to
assess internal consistency and potential applicability of measures across settings.
Conclusions: Discipline specific instruments to measure critical thinking in nursing and
midwifery are required, specifically tools that measure the application of critical thinking to
practice. Given that critical thinking development occurs over an extended period,
measurement needs to be repeated and multiple methods of measurement used over time.
Key words: critical thinking, nursing, midwifery, measures, scales, evaluation
Introduction
Page 4
4
The development of critical thinking (CT) skills has long been recognised as a priority in
tertiary education. The landmark Delphi study by the American Philosophical Association
(APA) produced an international expert consensus definition of critical thinking. Critical
thinking is described as purposeful, self-regulatory judgment which results in interpretation,
analysis, evaluation, and inference (Facione, 1990). Critical thinkers consider events or
issues in a controlled, purposeful, focussed and conscious way (Mong-Chue, 2000).
Critical thinking is a crucial skill for nurses and midwives who, like other healthcare
clinicians, need to effectively manage complex care situations in fast paced environments
that demand increasing accountability (Mong-Chue, 2000; Muoni, 2012; Pucer, Trobec, &
Žvanut, 2014). The processes of clinical decision-making and problem-solving require
advanced CT skills (Muoni, 2012). CT is also essential for clinicians to critique and apply
evidence, especially in situations where uncertainty regarding ‘best practice’ remains unclear
(Scholes et al, 2012).
Although the development of students’ higher order cognitive abilities is recognised as
important in nursing and midwifery education, the measurement of these vital skills is
inconsistent or neglected (Walsh & Seldomridge, 2006). The measurement of CT is
important to identify deficits and developments in students’ cognitive capacities as well as
demonstrate the effectiveness of teaching strategies. The purpose of this systematic review
was to evaluate tools used to measure CT development in nursing and midwifery
undergraduate students.
Search Strategies Utilised
A search of major databases CINAHL, Ovid Medline, ERIC, Informit, PsycINFO and Scopus,
was conducted in September 2014. The search was limited to English language articles
published in peer reviewed journals during 2001-2014. This period was chosen as the
results of a Delphi study to define CT in nursing was published in 2000 (Scheffer &
Rubenfeld, 2000). Scholarly work about CT in nursing would have further developed since
that publication.
The inclusion criteria were original research studies that utilised experimental designs to
assess CT development in undergraduate nursing and/or midwifery students. Papers were
excluded if CT was not specifically measured on more than one occasion; the sample was
post-graduate students, full text was not available in English, discussion papers that did not
involve original research, or did not use an experimental design.
Five search terms were entered into the databases with the article title, abstract and body all
searched. The search terms used were:
Page 5
5
1. “critical thinking” AND midwife*
2. “critical thinking” AND midwife* AND measure*
3. “critical thinking” AND midwife* AND evaluat*
4. “critical thinking” AND students, nursing AND measure*
5. “critical thinking” AND students, nursing AND evaluat*
The search was conducted sequentially using the search engines and search terms. An
initial search, filtering for date, language and source of publication, identified 1,191 papers.
Once duplicates were excluded, each identified citation was reviewed using the inclusion
and exclusion criteria and filtered through three screening levels i.e., (i) title screening; (ii)
title and abstract screening; and (iii) full-text screening. Articles that were not relevant or did
not meet inclusion criteria were discarded. Finally 35 papers were included. No papers
involving midwifery undergraduate students met the inclusion criteria and hence the samples
in all of the papers are undergraduate nursing students.
Overview of Tools
Twenty-one (60%) of the 34 studies reviewed utilised one of four standardised commercially
available measures of critical thinking. These were the California CT Disposition Inventory
(10 studies), the California CT Skills Test (5 studies), the Watson-Glaser CT Appraisal (3
studies) and Health Services Reasoning Test (3 studies). Two studies used both the
Californian CT Skills Test and California CT Disposition Inventory. All of these tools have
reported psychometric reliability and validity allowing comparison across settings,
disciplines, and time. Relatively few of the included studies (9 out of 21) undertook a
reliability analysis of the tool for their current context. There were twelve other measurement
tools utilised in the studies reviewed. See Table 1 for a comparison of tools employed in the
studies reviewed.
Page 6
6
Ta
ble
1:
De
sc
rip
tio
n o
f T
oo
ls/M
eth
od
s t
o m
ea
su
re c
riti
ca
l th
ink
ing
fo
nt
in t
he
ta
ble
dif
fers
fro
m t
he
te
xt.
? m
ak
e t
he
m a
ll t
he
sa
me
Na
me
of
Instr
um
en
t/
Au
tho
r/
Ye
ar
De
ve
lop
ed
Aim
of
too
l N
um
be
r o
f It
em
s/
form
at
Psyc
ho
me
tric
Te
sti
ng
S
co
res
Tim
e t
o
Co
mp
lete
F
ac
tor
Do
ma
ins
Me
asu
red
Th
e C
alif
orn
ia
Critica
l T
hin
kin
g
Dis
po
sitio
n
Inve
nto
ry
(CC
TD
I) /
F
acio
ne
&
Fa
cio
ne
/ 19
92
Me
asu
re th
e
exte
nt to
wh
ich
an
in
div
idu
al
po
sse
sse
s th
e
att
itu
de
s o
f a
critica
l th
inke
r.
De
sig
ne
d f
or
use
b
y th
e g
en
era
l a
du
lt p
opu
latio
n
75
Lik
ert
ite
ms,
“ag
ree
-dis
ag
ree
” sca
le, stu
den
t’s
se
lf r
ep
ort
Cro
nba
ch
’s
alp
ha
.9
0
for
the
ove
rall
instr
um
ent
an
d
.71
to
.8
0
for
the
se
ve
n
su
bsca
les
Ma
xim
um
sco
re o
f 6
0 in
ea
ch
do
ma
in.
Ne
gative
dis
po
sitio
n
is a
sco
re b
elo
w 3
0.
Th
e to
tal m
axim
um
sco
re is 4
20
po
ints
. S
co
res >
350
in
dic
ate
a h
igh C
T
dis
positio
n.
Sco
res
less <
280
ind
icate
p
au
city o
f C
T
20
-30
min
s
Op
en
-min
de
dn
ess,
ana
lyticity,
co
gn
itiv
e,
ma
turity
, tr
uth
-se
ekin
g,
syste
maticity,
inq
uis
itiv
ene
ss, an
d
se
lf-c
onfide
nce
.
Ca
lifo
rnia
C
ritica
l T
hin
kin
g
Skill
s T
est
(CC
TS
T)/
F
acio
ne
&
Fa
cio
ne
/ 19
92
De
sig
ne
d f
or
asse
ssm
en
t of
en
try o
r exit le
ve
l C
T s
kill
s o
f va
rio
us g
rou
ps o
f co
llege
stu
de
nts
a
nd
fo
r e
va
lua
tion
of
lea
rnin
g
ou
tco
me
s o
f va
rio
us c
urr
icu
lar
pro
gra
ms.
34
Mu
ltip
le
ch
oic
e ite
ms
use
s a
gen
eric
sce
na
rio
re
qu
irin
g a
n
accu
rate
an
d
co
mp
lete
in
terp
reta
tion
of
the q
ue
stio
n
Th
e K
ud
er-
Ric
ha
rdson
(K
R-2
0)
estim
ate
of
inte
rna
l co
nsis
ten
cy o
f th
e
CC
TS
T is r
epo
rte
d in
th
e
test
ma
nu
al to
be r
= .70
Th
e m
axim
um
to
tal
sco
re is 3
4.
A s
co
re
of
≥24
in
dic
ate
s v
ery
str
ong
CT
skill
s. A
sco
re 1
3-2
3
ind
icate
s a
mid
-ra
nge
skill
le
ve
l S
co
res o
f ≤ 1
2
ind
icate
fun
da
me
nta
l w
ea
kn
esse
s in C
T
skill
s.
45
-50
min
s
An
aly
sis
, in
fere
nce,
eva
lua
tio
n,
ded
uctive
and
in
du
ctive
re
aso
nin
g
He
alth
Scie
nce
s
Re
aso
nin
g T
est
(HS
RT
) /F
acio
ne,
F
acio
ne
, &
W
inte
rha
lte
r/
Ad
ap
tation
of
the
CC
TS
T
sp
ecific
ally
d
esig
ned
fo
r u
se
b
y h
ea
lth
scie
nce
s s
tude
nts
33
mu
ltip
le
ch
oic
e
que
stion
s u
se
s
a h
ea
lth
re
late
d
sce
na
rio
re
qu
irin
g a
n
Inte
rna
l con
sis
ten
cy .77
to
.8
4.
ove
rall
inte
rna
l co
nsis
ten
cy v
alu
e o
f .8
1
with
Ku
de
r-R
ich
ard
so
n
form
ula
20, a
nd
an
ove
rall
.81 r
elia
bili
ty c
oeff
icie
nt
To
tal sco
re r
eflects
o
ve
rall
CT
skill
s.
Ma
xim
um
sco
re is
33
. S
co
res o
f 2
5 o
r a
bo
ve
re
pre
sen
t str
ong
CT
skill
s,
30
-50
min
s
An
aly
sis
, in
fere
nce,
eva
lua
tio
n, in
du
ctive
re
ason
ing
and
d
ed
uctive
rea
son
ing
Page 7
7
201
0
and
pro
fessio
na
ls
to a
sse
ss t
he
ir
CT
and
clin
ica
l re
ason
ing
skill
s.
accu
rate
an
d
co
mp
lete
in
terp
reta
tion
of
the q
ue
stio
n
sco
res f
rom
15
to
24
are
co
nsid
ere
d m
id-
ran
ge
and
rep
resen
t co
mp
ete
nce in
CT
skill
s in
mo
st
situ
atio
ns,
an
d
sco
res o
f 1
4 o
r b
elo
w r
ep
resen
t fu
nd
am
en
tal
we
akn
esse
s in C
T
skill
s
Th
e W
ats
on
-G
lase
r C
ritica
l T
hin
kin
g
Ap
pra
isa
l (W
GC
TA
) /
Wats
on
&
Gla
se
r/
orig
ina
lly
de
ve
loped
in
1
92
5, m
ost
rece
nt
revis
ion
2
01
2
Me
asu
res b
oth
lo
gic
al a
nd
cre
ative
co
mp
on
en
ts o
f C
T a
nd
asse
sse
s
CT
ab
ility
in
in
div
idu
als
with
at
lea
st a
nin
th
gra
de
edu
ca
tio
n
40
mu
ltip
le
ch
oic
e ite
ms
an
sw
erin
g
sce
na
rio
ba
se
d
que
stion
s
Re
liab
ility
rep
ort
ed to
be
>
.8.
Usin
g the
Sp
ea
rmen
-B
row
n fo
rmu
la,
relia
bili
ty
for
the
to
tal sco
re o
f th
e
WG
CT
A w
as
esta
blis
he
d a
t .7
7.
Th
is is
co
nsis
tent
with
the s
plit
-ha
lf r
elia
bili
ty
co
eff
icie
nts
, ra
ng
ing
fr
om
.76
to
.85
Ma
xim
um
sco
re is
80
40
-50
min
s
Infe
ren
ce
, re
co
gn
itio
n o
f a
ssu
mp
tion
s,
ded
uctio
n,
inte
rpre
tation
an
d
eva
lua
tio
n o
f a
rgu
me
nts
Th
ink a
lou
d
ana
lytic
fra
me
wo
rk /
Da
ly/
20
01
An
aly
se
q
ua
lita
tive
da
ta to
syn
the
sis
e
co
nce
ptio
n o
f C
T
A s
ca
le o
f a
rgu
me
nt/ep
iste
mo
log
ica
l co
mp
lexity is
used
to
asse
ss
vid
eota
ped
clie
nt
sim
ula
tion
No
t sta
ted
Sco
res r
ang
e f
rom
1
-4
No
tim
e
co
mm
itm
en
t b
y s
tude
nt.
U
se
s
lea
rnin
g
activitie
s
inte
gra
ted
in
to the
co
urs
e
Str
uctu
ral
co
mp
on
en
ts o
f d
iffe
ren
tia
tion
and
in
teg
ration
in
re
ason
ing
, situ
ation
m
od
elli
ng
and
a
rgu
me
nt an
d
evid
en
tia
l str
uctu
re.
Critica
l T
hin
kin
g
Ab
ility
Sca
le
(CT
AS
) fo
r
Asse
ss
dim
en
sio
ns o
f C
T
of
co
llege
20
ite
ms
me
asu
red
usin
g
a L
ike
rt s
ca
le 1
Cro
nba
ch's
alp
ha
was
foun
d t
o b
e .7
4 (
Pa
rk,
199
9)
To
tal sco
res h
ave a
p
ossib
le r
an
ge
fro
m
5 to
10
0,
with h
ighe
r
No
t sta
ted
In
telle
ctu
al cu
rio
sity,
hea
lth
y s
kep
ticis
m,
inte
llectu
al in
teg
rity
,
Page 8
8
Co
llege
S
tud
en
ts/
Pa
rk/1
99
9
stu
de
nts
=
ab
so
lute
ly d
o
no
t ag
ree t
o 5
=
ab
so
lute
ly a
gre
e
sco
re in
dic
ating
str
ong
er
CT
ab
ility
p
rude
nce
, a
nd
o
bje
ctivity
Critica
l T
hin
kin
g
Dis
po
sitio
n
Sca
le f
or
Nu
rsin
g
Stu
den
ts
(CT
DS
) /P
ark
&
Kim
/20
09
(K
ore
an
ve
rsio
n
on
ly)
Asse
ss o
f C
T
dis
positio
n in
K
ore
an
nu
rse
s
35
ite
ms
asse
sse
d a
5-
po
int
Lik
ert
sca
le. S
tude
nt
se
lf-r
epo
rt
Cro
nba
ch
’s a
lph
a =
.7
8
(Pa
rk &
Kim
, 2
00
9)
Th
e to
tal sco
re
ran
ge
s f
rom
35
to
1
75
, w
ith
a h
ighe
r sco
re in
dic
ating
a
hig
he
r le
ve
l of
critica
l th
inkin
g
dis
positio
n
No
sta
ted
Inte
llectu
al in
teg
rity
, cre
ativity, cha
llen
ge
, o
pe
n-m
inde
dne
ss,
pru
de
nce
, o
bje
ctivity, tr
uth
se
ekin
g,
inq
uis
itiv
ene
ss,
Critica
l T
hin
kin
g
Pro
ce
ss T
est
(CT
PT
)/
Ed
uca
tio
na
l R
esou
rce
s In
c./
199
9
De
ve
lop
ed
sp
ecific
ally
fo
r n
urs
ing
stu
de
nts
. F
ocu
s o
n c
ritica
l th
inkin
g p
roce
ss
skill
s w
ith
in a
n
urs
ing
e
nviron
men
t, n
ot
leve
l of
nu
rsin
g
co
nte
nt
kn
ow
led
ge
50
ite
m m
ultip
le
ch
oic
e
Th
e a
ve
rage
re
liab
ility
co
eff
icie
nt
wa
s .
93
with
d
em
on
str
ate
d e
vid
en
ce
of
co
nte
nt
an
d d
iagn
ostic
va
lidity (
An
de
rson
et a
l,
200
0).
No
t sta
ted
6
0 m
inu
tes
Asse
sse
s 4
aspe
cts
of
the c
ritica
l th
inkin
g p
roce
ss:
liste
nin
g,
writing
, sp
ea
kin
g, a
nd
re
ad
ing
, an
d 5
le
ve
ls
of
ab
str
act th
inkin
g:
prio
ritizin
g,
infe
ren
tia
l re
aso
nin
g,
goa
l se
ttin
g,
app
lica
tion
of
kn
ow
led
ge
, a
nd
e
va
lua
tio
n o
f p
red
icte
d o
utc
om
es.
Th
ink a
lou
d
pro
toco
l /
Mo
rey,
200
2
Pro
vid
e a
va
lid
so
urc
e o
f q
ua
lita
tive
da
ta
on
th
inkin
g a
nd
th
ou
gh
t p
roce
sse
s
A r
atin
g too
l an
d
rub
ric u
sin
g a
4
po
int
Lik
ert
sca
le fo
r e
ight
co
gn
itiv
e
pro
ce
sse
s, le
ve
l of
critica
l th
inkin
g, an
d fo
r
Tw
o f
acu
lty r
ate
d the
th
ink-a
loud
sce
na
rio
re
sp
on
ses w
ith 9
7.9
to
100
pe
rcen
t ra
ter
ag
ree
men
t.
No
t p
rovid
ed
No
tim
e
co
mm
itm
en
t b
y s
tude
nt
as
use
s le
arn
ing
activitie
s
inte
gra
ted
in
to the
co
urs
e
Co
llect,
re
vie
w,
rela
te,
inte
rpre
t,
infe
r, d
iag
no
sis
, a
ct,
and
eva
lua
te
Page 9
9
accu
racy o
f n
urs
ing
d
iagn
osis
, co
nclu
sio
ns,
and
eva
lua
tion
.
N3
ca
se
rep
ort
a
ccre
dita
tio
n
form
/T
aiw
an
N
urs
es
Asso
cia
tion
/ n
o
da
te a
va
ilab
le
No
t sta
ted
45
crite
ria
(in
clu
din
g 3
6
str
eng
ths a
nd 9
w
ea
kn
esse
s
Inte
r-ra
ter
relia
bili
ty =
.8
93
, in
tern
al co
nsis
ten
cy
of
KR
-20
= .7
9 a
nd
te
st-
rete
st re
liab
ility
of
.32
(p
<0
.01
).
To
tal sco
res r
ang
ed
fr
om
0-4
5.
No
tim
e
co
mm
itm
en
t b
y s
tude
nt.
U
se
s
lea
rnin
g
activitie
s
inte
gra
ted
in
to the
co
urs
e
Co
nstr
ucte
d o
n the
b
asis
of
the
nu
rsin
g
pro
ce
ss.
Critica
l in
qu
iry p
oin
ts a
re
liste
d u
nd
er
ea
ch
ste
p o
f th
e n
urs
ing
p
roce
ss
Dis
cu
ssio
n
boa
rd a
na
lysis
/ P
uce
r T
robe
c &
Ž
va
nu
t / 20
14
An
aly
se
d
iscu
ssio
n b
oa
rd
po
sts
fo
r e
vid
en
ce
of
CT
Dis
cu
ssio
n
po
sts
exa
min
ed
a
ga
inst six
e
lem
en
ts o
f critica
l th
inkin
g
No
t sta
ted
N
ot
sta
ted
6
0 m
inu
tes
An
aly
sis
, in
fere
nce,
inte
rpre
tation
, e
xp
lan
ation
, e
va
lua
tio
n, an
d s
elf-
reg
ula
tion
.
Critica
l T
hin
kin
g
sca
le (
CT
S)
/ C
he
ng
, W
ang
, W
u, &
Hw
an
g, /
199
6
No
t p
rovid
ed
60
ite
m m
ultip
le
ch
oic
e
que
stion
s.
Pa
rtic
ipa
nts
ch
oo
se
on
e
co
rre
ct a
nsw
er
fro
m e
ith
er
one
in
fiv
e o
r d
ich
oto
mo
us
resp
on
se s
ets
a
cco
rdin
g t
o t
he
ite
m s
itua
tion
s.
CT
S d
em
onstr
ate
d
ade
qu
ate
re
liab
ility
(in
tern
al con
sis
ten
cy a
s
we
ll a
s s
plit
ha
lf r
elia
bili
ty)
and
con
ve
rgen
t a
s w
ell
as
kn
ow
n g
roup
va
lidity.
Th
e h
igh
er
sco
res
ind
icate
bett
er
CT
skill
s
No
t p
rovid
ed
Infe
ren
ce
, re
co
gn
itio
n o
f a
ssu
mp
tion
s,
ded
uctio
n,
inte
rpre
tation
, a
nd
e
va
lua
tio
n o
f a
rgu
me
nt.
Critica
l T
hin
kin
g
Asse
ssm
en
t
(CT
A)
/ A
sse
ssm
en
t
De
term
ine
stu
de
nts
’ o
ve
rall
pe
rfo
rma
nce
on
sp
ecifie
d C
T s
kill
s
40
gen
eric
mu
ltip
le c
ho
ice
q
ue
stion
s
CT
A h
as a
glo
ba
l a
lpha
of
.69 a
nd
a s
tan
da
rdiz
ed
ite
m a
lpha
of
.70
fo
r a
ll 40
ite
ms in
first-
tim
e
Ma
xim
um
sco
re o
f 4
0
No
t p
rovid
ed
Inte
rpre
tation
, a
na
lysis
, eva
lua
tio
n,
infe
ren
ce,
exp
lan
ation
, se
lf-
Page 10
10
Te
chn
olo
gie
s
Institu
te /
200
1
de
term
ined
to
be
n
ece
ssa
ry fo
r su
cce
ss in
an
acad
em
ic
pro
gra
m f
or
nu
rsin
g s
tud
y.
exa
min
ee
s (
AT
I, 2
001
).
reg
ula
tion
Blo
om
s
Ta
xon
om
y /
Jo
ne
s, 2
00
8
Asse
ss s
tud
en
t’s
de
ve
loped
n
urs
ing
ca
re
pla
ns f
or
evid
en
ce
of
critica
l th
inkin
g
Usin
g n
urs
ing
ca
re p
lans
No
t p
rovid
ed
No
t p
rovid
ed
No
tim
e
co
mm
itm
en
t b
y s
tude
nt
as
use
s le
arn
ing
activitie
s
inte
gra
ted
in
to the
co
urs
e
Kn
ow
ledg
e,
co
mp
reh
en
sio
n,
app
lica
tion
, a
na
lysis
, syn
the
sis
, e
va
luation
Co
nce
pt m
ap
sco
rin
g /
Da
ley
Sh
aw
, B
alis
trie
ri,
Gla
sen
ap
p,
Pia
cen
tine
/
199
9
Asse
ss s
tud
en
t’s
ab
ility
to
de
ve
lop
co
nce
pt m
ap
s
that
refle
ct
CT
u
sed
in
the
n
urs
ing
pro
ce
ss.
Usin
g c
on
ce
pt
ma
ps
Inte
r-ra
ter
relia
bili
ty w
as
pe
rfo
rme
d w
ith
tw
o
asse
sso
rs in t
he
pilo
t stu
dy a
nd t
he
pe
rcen
tage
of
ag
ree
men
t of
the
in
dep
en
de
nt
sco
res w
as
85
%.
Con
ten
t va
lidity
esta
blis
h b
y D
ale
y e
t a
l (1
999
).
No
t p
rovid
ed
N
o t
ime
co
mm
itm
en
t b
y s
tude
nt
as
use
s le
arn
ing
activitie
s
inte
gra
ted
in
to the
co
urs
e
Me
an
ingfu
l, v
alid
a
nd
sig
nific
an
t
Critica
l T
hin
kin
g
Sca
le (
CT
SM
) /
McM
aste
r U
niv
ers
ity /
200
2
No
t p
rovid
ed
10
ite
ms. E
ach
ite
m is s
co
red
on
a s
ix-p
oin
t L
ike
rt s
ca
le o
f 1
to
6
, w
ith
1
co
rre
spo
nd
ing
to
“ne
ve
r” a
nd
6
to “
alw
ays”.
.
Cro
nba
ch
’s a
lph
a
co
eff
icie
nt
.93 a
nd
tw
o-
we
ek te
st-
rete
st
relia
bili
ty
co
eff
icie
nt
wa
s .
92
.
To
tal sco
res r
ang
e
fro
m 1
0 to
60
with
h
ighe
r sco
res
ind
icatin
g h
ighe
r le
ve
l of
CT
co
mp
ete
ncy
No
t p
rovid
ed
No
t sta
ted
Page 12
12
Inclu
ded
stu
die
s w
ere
lis
ted
in
a s
um
ma
ry t
ab
le (
Ta
ble
2)
du
ring
th
e s
ea
rch
. T
he
stu
die
s a
re p
resen
ted
in
gro
up
s a
cco
rdin
g t
o t
he
too
l u
tilis
ed
.
Aft
er
the
in
itia
l sea
rch a
ll a
rtic
les iden
tified
in s
ub
se
qu
en
t se
arc
he
s w
ere
ch
ecked
ag
ain
st a
rtic
les in
th
e s
um
ma
ry tab
le a
nd
dup
lica
tes
exclu
ded
. E
ach
art
icle
wa
s a
lso
en
tere
d into
a r
efe
ren
ce
mana
ge
me
nt
da
taba
se (
End
no
te)
inclu
din
g th
e s
ea
rch
te
rm a
nd
en
gin
e u
sed
to
loca
te e
ach
art
icle
. A
qu
alit
y a
pp
rais
al p
roce
ss w
as p
erf
orm
ed u
sin
g the
Critica
l A
pp
rais
al S
kill
s P
rog
ram
me
(C
AS
P)
too
l (C
AS
P, 2
01
3)
and
one
art
icle
of
po
or
qua
lity w
as e
xclu
ded
. T
he
exclu
de
d s
tud
y is id
en
tified
in t
he
su
mm
ary
ta
ble
. F
ollo
win
g t
he
qua
lity a
pp
rais
al p
roce
ss 3
4
pap
ers
we
re s
ele
cte
d f
or
revie
w.
Page 13
13
Ta
ble
2:
Art
icle
s t
ha
t m
et
inc
lus
ion
an
d q
ua
lity
cri
teri
a
Au
tho
r,
ye
ar
an
d
loc
ati
on
De
sig
n/In
terv
en
tio
n
Pa
rtic
ipa
nts
R
esu
lts
R
elia
bil
ity a
nd
va
lid
ity
asse
ss
me
nt
Qu
ality
A
pp
rais
al u
sin
g
CA
SP
Ca
lifo
rnia
n C
riti
ca
l T
hin
kin
g D
isp
osit
ion
In
ve
nto
ry (
CC
TD
I)
Ata
y,
&
Ka
rab
aca
(2
012
).
Tu
rke
y
Pre
- po
st-
test
co
ntr
ol
gro
up
de
sig
n te
stin
g
eff
ects
of
usin
g c
on
ce
pt
pla
ns
80
fre
sh
ma
n a
nd
so
ph
om
ore
nu
rsin
g
stu
de
nts
Sta
tistica
lly s
ign
ific
an
t in
cre
ase
in C
T s
co
res fo
r e
xpe
rim
en
tal g
roup
.
Cro
nba
ch
’s a
lph
a f
or
the
w
as .8
8.
Inclu
de
Sh
in,
Le
e,
Ha
, &
Kim
(2
006
) K
ore
a
Lon
gitu
din
al stu
dy u
sin
g
CC
TD
I e
ach
ye
ar
for
4
ye
ars
60
nu
rsin
g s
tud
en
ts
co
mm
en
ce
d o
n s
tud
y,
32
co
mp
lete
d a
ll fo
ur
su
rve
ys
Sta
tistica
lly s
ign
ific
an
t im
pro
ve
men
t in
CT
d
isp
ositio
n
Cro
nba
ch
’s a
lph
a f
or
the
C
CT
DI w
as .
59 in
Yr
1, .5
3
for
Yr
2, .6
6 fo
r Y
r 3
, a
nd
.7
3 f
or
Yr
4.
Sig
nific
an
tly
low
er
than
ove
rall
med
ian
a
lpha
co
eff
icie
nt of
.90
rep
ort
ed b
y F
acio
ne (
199
4)
Inclu
de
Tiw
ari,
Ave
ry, &
La
i (2
006
).
Ho
ng K
ong
Expe
rim
en
tal de
sig
n,
pre
-po
st
test te
sting
th
e
eff
ects
of
PB
L. 4
tim
e
po
ints
te
ste
d
79
1st y
ea
r nu
rsin
g
stu
de
nts
.
Sig
nific
antly g
rea
ter
imp
rove
men
t in
CT
sco
res fo
r e
xpe
rim
en
tal g
roup
No
re
po
rtin
g o
f re
liab
ility
of
CC
TD
I fo
r th
is s
tud
y.
Inclu
de
Evan
s &
B
end
el,
(20
04
).
Un
ited
S
tate
s
Qu
asi-e
xpe
rim
en
tal, n
on
-e
qu
iva
len
t co
ntr
ol g
roup
d
esig
n te
stin
g n
arr
ative
p
ed
ag
og
y
114
un
de
rgra
dua
te
nu
rsin
g s
tude
nts
,
No
sig
nific
an
t d
iffe
ren
ce
s in
C
T s
co
res b
etw
ee
n c
on
tro
l a
nd
exp
erim
en
tal g
roup
s
No
re
po
rtin
g o
f re
liab
ility
of
CC
TD
I fo
r th
is s
tud
y.
Inclu
de
Wood &
T
oro
nto
(2
012
) U
SA
Expe
rim
en
tal stu
dy
testing
the
eff
ects
of
hu
man
pa
tie
nt
sim
ula
tion
85
2nd y
ea
r nu
rsin
g
stu
de
nts
H
igh
er
me
an
po
st-
test to
tal
sco
res c
om
pa
red
with p
re-
test to
tal sco
res in
e
xpe
rim
en
tal g
roup
stu
de
nts
.
No
re
po
rtin
g o
f re
liab
ility
of
CC
TD
I fo
r th
is s
tud
y.
Inclu
de
Ste
wa
rt &
D
em
pse
y
(20
05
).
US
A
Lon
gitu
din
al stu
dy, a
t 5
tim
e-p
oin
ts te
stin
g
eff
ects
of
wh
ole
pro
gra
m
55
nu
rsin
g s
tud
en
ts
recru
ited
, 3
4 s
tud
en
ts
co
mp
lete
d a
ll su
rve
ys
Su
bsca
le a
nd t
ota
l sco
res d
id
no
t sig
nific
an
tly in
cre
ase
th
rou
gh
ou
t th
e p
rog
ram
.
Cro
nba
ch
’s a
lph
a f
or
the
C
CD
DI w
as c
alc
ula
ted
at
ea
ch
ph
ase
: S
oph
om
ore
se
me
ste
r 2
=
.
Inclu
de
Page 14
14
.71.
Ju
nio
r se
me
ste
r 1
= .
77
Ju
nio
r se
me
ste
r 2
= .
76
S
en
ior
se
me
ste
r 1
= .6
7
Se
nio
r se
me
ste
r 2
= .7
5
Ye
h &
Ch
en
(2
005
).
Ta
iwa
n
A p
re-
an
d p
ost-
test
qua
si-e
xpe
rim
en
tal
rese
arc
h d
esig
n te
stin
g
the e
ffe
cts
of
a C
T
lectu
re a
nd in
tera
ctive
vid
eod
isc s
yste
m
126
RN
-BN
stu
de
nts
S
tatistica
lly s
ign
ific
an
t d
iffe
ren
ce
s b
etw
een
pre
and
p
ost-
test o
ve
rall
sco
res
No
re
po
rtin
g o
f re
liab
ility
of
CC
DT
I fo
r th
is s
tud
y.
Inclu
de
Yu
, Z
ha
ng
, X
u, W
u &
W
ang
(20
12
).
Ch
ina
Cro
sso
ve
r e
xp
erim
enta
l stu
dy te
sting
th
e e
ffe
cts
of
PB
L
76
2nd y
ea
r nu
rsin
g
stu
de
nts
.
Sta
tistica
l im
pro
ve
me
nt in
o
ve
rall
CT
DI sco
res fo
llow
ing
P
BL
Fo
r th
is s
tud
y the
ove
rall
Cro
nba
ch
’s a
lph
a w
as
.899
9
Inclu
de
De
hko
rdi, &
H
eyda
rne
jad
, (2
008
).
US
A
Qu
asi-e
xpe
rim
en
tal
de
sig
n te
stin
g the
eff
ects
of
PB
L
40
2nd y
ea
r nu
rsin
g
stu
de
nts
pa
rtic
ipa
ted
.
Sta
tistica
l im
pro
ve
me
nt in
C
TD
I sco
res f
ollo
win
g P
BL
N
o r
epo
rtin
g o
f re
liab
ility
of
CC
DT
I fo
r th
is s
tud
y
Inclu
de
Za
de
h,
Kh
aje
ali,
K
ha
lkh
ali,
&
Mo
ha
mm
ad
pou
r (2
01
4).
Ir
an
Qu
asi-e
xpe
rim
en
tal
stu
dy te
sting
th
e e
ffe
cts
of
an
evid
en
ce
ba
se
d
nu
rsin
g c
ou
rse
48
3rd
ye
ar
nu
rsin
g
stu
de
nts
C
CT
DI sco
res w
ere
sig
nific
an
tly h
igh
er
follo
win
g
the in
terv
en
tion
No
re
po
rtin
g o
f re
liab
ility
of
CC
DT
I.
Inclu
de
Ca
lifo
rnia
n C
riti
ca
l T
hin
kin
g T
est
(CC
TS
T)
Ch
au,
et a
l (2
001
).
Ho
ng K
ong
Pre
-te
st/
po
st-
test d
esig
n
testing
the
eff
ects
of
4
vig
nett
es.
101
1st a
nd 2
nd y
ea
r n
urs
ing
stu
de
nts
re
cru
ited
of 8
3
co
mp
lete
d b
oth
pre
an
d
po
st-
tests
.
No
sta
tistica
l d
iffe
ren
ce in
pre
a
nd
po
st te
st
sco
res.
KR
-20
of
the
CC
TS
T w
as
.74 a
nd
su
bsca
les r
an
ge
d
fro
m .
30
to
.61
.
Inclu
de
Be
ckie
,
A p
re-p
ost
test, n
on
-1
83
BN
stu
den
ts
Co
ho
rt 1
re
ce
ive
d t
he
ne
w
Cro
nba
ch a
lph
a o
n C
CT
ST
In
clu
de
Page 15
15
Lo
wry
, &
B
arn
ett,
(20
01
).
Un
ited
S
tate
s
equ
iva
len
t co
ntr
ol g
roup
d
esig
n. E
xp
erim
enta
l g
roup
expe
rie
nce
d n
ew
cu
rric
ulu
m
co
nsis
ted o
f 3 c
oh
ort
s
of
stu
de
nts
, 1
co
ntr
ol
co
ho
rt a
nd 2
coh
ort
s
that
exp
erie
nce
d the
n
ew
cu
rric
ulu
m
cu
rric
ulu
m, a
ch
ieved
sig
nific
an
tly h
igh
er
CT
sco
res
than
co
ntr
ols
. C
oho
rt 3
, th
e
2nd
cla
ss t
o e
xp
eri
en
ce
th
e
revis
ed c
urr
icu
lum
, fa
iled
to
d
em
on
str
ate
im
pro
ve
d C
T
sco
res a
nd
rep
ort
ed s
om
e
de
cre
ases.
ran
ge
d f
rom
.5
5 to
.8
3.
Inte
rna
l con
sis
ten
cy o
f to
ol
low
an
d v
aried
acro
ss te
sts
.
Sp
elic
, et
al.,(
20
01
).
Un
ited
S
tate
s
Lon
gitu
din
al stu
dy
testing
eff
ects
of
diffe
ren
t p
ath
wa
ys
136
stu
de
nts
in
3
und
erg
radu
ate
p
ath
wa
ys,
trad
itio
na
l,
acce
lera
ted
and
RN
-B
SN
Sta
tistica
lly s
ign
ific
an
t in
cre
ase
in C
T s
co
res fo
r a
ll p
ath
wa
ys
Th
e C
CT
ST
ha
s 3
4 ite
ms.
No
de
mon
str
ate
d v
arian
ce
(a
ll stu
den
ts s
co
red
th
e
sa
me
) on
so
me
ite
ms, α
le
ve
l th
ere
fore
co
mp
ute
d o
n
less th
an
30
ite
ms.
Inclu
de
Whee
ler,
&
Co
llin
s,
(2
003
)
Un
ited
S
tate
s
Qu
asi-e
xpe
rim
en
tal
de
sig
n. T
estin
g the
eff
ects
of
co
ncep
t m
ap
pin
g c
om
pa
red
to
tr
ad
itio
na
l nu
rsin
g c
are
p
lan
s.
A c
on
ven
ien
ce s
am
ple
(n
= 7
6)
Sig
nific
ant
diffe
ren
ce
b
etw
een
pre
– p
ost te
st
sco
res fo
r b
oth
gro
up
s. N
o
diffe
ren
ce
fo
un
d b
etw
ee
n
expe
rim
en
tal an
d c
on
tro
l g
roup
s.
No
re
po
rtin
g o
f re
liab
ility
of
CC
TS
T f
or
this
stu
dy.
Inclu
de
Yu
an
, K
una
vik
tiku
l,
Klu
nklin
,
& W
illia
ms,
(20
08
).
Ch
ina
A q
ua
si-e
xpe
rim
en
tal,
two
-gro
up
pre
–po
st te
st
de
sig
n te
stin
g the
eff
ects
of
PB
L
All
46 Y
ea
r 2
nu
rsin
g
stu
de
nts
P
BL s
tude
nts
ha
d
sig
nific
an
tly g
rea
ter
imp
rove
men
ts o
n o
ve
rall
CC
TS
T
KR
20 f
or
the
CC
TS
T-A
wa
s
.80 f
or
the
to
tal sca
le a
nd
be
twe
en
.6
0-.
78 f
or
su
bsca
les.
Inclu
de
Ca
lifo
rnia
n C
riti
ca
l T
hin
kin
g S
kills
Te
st
(CC
TS
T)
& C
alifo
rnia
n C
riti
ca
l T
hin
kin
g D
isp
os
itio
n I
nve
nto
ry (
CC
TD
I)
Ra
ve
rt,
(20
08
).
Sta
tes
Pre
-post
test de
sig
n
testing
eff
ects
of
hu
ma
n
pa
tie
nt sim
ula
tio
n
30
1st y
ea
r stu
den
ts
No
diffe
ren
ces in
CT
sco
res
No
re
po
rtin
g o
f re
liab
ility
of
the C
CT
ST
or
CC
TD
I fo
r th
is s
tud
y.
Inclu
de
Na
be
r &
W
yatt
, (2
014
) U
nited
S
tate
s
Expe
rim
en
tal, p
re–p
ost
test d
esig
n t
esting
eff
ects
of
refle
ctive
w
ritin
g
70
4th s
em
este
r nu
rsin
g
stu
de
nts
T
he
expe
rim
enta
l g
roup
's
tota
l C
CT
ST
and
CC
TD
I sco
res d
id n
ot
incre
ase
sig
nific
an
tly fo
llow
ing
the
in
terv
en
tio
n.
No
re
po
rtin
g o
f re
liab
ility
of
CC
TS
T o
r C
CT
DI sca
le f
or
this
stu
dy.
Inclu
de
Page 16
16
He
alt
h S
cie
nc
es R
ea
so
nin
g T
est
(HS
RT
)
Su
llivan
-M
an
n,
Pe
rro
n,
&
Fe
llne
r (2
009
).
Un
ited
S
tate
s
Mix
ed
-mo
de
l e
xpe
rim
en
tal de
sig
n,
testing
eff
ects
of
mu
ltip
le
sim
ula
tion
53
nu
rsin
g s
tud
en
ts
fro
m t
he
me
dic
al-
su
rgic
al co
urs
e
Sta
tistica
lly s
ign
ific
an
t in
cre
ase
in C
T s
co
res fo
r e
xpe
rim
en
tal g
roup
.
Re
liab
ility
of
the
HR
ST
not
rep
ort
ed f
or
this
stu
dy.
Inclu
de
Sh
inn
ick,. &
W
oo,
(20
13
).
Un
ited
S
tate
s
On
e-g
roup
, q
ua
si-
expe
rim
en
tal, p
re-p
ost
test d
esig
n.
Teste
d th
e
eff
ects
on o
ne
hu
ma
n
pa
tie
nt sim
ula
tio
n
A c
on
ven
ien
ce s
am
ple
of
154
, 3
rd o
r 4
th y
ea
r n
urs
ing
stu
de
nts
Fo
llow
ing H
PS
th
ere
we
re n
o
sta
tistica
lly s
ign
ific
an
t g
ain
s
in C
T,
with
so
me
de
cre
ase
in
sco
res (
no
t sta
tistica
lly
sig
nific
an
t).
No
re
po
rtin
g o
f re
liab
ility
of
HS
RT
fo
r th
is s
tud
y.
Inclu
de
Go
od
sto
ne
e
t a
l,
(20
13
).
US
A
A t
wo
-gro
up
qua
si-
expe
rim
en
tal p
re-p
ort
te
st d
esig
n t
esting
th
e
eff
ects
of
hig
h f
ide
lity
pa
tie
nt sim
ula
tio
n
(HF
PS
) co
mpa
red
to
ca
se
stu
dy
42
1st s
em
este
r a
sso
cia
te d
eg
ree
n
urs
ing
stu
de
nts
. A
lloca
ted to
tw
o
gro
up
s,
HF
PS
, an
d
ca
se
stu
dy g
roup
,
Th
ere
wa
s a
sig
nific
an
t in
cre
ase
in t
he
HS
RT
sco
res
for
the
ca
se
stu
dy g
rou
p
(p=
0.0
03
) bu
t n
ot fo
r th
e
HF
PS
gro
up
.
.No
rep
ort
ing
of
relia
bili
ty o
f H
SR
T f
or
this
stu
dy
Inclu
de
Wa
tso
n-G
lase
r C
riti
ca
l T
hin
kin
g A
pp
rais
al (W
GC
TA
)
L'E
pla
tten
ier
(20
01
).
Un
ited
S
tate
s
Lon
gitu
din
al stu
dy
testing
4 t
imes o
ve
r 3
ye
ar
und
erg
radu
ate
p
rog
ram
83
nu
rsin
g s
tud
en
ts
N
o c
ha
ng
e in
CT
sco
res a
s
stu
de
nt
pro
gre
ssed
th
rough
th
e p
rog
ram
.
No
re
po
rtin
g o
f re
liab
ility
of
WG
CT
A f
or
this
stu
dy.
Inclu
de
Bro
wn
, A
lve
rson
, &
P
epa
(2
001
).
Un
ited
S
tate
s
Lon
gitu
din
al stu
dy,
testing
at th
e b
eg
inn
ing
a
nd
en
d o
f de
gre
e.
Te
sting
diffe
ren
t p
ath
wa
ys a
nd
le
ng
th o
f p
rog
ram
Co
nve
nie
nce s
am
ple
(n
=
12
3)
of
thre
e g
roup
s
of
ba
cca
lau
reate
n
urs
ing
stu
de
nts
: tr
ad
itio
na
l, R
N-B
SN
, a
nd
acce
lera
ted
.
A s
ign
ific
an
ce d
iffe
ren
ce
foun
d b
etw
ee
n p
re-
and
po
st
WG
CT
A s
co
res fo
r tr
ad
itio
na
l stu
de
nts
(p=
0.0
07
) an
d R
N-
BS
N (
p=
0.0
29
), w
ith
no
d
iffe
ren
ce
fo
r a
cce
lera
ted
stu
de
nts
.
Re
liab
ility
fo
r th
e t
ota
l sco
re
of
the W
GC
TA
wa
s
esta
blis
he
d a
t .7
7 (
usin
g
Sp
ea
rmen
-Bro
wn f
orm
ula
).
Co
nsis
ten
t w
ith t
he
sp
lit-h
alf
relia
bili
ty c
oeff
icie
nts
(.6
9 to
.8
5),
re
po
rted
by W
ats
on
a
nd
Gla
se
r
Inclu
de
Wa
tso
n-G
lase
r C
riti
ca
l T
hin
kin
g A
pp
rais
al (W
GC
TA
) a
nd
Th
ink
Alo
ud
An
aly
tic
al F
ram
ew
ork
Da
ly,
A lon
gitu
din
al m
ulti-
43
nu
rsin
g s
tud
en
ts
No
sta
tistica
l d
iffe
ren
ce in
N
o r
epo
rtin
g o
f re
liab
ility
of
Inclu
de
Page 17
17
(20
01
).
Un
ited
K
ingd
om
me
tho
d d
esig
n w
ith
tr
ian
gu
latio
n.
co
mp
lete
d W
GC
TA
. 12
stu
de
nts
co
mp
lete
d
thin
k a
lou
d a
na
lytica
l fr
am
ew
ork
WG
CT
A s
co
res. L
ittle
e
vid
en
ce
of
CT
de
mon
str
ate
d
in t
hin
k a
loud
ana
lytica
l fr
am
ew
ork
WG
CT
A f
or
this
stu
dy.
No
d
iscu
ssio
n o
f re
liab
ility
or
va
lidity o
f th
ink a
loud
a
na
lytica
l fr
am
ew
ork
. N
ot
cle
ar
whe
the
r th
e th
ink
alo
ud
to
ol w
as v
alid
ate
d o
r re
vie
wed
by e
xp
ert
s a
nd
inte
r-ra
ter
relia
bili
ty w
as n
ot
dis
cu
sse
d
Cri
tic
al T
hin
kin
g A
bilit
y S
ca
le (
CT
AS
) fo
r C
olle
ge
Stu
de
nts
Ch
oi,
Lin
dq
uis
t, &
S
ong
, (2
014
).
Ko
rea
No
n-e
qu
iva
lent
co
ntr
ol
gro
up
pre
–p
ost
test
de
sig
n te
stin
g e
ffe
cts
of
PB
L.
90
1st y
ea
r nu
rsin
g
stu
de
nts
No
sig
nific
an
t d
iffe
ren
ce
s in
C
T s
co
res b
etw
ee
n c
on
tro
l a
nd
exp
erim
en
tal g
roup
s
Cro
nba
ch's
alp
ha
was .7
1
wh
ich
is c
on
sis
ten
t w
ith
th
e
rep
ort
ed .
74
by P
ark
(19
99
).
No
t ava
ilab
le in
En
glis
h
Inclu
de
Cri
tic
al T
hin
kin
g D
isp
osit
ion
Sc
ale
fo
r N
urs
ing
Stu
de
nts
(C
TD
S)
Ju
n, Le
e,
Pa
rk,
Ch
ang
&
Kim
(2
01
3).
S
outh
K
ore
a
Qu
asi-e
xpe
rim
en
tal
stu
dy te
sting
eff
ects
of
5E
le
arn
ing
cycle
mo
de
l w
ith
PB
L
161
1st y
ea
r n
urs
ing
stu
de
nts
Sta
tistica
lly s
ign
ific
an
t in
cre
ase
in C
T s
co
res fo
r e
xpe
rim
en
tal g
roup
.
Cro
nba
ch
’s a
lph
a w
as .81
. C
TD
S n
ot a
va
ilab
le in
E
ng
lish
, 2
0 p
oin
t se
lf r
epo
rt
Lik
ert
sca
le m
ea
su
res
dis
positio
n a
s a
pro
xy f
or
CT
skill
s.
Inclu
de
Cri
tic
al T
hin
kin
g P
roce
ss T
est
(CT
PT
)
De
Sim
on
e,
(20
06
).
Un
ited
S
tate
s
Expe
rim
en
tal de
sig
n
testing
eff
ects
of
acce
lera
ted
pro
gra
m
38
nu
rsin
g s
tud
en
ts
und
ert
akin
g a
n
acce
lera
ted
pro
gra
m
(12
mon
ths in le
ng
th)
Incre
ase in
CT
sco
res n
ot
sig
nific
an
tly d
iffe
ren
t A
ve
rag
e r
elia
bili
ty
co
eff
icie
nt
wa
s .
93
. In
clu
de
Cri
tic
al T
hin
kin
g P
roce
ss T
est
+ T
hin
k A
lou
d P
roto
co
l
Mo
rey,
(20
12
).
Un
ited
S
tate
s
An
expe
rim
enta
l d
esig
n
testing
an o
nlin
e
an
ima
ted p
eda
go
gic
al
age
nt.
45
associa
te d
eg
ree
nu
rsin
g s
tude
nts
in
the
ir f
ina
l se
me
ste
r
No
diffe
ren
ces in
CT
fo
r e
ithe
r to
ol
No
re
po
rtin
g o
f re
liab
ility
of
CT
PT
. T
wo f
acu
lty r
ate
d the
th
ink-a
loud
sce
na
rio
re
sp
on
ses w
ith 9
7.9
to 1
00
Inclu
de
Page 18
18
pe
rcen
t ra
ter
ag
ree
me
nt.
Lim
ite
d info
rma
tion
p
rovid
ed
re
ga
rdin
g th
e t
hin
k
alo
ud
pro
toco
l.
N3
Ca
se
Re
po
rt A
cc
red
ita
tio
n F
orm
Ch
en,
&
Lin
, (2
00
1)
Ta
iwa
n
Qu
asi- e
xpe
rim
enta
l d
esig
n w
ith
pre
-po
st
test
testing
eff
ects
of
a
rese
arc
h c
ou
rse
168
1st
ye
ar
nu
rsin
g
stu
de
nts
.
Expe
rim
en
tal g
roup
rep
ort
ed
sig
nific
an
tly h
igh
er
CT
sco
res
than
co
ntr
ol g
roup
No
re
po
rtin
g o
f re
liab
ility
of
the N
3 c
ase r
epo
rt fo
rm.
Un
cle
ar
wh
eth
er
too
l m
ea
su
red
stu
den
ts’ a
bili
ty
to c
ritiqu
e a
n a
rtic
le r
ath
er
than
CT
ab
ilitie
s..
Inclu
de
Dis
cu
ssio
n B
oa
rd A
na
lysis
Pu
ce
r,
Tro
be
c,
&
Žva
nu
t,
(20
14
)
Slo
ven
ia
Qu
asi-e
xpe
rim
en
t stu
dy
testing
the
eff
ects
of
an
IC
T p
rog
ram
wh
ich
p
resen
ted
scen
ario
s t
ha
t m
irro
r clin
ica
l situa
tion
s.
45
1st y
ea
r nu
rsin
g
stu
de
nts
Qu
alit
ative a
na
lysis
of
the
d
iscu
ssio
n b
oa
rds s
ho
wed
a
sig
nific
an
t im
pro
ve
me
nt
in %
of
po
sts
fo
r w
hic
h t
he
o
pin
ion
s a
nd
co
nclu
sio
ns o
f th
e p
art
icip
an
ts w
ere
ju
stifie
d
with
va
lid a
rgu
men
ts.
No
re
po
rtin
g o
f to
ol
relia
bili
ty.
No
dis
cu
ssio
n
reg
ard
ing d
eve
lop
men
t of
too
ls, e
xp
ert
re
vie
w p
roce
ss
or
psych
om
etr
ic te
sting
of
the t
oo
l
Inclu
de
Cri
tic
al T
hin
kin
g S
ca
le (
CT
S)
Lee
et
al
(20
13
)
Ta
iwa
n
Lon
gitu
din
al stu
dy,
me
asu
rin
g a
t 4
tim
e-
po
ints
te
sting
the
eff
ects
of
co
nce
pt
map
pin
g
A c
on
ven
ien
ce s
am
ple
of
95
stu
den
ts,
Bo
th c
on
tro
l an
d
expe
rim
en
tal g
roup
s h
ad
h
ighe
r in
itia
l C
T s
co
res th
at
tend
ed
to
de
cre
ase
ove
r tim
e.
No
re
po
rtin
g o
f re
liab
ility
of
CT
sca
le fo
r th
is s
tud
y.
Inclu
de
Cri
tic
al T
hin
kin
g A
sse
ss
me
nt
(CT
A)
Ma
nn
, (2
012
).
US
A
Expe
rim
en
tal, p
re-p
ost-
test, m
ixe
d m
eth
od
d
esig
n te
stin
g the
eff
ects
of
gra
nd
rou
nd
s
21
2nd y
ea
r nu
rsin
g
stu
de
nts
.
No
sig
nific
an
t d
iffe
ren
ce
b
etw
een
CT
sco
res fo
r th
e
two
gro
up
s.
In the
co
ntr
ol
gro
up
, stu
den
ts' s
co
res
ind
icate
d a
de
cre
ase
CT
sco
res.
No
re
po
rtin
g o
f re
liab
ility
of
CT
A f
or
this
stu
dy.
Inclu
de
Blo
om
s T
ax
on
om
y
Page 19
19
Jo
ne
s,(
20
0
8).
US
A
A q
ua
si-e
xpe
rim
en
tal,
pre
-po
st
test stu
dy
testing
the
eff
ects
of
PB
L
60
2nd
ye
ar
nu
rsin
g
stu
de
nts
.
Inte
rve
ntion
gro
up
d
em
on
str
ate
d a
hig
he
r sig
nific
an
t in
cre
ase in
CT
co
mp
are
d to t
he
co
ntr
ol
gro
up
.
No
re
po
rtin
g o
f re
liab
ility
. U
ncle
ar
wh
eth
er
the
to
ol
wa
s v
alid
ate
d o
r re
vie
wed
b
y e
xp
ert
s. B
loo
ms
taxon
om
y u
se
d to d
eve
lop
th
e t
oo
l, b
ut
no
atte
mp
t to
re
late
th
is t
o t
he
recog
nis
ed
d
efin
itio
ns o
f C
T
Inclu
de
Co
nc
ep
t M
ap
Sc
ori
ng
Ab
el. &
F
ree
ze,
(20
06
) U
SA
Lon
gitu
din
al stu
dy
me
asu
rem
en
t o
ve
r 4
tim
ep
oin
ts te
stin
g t
he
eff
ects
of
co
ncep
t m
ap
pin
g
28
associa
te d
eg
ree
nu
rsin
g s
tude
nts
T
he
re w
as a
sig
nific
an
t in
cre
ase
in m
ea
n s
co
res o
f th
e f
irst
co
nce
pt
map
to
th
e
ave
rag
e m
ea
n s
co
re o
f th
e
last
two
map
s (
p=
0.0
5).
No
re
po
rtin
g o
f re
liab
ility
of
too
l. L
imited
info
rmatio
n
abo
ut sco
ring
crite
ria
, n
ee
de
d m
ore
info
rmation
h
ow
th
is s
co
re r
ela
tes to
critica
l th
inkin
g
Inclu
de
Cri
tic
al T
hin
kin
g L
ike
rt S
ca
le (
CT
LS
)
Ste
ve
ns,
Bre
nn
er
&
Bre
nn
er
(20
09
) U
SA
Pre
-post
test
expe
rim
en
tal de
sig
n
testing
the
PA
LS
le
arn
ing
app
roa
ch
15
nu
rsin
g s
tud
en
ts
Incre
ase in
sco
res o
n C
TLS
b
ut no
sta
tistica
l an
aly
sis
p
erf
orm
ed
.
No
re
po
rtin
g o
f re
liab
ility
of
CT
LS
fo
r th
is s
tud
y o
r p
revio
usly
.
Exclu
de
d
ue
to
la
ck o
f sta
tistica
l a
na
lysis
a
nd
re
po
rtin
g
of
resu
lts.
Cri
tic
al T
hin
kin
g S
ca
le (
CT
SM
)
Tse
ng
, e
t a
l (2
011
).
Ta
iwa
n
A q
ua
si-e
xpe
rim
en
tal
de
sig
n m
ea
su
rem
en
t o
ve
r 3
tim
e-p
oin
ts te
stin
g
the e
ffe
cts
of
PB
L.
120
RN
stu
de
nts
.
Th
e C
TS
sco
res w
ere
sig
nific
an
tly h
igh
er
in th
e
expe
rim
en
tal g
roup
Cro
nba
ch
’s a
lph
a
co
eff
icie
nt
of
the C
TS
wa
s
.94.
Lim
ite
d info
rma
tion
re
ga
rdin
g t
he
CT
S to
ol a
nd
h
ow
it m
ea
su
red
CT
.
Inclu
de
Page 20
20
Results
All 34 studies measured CT skill development or change, either following completion of a
specific educational intervention or an undergraduate nursing program. Most studies were
conducted in Western countries namely USA (n=20), United Kingdom (n=1), others were
conducted in Taiwan (n=4), Korea (n=3), China (n=2), Iran (n=1), Hong Kong (n=2), Turkey
(n=1), and Slovenia (n=1).
Reliability, Validity and Factor Domains of the Tools
Reliability, validity and factor domains of the tools were examined. This included examination
of previous and current reliability and validity testing. In respect to reliability, Facione and
Facione (1992b) noted that a Kuder-Richardson (KR-20) range of .65 to .75 for this type of
instrument is acceptable. Kaplan and Sacuzzo (1997) similarly reported that reliability
estimates in the range of .70 to .80 are acceptable.
Factor Domains
In addition to developing a definition of CT, the APA also concluded that critical thinking
comprised two dimensions; cognitive skills and disposition (Facione, 1990). Within the
cognitive skills dimension, four sub-skills were defined; interpretation, analysis, evaluation,
and inference. The disposition dimension was defined as truth-seeking, open-mindedness,
analyticity, systematicity, self-confidence, inquisitiveness, and maturity of judgment (Facione
& Facione,1992a). Some scholars argued about the applicability of the universal definition of
CT to the discipline of nursing. Scheffer and Rubenfeld (2000) conducted a Delphi study to
develop a consensus definition of CT in nursing. A set of 17 consensus CT skills and habits
of the mind were developed, many of which reflected Facione’s (1990) earlier work with the
addition of creativity, intuition and transforming knowledge (Scheffer & Rubenfeld, 2000).
There has not been any published work on a definition of critical thinking for midwifery. The
construct validity of the tools was assessed according to the dimensions and sub-skills of CT
as outlined in the previous work of Facione (1990) and Scheffer and Rubenfeld (2000).
The California Critical Thinking Disposition Inventory (CCTDI) uses the APA consensus
definition of critical thinking as the theoretical basis to measure the extent to which an
individual possesses the attitudes of a critical thinker (Facione & Facione, 1992a). The
domains assessed are: open-mindedness, analyticity, cognitive, maturity, truth-seeking,
systematicity, inquisitiveness, and self-confidence.
Page 21
21
The CCTDI has a reported overall median alpha coefficient of .90 (Facione, 1994),
demonstrating good reliability. Within the twelve studies that utilised the CCTDI only four
(Atay & Karabacak, 2012; Shin et al., 2006; Stewart & Dempsey, 2005; Yu et al., 2012)
tested reliability of the CCTDI. Two of the studies (Atay et al., 2012; Yu et al., 2012) reported
reliability levels similar to those reported by Facione (1994) of .88 and .89. However, Stewart
and Dempsey (2005) reported only marginal reliability with an alpha coefficient between .67
and .75. Shin (2006) reported a much lower alpha coefficient of .53. These inconsistent
results place some doubt on the reliability of this tool in different nursing education contexts.
The California Critical Thinking Skills Test (CCTST) was designed to measure critical
thinking in college students (Facione, 1992b). The CCTST measures the ability of
participants to draw conclusions in the areas of analysis, inference, evaluation, deductive
and inductive reasoning. (Facione & Facione, 1998). These skills relate to the APA
consensus definition of critical thinking (Facione, 1990). The KR-20 estimate of internal
consistency of the CCTST was r = .70 (Facione & Facione, 1998). Four of the seven studies
that utilised the CCTST reported on reliability. Two studies reported low alpha coefficients of
.62 (Beckie et al, 2001) and between .55 and .83 (Spelic et al, 2001). The CCTST was used
to track development of CT in students undertaking different study pathways (Spelic et al.,
2001). Some concerns were expressed with the internal consistency of the CCTST across
the different cohorts. The total score α for the RN-BSN group was very low (alpha = .31)
compared to the traditional and accelerated pathways cohorts (alpha = .66). Spelic et al.
(2001) suggested that the reliability of tools with few items and involving a timed test
administration is low. The CCTST comprises 34 items, and Spelic et al. (2001) found that on
several items all students scored the same. When these items were removed the α level for
30 items was .62. This limitation highlights the value of using multiple measures in the
assessment of CT.
The second study using the CCTST demonstrated inconsistent results (Beckie, et al., 2001).
Two cohorts of nursing students in a new curriculum focussing on CT skills completed the
CCTST over three time-points. The first group experienced significantly improved CT scores
from baseline but scores of the second group revealed decreased CT scores. This variation
in results across the two cohorts undertaking the same curriculum places doubt on the
reliability of this tool.
The other two studies that tested the reliability of the CCTST (Chau et al., 2001; Yuan et al.,
2008) reported similar results to Facione and Facione (1998). The differences in findings
Page 22
22
between these four studies may indicate that the CCTST does not consistently measure CT
in nursing practice across different settings.
The HSRT is a commercially available, recent adaptation of the CCTST specifically designed
for health sciences students and professionals to assess their CT and clinical reasoning
skills (Goodstone et al, 2013). Similar to the CCTST the HSRT uses the sub-skills identified
within the APA consensus definition of critical thinking. The HRST is considered a reliable
and valid measure of critical thinking for entry level nursing students with a KR 20 of .81
(Facione, Facione & Winterhalter, 2010). The three studies that used this tool all tested the
effects of simulation on CT but none reported reliability (Sullivan-Mann, et al, 2009; Shinnick
& Woo, 2013; Goodstone et al, 2013). One study (Sullivan-Mann, et al, 2009) reported an
increase in student’s CT skills following simulation but the other two studies (Goodstone et
al,2013; Shinnick & Woo, 2013) reported no statistical increase, with decrease in scores in
one study. These inconsistent results could indicate the HSRT is not a reliable tool across
diverse settings and populations.
The WGCTA, originally developed in the 1920’s, measures both logical and creative
components of CT and assesses CT ability in individuals with at least a ninth grade
education (Watson & Glaser, 1980). The test comprises 80 proposed arguments related to
25 statements that include problems, arguments, and interpretations. On completion a total
score is produced based on the assessment of five critical thinking skills: inference,
recognition of assumptions, deduction, interpretation and evaluation of arguments, which
align to the CT sub-skills defined by Facione (1990). The WGCTA measures the underlying
constructs of classical logic and general reasoning skills rather than application of CT skills
(Walsh & Seldomridge, 2006). Only the study by Brown et al. (2001) reported an alpha
coefficient of .77. This is consistent with the split-half reliability coefficients of .69 to .85
reported by Watson and Glaser (1980). The three studies that used the WGCTA were all
conducted in the USA and used a longitudinal design to detect change in CT across different
undergraduate nursing degrees (L'Eplattenier, 2001; Brown et al, 2001; Daly, 2001). Two of
the studies (L’Eplattenier, 2001; Daly, 2001) found no change in CT scores whereas Brown
et al (2001) reported increases in CT scores of students undertaking traditional and RN-BSN
pathways but no change for students in the accelerated pathway. These inconsistencies in
findings may support claims that the constructs within the WGCTA are not suited to measure
CT skills in the nursing discipline (Walsh & Seldomridge, 2006).
Of the twelve non-standardised tools utilised to measure critical thinking in this review, only
four tested reliability. The Critical Thinking Ability Scale (CTAS) for College Students has a
Page 23
23
reported Cronbach's alpha of .74 (Park, 1999). The CTAS was used by Choi et al (2014) to
measure the effect of problem based learning (PBL) on CT and had a reported Cronbach’s
alpha of .71. Although the aim was to measure changes in students’ CT abilities, the CTAS
is a self-report tool that assesses the domains of; intellectual curiosity, healthy skepticism,
intellectual integrity, prudence, and objectivity, which relate more to CT disposition rather
than skills.
The Critical Thinking Disposition Scale (CTDS) for Nursing Students developed by Park &
Kim (2009) has a reported Cronbach’s alpha of .78. Jun et al. (2013) used the CTDS to
measure critical thinking development in 161 nursing students, and reported a Cronbach’s
alpha of .81. The CTDS uses the concepts of intellectual integrity, creativity, challenge,
open-mindedness, prudence, objectivity, truth seeking, inquisitiveness, which directly relate
to dispositional characteristics identified by both Facione, (1990) and Scheffer & Rubenfeld
(2000). This tool is not available in English which limits use in other settings. Similar to the
CCTDI, the CTDS only measures CT disposition not the application of these skills in
practice.
The N3 case report accreditation form developed by the Taiwan Nurses Association was
used to assess students’ CT abilities in the critique of case study reports (Chen & Lin, 2003).
Testing of this tool resulted in good inter-rater reliability = .89 (Pearson r), internal
consistency of KR-20 = .79, but low test-retest reliability of .32 after a 16 week interval.
However, the construct validity of this tool is questionable. The criteria of the tool do not
reflect any of the CT constructs. Instead the tool was constructed on the basis of the nursing
process with critical inquiry points listed under each step of the nursing process (Chen & Lin,
2003). The study tested the effects of a research course, and found significantly higher CT
scores in students who undertook the course. However, it was unclear whether the tool
measured students’ abilities to critique an article rather than their CT abilities.
The Critical Thinking Process Test (CTPT), a commercial tool developed by Educational
Resources, has a reported reliability coefficient of.93 (Anderson et al, 2000). The CTPT
measured CT development in two studies but neither reported on reliability (DeSimone,
2006; Morey, 2012). The CTPT assesses four aspects of the critical thinking process;
listening, writing, speaking, and reading, and five levels of abstract thinking; prioritizing,
inferential reasoning, goal setting, application of knowledge, and evaluation of predicted
outcomes. Several concepts partially relate to elements of the recognised definition of CT.
This tool is expensive to administer and not widely used (Fountain, 2011).
Page 24
24
The Critical Thinking Scale (CTSM) developed by McMaster University assesses the effects
of PBL and concept mapping on CT (Tseng et al., 2011). The reported Cronbach’s
coefficient of .94 (Tseng et al., 2011), was replicated in another study which reported .93
(Chou, Jian, Tseng & Ko, 2014). The concepts of inference, recognition of assumptions,
deduction, interpretation, and evaluation of argument reflect the critical thinking sub-skills
identified by Facione (1990). The CTSM is a student self-report test but may not measure
CT in practice.
A validated concept map scoring criteria was used to measure CT development over a one
year period (Abel & Freeze, 2006). Inter-rater reliability with two assessors found an 85%
level of agreement (Abel & Freeze, 2006). The authors stated that content validity had
previously been established, and no further testing of internal consistency was performed.
The scoring criteria were: 1) meaningful relationships between two concepts indicated by a
connecting line; 2) hierarchy shows a general to specific approach; 3) cross-links show
meaningful connections between one segment of the hierarchy; and 4) examples describe
specific instances of a concept (Lawson, 2012). It was unclear how the scoring criteria
related to the dimensions of CT. The study demonstrated increases in students’ concept
map scores as they progressed through the curriculum, but it is uncertain whether this
increase was representative of increases in critical thinking or simply improved competence
in concept mapping.
The Critical Thinking Scale (CTS) assesses CT through the concepts of inference,
recognition of assumptions, deduction, interpretation, and evaluation of argument (Lee et al.,
2013). These concepts match those suggested within the two recognised definitions of CT.
In a study examining the effects of concept mapping on CT skills, Lee et al., (2013) reported
that previous reliability testing convergent as well as known group validity was conducted by
the developer of the tool Cheng et al (1996). No further testing of the reliability of the tool
was conducted by Lee et al. Using a longitudinal design, students’ CT scores were
compared between those exposed to one semester of teaching on concept mapping with a
control group (Lee et al, 2013). Initial increases in CT scores were found in both groups but
decreased over time. These findings infer the teaching methodologies were not effective but
also may indicate the CTS is not reliable in measuring changes in CT over time.
The Critical Thinking Assessment (CTA) tool was used to evaluate the effects of a grand
round education strategy on CT (Mann, 2012). The CTA has a reported alpha of .69 and a
standardized item alpha of .70 in first-time examinees (Assessment Technologies Institute,
2001). No reliability testing was performed by Mann (2012). The CTA uses 40 multiple
Page 25
25
choice questions based on the domains of interpretation, analysis, evaluation, inference,
explanation and self-regulation (ATI, 2003). Four of these domains (interpretation, analysis,
evaluation and inference) directly relate to the recognised domains of CT. There were no
differences in the CT scores in the control or experimental groups, with a decrease in scores
in the control group (Mann, 2012). The unexpected decrease in CT scores could be due to
the very small sample size of 21, with only 4 students in the control group.
Four of the twelve non-standardised tools were newly developed with the specific purpose of
measuring critical thinking in action (Daly, 2001; Jones, 2008; Morey, 2012; Pucer et al.,
2014).The studies utilised practice-based teaching, learning, and assessment activities to
measure CT which not only presents opportunities to evaluate the application of CT but also
reduces survey and response burden as the activities are embedded in student learning.
However, none of these studies reported reliability of these newly developed tools.
Pucer et al. (2014) used a discussion board tool to analyse student’s postings according to
identified core key elements of critical thinking (as defined by Facione, 1990). A significant
improvement in the percentage of posts where the opinions and conclusions of participants
were justified with valid arguments was reported (Pucer et al., 2014). However, limited
information was presented on the development of the tool, process of expert review and
validation, or inter-rater reliability.
The effect of PBL on students’ CT development was measured by grading nursing care
plans over a semester (Jones, 2008). The grading system was based on the six levels of
Blooms taxonomy of cognitive learning and were described as; comprehending information,
organising ideas, and evaluating information and actions. Students who experienced the
PBL educational intervention reported higher CT scores. It was not clear however, whether
the tool was validated or reviewed by experts. Although Blooms taxonomy was used as the
basis of the tool, there did not seem to be any attempt to relate the grading domains to the
recognised definitional elements of CT (Facione, 1990; Scheffer & Rubenfeld, 2000).
In an attempt to establish concurrent validity Morey (2012) used both a newly developed
qualitative tool based on a ‘think aloud protocol’, and a standardised tool (CTPT) to measure
the effects of an animated pedagogical agent on critical thinking. The think aloud protocol
used elements of the nursing process to assess students’ thinking in solving a clinical
scenario (Morey, 2012). The elements of collect, review, relate, interpret, infer, diagnosis,
act, and evaluate did not align directly with the recognise definitions of CT. Both groups
displayed significant improvements in CT levels and correct conclusions from baseline to
post-intervention on the think-aloud protocol, but only the pedagogical agent group had a
Page 26
26
significant result for “evaluation”. These mixed results may indicate the difficulty in
measuring CT development in a standardised exam format. Reliability testing and construct
validity of the think aloud were not reported, therefore results must be viewed with caution.
Daly (2001) also compared the use of a newly developed think-aloud analytic framework and
a standardised tool (WGCTA) to measure CT development over an 18 month period. No
statistical improvement in the WGCTA scores was found. The think aloud qualitative
assessment demonstrated consistent evidence of reasoning that reflected an “enduring
absolutist epistemology” but portrayed little evidence of CT (Daly 2001). The authors
explained that reasoning of this nature usually involves a single theory structured argument
which is contradictory to the principles of CT (Daly, 2001). Although both tools indicated
similar results, no reliability testing was conducted. The constructs of this new tool were
described as differentiation and integration in reasoning, situation modelling and argument
and evidential structure (Daly, 2001), which do not incorporate the recognised definitional
elements of CT (Facione, 1990; Scheffer & Rubenfeld, 2000).
Discussion
This review included studies from 9 different countries using 16 different tools. This section
discusses the findings in relation to the reliability, validity and factor domains of the
standardised tools and then examines the non-standardised tools.
The reliability of tools used to measure CT in nursing practice was not reported consistently
and varied considerably. Only two authors of new tools reported on internal stability using a
test-retest, and at best, split-half reliability for internal consistency was reported. The review
included four commercially available tools and this cost may limit their use for routine
evaluation of classroom teaching effectiveness. The CCTDI and the CCTST had reported
reliability ranging from .31 to .89 and some authors using these tools did not test reliability
for their specific context. The CCTDI measures students’ self-report CT disposition and does
not measure the development of CT skills. Relying on student self-report may be affected by
recall bias and a socially desirable response set (Tiwari et al., 2006). The act of critical
thinking involves both skills and habit of the mind (Scheffer & Rubenfeld, 2000). The CCTDI
only measures the habits of the mind. For a complete assessment of student’s critical
thinking both skills and disposition need to be measured, and the CCTST should be used in
conjunction with the CCTDI (Insight Assessment, 2013).
A lack of congruence between items in the CCTST and the CCTDI could account for
inconsistencies in reliability. Although the cognitive skills underlying the framework for the
Page 27
27
CCTST and the CCTDI were identified as important to the practice of nursing (Stone,
Davidson, Evans, & Hansen, 2001), the same study found less agreement on whether the
items reflected CT skills required of nurses. Inconsistent results across studies have
prompted questions related to the reliability of the CCTDI to measure dispositional attitudes
(Walsh & Seldomridge, 2006), and the lack of stability of the instrument (Walsh & Hardy,
1997; Kakai, 2003).
Limited reporting of tool reliability makes it difficult to assess their applicability in the nursing
and midwifery contexts. Concern could also be justified over the focus of existing tools
(especially standardised tools) on the measurement of formal logic and general thinking
skills, rather than the application of CT in practice (Seldomridge & Walsh, 2006).
Four new tools that measure the application of CT skills in nursing in practice were reviewed.
However, none of these new tools were tested for reliability. When the domains were
compared to the recognised definition of CT, construct validity was only established for one
tool (Jones, 2008). None of the studies conducted a factor analysis to establish validity. In
the development of the new tools, items were drawn from concepts thought to be useful but
no testing was conducted to confirm this. Therefore, further research with large samples,
factor analysis, and testing of different forms of reliability and validity, are required before
implementing these tools into practice.
CT is also considered to be a multidimensional concept, and a single test in a multiple
choice format may be inadequate to accurately detect change in development. There is a
need to ensure that measures of CT development address the complexity of practice and
are adaptive to the nursing and midwifery environments (Rubenfeld & Scheffer, 2006). A
mixed method approach and triangulation of findings may provide greater validity, reliability,
and insight into CT development.
Conclusion
There was limited reporting of the reliability of tools in the included studies. Overall there
was relatively little emphasis placed on validity of newly developed tools. Inconsistent results
were found in studies using standardised tools, placing doubt of the reliability of these tools
in the nursing context. On examination of the domain concepts construct validity was
questionable with several non-standardised tools used.
Nursing and midwifery education needs to prepare graduates to work effectively in complex,
fast paced and uncertain environments. Continued collection of data using measures of
Page 28
28
generalised CT is unlikely to help improve curricula, teaching methods, or preparation of
students for professional practice. There is a need to develop discipline specific instruments
to measure CT in nursing and midwifery, and more specifically tools that measure the
application of CT to practice. Considering the complexity of critical thinking in nursing and
midwifery practice, and that CT development occurs over a long time, measurement requires
a long term, multi-method approach over this time.
Page 29
29
References
Anderson, N., Booth, L., Catalano, J., Gaines, L., Horner, M., % McCormick, S., (2000).
Critical thinking process test: Development and technical eport. Stillwell, KS: Educational
Resource, Inc.
Assessment Technologies Institute, LLC. (2001). CT assessment: Developmental and statistical report. Overland Park, KS: Author.
Abel, W. M., & Freeze, M. (2006). Evaluation of concept mapping in an associate degree nursing program. Journal of Nursing Education, 45(9), 356-364.
Atay, S., & Karabacak, Ü. (2012). Care plans using concept maps and their effects on the critical thinking dispositions of nursing students. International Journal of Nursing Practice, 18(3), 233-239. doi: 10.1111/j.1440-172X.2012.02034.x
Beckie, T. M., Lowry, L. W., & Barnett, S. (2001). Assessing critical thinking in baccalaureate nursing students: a longitudinal study. Holistic Nursing Practice, 15(3), 18-26.
Brown, J. M., Alverson, E. M., & Pepa, C. A. (2001). The influence of a baccalaureate program on traditional, RN-BSN, and accelerated students' critical thinking abilities. Holistic Nursing Practice, 15(3), 4-8.
Brunt, B. A. (2005). Models, measurement, and strategies in developing critical-thinking skills. Journal of Continuing Education in Nursing, 36(6), 255-262.
CASP (2013). Critical Thinking Appraisal Skills Programme:CASP Checklists. UK: CASP.
Retrieved from http://www.casp-uk.net/#!casp-tools-checklists/c18f8 Chau, J. P. C., Chang, A. M., Lee, I. F. K., Ip, W. Y., Lee, D. T. F., & Wootton, Y. (2001).
Effects of using vidoetaped vignettes on enhancing students' critical thinking ability in a baccalaureate nursing programme. Journal of Advanced Nursing, 36(1), 112-119.
doi: 10.1046/j.1365-2648.2001.01948.x Chen, F., & Lin, M. (2003). Effects of a nursing literature reading course on promoting critical
thinking in two-year nursing program students. Journal of Nursing Research (Taiwan Nurses Association), 11(2), 137-147.
Cheng, Y. Y., Wang, W. C., Wu, J. J., & Hwang, C. K. (1996). A preliminary report on theconstruction of the critical thinking scale (in Chinese). Psychological Testing, 43,
213-226. Choi, E., Lindquist, R., & Song, Y. (2014). Effects of problem-based learning vs. traditional
lecture on Korean nursing students' critical thinking, problem-solving, and self-directed learning. Nurse Education Today, 34(1), 52-56. doi:
10.1016/j.nedt.2013.02.012. Chou, F.H., Jian, S.Y., Tseng, H.C., Ko, H.G., (2004). The evaluation of students
performance in applying problem-based learning to a nursing course. The Research
Outcome of National Science Council, Taiwan. Daley, B.J., Shaw, C.R., Balistrieri, T., Glasenapp, K., & Piacentine, L. (1999). Concept
maps: A strategy to teach and evaluate critical thinking. Journal of Nursing Education, 38, 42-47.
Daly, W. M. (2001). The development of an alternative method in the assessment of critical thinking as an outcome of nursing education. Journal of Advanced Nursing, 36(1),
120-130. doi: 10.1046/j.1365-2648.2001.01949.x Dehkordi, A. H., & Heydarnejad, M. S. (2008). The effects of problem-based learning and
lecturing on the development of Iranian nursing students' critical thinking. Pakistan Journal of Medical Sciences, 24(5), 740-743.
DeSimone, B. B. (2006). Curriculum design to promote the critical thinking of accelerated bachelor's degree nursing students. Nurse Educator, 31(5), 213-217.
Evans, B. C., & Bendel, R. (2004). Cognitive and ethical maturity in baccalaureate nursing students: did a class using narrative pedagogy make a difference? Nursing Education Perspectives, 25(4), 188-195.
Facione, P. A. (1990). Critical Thinking: A Statement of Expert Consensus for Purposes of Educational Assessment and Instruction, Executive Summary: “The Delphi Report”.
Page 30
30
CA: The Californian Academic Press. Retrieved from http://assessment.aas.duke.edu/documents/Delphi_Report.pdf
Facione, P. A. & Facione, N. C. (1992a). The California Critical Thinking Dispositions Inventory (CCTDI); and the CCTDI Test manual. Millbrae, CA: California Academic
Press. Facione, P. A., & Facione N. C. (1992b). The California Critical Thinking Skills Test: Test
Manual. Millbrae, CA: California Academic Press. Facione, P., & Facione, N. (1994). The California Critical Thinking Disposition Inventory
(CCTDI): Test Manual. Millbrae, CA: California Academic Press. Facione, N. C, & Facione, P. A. (1996). Assessment design issues for evaluating critical
thinking in nursing. Holistic Nursing Practice, 10(3), 41-53. Facione, P. A., & Facione, N. C. (1998). The California Critical Thinking Skills Test: CCTST
test manual. Millbrae, CA: California Academic Press. Facione, P. A., Facione, N. C., & Winterhalter, K. (2010). The Health Sciences Reasoning
Test: Test manual. Millbrae, CA: California Academic Press. Fountain, L. (2011). Thinking Like a 21st Century Nurse: Theory, Instruments, and
Methodologies for Measuring Clinical Thinking. Paper presented at the Annual
Meeting of the American Educational Research Association New Orleans, University of Maryland.
Goodstone, L., Goodstone, M. S., Cino, K., Glaser, C. A., Kupferman, K., & Dember-Neal, T. (2013). Effect of simulation on the development of critical thinking in associate degree nursing students. Nursing Education Perspectives, 34(3), 159-162.
Insight Assessment (2013). California Critical Thinking Disposition Inventory (CCTDI). San Jose, CA.: The Californian Academic Press Retrieved from: http://www.insightassessment.com/Products/Products-Summary/Critical-Thinking-Attributes-Tests/California-Critical-Thinking-Disposition-Inventory-CCTDI#sthash.HBvLVC1c.dpbs
Jones, M. (2008). Developing clinically savvy nursing students: an evaluation of problem-based learning in an associate degree program. Nursing Education Perspectives, 29(5), 278-283.
Jun, W. H., Lee, E. J., Park, H. J., Chang, A. K., & Kim, M. J. (2013). Use of the 5E learning cycle model combined with problem-based learning for a fundamentals of nursing course. Journal of Nursing Education, 52(12), 681-689.
Kakai H. (2003). Re-examining the factor structure of the California Critical Thinking Disposition Inventory. Perceptual and Motor Skills, 96(2), 435-438.
Kaplan, R. M., Sacuzzo, D.P. (1997) Psychological Testing: Principles, Applications and Issues (4th ed.), Pacific Grove, CA: Brooks/Cole.
Lawson, S. B. (2012). The Effectiveness of Concept Mapping as an Educational Tool to Enhance Critical Thinking Skills in Undergraduate Nursing Students. Unpublished Thesis. Indiana: Ball State University.
L'Eplattenier, N. (2001). Tracing the development of critical thinking in baccalaureate nursing students. Journal of the New York State Nurses Association, 32(2), 27-32.
Lee, W., Chiang, C. H., Liao, I. C., Lee, M. L., Chen, S.L., & Liang, T. (2013). The longitudinal effect of concept map teaching on critical thinking of nursing students. Nurse Education Today, 33(10), 1219-1223. doi: 10.1016/j.nedt.2012.06.010
Mann, J. (2012). Critical Thinking and Clinical Judgment Skill Development in Baccalaureate Nursing Students. Kansas Nurse, 87(1), 26-31.
Mong-Chue, C. (2000). Professional issues. The challenges of midwifery practice for critical thinking. British Journal of Midwifery, 8(3), 179-183.
Morey, D. J. (2012). Development and Evaluation of Web-Based Animated Pedagogical Agents for Facilitating Critical Thinking in Nursing. Nursing Education Perspectives, 33(2), 116-120. doi: 10.5480/1536-5026-33.2.116
Muoni, T. (2012). Decision-making, intuition, and the midwife: Understanding heuristics. British Journal of Midwifery, 20(1), 52-56.
Page 31
31
Naber, J., & Wyatt, T. H. (2014). The effect of reflective writing interventions on the critical thinking skills and dispositions of baccalaureate nursing students. Nurse Education Today, 34(1), 67-72. doi: 10.1016/j.nedt.2013.04.002
Park, S.H., (1999). The effects of the program for the improvement of college students' critical thinking ability [Korean]. Journal of Educational Psychology, 13 (4), 93–112.
Park, J.A., & Kim, B.J. (2009). Critical thinking disposition and clinical competence in general hospital nurses [Korean]. Journal of Korean Academy of Nursing, 39, 840-850.
Paul, R. W. (1993). Critical Thinking. Santa Rosa, CA; Foundationfor Critical Thinking.
Pucer, P., Trobec, I., & Žvanut, B. (2014). An information communication technology based approach for the acquisition of critical thinking skills. Nurse Education Today, 34(6),
964-970. doi: 10.1016/j.nedt.2014.01.011 Ravert, P. (2008). Patient simulator sessions and critical thinking. Journal of Nursing
Education, 47(12), 557-562. doi: 10.3928/01484834-20081201-06 Rubenfeld, M. G. & Scheffer,B. K. ( 2006). Critical thinking TACTICS for nurses : tracking,
assessing, and cultivating thinking to improve competency-based strategies.
Sudbury, Mass : Jones and Bartlett. Scheffer, B. K., & Rubenfeld, M. G. (2000). A consensus statement on critical thinking in
nursing. Journal of Nursing Education, 39(8), 352-359.
Scholes, J., Endacott, R., Biro, M., Bulle, B., Cooper, S., Miles, M., Gilmour, C., Buykx, P., Kinsman, L., Boland, R., Jones, J., Zaidi, F. 2012. Clinical decision-making: midwifery students’ recognition of, and response to, post partum haemorrhage in the simulation environment. BMC Pregnancy and Childbirth 12, 19.
Seldomridge, L. A., & Walsh, C. M. (2006). Measuring critical thinking in graduate education: what do we know? Nurse Educator, 31(3), 132-137.
Shin, K. R., Lee, J. H., Ha, J. Y., & Kim, K. H. (2006). Critical thinking dispositions in baccalaureate nursing students. Journal of Advanced Nursing, 56(2), 182-189. doi: 10.1111/j.1365-2648.2006.03995.x
Shinnick, M. A., & Woo, M. A. (2013). The effect of human patient simulation on critical thinking and its predictors in prelicensure nursing students. Nurse Education Today, 33(9), 1062-1067. doi: 10.1016/j.nedt.2012.04.004
Spelic, S. S., Parsons, M., Hercinger, M., Andrews, A., Parks, J., & Norris, J. (2001). Evaluation of critical thinking outcomes of a BSN program. Holistic Nursing Practice, 15(3), 27-34.
Stevens, J., Brenner, C., & Brenner, Z. R. (2009). The peer active learning approach for clinical education: a pilot study. Journal of Theory Construction & Testing, 13(2), 51-
56. Stewart, S., & Dempsey, L. F. (2005). A longitudinal study of baccalaureate nursing
students' critical thinking dispositions. Journal of Nursing Education, 44(2), 81-84.
Stone, C. A., Davidson, L. J., Evans, J. L., & Hansen, M. A. (2001). Validity evidence for using a general critical thinking test to measure nursing students' critical thinking. Holistic Nursing Practice, 15(4), 65-74.
Sullivan-Mann, J., Perron, C. A., & Fellner, A. N. (2009). The effects of simulation on nursing students' critical thinking scores: a quantitative study. Newborn and Infant Nursing Reviews, 9(2), 111-116.
Tiwari, A., Lai, P., So, M., & Yuen, K. (2006). A comparison of the effects of problem-based learning and lecturing on the development of students' critical thinking. Medical Education, 40(6), 547-554.
Tseng, H. C., Chou, F. H., Wang, H.H., Ko, H.K., Jian, S.Y., & Weng, W.C. (2011). The effectiveness of problem-based learning and concept mapping among Taiwanese registered nursing students. Nurse Education Today, 31(8), e41-46. doi:
10.1016/j.nedt.2010.11.020 Walsh, C. M, & Hardy, R. C. (1997) Factor structure stability of the California Critical
Thinking Disposition Inventory across gender and various student majors. Perceptual and Motor Skills, 85,1211-1228.
Page 32
32
Walsh, C. M., & Seldomridge, L. A. (2006). Measuring critical thinking: one step forward, one step back. Nurse Educator, 31(4), 159-162.
Watson, G., & Glaser, E. M. (1980). Watson-Glaser Critical Thinking Appraisal. San Antonio,
TX: Psychological Corp. Wheeler, L. A., & Collins, S. K. R. (2003). The influence of concept mapping on critical
thinking in baccalaureate nursing students. Journal of Professional Nursing, 19(6), 339-346.
Wood, R. Y., & Toronto, C. E. (2012). Measuring Critical Thinking Dispositions of Novice Nursing Students Using Human Patient Simulators. Journal of Nursing Education, 51(6), 349-352. doi: 10.3928/01484834-20120427-05
Yeh, M., & Chen, H. (2005). Effects of an educational program with interactive videodisc systems in improving critical thinking dispositions for RN-BSN students in Taiwan. International Journal of Nursing Studies, 42(3), 333-340.
Yu, D., Zhang, Y., Xu, Y., Wu, J., & Wang, C. (2012). Improvement in critical thinking dispositions of undergraduate nursing students through problem-based learning: a crossover-experimental study. Journal of Nursing Education, 52(10), 574-581.
Yuan, H., Kunaviktikul, W., Klunklin, A., & Williams, B. A. (2008). Improvement of nursing students' critical thinking skills through problem-based learning in the People's Republic of China: a quasi-experimental study. Nursing & Health Sciences, 10(1), 70-76.
Zadeh, H. H., Khajeali, N., Khalkhali, H., & Mohammadpour, Y. (2014). Effect of evidence-based nursing on critical thinking disposition among nursing students. Life Science Journal, 11 (9 Spec. Issue), 487-491.