Running Head: VIRTUAL REALITY TESTING IN TBI PATIENTS Assessing the Utility of a Virtual Reality Test of Executive Dysfunction on Traumatic Brain Injury Patients Matthew R. J. Vandermeer University of Toronto Scarborough 996234445
Aug 09, 2015
Running Head: VIRTUAL REALITY TESTING IN TBI PATIENTS
Assessing the Utility of a Virtual Reality Test of Executive Dysfunction on Traumatic
Brain Injury Patients
Matthew R. J. Vandermeer
University of Toronto Scarborough
996234445
VIRTUAL REALITY TESTING IN TBI PATIENTS 1
Abstract
The present study examines the ability of an ecologically valid “Virtual Reality (VR)
Office Task” test of executive functioning (alongside a battery of traditional tests of
executive functioning) to discriminate between a group of 30 non-injured control subjects
and 5 TBI patients. Statistical comparison of the means demonstrated that the two groups
performed significantly differently on the novel VR test but failed to reach significant
differences on the majority of traditional tests of executive functioning. Analysis of the
magnitude of the difference between the two groups’ performances using Cohen’s d
effect sizes showed that the greatest magnitude difference was found among the three
scores of the VR tests. This VR Office Task may prove to have clinical utility in
assessing ongoing cognitive dysfunction in TBI patients.
VIRTUAL REALITY TESTING IN TBI PATIENTS 2
Introduction
Neuropsychological tests occupy an extremely important role in examining a patient’s
cognitive deficits incurred as a result of traumatic brain injury (TBI). Clinical diagnosis is
based on a patient’s level of performance on these tests (Lezak, Howieson, & Loring,
2004). The clinician’s diagnosis consequently has a wide range of implications for the
life of the patient, the most important of which is predicting the degree to which the
patient can expect to function with regards to tasks in their every day life (Chaytor &
Schmitter-Edgecombe, 2003). Therefore, it is of utmost importance that these tests
accurately predict the degree to which an individual with a TBI can expect to suffer from
deficits in their daily functioning.
Mild traumatic brain injury
TBI can be classified as mild, complicated mild, moderate, and severe.
Classification is based on testing at the time of injury, Glascow Coma Scale (GCS;
Teasdale & Jennett, 1974), post-injury characteristics, duration of Loss of Consciousness
(LOC; Lezak et al., 2004) and Post Traumatic Amnesia (PTA; Bond, 1990). Mild TBI
(mTBI) specifically, is defined by a head trauma resulting in a GCS score of 13 to 15,
PTA of less than 24 hours (commonly much less), and a very brief LOC (if any) of
seconds to minutes (Alexander, 1995). Increasingly severe TBI are defined by decreasing
scores on the GCS, and increasingly longer periods of LOC and PTA.
Recovery from mTBI
Expected recovery. While there is consensus regarding ongoing cognitive
dysfunction among moderate and severe TBI patients, the outcome for mTBI patients is
much less clear (Dikmen et al., 2009; Ponsford et al., 2000). Neuropsychological
VIRTUAL REALITY TESTING IN TBI PATIENTS 3
sequelae post-mTBI include: reduced visuo-motor speed (Levin et al., 1987), deficits in
attention (Chan, 2005; Ziino & Ponsford, 2006), reduced expressive fluency (Henry &
Crawford, 2004), and slowed information processing (Johansson, Berglund, & Ronnback,
2009; Mathias et al., 2004). In addition there is considerable evidence that executive
functioning is impaired following TBI (Hartikainen et al., 2010; Nolin, 2006; Ord, Greve,
K. W., Bianchini, K. J., & Aguerrevere, 2010). Most research has demonstrated that
following mild cognitive impairment in the immediate post-traumatic period (Maddocks
& Saling, 1996) the vast majority of mTBI patients recover within the 1-3 months,
showing no long-term neuropsychological deficits (Alexander, 1995; Ponsford et al.,
2000; Voller et al., 1999).
Persistent difficulties in mTBI patients. Despite evidence for mTBI recovery
within 1-3 months, many patients report prolonged difficulties in aspects of every day
living related to executive dysfunction (Conboy, Barth, & Boll, 1986). Konrad et al.
(2011) demonstrated that even 6 years following a mTBI patients showed persistent
cognitive dysfunction along with significantly higher self-ratings of impairment in daily
life. Alves, Macciocchi, and Barth (1993) found that a small but significant number of
mTBI patients continued to self-report impairment up to 1 year post-trauma. Additionally
there have been several studies showing persistent difficulties following a mTBI after the
typical recovery phase as demonstrated by a failure to return to work (Ruffolo, Friedland,
Dawson, Colantonio, & Lindsay 1999; Hanlon, Demery, Martinovich, & Kelly, 1999).
These two contradictory findings: that most mTBI patients recover to premorbid
levels within 3 months of injury, and that some mTBI patients persist in self-reporting
VIRTUAL REALITY TESTING IN TBI PATIENTS 4
difficulties in every day functioning, present a problem. If neuropsychological testing
reflects a full recovery why is it that some individuals claim that their difficulties persist?
Neuropsychological tests and ecological validity
Over the past several decades a number of tests have been developed to assist in
the assessment of executive dysfunction. This includes the Wisconsin Card Sort Test
(WCST; Heaton, Chelune, Talley, Gary, & Curtiss, 1993), the Ruff Figural Fluency Test
(RFFT; Ruff, 1996), and the Tower of London Revised (TOL-DX; Cubertson & Zillmer,
2001). A frequently argued limitation to the use of traditional neuropsychological tests
such as these is their lack of ecological validity. Ecological validity refers to how well a
test predicts how one performs in the real world, outside of the testing environment. The
concept of ecological validity in testing can best be explained along two dimensions,
veridicality and verisimilitude (Chaytor & Schmitter-Edgecombe, 2003). Veridicality is
defined as the degree to which neuropsychological tests are statistically related to ones
performance in the real world (Franzen & Wilhelm, 1996). Verisimilitude is defined as
the degree to which a neuropsychological test resembles a task found in every day life
(Franzen & Wilhelm, 1996). The main idea behind the design of tests that are verisimilar
is that performance on these tests will reflect individual abilities in performing everyday
tasks, without taking into account the specific cause of deficits (Chaytor & Schmitter-
Edgecombe, 2003). That is to say, it’s the prediction of future, non-testing behaviours
that we’re interested in – not necessarily an explanation of the behaviour itself or the
cause of the behaviour.
The concept of ecological validity in testing neuropsychological function is one of
vital importance. The assumption has often been made in the past that diminished
VIRTUAL REALITY TESTING IN TBI PATIENTS 5
functioning in a patient’s everyday behaviour could be inferred from poor performance
on neuropsychological tests (Chaytor, Schmitter-Edgecombe, & Burr, 2006). This is an
important assumption to examine, and perhaps reevaluate, in light of the finding that
many neuropsychological tests don’t predict the level of real life functioning (Wilson,
1993).
Several explanations have been put forth as to why traditional neuropsychological
testing has lacked ecological validity, and more specifically, show limited verisimilitude.
First, traditional neuropsychological testing has taken place in a heavily controlled and
artificial testing environment. The test environment is often quiet, distraction free, with
strictly determined rules, and all behaviours prompted by the clinician (Manchester,
Priestley, & Jackson, 2004). This removes much of the environmental challenges that an
individual with executive dysfunction may experience in their real life experiences. It is
very possible that the testing environment is so different from the real world that the
cognitive factors being assessed in the testing environment are independent of those used
in every day life (Burgess, Alderman, Evans, Emslie, & Wilson, 1998; Norris & Tate,
2000).
A second factor contributing to limited ecological validity in neuropsychological
testing is that many of the tests now being used in a clinical setting were originally
developed for use in research (Manchester et al., 2004). The requirements for assessment
tools in research and clinical settings are fundamentally different (Burgess et al., 2006).
Tasks derived and used in a research setting usually aim to explain some relationship
between the human brain and behaviour; this is not necessarily what the clinician is
interested in. Rather, it is the functional outcome, or what the patient is capable of in the
VIRTUAL REALITY TESTING IN TBI PATIENTS 6
real world, that is of primary clinical interest (Burgess et al., 2006). In other words, tasks
that may be useful in delineating relationships between biology and behaviour in a
research setting do not necessarily describe functionality in the real world (the primary
concern of the clinician).
Virtual reality
The use of Virtual Reality (VR) in a clinical capacity is increasingly common
including: psychiatric treatment (Rothbaum et al., 1995), neurocognitive rehabilitation
(Trepagnier, 1999; Wilson, Foreman, & Stanton, 1997), and surgical training (Seymour et
al., 2002). The application of VR to increase ecological validity in neuropsychological
testing is also becoming increasingly prevalent (Campbell, et al., 2009; Kang et al., 2008;
Christiansen et al., 1998).
The use of VR offers several advantages over traditional neuropsychological tests.
One of the most obvious of these is immersion of the patient in a realistic Virtual
Environment (VE; Schultheis, Himelstein, & Rizzo, 2002). The use of VEs provides an
increased approximation of the environment and situations that a patient is likely to
encounter in the real world, and therefore an increase in testing’s verisimilitude and
ecological validity. In the most general sense, VR allows the clinician or researcher to
directly observe an individual’s functionality in a real world situation, while placing strict
controls on the environment (Schultheis et al., 2002).
Present study
While traditional neuropsychological testing routinely shows that mTBI patients
recover by 3 months post-injury (Alexander, 1995; Ponsford et al., 2000; Voller et al.,
1999), many patients report persistent challenges with cognitive dysfunction. There is a
VIRTUAL REALITY TESTING IN TBI PATIENTS 7
disconnect between what the neuropsychological tests indicate and what many mTBI
patients report. Accordingly, the purpose of the present study was to examine an
ecologically valid VR measure of executive function (VR Office Task) in order to
determine if it can be useful in discerning executive dysfunction in TBI patients after the
expected recovery period.
For the purposes of this study several hypotheses have been developed.
Hypothesis 1 states that due to the low ecological validity of traditional
neuropsychological tests of executive functioning (i.e. the Wisconsin Card Sort Test,
Tower of London, and Ruff Figural Fluency Test), no significant differences will be
found between the performance of non-injured control subjects and TBI patients.
Hypothesis 2 states that due to the high level of ecological validity afforded by the virtual
environment in the VR Office Task, there will be a significant difference between the
performance of non-injured control subjects and TBI patients. Finally, hypothesis 3 states
that the greatest magnitude of difference in performance between the two groups will be
found in the VR Office Task (as measured by effect sizes).
Methods
Participants
Control subjects were healthy individuals, with no history of psychological
disturbance or head injury. They were primarily drawn from an introductory psychology
class at University of Toronto Scarborough, along with a number of non-student
members of the community. Student participation was compensated by the addition of
1% onto their final grade, while community members were not compensated. The final
control sample consisted of 30 healthy individuals. The control sample had a mean age of
VIRTUAL REALITY TESTING IN TBI PATIENTS 8
20.1 (SD = 5.67, range = 18 – 49), was 63.3% female, and had a mean education of 12.9
years (SD = 1.41, range = 11 – 16) (see table A1).
Clinical participants were drawn from a population of individuals diagnosed with
TBI. The final control sample consisted of 5 patients, with a mean age of 40.0 (SD =
26.5, range = 20 – 83), was 100% male, and had a mean education of 15.6 years (SD =
4.34, range = 12 – 22). Medical files indicated a mean Glasgow Coma Scale of 11.4 (SD
= 4.9, range = 3 – 15), and an average of 409 days (SD = 77.5, range = 278 – 477)
between date of loss and date of assessment.
Any subject with a history of psychological diagnosis, or who failed to score >45
on the Test of Memory Malingering (used here as an assessment of effort [O’Bryant,
Engel, Kleiner, Vasterling, & Black, 2007]), or whose scores on the Personality
Assessment Inventory (PAI) indicated either invalid results or some psychological
disorder were exempt from the study. This resulted in 6 control subjects being removed
from the final analysis.
Materials
In order to assess the contribution of personality on executive functioning tasks,
subjects were administered the Personality Assessment Inventory (PAI). The Test of
Memory Malingering (TOMM) was administered to subjects as a measure of effort level.
The Wide Ranging Achievement Test 4th Edition (WRAT) Word Reading subtest was
administered in order to evaluate participant reading level. Finally, executive functioning
was assessed using the Tower of London-Drexel University Second Edition (TOL), the
64-card Wisconsin Card Sort Test (WCST-64), Ruff Figural Fluency Test (RFFT), and a
novel VR Office Task (VROT).
VIRTUAL REALITY TESTING IN TBI PATIENTS 9
Personality Assessment Inventory. The PAI (Morey, 1991) is a self-report,
multiple-scale measure designed to assess both psychopathology and personality. The
344-item scale consists of 22 orthogonal scales: 4 validity scales, 11 clinical scales, 5
treatment scales, and 2 interpersonal scales. Each item is rated by participants on a 4-
point Likert scale from “false, not at all true” to “very true”. The PAI was primarily used
to screen for the presence of psychopathology in participants.
Wide Ranging Achievement Test – 4. The WRAT (Wilkinson & Robertons,
2006) is a brief assessment of academic skills, including word reading, spelling, and basic
math. The WRAT reading subtest was used to quickly assess reading level among
subjects.
Test of Memory Malingering. The TOMM was originally designed as a test to
detect symptom validity in neuropsychological testing (Tombaugh, 1996), and as such
has become the most frequently administered symptom validity test among clinical
neuropsychologists (Slick, Tan, Strauss, & Hultsch, 2004). The TOMM trial 1 was
administered to participants according to standard procedures as outlined in Tombaugh
(1996). Participants first completed a practice trial consisting of 2 stimuli pictures they
had to remember, and then identify, using a forced choice paradigm with two options.
Following the practice trial, participants completed TOMM trial 1, which is identical to
the practice trial, excepting there are now 50 stimuli instead of 2.
Following the O’Bryant et al. (2008) finding that little additional information
could be obtained by administering the full TOMM to subjects who score >45 on TOMM
trial 1, a discontinue rule for subjects who scored >45 on TOMM trial 1 was followed.
Additionally, TOMM trial 1 has emerged as an efficient tool to quickly screen for
VIRTUAL REALITY TESTING IN TBI PATIENTS 10
sufficient effort in participants (O’Bryant et al., 2007). This study used TOMM to screen
against insufficient effort in participants. Participants who failed to score >45 on any
given trial of the TOMM were excluded from the study.
Tower of London-Drexel University Second Edition. The TOL was designed to
assess a subject’s executive functioning, specifically planning behaviour (Shallice, 1982).
TOL consists of 12 trials (two practice trials and ten test trials). Participants are required
to rearrange 3 coloured rings from their initial position on 2 of 3 upright pegs of varying
length, to a new predetermined position. As the participant rearranges the rings, they
must follow two rules: the participant can not attempt to place more rings on a given peg
than it can support due to it’s length (Rule I), and the participant may only move one peg
at a time (Rule II). Participants are marked as correct for a given trial if they correctly
rearrange the rings within the minimum number of moves required (ranging from 2 to 7),
within the allotted time limit (2 minutes per trial). Scores were calculated based on total
move score (sum of total moves minus the minimum number of moves required), total
time (sum of times required for each trial), total initiation time (sum of time taken to
begin each trial), and the total correct score (number of trials completed within the
minimum move count).
Wisconsin Card Sort Test-64. The Wisconsin Card Sort Test (WCST) is
designed to test the subject’s executive functioning, specifically formation of abstract
thoughts, and cognitive flexibility (Heaton et al., 1993). Rabin, Barr, & Burton (2005)
reported that the WCST is the most commonly used tool to assess executive functioning,
therefore it is a necessary inclusion in any battery assessing the ecological validity of a
novel test of executive functioning. The WCST-64 is an abbreviated form of the full 128
VIRTUAL REALITY TESTING IN TBI PATIENTS 11
card Wisconsin Card Sort Test (WCST), following the same administration instructions
as laid out in the full WCST manual (Heaton et al., 1993). Vayalakkara, Devaraju-
Backhaus, Bradley, Simco, and Golden (2000) demonstrated the high validity of the
WCST-64 in predicting scores on the full WCST, while simultaneously reducing
administration time. For theses reasons the WCST-64 was used.
Subjects were presented with four stimuli cards, varying on three patterned
dimensions: colour of items, form (shape) of items, and number of items. Subjects were
then instructed to sort response cards underneath each of the four stimuli cards, according
to their own judgment of the correct sorting pattern. Simple feedback as to whether they
are correctly sorting response cards was given to participants, according to a
predetermined set of rules that the subject is not aware of. After 10 correct responses, the
sorting rule was changed (sorting by colour, form, or number) without the subject’s
awareness. The test was complete when the subject has finished six sets of 10 correct
trials or the entire 64-card deck is used. Scores were calculated based on number of trials
to reach first category, number of errors made, number of sets completed, and failure to
maintain set errors (incorrect care placement after 5 consecutive correct placements)
(Heaton et al., 1993).
Ruff Figural Fluency Test. The RFFT (Ruff, Light, & Evans, 1987) is a figural
test of a person’s ability to shift cognitive set, use planning strategies, and their overall
executive functioning in coordinating these activities (Psychological Assessment
Resources [PAR], 2012). The RFFT consists of 5 different trials, each consisting of a
piece of paper with 40 squares, each containing an identical arrangement of 5 dots. The
first three trials have the dots arranged identically, with trial 2 and 3 containing
VIRTUAL REALITY TESTING IN TBI PATIENTS 12
interference patterns (lines or other shapes). Trials 4 and 5 each contain different
arrangements of 5 dots with no interference patterns. Subjects were instructed to create as
many different “patterns” or “figures” as they could by connecting 2 or more of the dots
in each of the squares. They were then allowed to complete a practice trial before each of
the test trials, consisting of 3 squares identical to the 40 squares in the testing trial.
Subjects were corrected following the sample trial if any errors were made (i.e. two or
more identical patterns). They then completed as many different patterns or figures as
possible in one minute for each of the trials. Scores were calculated based on the number
of different patterns provided, and the number of perseverations (or repeated patterns).
Virtual Reality Office Task. The VROT is a brief VR task based on the WCST.
Participants were instructed to imagine that they were working for a courier company and
it was their job to deliver packages to the correct rooms inside an office building.
Participants were given a game pad in order to navigate the virtual office building. They
were instructed to approach a shipping cart filled with packages to pick up a new
package, and then approach the door they believe the package should be delivered to
pressing the appropriate button on the game pad to deliver the virtual package. For every
package delivered the computer prompted “CORRECT DOOR” or “INCORRECT
DOOR”. Regardless of their result subjects were to then return to the shipping cart to
pick up a new package and try to determine how to correctly deliver each of the rest of
the packages.
There are 4 doors that packages can be delivered to, all labeled with different
signs identifying the business inside. From left to right the rooms are: 401-410 (The
Doctor’s Office), 411-420 (Riverdale Florists), 421-430 (Shutterbugs Photography), and
VIRTUAL REALITY TESTING IN TBI PATIENTS 13
431-440 (A Touch of Class Catering). Packages are labeled with either: the appropriate
room number, printed paraphernalia associated with the one of the 4 rooms, or the exact
sign/logo as is shown on the doors. There is no time limit for this task, participants must
continue until the simulation is complete (45-50 packages per simulation).
Scores were calculated based on the total number of packages delivered, the
number of correct deliveries, number of incorrect deliveries, perseveration errors (subject
making more than 2 consecutive incorrect deliveries), and failure to maintain set (subject
making an incorrect delivery following 3 consecutive correct deliveries).
It is believed that the more immersive VR environment in the VR Office Task
will provide increased verisimilitude (similarity to every day tasks) and be more
ecologically valid than traditional neuropsychological tests. As such, it is expected that
this improved ecological validity will translate to better detection of the subtle executive
dysfunctions that many mTBI patients complain about following their expected recovery
phase.
Procedure
Demographic information was collected at the onset of testing for all participants.
Injury characteristics for TBI patients were derived from patient clinical reports. Subjects
were then administered a neuropsychological battery containing the tests discussed
above. Following completion of the traditional tests subjects were familiarized with the
VR controller and instructed to complete the VR Office Task.
Statistical Analysis
All test data was analyzed using the Shapiro-Wilk test separately for both injured
patients and non-injured controls. Those scores that were found to have normal
VIRTUAL REALITY TESTING IN TBI PATIENTS 14
distributions in both injured and non-injured groups were compared using two-tailed t-
test. Those scores that were found to have non-normal distributions were compared using
the Mann-Whitney U test. Effect sizes were computed using Cohen’s d for all test scores.
All analyses were preformed using SPSS 20.0.0.
Results & Discussion
Table A2 compares the TBI and normal controls normed and raw scores for all
tests of executive functioning, using two-tailed t-tests and Mann-Whitney U tests. The
scores indicate the majority of traditional neuropsychological tests were not able to
significantly differentiate TBI patients from normal controls. This is not surprising, given
that our patient group had a mean of 409 days (SD = 77.5, range = 278 – 477) well past
the 3 month period of typical recovery seen in mild TBI patients (Rohling et al., 2011).
At this point in recovery, mild TBI patients should be cognitively indistinguishable from
non-injured controls subjects. Any further subtle executive dysfunctions would likely go
unnoticed by these traditional tests. However, many patients persist in their complaints of
diminished abilities in every day life well past this recovery period (Konrad et al., 2011;
Alves, et al., 1993; Ruffolo et al., 1999; Hanlong et al., 1999). The lack of ecological
validity in traditional neuropsychological tests likely accounts for this disparity between
the testing environment, where it appears the patient is fully recovered, and the patients’
every day life, where they are faced with ongoing cognitive problems. These findings
agree with past research demonstrating that traditional tests of executive function poorly
distinguish normal control subjects from mild TBI patients (Cockburn, 1995; Levin,
Goldstein, Williams, & Eisenberg, as cited in Lezak et al., 2004, p. 618; Ord et al., 2010).
VIRTUAL REALITY TESTING IN TBI PATIENTS 15
These findings (that performance on the majority of neuropsychological tests of
executive functioning showed no significant difference between control subjects and TBI
patients) provide partial support for hypothesis 1. Of the traditional tests of executive
functioning used, only the difference between TBI patients (M = -1.10, SD = 0.83) and
control subjects (M = -0.02, SD = 1.27) z scores for TOL execution time reached
significance; U = 19.5, p = 0.030. The TOL is notable for its focus on assessing
prospection, the ability to plan ahead to reach a specified goal and all steps in between.
Unterrainer et al. (2004) found that increased execution time in the TOL was associated
with decreased planning ability. It may be that one of the main subtle deficits that TBI
patients face is decreased prospection, or preplanning, leading to a significantly longer
execution time during the TOL even after the expected recovery time.
In contrast to the majority of traditional executive function tests, the novel VR
Office Task implemented was able to significantly distinguish TBI patients from normal
control subjects (“incorrect deliveries”; U = 11.0, p = 0.022; “failure to maintain set”; U
= 7.50, p = 0.009; “perseverations”; U = 30.0, p = 0.002). It seems that the virtual
environment in the VR task approximates a hypothetical real life environment and
situation, thereby increasing the ecologically validity of the test (verisimilitude).
Essentially, the delivery of differently labeled packages is a more likely and realistic
scenario of a patient’s real life than sorting various cards by some vague rule set (as in
the WCST). This improved ecological validity is elucidating the subtle cognitive
dysfunctions of the TBI patient group that were missed by a battery of traditional
neuropsychological tests. The finding that the virtual environment produced significant
VIRTUAL REALITY TESTING IN TBI PATIENTS 16
differences between control subject and TBI patient performance on the VR Office Task
provides support for hypothesis 2.
One of the main limitations of this study is the extremely low sample size for the
TBI patient group (n=5). Looking more closely, one can see that the TBI patient sample
size is never greater than n=4 for any given test of executive dysfunction. As such, it is
pertinent to employ a statistic that is not as dependent on sample size as Student’s t and
the Mann-Whitney U tests used above – Cohen’s d effect size. Cohen’s d also provides
the added benefit of considering the magnitude of difference between patient and control
group performances, rather than simply investigating the significance of their differences
(Zakzanis, 2001).
Table A3 presents effect sizes and percent overlap in scores (Zakzanis, 2001)
between the TBI group and the normal control group. Percent overlap refers to the
fraction of the sample that overlaps in terms of performance on any given test – the lower
the percent overlap, the greater the magnitude of difference between the two groups.
Using Cohen’s (1988) heuristic benchmarks for interpreting the magnitude of effect sizes
(i.e. effect sizes of 0.2 are taken as small magnitude, 0.5 as medium, and 0.8 and up as
large) we see that several tests can be considered to have large effect sizes. Of the
traditional neuropsychological tests employed these include: Z score for TOL execution
time (d = 1.52, OL% = 28.8), and z-score for TOL number of time violations (d = 0.89,
OL% = 48.8). All scores collected from the VR Office Task were found to have large
effect sizes, including: incorrect deliveries (d = -2.33, OL% = 13.9), failure to maintain
set (d = -2.54, OL% = 11.4), and perseverations (d = -2.25, OL% = 15.0). TBI patients
VIRTUAL REALITY TESTING IN TBI PATIENTS 17
tended to underperform compared to normal control subjects on the VR Office Task. The
rest of the executive tests had effect sizes below Cohen’s benchmark of a “large effect”.
As Zakzanis (2001) explained, lack of statistical significance does not necessarily
equate to lack of effect. The argument can be made that the reverse is also true –
statistical significance does not necessarily equate to the presence of an effect. These
findings are important as they demonstrate that not only are the differences between
control subject and TBI patient performances on the VR Office Task statistically
significant, but they’re also clinically significant. With an overlap between the two
groups’ performance on the VR task ranging between 11.4% and 15.0%, the VR Office
Task offers a potential clinical tool for discriminating between TBI patients and control
subjects following the expected recovery time. These findings provide support for
hypothesis 3.
There are several limitations to the current study. As was already mentioned
above, the TBI sample size was extremely low (n=5), and not every patient was able to be
tested on all tests of executive dysfunction resulting in a range of sample sizes from n=3
to n=4. This extremely limited sample size throws the results of any comparisons of
means tests (i.e. t-test, Mann-Whitney U test) into question. Further studies should aim to
acquire a larger sample size. Furthermore, mainly due to our limited pool of TBI patients,
not all TBI patients studied were suffering from a mild TBI; we therefore pooled all TBI
patients under the umbrella “TBI” rather than accounting for severity. It would be
essential in the future to analyze a larger sample of TBI patients, stratifying analysis by
TBI severity. The current study also saw a TBI group who had noticeably higher average
age, greater composition of males, and greater average level of education than the
VIRTUAL REALITY TESTING IN TBI PATIENTS 18
counterpart control subject group. In the future the disparity between the two groups’
demographics should be reduced. Finally, in an effort to better evaluate the ecological
validity of the VR Office Task, TBI patient performance on it should be compared to
their level of functioning in the real world (i.e. ability to return to work, cook, drive, etc.).
Conclusions
The VR Office Task is an ecologically valid tool that has the clinical potential to
assess the presence of ongoing executive dysfunction in TBI patients, otherwise
undetected by traditional neuropsychological tests. Further study is needed in order to
increase sample size, study the relationship between performance on the VR Office Task
and severity of TBI, and better assess its ecological validity in terms of patient ability to
function in the real world.
VIRTUAL REALITY TESTING IN TBI PATIENTS 19
References
Alexander, M. P. (1995). Mild traumatic brain injury: Pathophysiology, natural history,
and clinical management. Neurology, 42, 1253-1260.
Alves, W., Macciocchi, S. N., & Barth, J. T. (1993). Postconcussive symptoms after
uncomplicated mild head injury. Journal of Head Trauma Rehabilitation, 8(3),
48-59.
Bond, M.R. (1990). Standardized methods of assessing and predicting outcome. In
Rosenthal, M., Bond, M.R., Griffirth, E.R., & Miller, J.D. (Eds.), Rehabilitation
of the adult and child with traumatic brain injury (2nd ed.). Philadelphia: Davis.
Burgess, P. W., Alderman, N., Evans, J., Emslie, H., & Wilson, B. A. (1998). The
ecological validity of tests of executive function. Journal of the International
Neuropsychological Society, 4, 547-558.
Burgess, P. W., Alderman, N., Forbes, C., Costello, A., Coates, L. M., Dawson, D. R.,
…, & Channon, S. (2006). The case for the development and use of “ecologically
valid” measures of executive function in experimental and clinical
neuropsychology. Journal of the International Neuropsychological Society, 12,
194-209.
Campbell, Z., Zakzanis, K. K., Jovanovski, D., Joordens, S., Mraz, R., & Graham, S. J.
(2009). Utilizing virtual reality to improve the ecological validity of clinical
neuropsychology: An fMRI case study elucidating the neural basis of planning by
comparing the Tower of London with a three-dimensional navigation task.
Applied Neuropsychology, 4, 295-306.
VIRTUAL REALITY TESTING IN TBI PATIENTS 20
Chan, R. C. K. (2005). Sustained attention in patients with traumatic brain injury.
Clinical Rehabilitation, 19, 188–193.
Chaytor, N., & Schmitter-Edgecombe, M. (2003). The ecological validity of
neuropsychological tests: A review of the literature on everyday cognitive skills.
Neuropsychology Review, 13(4), 181-197.
Chaytor, N., Schmitter-Edgecombe, M., & Burr, R. (2006). Improving the ecological
validity of executive functioning assessment. Archives of Clinical
Neuropsychology, 21, 217-227.
Christiansen, C., Abreu, B., Ottenbacher, K., Huffman, K., Masel, B., & Culpepper, R.
(1998). Task performance in virtual environments used for cognitive
rehabilitation after traumatic brain injury. Archives of Physical and Medical
Rehabilitation, 79, 888-892.
Conboy, T.J., Barth, J., & Boll, T.J. (1986). Treatment and rehabilitation of mild and
moderate head trauma. Rehabilitation Psychology, 31(4), 203 – 215.
Cockburn, J. (1995). Performance on the Tower of London test after severe head injury.
Journal of the International Neuropsychological Society, 1, 537-544.
Cubertson, W.C., & Zillmer, E.A. (2001). Tower of London: Drexel University (TOL-
DX): Test manual. Toronto, Canada: Multi Health Systems.
Dikmen, S. S., Corrigan, J. D., Levin, H. S., Machamer, J., Stiers, W., & Wesskopf, M.
G. (2009). Cognitive outcome following traumatic brain injury. Journal of Head
Trauma Rehabilitation, 24(6), 430-438.
VIRTUAL REALITY TESTING IN TBI PATIENTS 21
Franzen, M. D., & Wilhelm, K. L. (1996). Conceptual foundations of ecological validity
in neuropsychology. In: Sbordone, R. J., and Long, C. J. (eds.), Ecological
Validity of Neuropsychological Testing, GR Press/St. Lucie Press, Delray Beach,
FL, pp. 91–112.
Hanlon, R. E., Demery, J. A., Martinovich, Z., & Kelly, J. P. (1999). Effects of acute
injury characteristics on neuropsychological status and vocational outcome
following mild traumatic brain injury. Brain Injury, 13(11), 873-887.
Hartikainen, K. M., Waljas, M., Isoviita, T., Dastidar, P., Liimatainen, S., Solbakk, A. K.,
…, Ohman, J. (2010). Persistent symptoms in mild to moderate traumatic brain
injury associated with executive dysfunction. Journal of Clinical and
Experimental Neuropsychology, 32(7), 767–774.
Heaton, R. K., Chelune, G. J., Talley, J. L., Kay, G. G., & Curtiss, G. (1993). The
Wisconsin Card Sorting Test Manual Revised and Expanded. Odessa, FL:
Psychological Assessment Resources, Inc.
Henry, J. D., & Crawford, J. R. (2004a). A meta-analytic review of verbal fluency
performance following focal cortical lesions. Neuropsychology, 18(2), 284–295.
Henry, J. D., & Crawford, J. R. (2004b). A meta-analytic review of verbal fluency
performance in patients with traumatic brain injury. Neuropsychology, 18(4),
621–628.
Johansson, B., Berglund, P., & Ronnback, L. (2009). Mental fatigue and impaired
information processing after mild and moderate traumatic brain injury. Brain
Injury, 23, 1027–1040.
VIRTUAL REALITY TESTING IN TBI PATIENTS 22
Kang, Y. J., Ku, J., Han, K., Kim, S. I., Yu, T. W., Lee, J. H., & Park, C. I. (2008).
Development and clinical trial of virtual reality-based cognitive assessment in
people with stroke: Preliminary Study. CyberPsychology & Behavior, 11(3), 329-
339.
Konrad, C., Geburek, A. J., Rist, F., Blumenroth, H., Fischer, B., Husstedt, I., Arolt, V.,
…, & Lohmann, H. (2011). Long-term cognitive and emotional consequences of
mild traumatic brain injury. Psychological Medicine, 41, 1197-1211.
Levin, H. S., Goldstein, F. C., Williams, D. H., & Eisenberg, H. M. (1991). The
contribution of frontal lobe lesions to the neurobehavioral outcome of closed head
injury. In H.S. Levin et al. (Eds.), Frontal lobe function and dysfunction. New
York: Oxford University Press.
Levin, H. S., Mattis, S., Ruff, R. M., Eisenberg, H. M., Marshall, L. F., Tabaddor, K., et
al. (1987). Neurobehavioral outcome following minor head injury: A three-center
study. Journal of Neurosurgery, 66(2), 234–243.
Lezak, M. D., Howieson, D.B., & Loring, D.W. (2004). Neuropsychological assessment,
4th edn., Oxford: Oxford University Press.
Maddocks, D., & Saling, M. (1996). Neuropsychological deficits following concussion.
Brain Injury, 2, 99-103.
Manchester, D., Priestley, N., & Jackson, H. (2004). The assessment of executive
functions: Coming out of the office. Brain Injury, 18(11), 1067-1081.
VIRTUAL REALITY TESTING IN TBI PATIENTS 23
Mathias, J. L., Bigler, E. D., Jones, N. R., Bowden, S. C., Barrett-Woodbridge, M., &
Brown, G. C. (2004). Neuropsychological and Informational processing
performance and its relationship to white matter changes following moderate and
severe traumatic brain injury: A preliminary study. Applied Neuropsychology, 11,
134–152.
Morey, L. C. (1991). The Personality Assessment Inventory. Odessa, FL: Psychological
Assessment Inventory.
Nolin, P. (2006). Executive memory dysfunctions following mild traumatic brain injury.
Journal of Head Trauma Rehabilitation, 21(1), 68–75.
Norris, G., & Tate, R. L. (2000). The Behavioural Assessment of the Dysexecutive
Syndrome (BADS): Ecological, concurrent, and construct validity.
Neuropsychological Rehabilitation, 10(1), 33-45.
O’Bryant, S. E., Engel, L. R., Kleiner, J. S., Vasterling, J. J., & Black, F. W. (2007). Test
of Memory Malingering (TOMM) trial 1 as a screening measure for insufficient
effort. The Clinical Neuropsychologist, 21(3), 511-521.
O’Bryant, S. E., Gavett, B. E., McCaffrey, R. J., O’Jile, J. R., Huerkamp, J. K.,
Smitherman, T. A., & Humphreys, J. D. (2008). Clinical utility of trial 1 of the
Test of Memory Malingering (TOMM). Applied Neuropsychology: Adult, 15(2),
113-116).
Ord, J. S., Greve, K. W., Bianchini, K. J., & Aguerrevere, L. E. (2010). Executive
dysfunction in traumatic brain injury: The effects of injury severity and effort on
the Wisconsin Card Sorting Test. Journal of Clinical and Experimental
Neuropsychology, 32(2), 132–140.
VIRTUAL REALITY TESTING IN TBI PATIENTS 24
Ponsford, J., Willmott, C., Rothwell, A., Cameron, P., Kelly, A., Nelms, R., …, Ng, K.
(2000). Factors influencing outcome following mild traumatic brain injury in
adults. Journal of the International Neuropsychological Society, 6, 568-579.
Psychological Assessment Resources, Inc. (2012). RFFT (Ruff Figural Fluency Test).
Retrieved from http://www4.parinc.com/Products/Product.aspx?ProductID=RFFT
Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical
neuropsychologists in the United States and Canada: A survey of INS, NAN, and
APA division 40 members. Archives of Clinical Neuropsychology, 20, 33-65.
Rohling, M. L., Binder L. M., Demakis, G. J., Larrabee, G. J., Ploetz, D. M., &
Langhinrichsen-Rohling, J. (2011). A meta-analysis of neuropsychological
outcome after mild traumatic brain injury: Re-analysis and reconsiderations of
Binder et al. (1997), Frenchman et al. (2005), and Pertab et al. (2009). The
Clinical Neuropsychologist, 25(4), 608-623.
Rothbaum B., Hodges L. F., Kooper R., Opdyke, D., Williford, J. S., & North M. (1995).
Effectiveness of computer-generated (virtual reality) graded exposure in the
treatment of acrophobia. American Journal of Psychiatry, 152, 626–28.
Ruff R. M. (1996). Ruff Figural Fluency Test: Professional manual. Lutz: Psychological
Assessment Resources, Inc.
Ruff R. M., Light, R. H., & Evans, R. W. (1987). The Ruff Figural Fluency Test: A
normative study with adults. Developmental Neuropsychology, 3(1), 37-51.
Ruffolo, C. F., Judith, F. F., Deirdre, R. D., Colantonio, A., & Linday, P. H. (1999). Mild
traumatic brain injury from motor vehicle accidents: Factors associated with
return to work. Archives of Physical Medicine and Rehabilitation, 80, 392-398.
VIRTUAL REALITY TESTING IN TBI PATIENTS 25
Seymour N. E., Gallagher A. G., Roman S. A., O’Brien, M. K., Bansal, V. K., Andersen,
D. K., Satava, R. M. (2002). Virtual reality training improves operating room
performance: Results of a randomized, double-blinded study. Annals of Surgery,
236, 458–63.
Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the
Royal Society of London: Biological Sciences, 298, 199-209.
Schultheis, M. T., Himelstein, J., & Rizzo, A. A. (2002). Virtual reality and
neuropsychology: Upgrading the current tools. Journal of Head Trauma
Rehabilitation, 17(5), 378-394.
Slick, D. J., Tan, J. E., Strauss, E. H., & Hultsch, D. F. (2004). Detecting malingering: a
survey of experts’ practices. Archives of Clinical Neuropsychology, 19, 465-473.
Teasdale, G., Jennett, B. (1974). Assessment of coma and impaired consciousness: A
practical scale. The Lancet, ii, 81-84.
Tombaugh, T. N. (1996). Test of Memory Malingering (TOMM). North Tonawanda, NY:
Multi Health Systems.
Trepagnier C. G. (1999). Virtual environments for the investigation and rehabilitation of
cognitive and perceptual impairments. Neurorehabilitation, 12, 63–72.
Unterrainer, J. M., Rahm, B., Kaller, C. P., Leonhart, R., Quiske, K., Hoppe-Seyler, K.,
. . . Halsband, U. (2004). Planning abilities and the Tower of London: Is this task
measuring a discrete cognitive function? Journal of Clinical and Experimental
Neuropsychology, 26(6), 846-856.
VIRTUAL REALITY TESTING IN TBI PATIENTS 26
Vayalakkara, J., Devaraju-Backhaus, S., Bradley, J. D. D., Simco, E. D., & Golden, C. J.
(2000). Abbreviated form of the Wisconsin Card Sort Test. International Journal
of Neuroscience, 103, 131-137.
Voller, B., Benke, T. Benedetto, K., Schnider, P., Auff, E., & Aichner, F. (1999).
Neuropsychological, MRI and EEG findings after very mild traumatic brain
injury. Brain Injury, 13(10), 821-827.
Wilson, B. A. (1993). Ecological validity of neuropsychological assessment: Do
neuropsychological indexes predict performance in everyday activities?. Applied
& Preventive Psychology, 2, 209-215.
Wilson P. N., Foreman N., & Stanton D. (1997). Virtual reality, disability and
rehabilitation. Disability and Rehabilitation, 19, 213–20.
Zakzanis, K. K. (2001). Statistics to tell the truth, the whole truth, and nothing but the
truth: Formulae, illustrative numerical examples, and heuristic interpretation of
effect size analyses for neuropsychological researchers. Archives of Clinical
Neuropsychology, 16, 654-667.
Ziino, C., & Ponsford, J. (2006). Vigilance and fatigue following traumatic brain injury.
Journal of the International Neuropsychological Society, 12, 100–110.
VIRTUAL REALITY TESTING IN TBI PATIENTS 27
Appendix A
Table A1. Demographic Information
Characteristic n Mean Standard Deviation Minimum Maximum
Age
Control 30 20.1 5.67 18 49
TBI 5 40.0 26.5 20 83
Education
Control 30 12.9 1.41 11 16
TBI 5 15.6 4.34 12 22
Gender
Control M/F 15/21 N/A N/A N/A N/A
TBI M/F 5/0 N/A N/A N/A N/A
Note. M, Male; F, Female
VIRTUAL REALITY TESTING IN TBI PATIENTS 28
Table A2. Comparison of Mean Performance on Tests
Test No TBI TBI n M(SD) n M(SD) p
WCST Number of Errors Z-score* 29 0.37(0.98) 4 0.28(0.43) N.S. Categories Completed ‡ 29 3.69(1.27) 4 4.25(2.22) N.S. Trials to First Category ‡ 29 13.52(6.28) 4 14.00(4.08) N.S. Failure to Maintain Set ‡ 29 0.34(0.77) 4 0.50(0.58) N.S. RFFT Unique Designs Z-score ‡ 30 -1.23(1.20) 4 -1.87(0.43) N.S. Error Ratio Z-score ‡ 30 -0.26(0.91) 4 -0.48(1.04) N.S. TOL Total Move Z-score ‡ 30 -0.12(1.25) 4 -0.73(0.78) N.S. Total Correct Z-score ‡ 30 -0.02(1.27) 4 -0.58(0.73) N.S. Total Initiation Time Z-score ‡ 30 0.50(0.83) 4 -0.03(0.52) N.S. Total Execution time Z-score ‡ 30 -0.02(0.69) 4 -1.10(0.83) <0.05 Total Time Z-score ‡ 30 -0.25(0.68) 4 -0.78(0.50) N.S. Number of Time Violations Z-score ‡ 30 0.00(0.74) 4 -0.63(0.19) N.S. Type 2 Violations Z-score ‡ 30 0.13(0.43) 4 0.25(0.50) N.S. VR Office Task Total Incorrect ‡ 30 0.73(0.94) 3 3.33(2.52) <0.05 Failure to Maintain Set ‡ 30 0.50(0.68) 3 2.33(1.16) <0.01 Perseverations ‡ 30 0.00(0.00) 3 0.33(0.58) <0.01 Note. N.S., not significant. * two tailed t-test ‡ two tailed Mann-Whitney U test
VIRTUAL REALITY TESTING IN TBI PATIENTS 29
Table A3. Effect Sizes and Percent Overlap With Regards to Test Performance.
Test Minimum Maximum d Overlap % WCST Number of Errors Z-score -2.10 2.30 0.10 92.3 Categories Completed 1 6 0.41 72.0 Trials to First Category 10 37 -0.08 93.8 Failure to Maintain Set 0 3 -0.21 84.6 RFFT Unique Designs Z-score -3.38 0.78 0.56 63.7 Error Ratio Z-score -1.76 1.36 0.24 82.7 TOL Total Move Z-score -2.40 1.73 0.50 66.6 Total Correct Z-score -1.87 2.13 0.45 69.6 Total Initiation Time Z-score
-0.40 3.20 0.65 59.4
Total Execution time Z-score
-2.00 1.33 1.52 28.8
Total Time Z-score -2.00 0.80 0.79 53.0 Number of Time Violations Z-score
-2.13 0.53 0.89 48.8
Type 2 Violations Z-score 0 2 -0.27 80.7 VR Office Task Incorrect Deliveries 0 6 -2.33 13.9 Failure to Maintain Set 0 3 -2.54 11.4 Perseverations 0 1 -2.25 15.0 Note. d, Cohen’s d effect size