Predictive Factors for Outcome in Patients having …...Table 5.1: Characteristics of Patients with Cervical Spondylotic Myelopathy Table 5.2: Performance of the mJOA in CSM sample
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Predictive Factors for Outcome in Patients having Surgery
FOR CERVICAL SPONDYLOTIC MYELOPATHY.
By
Alina Karpova, BSc
A thesis submitted in conformity with the requirements for the degree of Masters of Science
Department of the Institute of Medical Sciences University of Toronto
Predictive factors for outcome in patients having surgery for cervical spondylotic myelopathy.
Master’s of Science, 2011
Alina Karpova Institute of Medical Sciences
University of Toronto
ABSTRACT
PURPOSE: The objective was to determine if particular magnetic resonance,
clinical and demographic findings were associated with functional status prior to surgery
and predictive of functional outcomes at follow-up.
RESULTS: The study included 65 consecutive CSM patients. The modified
Japanese Orthopaedic Association Scale (mJOA) was used as the primary outcome
measure. Higher baseline mJOA scores were associated with younger age, shorter
duration of symptoms, fewer compressed segments and less severe cord compression.
Better post-operative mJOA scores were associated with younger age, shorter duration of
symptoms and higher baseline scores. Using multivariate analysis, baseline and follow-up
mJOA scores adjusted for baseline mjOA score were best predicted by age.
CONCLUSION: Age and clinical severity scores at admission can both provide
valuable information. However, MR imaging features of the spinal cord before surgery
cannot accurately predict the functional prognosis for patients with CSM and hence
alternative imaging approaches may be required.
iii
ACKNOWLEDGMENTS
I would like to acknowledge and thank my mentor, Dr. Michael Fehlings
(Supervisor), and my Program Advisory Committee members, Dr. Aileen Davis, Dr.
Abhaya Kulkarni, and Dr. David Mikulis. I am deeply grateful for the academic
enrichment they were able to provide, as well as their guidance and unwavering support
throughout this project.
I am thankful to have received funding from the Ontario Neurotrauma
Foundation.
I would also like to thank my friends and colleagues who have helped me shape
this project: Dr. David Cadotte, who screened the articles as a second reviewer for
systematic review and gave me an opportunity to write review paper of CSM in elderly
population, Dr. Yuriy Petrenko for the technical support; Amy Lem; Dr. E Massicotte,
Dr. SJ Lewis, Dr. YR Rampersaud, Neurosurgeons at the Toronto Western Hospital,
Ontario, for allowing us to study their patients; Dr. Zvonimir Lubina for the imaging data
he has provided for this study; and Branko Kopjar for his statistical advices.
I would like to dedicate the thesis to my ‘family’, Alexandra, Olga, Nataly, Ira
and Yakov. I am extremely grateful to Roman for his patience, love and unwavering
support over the years. It is because of them that I have had the strength to see the project
through to the end.
iv
TABLE OF CONTENTS ABSTRACT……………………………………………………………………………..i ACKNOWLEDGMENTS……………………………………………………………… ii CHAPTER 1: Introduction 1.1. Problems of predicting outcomes in CSM patients after surgery……………………1 1.2. Importance of investigating predictors of outcomes after surgery…………………1-2 1.3. Magnetic Resonance Imaging (MRI) in CSM population………………………….2-3 CHAPTER 2: Background and literature review Functional outcomes after surgery and their important predictors: Current State of knowledge 2.1. Cervical spondylotic myelopathy (CSM): Definition and clinical presentation……..4 2.2. Epidemiology of CSM……………………………………………………………….4 2.3. CSM treatment……………………………………………………………………….5 2.4. Functional outcome assessments…………………………………………………...5-6 2.5. Predictors of functional outcomes following surgery……………………………..6-8 2.5.1 Age …………………………………………………………………………6 2.5.2 Gender………………………………………………………………………7 2.5.3 Duration of symptoms………………………………………………….....7-8 2.5.4 Baseline severity score………………………………………………………8 2.5.5. MR imaging findings……………………………………………………8-10 2.6. Theoretical framework and definition of the concept…………………………...10-11 Chapter 3: Systematic review Currently available MR imaging based measurements for the explanation of variations among CSM patients: review and critical appraisal 3.1. LITERATURE SEARCH………………………………………………………..12-16 3.1.1 Objective…………………………………………………………………...12 3.1.2 Inclusion criteria………………………………………………………..12-13 3.1.3 Identification of studies and assessment of methodological quality…...13-15 3.1.4 Data extraction …………………………………………………………….16 3.1.4.1 Severity of myelopathy: functional score and recovery percentage 3.1.4.2 MRI predictive factors…………………………………………...16 3.2. RESULTS………………………………………………………………………..16-33 3.2.1 Compression of spinal canal and cord………………………………….20-26 3.2.2 T2 signal changes on MRIs of the spinal cord…………………………26-33 3.3. OVERALL SUMMARY OF THE SYSTEMATIC LITERATURE REVIEW…33-36 3.4. RATIONALE FOR STUDYING CLINICAL AND IMAGING PREDICTORS OF OUTCOME IN CSM…………………………………………....................................36-37 3.5. HYPOTHESIS AND STUDY OBJECTIVES…………………………………..37-38 CHAPTER 4: Material and methods 4. 1. STUDY OBJECTIVES……………………………………………………………39 4. 2. STUDY DESIGN………………………………………………………………….39 4. 3. TARGET POPULATION………………………………………………………39-42
v
4. 4. DEFINITION OF PRIMARY OUTCOME…………………………………….42-43 4. 5. PRIMARY EXPOSURE (INDEPENDENT VARIABLES)…………………...43-51 4.5.1 Strategies to improve accuracy and easy use of exposure variables…...43-45 4.5.2 Definition of primary exposure and clinimetric properties (validity)
of the independent variables …………………………………………………45-48 4.5.2. 1. Age
4.5.2. 2. Gender 4.5.2. 3. Baseline mJOA 4.5.2. 4. Duration of symptoms 4.5.2. e. Degree of compression (Anteroposterior diameter and Transverse
Area) 4.5.2. f. Signal intensity changes 4.5.2. g. Number of affected stenotic levels
Transverse area Continuous mm² Anteroposterior diameter Continuous mm Signal intensity changes Type 0 = Normal T1WI/ Normal T2WI Type 1=Normal T1WI/ Hi T2WI Type 2 = Low T1WI/ Hi T2WI
Categorical 0/1/2
MR imaging
Number of affected stenotic levels 0=1 compressed segment 1=2 compressed segments 2 = ≥ 3 compressed segments
Categorical 0/1/2
4. 6. CONFOUNDING VARIABLES
It has been shown that some baseline characteristics such as pre-existing or
concomitant medical conditions (hypertension, diabetes mellitus, coronary insufficiency,
cardiomyopathy, pulmonary problems, previous cerebral infarction and gastrointestinal
ulcers) may slow the functional recovery in patients with CSM [14]. Given the
established inhibitory effects of smoking on spine fusion [62, 63], smoking may slow the
functional recovery. In addition, functional deterioration in the postoperative period may
also result from aggravation of diabetes mellitus [14]. Since this type of information was
collected at baseline examination, it was statistically tested for its significance. The
surgical interventions information (anterior cervical decompression and fusion,
laminoplasty, and laminoplasty and laminectomy and fusion) was not included in the
predictive model due to the limited size of the sampled population at one single centre.
49
4. 7. SAMPLE SIZE
General guidelines have suggested for the minimum number of events per
variable required in the multivariate analysis. It is generally suggested that a minimum of
ten subjects per variable analyzed (for continuous outcome) are required to prevent over-
fitting [64]. Given the total number of 61 patients available for analysis, we included no
more than 6 out of 10 given preoperative variables in the theoretical framework. Such
number ensures adequate sample size for future predictive models.
4.8. DATA ANALYSIS
4.8.1. Exploratory analysis
All data analyses were performed by using SAS, version 9.2 Software. Data
analysis followed standard procedures for a prediction study. Summary descriptive
statistics were computed on all variables. Categorical variables were summarized as
frequencies and percentages, and continuous variables as means and standard deviations.
Categorical variables were compared using Spearman Chi-square test for independent
proportions, and the student t-test was used as compare continuous variables.
Exploratory correlation coefficient analyses were performed to identify
associations between the ten individual independent variables and final mJOA scores and
associations or multicollinearity between variables. More specifically, Spearman
correlation analysis was used when both variables were continuous, t tests were used
when one variable was continuous and the other dichotomous, and continuity-adjusted
chi squares were calculated when both variables were categorical. The Mann Whitney U
test was used for analysis of the association between dichotomous variables and final
50
mJOA scores, because these scores did not follow a Gaussian distribution. The criterion
of r> 0.90 was used for excessive correlation between variables. At the same time, the p-
value was used in chi-square test to see if it is significantly smaller than 5%.
To assess normality of primary outcome measure and other variables’
distribution, plotting of histograms was used. The logarithmic transformation for
normality was used when distribution of follow up mJOA scores was negatively skewed
[65].
4.8.1.1. Univariable (unadjusted) analysis
Univariable data analyses that include unadjusted regression coefficients (beta
values estimates) and p-values were carried out for all variables under evaluation.
Initially, continuous variables (age, duration of symptoms, baseline mJOA scores,
transverse area and anterioposterior diameter of spinal cord) were analyzed individually
for a linear relationship with post-operative functional scores. Then, age and duration of
symptoms variables were dichotomized for convenience in clinical practice and ease of
interpretation of findings. In addition, three MR imaging variables (three patterns of
spinal cord signal intensity changes on T1- and T2-weighted sequences, transverse area
of the spinal cord and number of compressed segments), were analyzed. Table 5.5
summarizes the statistical details of the unadjusted analysis. All candidate variables were
examined using linear regression.
4. 8. 2. Model development
As the outcome of interest is continuous (functional score calculated using mJOA
score from 0-18), multivariable linear regression modeling techniques were used to
51
determine the relationship between each independent variable and the functional
outcomes.
Unadjusted (univariable) data analyses were carried out initially to estimate the
effect of each potential predictive variable individually, followed by the adjusted
(multivariable) analysis.
Efforts were made to maximize predictive performance using all-variables
regression for model building (no selection methods were applied, eg. stepwise selection
for example) and a variable remained in the final model if it met the following three
criteria: 1) a significance level of p of 0.1 or less; 2) the r2 statistic for the model
increased by at least 10%; and 3) if the beta coefficient did not change by more than 10%
with the addition of other variables into the model [66, 67]. Baseline scores were
included in the model to adjust for the effect of baseline differences on final scores [68].
This analysis was conducted using the PROC GLM procedure in SAS, version 9.2.
4.8.3 Data sources and management
Source of clinical data: Source data included all information in original records,
observations, or other activities necessary for the reconstruction of missing data and
verification of outliers. More specifically, it included surgery, imaging and laboratory
reports, medical history information and demographics. The study database was a secured
electronic database system known as OPVerdi.
Several strategies were implemented for the reconstruction of missing data and
verification of outliers. For continuous data, we plotted each variable and investigated for
any outliers were beyond 3 standard deviations. The same approach was applied on
52
categorical data by plotting a boxplot. The spotted outliers were checked against data
collection forms and were corrected. For age, gender, and duration of symptoms
variables, the data was 100% complete. After calculating the frequency of missing
values, the following was found: 3 (5%) for transverse area of spinal cord measurements,
0 (0%) for anteroposterior diameter measurements, 2 (3%) for signal intensity changes,
and 4 (7%) for number of compressed segments. The mJOA scores at 12 months for 4 out
of 65 patients (6%) were found to be missing due to loss of follow up. The subjects were
removed and as a result, 61 subjects were analyzed in statistical modelling.
MR imaging data: Issa [proprietary name] was used as an integrated system for
archiving patient data and examination data including images.
4.8.4 Ethics
The research protocol was approved by the University Health Network Research
Ethics Board.
53
CHAPTER 5
RESULTS
Chapter 5 provides findings to two research questions related to Specific Aims I & II.
OVERALL STUDY OBJECTIVE: To develop a predictive model of functional score incorporating key demographic, clinical and MR imaging assessments in patients with cervical spondylotic myelopathy undergoing surgical treatment. Hypothesis: Key demographic parameters, clinical factors and MR imaging features of the site of cervical cord compression are independently associated with baseline scores and predictive of functional outcomes scores at 12 months follow up in patients with CSM undergoing surgical treatment. Each specific aim contributes to overall objective: Specific Aim I : Reliability assessment of MR imaging to assess cord compression in CSM Objective: To investigate the inter-rater reliability of two published methods (transverse area and anteroposterior diameter) of examining cord stenosis on axial MR images. Question: Are the ICC values of TA and AP diameter of spinal cord methods free of systematic errors (bias)? Findings: The two-way analysis of variance indicated the interrater agreement ICC’s for transverse area (TA) and anteroposterior diameter (AP) of the spinal cord were 0.68, 0.69, 0.73 and 0.76, and 0.86, 0.72, 0.68, and 0.52 on 1st-4th sessions, respectively. Those coefficients were calculated using Shrout-Fleiss models for random effects (Model 2). Of note, TA and AP methods showed wider variability in cases of severe cord compression (presence of systematic error) and the variability of images interpretation was dependent of rater’s individual differences. TA and AP measurement techniques demonstrated moderate to good inter-reliability, with more consistent agreement noted in the assessment of transverse area of spinal cord. This is the first study to examine, the interobserver reliability of quantifiable methods to assess spinal cord stenosis in the setting of CSM. Based on our data, we recommend that the TA method be used to assess the extent of compression on axial T2 images.
54
Specific Aim II : Development of a predictive model of outcome in patients with CSM undergoing surgical treatment Objective: To address the limitations of the current literature by prospectively evaluating if demographic, clinical and radiological factors in patients with CSM are predictive of functional outcomes pre- and post- surgery. Questions: After controlling for age, gender and duration of symptoms, MRI is independently associated and predictive of functional outcomes at baseline and 12 months follow-up, respectively. Findings: Higher baseline mJOA scores were associated with younger age (p=0.0002), shorter duration of symptoms (p=0.03), fewer compressed segments (p=0.04) and less severe cord compression (p=0.02). Moreover, better post-operative mJOA scores were associated with younger age (p<0.0001), shorter duration of symptoms (p=0.09) and higher baseline mJOA score (p<0.0001). Using multivariate analysis, baseline and follow-up mJOA scores were best predicted by age. This data suggest that: first, it is important to diagnose and treat CSM at an early stage and that age is a key predictor of functional improvement on the mJOA scale; ischemic changes, degree of spinal cord deformity and multiplicity of stenosis could not predict post-operative functional status being measured by mJOA scale, after controlling for age and baseline mJOA score.
5. 1. DESCRIPTIVE STATISTICS
The final dataset included information on 61 CSM patients, who underwent spine
surgery at Toronto Western Hospital between February 2006 and November 2007. The
missing data were 6% of sample population lost to follow up in the development of
model. All 61 patients had complete data. The general patients’ characteristics with
cervical spondylotic myelopathy are illustrated in Table 5.1.
5. 2. MODEL DEVELOPMENT
5. 2. 1. Improving the validity of the predictive model
Among the potential predictor variables, two of these variables, transverse area
[TA] and anteroposterior diameter [AP] of spinal cord, both provide similar information
about the degree of spinal cord compression, efforts were made to establish the
55
reliabilities (inter-rater reliability and test-retest) of each variable were (please refer to
Appendix 3).
Based on three-way ANOVA, the observed differences between AP
measurements consists of true score variances, random error (imprecision) and systematic
error (bias) caused by raters’ specialty training and their interpretations of MRI based on
stage of CSM severity (Table F.5. - F.8).
In addition to the sources of systematic error mentioned above, the TA method
had time (learning or fatigue) as a source of variability. The time effect has been shown
to be statically significant in the TA method of spinal stenosis assessment, based on
three-way ANOVA with Bonferroni post-hoc analysis [TA, p= 0.01], specifically the
agreement among four raters consistently increased from Session 1 to Session 4 (Table
F. 4). The time differences are illustrated as normal fluctuations by graphical
representation (i.e. random error) (Figure F.3).
The TA and AP measurement techniques demonstrated a moderate level of inter-
reliability (0.68, 0.69, 0.73, 0.76 and 0.86, 0.72, 0.68, 0.52), with more consistent
agreement noted in the assessment of transverse area of spinal cord. As a result,
transverse area was chosen over anteroposterior diameter of spinal cord method. The
variable was selected based on clinical, practical, statistical and reliability criteria
described in Table 4.7.
Transverse area and anteroposterior diameter of spinal cord are statically collinear
and choosing TA for a predictive model avoids this collinearity. Collinearity is a
statistical phenomenon in which two predictor variables in a multiple regression model
are highly correlated. As a result, the coefficient estimates of individual predictor
56
variables may change erratically in response to small changes in the model or the data.
AP of spinal cord has the disadvantage of being less applicable in cases of compression
sites off midline of spinal cord. TA adjusts for asymmetrical compression of spinal cord;
thus it is a less biased measure.
5. 2. 2. Univariable (unadjusted) analysis
5. 2. 2. 1. mJOA Scores at baseline
Higher baseline mJOA scores were associated with younger age (p=0.0002, β(r) =
-2.83), shorter duration of symptoms (p=0.03, β(r) = -1.55), a smaller compression of
transverse area of the spinal cord (p=0.02, β(r) = 0.06) and less number of compressed
Analysis of all variables revealed that the MR imaging features (three patterns of
spinal cord signal intensity changes on T1- and T2-weighted sequences and number of
compressed segments), and gender variables were not significantly associated with the
functional score at follow-up (p-value > 0.2). Therefore, these insignificant variables (list
variables) were excluded (Table 5.4).
5. 2. 3. Multivariate (adjusted) analysis
5. 2. 3. 1. mJOA Scores at baseline
The final statistical model includes age (Table 5.5), which explains 20% of the
total variability of the baseline mJOA scores. The average baseline score of CSM patients
in patients older 65 years of age was 13.5. The baseline mJOA scores in younger patients
are on average 2.83 higher.
5. 2. 3. 2. mJOA Scores at follow-up
The final model includes the baseline mJOA score and age (Table 5.5), and
explains 36% of the total variability of the final mJOA scores. This model indicates that,
for example, if baseline scores were identical, a patient less than 65 years of age has on
average score 1.04 higher than an older patient. Moreover, if age was identical, a patient
with moderate severity of myelopathy may benefit from surgical treatment more than a
patient with severe myelopathy (approximately by 1.01 points lower on average).
58
CHAPTER 6
DISCUSSION AND CONCLUSION
6.1. Summary of findings
The studies described herein have led to several major conclusions: 1) Age and
baseline severity score are good predictors of functional score after surgery. 2) Duration
of symptoms is not a good predictor of functional scores after surgery. 3) Measurements
of the transverse area and anteroposterior diameter of the spinal cord have shown good to
moderate inter-rater reliability. 4) No definite conclusions can yet be drawn on whether
transverse area of spinal cord, combined patterns of signal intensity changes on T1/T2WI,
and the number of compressed levels are predictors of functional score.
Age & Baseline severity score
Based on in-depth examination of the impact of predictors on outcome using beta
coefficient values and reliability assessments, our study confirms that age and baseline
severity score are two preoperative variables that can predict functional outcomes after
surgery (post-operative mean mJOA score). The most prominent patient information was
the age at the time of admission, which was shown to be associated with baseline
functional score and predictive of follow up functional score in the setting of CSM.
Based on the beta estimate magnitude, the following data suggest that there might be
more opportunities for greater improvement when performing surgery on younger
population. However, more research is needed to confirm these findings. In contrast,
Yamazaki et al showed no differences based on age in post-operative functional scores
after surgery in a retrospective study [15]. However, these results must be cautiously
59
interpreted because the study did not controlled for baseline severity score, the number of
patients in each subgroup was small, and patient characteristics were too poorly described
to understand the differences between two samples. Finally, baseline CSM severity score
was a strong independent predictor of functional score following surgery. Patients with
less severe functional disability may benefit from surgical treatment more than those with
a more severe disability. The greater benefit from surgery in patients with less functional
disability could be due to milder neuropathologic alterations in the spinal cord that reflect
greater recuperative potential [19]. These findings suggest the possibility that patients
may experience poorer outcome if surgery is delayed until the patient is more severely
affected. In contrast, Singh et al. reported patients with lower starting point in function
make the most gains after surgery[24]. We suspect that higher functional scores in the
more severe CSM group in this study could be due to other differences in patient
characteristics (age and duration of symptoms), which were not comparable at admission.
Duration of symptoms
In our study, duration of symptoms was mildly associated with functional scores
at admission at 12 months follow-up. However, after adjustments for age and baseline
severity score, duration of symptoms appears to be associated with functional score at
admission and follow up, though this is not significant. The question as to whether CSM
patients with indications for surgery should be offered operative interventions
irrespective of duration of symptoms is still unclear.” Our findings are inconsistent with
some other studies in the literature that support the notion of long-standing mechanical
compression causing additional circulatory impairment of the spinal cord [15, 19, 21, 46].
60
We suspect that these differences may be due to the interpretation of the onset of CSM.
Heterogeneity of samples, non-consecutive methods of recruitment and insufficient
descriptions of patients associated with retrospective design in previous studies could
also have contributed to the observed differences in functional scores. Although
Mastronardi et al prospectively analyzed CSM patients, these results must be cautiously
interpreted because baseline severity score and age were not similar between groups and
the number of patients in each subgroup was small [21].
MR imaging features
Based on the findings of our systematic review, transverse area of spinal cord,
combined patterns of signal intensity changes on T1/T2WI, and number of compressed
segments were found to be associated with functional scores at long term follow up
before and after adjustment for age, duration of symptoms, and baseline severity score.
The data obtained for this thesis did not support the findings of previous studies. We can
speculate that several factors may have contributed to these results. Firstly, the
inconsistencies in findings could be due to heterogeneity of the patients in this sample
population. The findings vary based on different etiology, ossification of posterior
ligaments (OPLL) versus cervical spondylotic myelopathy (CSM) vs herniated disc (HD)
[27, 31]. Differences could also be due to inter-institutional variations in MRI protocols.
For example, previous studies used T1-weighted axial imaging to measure spinal cord
deformity. At our institution (Toronto Western Hospital), MRI protocols for the cervical
spine include axial T2 slices. Differences among clinicians are another source of
variation. Based on our observations from reliability testing, the measurements of
61
transverse area of spinal cord is subjective; clinicians had different approaches to
interpret the exact location and boundaries of the most compressed site of the spinal cord,
especially in multisegmental CSM. Based on the findings of intra- and inter-rater
reliability project (Appendix 3), the interpretations of MR images varied depending on
the specialty and years of practice. In our study, we found that the percentage of
agreement was 68% to 76% and overall correlation was moderate to good. We
recommend that the use of this measurement technique be applied in a larger sample size.
In our study, we established that the variations in functional outcomes defined by
mJOA score after surgery cannot be further explained using MR modality, in addition to
age and baseline mJOA score. Our findings suggest that assessments of T1-/T2 signal
intensity changes, degree of spinal cord compression and number of levels involved in
compression have no statistically significant effect on post-operative functional status as
measured using the mJOA scale, and provide no additional clinically important
information in predicting function after surgery. We suspect that the study does not
support the use of MR imaging features as predictors because of the ceiling effect present
in mJOA measurements at follow up, which leads to poor discriminative response thus
resulting in low responsiveness. All of the subjects were fully developed in their ability to
function, therefore, no subjects scored below 10 on mJOA scoring system. The majority
of patients scored on the mild side of spectrum at follow up. Therefore, one of the
limitations of the study was the use of poorly variable pool of CSM individuals at follow-
up. A future study will require a better outcome measure than the mJOA that would have
capacity to differentiate subjects more precisely from all severity groups at follow-up.
62
Similar to these findings, Singh et al reported low levels of sensitivity to change in JOA
score (r=0.21) compared to SF-36 (r=0.32), Nurick score (r=0.42) and MDI (r=0.52),
indicating that the scale is possibly less sensitive when differentiating milder levels of
severity [23]. Predicting a perfect correlation between the clinical scores with poor
sensitivity and the findings seen on MR images of spinal cord remains a challenge. In
addition, MRI provides a quantitative measure as opposed to qualitatively subjective
report of observers to differentiate severity of CSM. More research is needed to
investigate in greater detail about the psychometric properties (reliability and validity) of
the modified version of JOA (mJOA) scale.
The availability of these predictors enables spine surgeons and referring
physicians to provide more information to patients in consulting sessions prior to surgery,
and in guiding their therapeutic decision making. It provides better allocation of services
and becomes useful in designing clinical trials to test the effect of surgical interventions
on outcomes.
6.2. Implications of the findings
The most significant finding of this study is that there are now known reliable
measures (transverse area and anteroposterior diameter of spinal cord) to assess the
degree of spinal cord compression using digitized/magnified images and a standardized
written protocol. In the past, there was a lack of concordance in the literature on the
optimal techniques to quantitatively assess MRIs in patients with CSM. It has not been
possible to replicate previously published results due to lack of availability of information
on MRI protocol details and its measures.
63
The findings also enhance knowledge which lends insight into how MR imaging
should be approached and analyzed with this population. Perhaps some studies should
include assessments using T2-weighted images as opposed to T1-weighted and have a
consistent approach in the selection of the most compressed site, especially in CSM cases
with multilevel involvements due to degenerative changes of spine.
In summary, the predictive model provides a detailed profile of patient
characteristics and their variability, enabling a clinician to council patients on individual
bases. We also report details about variability, age and baseline severity score.
Furthermore, the data was collected in a prospective fashion, which fills a void currently
existing in the literature. This study provides a detailed exploratory analysis, providing
new insights on discriminative abilities of mJOA scale in the area of CSM research.
6.3. Limitations
The present study has several limitations. The first is the absence of confirmed
reliability, validity and responsiveness of the modified version of JOA scoring system
and some MR imaging based predictive variables (number of compressed segments) used
in the baseline examinations. In addition, the modified version of JOA scale, which has
limited usefulness in detecting the precise benefit of surgery for mild CSM patients due
to a ceiling effect, was used in our study. The majority of patients scored on the mild side
of spectrum at follow up. A future study will require a better outcome measure than the
mJOA scale that would have a capacity to differentiate subjects more precisely from all
severity groups at follow-up. Second, a study with one single recruitment centre might
potentially systematically under- or overestimate measurement errors due to particular
64
characteristics of patients. Multicentre trial data in and outside of Toronto may help to
establish more representative estimates of CSM parameters. Finally, although the
findings were based on the secondary analysis of a prospectively collected data, there was
a restriction in the types of MR imaging features collected.
6.4. Future directions
Based on the results from this study it appears that age and baseline severity score
at admission can both provide valuable information and can be part of a new
multidimensional scoring system for clinicians to counsel patients with CSM.
Because the cumulative effect of age, gender, duration of symptoms, baseline
severity score and MR imaging predictors on functional score assessed by mJOA scale
following surgery in the present study were from a single centre and investigated for the
first time, a similar analysis must be conducted on the data collected from larger North
American and international CSM clinical trials databases to determine if our results can
be reproduced in other geographical regions with similar estimates for the magnitude of
all associations (beta values).
In light of the need to establish the predictive value of MR imaging features of
functional outcomes after adjusting for other important predictors, the mJOA scale
requires improvements to the existing measurements and should potentially add some
new ones. For example, some studies has shown that JOA score underestimates the initial
handicap in the hands, often among the first of patients’ complaints [69]. Similarly,
recovery of manual dexterity is poorly judged by this score. Potentially, the domain
including the functioning of the upper limb needs to be reconsidered to make it more
65
quantitative as compared to qualitative estimate that currently is. Gait dysfunction is the
most important issue in CSM patients regarding the surgical outcome and clinical
deficits. The measurements of ambulation have shown the relative advantages over
previous clinical assessment scales in determining clinical severity and, particularly, in
the detection of change following surgery [70]. In study by Singh et al 2001, walking-
related parameters were shown to have good correlation, along with validity, with other
functional and impairment scales such as the myelopathy disability index (MDI), the
Nurick Scale and the short form health survey (SF-36) in CSM setting [24]. Potentially
adding a new domain with a walking component may enable more accurate prediction of
patients’ functioning after treatment. Further work on the mJOA scale is necessary to
confirm its psychometric properties including reliability, construct validity and its
discriminative abilities (responsiveness).
Alternatively, MR imaging with T2 weighting has been reported to have a level of
sensitivity ranging from 15% to 65% [71], but low specificity for the visualization of
intramedullary pathology. The development of a more advanced spinal imaging
technique such as diffusion tensor imaging (DTI) with fractional anisotropy, diffusion-
weighted imaging (DWI), functional magnetic resonance (fMR), diffusion coefficient
(ADC), may enable more accurate correlations between imaging and clinical
presentation.
In addition, given that spin-echo MR imaging has limited pathophysiologic
usefulness in detecting myelopathy, diffusion-tensor imaging (DTI) and diffusion
weighted imaging (DWI) may be more useful in identifying additional shearing injuries
66
that are not visible on conventional MR images. In general, DTI analyzes the movement
of water in association with white matter fibers, providing three-dimensional
reconstruction of fiber tracts, and has the ability to help quantify the severity of injury to
individual white matter tracts [72]. Budzik et al found diffusion-tensor MR imaging to be
better correlate with clinical scores than T2WI in cervical spondylotic myelopathy [73].
Similarly, results of the Sagiuchi et al study showed that DWI has higher sensitivity for
detection of acute spinal cord imaging abnormality compared to standard MRI [74].
fMRI analysis of the spinal cord provides physiological readouts of neuronal
activity and neuronal plasticity, in a non-invasive manner. A number of studies have
demonstrated the utility of advanced MRI techniques in the setting of spinal cord injuries
with reliable results and good sensitivity to changes in neuronal activity.
ADC and fractional anisotropy may be beneficial in assessing a correlation
between imaging and clinical presentation. Demir et al found that diffusion ADC values
were a more sensitive indicator of spinal cord injury than T2-weighted images. The study
demonstrated a higher sensitivity when combined with electrophysiological examination
with sensitivity of 92% and negative predictive value of 75% compared to the T2-
weighted images that had 53% sensitivity and 50% negative predictive value [71]. Facon
et al. performed a similar study in six cervical spondylosis patients and determined that
the fractional anisotropy values had significantly higher sensitivity and specificity in the
detection of spinal cord abnormalities than T2 weighted images [72].
In conclusion, imaging indexes based on pathophysiologic models may enable
more accurate prediction of CSM and thereby facilitate better assessment of the prognosis
and better application of treatment strategies.
67
Conclusion
A predictive model of functional outcomes was developed to predict functional
outcome of patients undergoing surgery according to their age and baseline severity
score, though changes on MR imaging were not independently predictive of outcome. In
addition to validating reports in the existing literature, our study results suggest that MRI
is a reliable tool yielding reproducing stable measurements. Some work on
responsiveness of the current mJOA scale is needed to establish the ability of MRI to
predict the functional outcomes of CSM patients.
This study has shed some light on the need for a more responsive functional scale
than the mJOA that could detect more clinically important changes in functional
outcomes. More specifically, the main issue explained above with the mJOA scale is the
presence of ceiling effect with lack of discrimination of functional deficits in milder
patients with CSM. This is preliminary work which provides a first step in developing a
multidimensional scoring system for prediction of functional outcomes in CSM using
demographic, clinical and MR imaging domains.
Moreover, the proportions of variance in follow up functional scores explained by
age and baseline score is low, suggesting that this field has long way to go before
achieving equipoise in refusing someone surgery on the basis of unfavourable baseline
characteristics.
68
CHAPTER 7
REFERENCE LIST
1. Emery, S., Cervical spondylotic myelopathy: diagnosis and treatment. . Journal of the American Academy of Orthopaedic Surgeons, 2001. 9(6): p. 376-385.
2. Cadotte, D.W., Karpova, A.V., Fehlings,M.G. , Cervical spondylotic myelopathy: surgical outcomes in the elderly. Int. J. Clin. Rheumatol, 2010. 5(3): p. 327-337.
3. Montgomery, D.M. and R.S. Brower, Cervical spondylotic myelopathy. Clinical syndrome and natural history. [Review] [54 refs]. Orthopedic Clinics of North America. 23(3):487-93, 1992 Jul., 1992.
4. Adams, C.B., Logue, V., Some functional effects of operations for cervical spondylotic myelopathy. Brain, 1971. 94: p. 587-594.
5. Law, M.D., Jr., Bernhardt, M., White, A.A., Evaluation and management of cervical spondylotic myelopathy. Instr Course Lect 1995. 44: p. 99-110.
6. Young, W.F., Cervical spondylotic myelopathy: a common cause of spinal cord dysfunction in older persons. . Am Fam Physician 2000. 62: p. 1064-1070, 1073, 2000.
7. Matz, P.G., et al., The natural history of cervical spondylotic myelopathy. J Neurosurg Spine, 2009. 11(2): p. 104-11.
8. Houten, J.K. and P.R. Cooper, Laminectomy and posterior cervical plating for multilevel cervical spondylotic myelopathy and ossification of the posterior longitudinal ligament: effects on cervical alignment, spinal cord compression, and neurological outcome. Neurosurgery. 52(5):1081-7; discussion 1087-8, 2003 May., 2003.
9. Hirabayashi, K., Miyakawa, J., Satomi, K., Maruyama, T., Wakano, K., Operative results and postoperative progression of ossification among patients with offication of cervical posterior longitudinal ligaments. . Spine 1981. 6(4): p. 354-364.
10. Chen, C.J., et al., Intramedullary high signal intensity on T2-weighted MR images in cervical spondylotic myelopathy: prediction of prognosis with type of intensity. Radiology. 221(3):789-94, 2001 Dec., 2001.
11. Benzel, E.C., et al., Cervical laminectomy and dentate ligament section for cervical spondylotic myelopathy. Journal of Spinal Disorders. 4(3):286-95, 1991 Sep., 1991.
12. Park, Y.S., et al., Predictors of outcome of surgery for cervical compressive myelopathy: retrospective analysis and prospective study. Neurologia Medico-Chirurgica. 46(5):231-8; discussion 238-9, 2006 May., 2006.
13. Handa, Y., et al., Evaluation of prognostic factors and clinical outcome in elderly patients in whom expansive laminoplasty is performed for cervical myelopathy due to multisegmental spondylotic canal stenosis. A retrospective comparison with younger patients. Journal of Neurosurgery. 96(2 Suppl):173-9, 2002 Mar., 2002.
14. Matsuda, Y., et al., Outcomes of surgical treatment for cervical myelopathy in patients more than 75 years of age. Spine. 24(6):529-34, 1999 Mar 15., 1999.
69
15. Yamazaki, T., et al., Cervical spondylotic myelopathy: surgical results and factors affecting outcome with special reference to age differences. Neurosurgery. 52(1):122-6; discussion 126, 2003 Jan., 2003.
16. Hasegawa K, H.T., Chiba Y, Hirano T, Watanabe K, Yamazaki A. , Effects of surgical treatment for cervical spondylotic myelopathy in patients > or _ 70 years of age: a retrospective comparative study. J Spinal Disord Tech. , 2002. 15: p. 458-460.
17. Kohno K, K.Y., Oka Y, Matsui S, Ohue S, Sakaki S. , Evaluation of prognostic factors following expansive laminoplasty for cervical spinal stenotic myelopathy. . Surg Neurol. , 1997. 48: p. 237–245.
18. Nagata, K., et al., Cervical myelopathy in elderly patients: clinical results and MRI findings before and after decompression surgery. Spinal Cord. 34(4):220-6, 1996 Apr., 1996.
19. Morio, Y., et al., Correlation between operative outcomes of cervical compression myelopathy and mri of the spinal cord. Spine. 26(11):1238-45, 2001 Jun 1., 2001.
20. Yagi M, N.K., Kihara M, Horiuchi Y, Long-term surgical outcome and risk factors in patients with cervical myelopathy and a change in signal intensity of intramedullary spinal cord on magnetic resonance imaging. J Neurosurg Spine, 2010. 12: p. 59–65.
21. Mastronardi, L., et al., Prognostic relevance of the postoperative evolution of intramedullary spinal cord changes in signal intensity on magnetic resonance imaging after anterior decompression for cervical spondylotic myelopathy. Journal of Neurosurgery Spine. 7(6):615-22, 2007 Dec., 2007.
22. Fukushima, T., et al., Magnetic resonance imaging study on spinal cord plasticity in patients with cervical compression myelopathy. Spine. 16(10 Suppl):S534-8, 1991 Oct., 1991.
23. Singh A, C.H., Comparison of seven different scales used to quantify severity of cervical spondylotic myelopathy and post-operative improvement. Journal of Outcome Measures, 2001. 5(1): p. 798-818.
24. Singh, A., et al., Clinical and radiological correlates of severity and surgery-related outcome in cervical spondylosis. Journal of Neurosurgery. 94(2 Suppl):189-98, 2001 Apr., 2001.
25. Yukawa, Y., et al., MR T2 Image Classification in Cervical Compression Myelopathy. Spine, 2007. 32(15): p. 1675–1678.
26. Alafifi, T., Kern, R.,Fehlings, M. , Clinical and MRI Predictors of Outcome After Surgical Intervention for Cervical Spondylotic Myelopathy. Journal of Neuroimaging, 2006. 17(4): p. 315-322.
27. Okada, Y., et al., Magnetic resonance imaging study on the results of surgery for cervical compression myelopathy. Spine. 18(14):2024-9, 1993 Oct 15., 1993.
28. Chung, S., Chung, KH. , Factors affecting the surgical results of expansive laminoplasty for cervical spondylotic myelopathy. . Int Orthop, 2002. 26(6): p. 334-338.
29. Wada, E., M. Ohmura, and K. Yonenobu, Intramedullary changes of the spinal cord in cervical spondylotic myelopathy. Spine. 20(20):2226-32, 1995 Oct 15., 1995.
70
30. Uchida, K., Nakajima,H., Sato,R., Kokubo, Y., Yayama,T., Kobayashi,S., Baba, H., Multivariate analysis of the neurological outcome of surgery for cervical compressive myelopathy. Journal of Orthopaedic Science 2005. 10: p. 564–573.
31. Yone, K., et al., Preoperative and postoperative magnetic resonance image evaluations of the spinal cord in cervical myelopathy. Spine. 17(10 Suppl):S388-92, 1992 Oct., 1992.
32. Fernandez de Rota, J.J., et al., Cervical spondylotic myelopathy due to chronic compression: the role of signal intensity changes in magnetic resonance images. Journal of Neurosurgery Spine. 6(1):17-22, 2007 Jan., 2007.
33. Nagata, K., Kiyonaga, K., Ohashi, MS., Miyazaki, S., Inoue, A. , Clinical value of magnetic resonance imaging for cervical myelopathy. Spine 1990. 15(11): p. 1089-1096.
34. Matsuyama, Y., N. Kawakami, and K. Mimatsu, Spinal cord expansion after decompression in cervical myelopathy. Investigation by computed tomography myelography and ultrasonography. Spine. 20(15):1657-63, 1995 Aug 1., 1995.
35. Ramanauskas WL, W.H., Metes JJ, Lazo A, Kelly JK., MR imaging of compressive myelomalacia. J Comput Assist Tomogr. , 1989. 13(3): p. 300-404.
36. Takahashi M, S.Y., Miyawaki M, Bussaka H., Increased MR signal intensity secondary to chronic cervical cord compression. Neuroradiology., 1987. 29(6): p. 550-556.
37. Morio Y, Y.K., Kuranobu K, Murata M, Tuda K., Does increased signal intensity of the spinal cord on MR images due to cervical myelopathy predict prognosis? Arch Orthop Trauma Surg. , 1994. 113(5): p. 254-259.
38. Al-Mefty O, H.L., Middleton TH, Smith RR, Fox JL., Myelopathic cervical spondylotic lesions demonstrated by magnetic resonance imaging. J Neurosurg. , 1988. 68(2): p. 217-222.
39. Mehalic TF, P.R., Applebaum BI., Magnetic resonance imaging and cervical spondylotic myelopathy. Neurosurgery, 1990. 26(2): p. 226-227.
40. Serizawa Y, O.K., Tanaka K, Tamaki S, Matsuura K, Uchihara T., Spontaneous resolution of an acute spontaneous spinal epidural hematoma without neurological deficits. Intern Med. , 1995. 34(10): p. 992-994.
41. Mihara, H., et al., Cervical myelopathy caused by C3-C4 spondylosis in elderly patients: a radiographic analysis of pathogenesis. Spine. 25(7):796-800, 2000 Apr 1., 2000.
42. Tuszynski MH, S.J., Fawcett JW, Lammertse D, Kalichman M, Rask C, Curt A, Ditunno JF, Fehlings MG, Guest JD, Ellaway PH, Kleitman N, Bartlett PF, Blight AR, Dietz V, Dobkin BH, Grossman R, Privat A; , Guidelines for the conduct of clinical trials for spinal cord injury as developed by the ICCP Panel: clinical trial inclusion/exclusion criteria and ethics. Spinal Cord, 2007. 45(3): p. 222-231.
43. Peolsson A, H.R., Vavruch L, Prediction of fusion and importance of radiological variables for the outcome of anterior cervical decompression and fusion. Eur Spine J, 2004. 13: p. 229–234.
44. McCormack, B.M. and P.R. Weinstein, Cervical spondylosis. An update. [Review] [116 refs]. Western Journal of Medicine. 165(1-2):43-51, 1996 Jul-Aug., 1996.
71
45. Lee, T.T., G.R. Manzano, and B.A. Green, Modified open-door cervical expansive laminoplasty for spondylotic myelopathy: operative technique, outcome, and predictors for gait improvement. Journal of Neurosurgery. 86(1):64-8, 1997 Jan., 1997.
46. Hashizume, Y., Iijima, S., Kishimoto, H. Yanagi,T. , Pathology of Spinal Cord Lesions caused by Ossification of the Posterior Longitudinal Ligament Acta neuropathology, 1984. 63: p. 1230-130.
47. Uchida K, N.H., Sato R, Kokubo Y, Yayama T, Kobayashi S, Baba H., Multivariate analysis of the neurological outcome of surgery for cervical compressive myelopathy. J Orthop Sci., 2005. 10(6): p. 564-573.
48. Shinomiya K, M.N., Furuya K., Study of experimental cervical spondylotic myelopathy. Spine (Phila Pa 1976). 1992. 7(10 Suppl): p. S383-387.
49. Ryan R, H.S., Broclain D, Horey D, Oliver S, Prictor M, Cochrane consumers & communication review group: study quality guide. 2007: p. 1-50.
50. Nurick, S., The pathogenesis of the spinal cord disorder associated with cervical spondylosis. Brain. 95(1):87-100, 1972., 1972.
51. Chung SS, L.C., Chung KH., Factors affecting the surgical results of expansive laminoplasty for cervical spondylotic myelopathy. Int Orthop., 2002. 26(6): p. 334-338.
52. Kasai Y, U.A., New evaluation method using preoperative magnetic resonance imaging for cervical spondylotic myelopathy. Arch Orthop Trauma Surg., 2001. 121(9): p. 508-510.
53. Nagata K, K.K., Ohashi T, Sagara M, Miyazaki S, Inoue A., Clinical value of magnetic resonance imaging for cervical myelopathy. Spine (Phila Pa 1976). , 1990. 15(11): p. 1088-1096.
54. Matsuyama Y, K.N., Yanase M, Yoshihara H, Ishiguro N, Kameyama T, Hashizume Y., Cervical myelopathy due to OPLL: clinical evaluation by MRI and intraoperative spinal sonography. J Spinal Disord Tech. , 2004. 17(5): p. 401-404.
55. Mizuno J, N.H., Inoue T, Hashizume Y., Clinicopathological study of "snake-eye appearance" in compressive myelopathy of the cervical spinal cord. J Neurosurg., 2003. 99(2 Suppl)(162-168).
56. Yukawa Y, K.F., Yoshihara H, Yanase M, Ito K., MR T2 image classification in cervical compression myelopathy: predictor of surgical outcomes. Spine (Phila Pa 1976). , 2007. 32(15): p. 1675-1678.
57. Papadopoulos CA, K.P., Papagelopoulos PJ, Karampekios S, Hadjipavlou AG., Surgical decompression for cervical spondylotic myelopathy: correlation between operative outcomes and MRI of the spinal cord. Orthopedics., 2004. 27(10): p. 1087-1091.
58. Wada, E., et al., Can intramedullary signal change on magnetic resonance imaging predict surgical outcome in cervical spondylotic myelopathy? Spine. 24(5):455-61; discussion 462, 1999 Mar 1., 1999.
59. Tanaka J, S.N., Tokimura F, Doi K, Inoue S. , Operative results of canal-expansive laminoplasty for cervical spondylotic myelopathy in elderly patients. . Spine, 1999. 24: p. 2308-2312.
72
60. Tani T, Y.H., Kimura J. , Cervical spondylotic myelopathy in elderly people: a high incidence of conduction block at C3-4 or C4-5. . J Neurol Neurosurg Psychiatry, 1999. 66: p. 456–464.
61. Suri A, C.R., Mehta VS, Gaikwad S, Pandey RM., Effect of intramedullary signal changes on the surgical outcome of patients with cervical spondylotic myelopathy. Spine J., 2003. 3(1): p. 33-45.
62. Andersen T, C.F., Laursen M, Hoy K, Hansen ES, Bunger C, Smoking as a predictor of negative outcome in lumbar spinal fusion. . Spine, 2001. 26: p. 2623–2628.
63. Glassman SD, A.S., Parker A, Burke D, Johnson JR, Dimar JR The effect of cigarette smoking and smoking cessation on spinal fusion. Spine, 2000. 25: p. 2608–2615.
64. Concato J, F.A., Holford TR. , The risk of determining risk with multivariable models. Annals of Internal Medicine, 1993. 118: p. 201-210.
65. Geoffrey R. Norman, D.L.S., Biostatistics: The Bare Essentials. 2008, People's medical publishing house Shelton.
66. Feinstein, A., Multivariate analysis: an introduction. . 1996, London: Yale Univ Pr.
67. Rothman KJ, G.S., Modern epidemiology. . 1998, Philadelphia: Lippincott-Raven. 68. Vickers AJ, A.D., Analysing controlled trials with baseline and follow up
measurements. . BMJ, 2001. 323: p. 1123-1126. 69. Pascal -Moussellard H, D.L.-R., Olindo S, Rouvillain J-L, Catonné Y
Neurological recovery after cervical cord decompression for canal stenosis myelopathy. Elsevier Masson SAS, 2006. 91: p. 607-614.
70. Singh A, C.H., Quantitative assessment of cervical spondylotic myelopathy by a simple walking test. Lancet 1999. 354: p. 370–373.
71. Demir A, R.M., Moonen CT, Vital JM, Dehais J, Arne P, Caillé JM, Dousset V., Diffusion-weighted MR imaging with apparent diffusion coefficient and apparent diffusion tensor maps in cervical spondylotic myelopathy. Radiology, 2003. 229(1): p. 37-43.
72. Facon D, O.A., Fillard P, Lepeintre JF, Tournoux-Facon C, Ducreux D., MR diffusion tensor imaging and fiber tracking in spinal cord compression. AJNR Am J Neuroradiol, 2005. 26(6): p. 1587-1594.
73. Budzik JF, B.V., Le Thuc V, Duhamel A, Assaker R, Cotten A., Diffusion tensor imaging and fibre tracking in cervical spondylotic myelopathy. Eur Radiol., 2010.
74. Sagiuchi T, T.S., Endo M, Hayakawa K., Diffusion-weighted MRI of the Cervical Cord in Acute Spinal Cord Injury With Type II Odontoid Fracture. J Comput Assist Tomogr. , 2002. 26(4): p. 654-656.
73
TABLES
CHAPTER 3: Systematic review Table 3.1: presents criteria in a modified version of quality assessment checklist Yes No Comments
Source description Was the source of participants adequately described?
Referral pattern Was the recruitment method adequately described? eg. Representative sample: participants were selected as consecutive or random cases.
Patients characteristics Was the population of interest adequately described for key characteristics: severity, co-morbidity, inclusion/exclusion criteria, age and sex? Yes, if all characteristics are reported. No, if the description is limited to age and sex characteristics, or none.
Representative sample
Sample size Was the sample size large enough? The rule of thumb: At least 10 cases per independent variable are required at a power of 80% and a 5% significance level (eg. The author runs a comparison for age, sex, symptom duration, pre-/post-operative neurological scores, etc).
Blinding Blinded assessor Were MRI assessors involved in the study blinded to clinical data? eg. Blinded outcome assessment: assessor was unaware of prognostic factors at the time of outcome assessment.
Baseline comparability Compared baseline performance of clinical status Is baseline performance of clinical status measured? If yes, is the absolute difference between the groups less than 10%? If yes, score the quality criterion as YES. If no, did the analysis take into consideration the baseline imbalance (for example, analysis of co-variance or analysis by change scores between groups? eg. Statistical adjustment: multivariate analyses conducted with adjustment for potentially confounding factors. If yes, score the quality criterion as YES. If no, score the quality criterion as NO. Otherwise, if no comparison is completed, then NA
74
Compared baseline performance of other predictive variables Is baseline performance of age, sex and symptom duration measured? If yes, is the absolute difference between the groups less than 10%? If yes, score the quality criterion as YES. If no, did the analysis take into consideration the baseline imbalance (for example, analysis of co-variance or analysis by change scores between groups? eg. Statistical adjustment: multivariate analyses conducted with adjustment for potentially confounding factors. If yes, score the quality criterion as YES. If no, score the quality criterion as NO. Otherwise, if no comparison is completed, then NA
Complete Was follow up reported? If yes, was follow-up complete? Follow-up >80%: outcome data were available for at least 80% of participants at one follow-up point. If not, then score the quality criterion as NO.
Comparison of drop outs with remained Were those followed up comparable to those who dropped out?
Follow-up
Reasons of drop outs Were reasons for loss to follow-up provided?
Valid Were outcome measures adequately valid? Yes, if the prognostic study tested the validity of measurements used or referred to other studies which had established validity. Otherwise, no.
Validation of outcome measurement
Reliable Were outcome measures adequately reliable? Yes, if the prognostic study tested the reliability of measurements used or referred to other studies which had established reliability. Otherwise, no.
Validation of predictive factor measurement
Defined Were definitions or descriptions of MRI predictor adequately provided? Yes, if there is clear indication of measurement method such as detailed description of MRI protocol including planes (axial/sagittal and thickness of slices). Otherwise, no.
75
Reliable Were predictive factors measures adequately reliable? Yes, if inter/intra-observer reliability tests with/without coefficient value are reported (eg. Cronbach alpha or Kappa coefficients). Otherwise, no.
76
Table 3.2: Presents the summary of methodological limitations in a format of modified version of quality assessment checklist designed by the Cochrane collaboration group et al (2007) [No-0, Yes -1].
Representative sample Blinding Baseline comparability Follow up
Table 3.3: Study design, sample size, type of outcome measures and level of evidence
Citation Study design Sample N=
Outcome measure scale Level of Evidence*
Nagata et al. 1990 Prospective cohort
300 JOA IV Follow up No
Inception point No Fukushima et al. 1991 Prospective cohort
55 JOA I
Follow up YES Inception point YES
(onset) Yukawa et al. 2007 Prospective cohort
142 JOA IV
Follow up NO Inception point NO
Yone et al. 1992 Prospective cohort
140 JOA IV Follow up NO
Inception point NO Okada et al. 1993 Prospective cohort
74 JOA IV
Follow up YES Inception point NO
(symptom duration?) Papadopolous et al. 2004
Prospective cohort
42 JOA IV Follow up YES
Inception point NO (symptom duration?)
Singh et al. 2001 Prospective cohort
69 Walking Test I Follow up YES
Inception point YES (surgery)
Chen et al. 2001 Prospective cohort
64 mJOA IV Follow up YES
Inception point NO Mastronardi et al.2007 Prospective cohort
42 mJOA I
Follow up YES Inception point YES (onset of symptoms)
79
Fernandez et al. 2007 Prospective cohort
67 mJOA I Follow up YES
Inception point YES (3 months before surgery)
Nagata et al. 1996 Retrospective cohort
173 JOA IIc
Uchida et al. 2005 Retrospective cohort
135 JOA IIc
Kasai et al. 2001 Retrospective cohort
128 JOA IIc
Chung et al. 2002 Retrospective cohort
113 JOA IIc
Wada et al. 1999 Retrospective cohort
85 JOA IIc
Morio et al. 2001 Retrospective cohort
73 JOA IIc
Yamazaki et al. 2002 Retrospective cohort
64 JOA IIc
Wada et al. 1995 Retrospective cohort
31 JOA IIc
Houten et al 2002 Retrospective cohort
38 mJOA IIc
Park et al 2006 Retrospective cohort
80 NCSS IIc
Mizuno et al. 2003
Case series study 134 JOA IV
Matsuyama et al., 2004
Case series study 44 JOA IV
Matsuda et al. 1991
Case series study 29 JOA IV
* http://www.eboncall.org/content/levels.html: NHS R&D Centre for Evidence-Based Medicine (Bob Phillips, Chris Ball, Dave Sackett, Brian Haynes, Sharon Straus and Finlay McAlister) (2002)
80
Table 3.4: Data extracted were groups of MRI features (signal intensity, spinal cord compression and spinal canal compromise) Table 3.4 (I): Descriptions of increased signal intensity (ISI) of the spinal cord in T2-/T1-weighted MRI
Predictive variable Author Method assessments:
Matsuda et al. 1991 1.5-tesla superconductive magnet* and a surface coil was used. The slices were from 3 to 5 mm thick.
Papadopolous et al. 2004 No description
Absence/presence of T2 signal intensity changes on sagittal view
Yukawa et al. 2007 1.5-T A surface coil was used. The slice width was 4 mm Absence/presence of T2 signal intensity changes on axial views
Mizuno et al. 2003
Snake-eye appearance was defined as one left- and one right-sided small round or elliptical high signal intensity lesion in the central gray matter near the ventrolateral posterior column
Absence/presence of T2 signal intensity changes (type of plane is not mentioned)
Type 0 no SI on T2 Type 1 (>50%) faint and fuzzy border Type 3 (>50%) intense and well-defined border
Three patterns of axial T1/sagittal T2 –weighted sequences
Morio et al. 2001 Alafifi et al. 2007 Mastronardi et al.2007
(A) normal intensity on both T1- and T2-weighted images (B) normal intensity on T1- weighted and high signal intensity on T2-weighted images (C) low signal intensity on T1-weighted and high signal intensity on T2-weighted images
Signal-intensity ratio on sagittal T2-WI
Okada et al. 1993
The intensity of the intramedullary, sagittal T2-weighted MRI cord signal at maximal compressed levels divided by comparable readings at contagious noncompressed sites
81
Table 3.4 (II): Descriptions of degree of spinal cord compression and/or canal compromise for cervical spondylotic myelopathy by magnetic resonance imaging (MRI) finding
Predictive variable Author Method assessments:
Yone et al. 1992
No description Slice thickness: 5 mm
Anterioposterior diameter on sagittal T1WI
Kasai et al. 2001 A 1.5-T MRI device The slice width was set at 5 mm and the number of slices at 7.Sagittal view of T1-/T2-weighted images MRI cumulative score: 6 degrees of spinal stenosis captured on T1/T2-weighted sagittal imaging: Grade 0: normal image; Grade 1: either the anterior or posterior subarachnoid space is not maintained; Grade 2: both the anterior and posterior subarachnoid spaces are not maintained; Grade 3: either anterior or posterior spinal cord deformity, but the posterior or anterior subarachnoid space is maintained; Grade 4: either anterior or posteror spinal cord deformity is observed, and the posterior or anterior subarachnoid space is not maintained; Grade 5: spinal cord deformity is observed both anteriorly and posteriorly
Degrees of spinal cord on sagittal T1WI
Nagata et al.1996 None (0) Mild (1; flattening or concavity of the anterior surface only) Moderate (2; <50% reduction in maximal sagittal diameter) Severe (3; >50% reduction in sagittal diameter)
Okada et al. 1993 The transverse area at the site of maximal cord compression was measured with a digitizer linked to a computer
Transverse area on axial T1WI
Fukushima et al. 1991 MRI axial views perpendicular to the spinal cord were obtained with a 0.5 tesla superconducting MRI system Critical value of transverse area is 0.45 cm2
82
Chung et al. 2002 Thickness of slices was not reported Pre-operative T1-weighted axial imaging with a Signa 1.5-tesla Compression ratio=a/b: a Smallest sagital diameter of the spinal cord, b broadest transverse diameter of the cord at the same level
Chen et al. 2001
Cord compression ratio = sagittal diameter/transverse diameter The imagers were superconducting 1.5-T MR systems Section thickness was 4 mm with 1-mm gap on both sagittal and transverse images.
Compression ratio on axial T1-weighted
Okada et al. 1993
(Saggital diameter/transverse diameter)*100% MRI examinations were performed with a 0.5 Tesla Slice thickness =10 mm
Degree of diameter on sagittal view
Houten et al.2002 Thickness is not reported Grade 0: 360 degree cushion of CSF around SC Grade 1: loss of CSF cushion without indentation of SC. May have slight anterior cord flattening Grade 2: mild cord compression Grade 3: Severe spinal cord compression
Table 3.4 (III): Area of high T2-signal change for cervical spondylotic myelopathy by magnetic resonance imaging (MRI) finding
Predictive variable Author Method assessments:
Wada et al. 1999 1.5-T with surface coil. Slice thickness =3-5 mm Mastronardi et al.2007 1.5-T with surface coil. Slice thickness =5 mm
Focal/ multisegmental high MRI intensity areas Fernandez et al. 2007
No thickness of slices was reported Type 0 no intramedullary high-signal intensity on T2-weighted images Type 1 high-signal intensity involved only one segment Type 2 high signal intensity extended over two segments
83
Table 3.5: Potential predictors with reported for univariate analyses and strength of association where available short (less than 6 months) and long (greater than 6 months) terms follow –up. Table 3.5 (I): Signal intensity changes as potential predictors
Prognostic factors Author Outcome Length of follow-up
Statistical significance
Strength of association
Yukawa et al 2007 JOA Long term p=0.033 p=0.0012
NA * NA **
Yone et al 1992 JOA Unknown p>0.05 NA * Papadopolous et al 2004 JOA Long term p>0.05
p<0.001 NA * NA **
Absence/presence of T2 signal intensity changes on sagittal view
Matsuda et al 1991 JOA Short term p<0.05 p<0.05
NA * NA **
Wada et al 1995 JOA Short term p>0.05 p>0.05
NA * NA **
Yamazaki et al 2002 JOA Long term p>0.05 NA *
Absence/presence of T2 signal intensity changes on axial/sagittal views
Chung et al 2002 JOA Long term p>0.05 NA * Absence/presence of T2 signal intensity changes on axial views
Mizuno et al 2003
JOA Short term p<0.001 NA *
Yukawa et al 2007 JOA Long term p=0.020 NA * Chen et al 2001 JOA Long term p=-0.018 NA *
Degree of intensity on sagittal T2WI
Uchida et al 2005 JOA Long term p>0.05 NA * Three patterns of axial T1/sagittal T2 –weighted sequences
Morio et al 2001 JOA Long term p = 0.0259 NA *
Signal-intensity ratio on sagittal T2-WI Okada et al 1993 JOA Unknown p<0.001 r=0.537 OPLL * r=0.426 CSM *
Fernandez et al 2007 mJOA Long term p>0.05 NA ** Absence/presence of T2 signal intensity changes on sagittal view Houten et al 2003 mJOA Short term p>0.05 NA ** Three patterns of axial T1/sagittal T2 –weighted sequences
Mastronardi et al 2007 mJOA Long term p=0.001 NA **
Absence/presence of T2 signal intensity changes (type of plane is not mentioned)
Singh et al 2001
Nurick Walking
Short term
p=0.03 p=0.0011
r=0.26 ** NA **
Area of signal intensity changes Wada et al 1995 JOA Short term p>0.05 NA *
84
p>0.05 NA ** Wada et al 1999 JOA Long term p<0.05
p<0.05 NA * NA **
Mastronardi et al 2007 mJOA Long term p=0.001 p<0.05
NA * NA **
Fernandez et al 2007 mJOA Long term p=0.001 NA * Table 3.5 (II): Severity of spinal cord compression as potential prognostic indicator Prognostic factors Author Outcome Length of
follow-up Statistical significance
Strength of association
Yone et al 1992
JOA Unknown p>0.05 p>0.05
NA * NA **
Anterioposterior diameter on sagittal T1WI
Kasai et al 2001 JOA Long term p<0.01 r=-0.436 * Degrees of spinal cord on sagittal T1WI Nagata et al 1996 JOA Long term p<0.05 NA **
Okada et al 1993 JOA Unknown p<0.01 (CSM/OPLL)
r=0.678/0.586 *
Fukushima et al 1991 JOA Long term p<0.05 r=0.295**
Transverse area on axial T1WI
Morio et al 2001 JOA Long term p=0.0517 p=0.0015
r=0.243 * r=0.398 **
Okada et al 1993 JOA Unknown p>0.05 NA * Chen et al 2001 JOA Long term p=0.836 r=0.026 *
Compression ratio on axial T1-weighted
Chung et al 2002 JOA Long term p<0.05 NA * Uchida et al 2005 JOA Long term p<0.05 in OPLL
p>0.05 in CSM NA ** NA **
Rate of flattening of the cord
Nagata et al 1990 JOA Long term p>0.05 NA * Grade 0 360 degree cushion of CSF around SC on…..
Houten et al 2003
mJOA Short term NA NA **
Degree of diameter on sagittal view Singh et al 2001 Nurick Short term p=0.60 r=0.07 ** Cord deformity on axial T1-weighted MRI Matsuyama et al 2004 JOA Short term NA
NA NA * NA **
*- recovery rate ** - post –operative functional score
85
Table 3.6: RESULTS - PREVIOUS PREDICTIVE MODELS
Study Name Year
Population Number
Fashion of selection
Range of years
Data collection
Statistics Outcome Measure
Recovery percentage
& Mean post-operative
score
Explained variation
(r2)
Variables in final model
Park 2006
80 Non-consecutive CSM cases 2000-2003 3 months after surgery
Patients charts
Stepwise, multivariate regression
NCSS Recovery (%) Maximum score 14
62.2% 25.2% Duration of symptoms Number of high intensity segments
Chen 2001
64 consecutive CSM cases, 1999-2000 6 months after surgery
Clinical database
ANCOVA mJOA Recovery (%) Maximum score 21
79.3%
47.9% Age Degree of intrinsic signal changes
Morio 2001
1998-1999 Non-consecutive CSM cases, Mean 3.4 years, range, 0.5–10 years after surgery
Clinical database
Stepwise, multivariate regression
JOA Recovery (%)
& Mean post score Maximum score 17
180% 14.5
29.7% 70.3%
Recovery percentage: Age Duration of symptoms Signal patterns Post-JOA: Age Duration Signal patterns
86
Baseline score Okada 1993
74 non-consecutive CSM cases No follow-up time was provided
Clinical database
Multiple regression analysis
JOA Recovery (%) Maximum score 17
(OPLL) 54.7% (CSM) 52.2% (CDH) 12.7%
71.8% 70.2%
Transverse area Signal Intensity ratio Duration of symptoms
Uchida 2005
1988-2001 Non-consecutive OPLLCSM cases Mean 8.3 years, range, 1.0–12.8 years
CSM group: Anterior Surgery Preoperative JOA score Crandall and Batzdorff’s type Radiographic abnormality Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP Laminoplasty Surgery Preoperative JOA score Crandall and Batzdorff’s type
87
Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP OPLL group Anterior Surgery & Laminoplasty Preoperative JOA score Crandall and Batzdorff’s type Spinal canal narrowing (preoperative CT) Type of OPLL Level of compression Rate of flattening of the cord Increased transverse area of the cord Grade of SCEP
88
CHAPTER 4: Material and Methods
Table 4.1: The mJOA scale for functional assessment for CSM* Score I. Motor dysfunction 0 Inability to move hands 1 Inability to eat with a spoon but able to move hands 2 Inability to button shirt but able to eat with a spoon 3 Able to button shirt with great difficulty 4 Able to button shirt with slight difficulty 5 No dysfunction II. Motor dysfunction of the lower extremities 0 Complete loss of motor and sensory function 1 Sensory preservation without ability to move legs 2 Able to move legs but unable to walk 3 Able to walk flat floor with a walking aid (such as a cane
or crutch) 4 Able to walk up and/or down stairs with hand rail 5 Moderate to significant lack of stability but able to walk
up and/or down stairs without hand rail 6 Mild lack of stability but walks unaided with smooth
reciprocation 7 No dysfunction III. Sensation 0 Complete loss of hand sensation 1 Severe sensory loss of pain 2 Mild sensory loss 3 No sensory loss IV. Sphincter dysfunction 0 Inability to micturate voluntarily 1 Marked difficulty with micturation 2 Mild to moderate difficulty with micturation 3 Normal micturation *From Benzel and colleagues, 1991.
89
Table 4.3: Standard parameters for cervical spine T1- and T2-weighted Magnetic Resonance Image (MRI) used in our study PROTOCOL: C-Spine 1.5T - start w/ 3-pl Loc & Asset Cal Series # 3 4 5 Scan Pl. / Mode Sag T2 Sag T1 Ax 3D T2 Pulse Sequence FrFSE FrFSE 3D FrFSE PSD File NPW, EDR NPW FC Name & Fast Imaging FR Options TR* / R-R#** 3200-6887 467-2616 2000-2500 TE1 / TE2* 110-119 10.1 97-106 ETL (Echo Train Length) 24-33 1--6 39 FOV (Field of View) 24-26 24-26 18-24 Slice Thickness 3 3 2.5 Spacing*** 3.3-3.5 3.3-3.5 2.5 # of Slices 13-18 13-18 24-80 Matrix 512X224 512X224 320X224 Phase FOV (Field of View) Frequency Direction A/P A/P R/L Number of excitation 2--4 1--2 1 Shim on Spatial Sat I,S,a,p I,S,a,p a Scan Time 0:26-17:22 0:55-23:43 4:35-5:44
Table 5.1: Characteristics of Patients with Cervical Spondylotic Myelopathy
% (No. of Patients) Characteristics n=61 Mean duration of symptoms ± SD (months) 21.1±18.2 Mean age ± SD (y) 56.2±11.9 Mean age (years)** <=65 years old 75% (46) >65years old 25% (15) Gender Female 31%(19) Male 69%(42) Severity of CSM*** Mild (mJOA>=15) 32% (19) Moderate (mJOA 12-14) 34% (21) Severe (mJOA<12) 34% (21) Anatomical level of stenosis C3/C4 9% (6) C4/C5 13% (8) C5/C6 25% (15) C6/C7 49% (30) Unknown 3% (2) Number of stenotic levels** One 45% (26) Two 23% (13) Three and more 32% (18) Unknown 6% (4) Signal intensity changes
91
Normal T1/Norm T2 20 (34%) Normal T1/High T2 28 (47%) Low T1/High T2 11 (19%) Surgical approach Anterior approach 42 (67%) Posterior approach 18 (30%) Anterior & posterior approach 1 (3%) Etiologies of myelopathy One etiology OPLL 6% (4) Spondylosis 37% (24) Disk 17% (11) Hypertrophic ligament flavum 2% (1) Subluxation 2% (1) Two etiologies 29% (19) Three etiologies 5% (3) Unknown 3% (2) Table 5.2: Values of the mJOA in CSM sample Baseline 12 months Change Score 95% CI for
change score mJOA functional scale
12.9+/-2.7 15.8+/-2.3 2.93+/-2.4 2.32-3.55
NOTE. Values are mean +/- SD. Abbreviation: CI, confidence interval.
92
Table 5. 3: Correlation matrix and coefficients between functional outcomes and independent variables Age Gender Duration of
symptoms Baseline score Signal
intensity changes
Transverse area Anteroposterior diameter
Number of compressed segments
Age 1.00 Gender 0.03 1.00 Duration of symptoms
0.27 -0.12 1.00
Baseline score 0.44 0.05 0.26 1.00 Signal intensity changes
0.24 0.12 0.13 0.13 1.00
Transverse area 0.27 0.13 0.08 0.29 0.39 1.00 Anteroposterior diameter
0.21 0.12 0.03 0.19
0.41 0.62
1.00
Number of compressed segments
0.20 0.12 0.20 0.32 0.24 0.35 0.23 1.00
93
Table 5.4: Unadjusted beta value estimates for independent variables (univariable analysis)
Variable Coefficient 95% CI P Value
R2
Baseline mJOA Age as dichotomized* <=65 years old >65 years old
-2.83 -1.42, -4.24 0.0002 0.20
Age as continuous -0.08 -0.13, -0.03 0.0051 0.12 Gender* 0.30 -1.15, 1.75 0.68 0.003 Duration of symptoms as dichotomized* <=12 months >12 months
-1.55 -2.97, -0.15 0.03 0.07
Duration of symptoms as continuous 0.00 -0.04, 0.04 0.97 0.00 TA as dichotomized 0.96 -0.4, 2.32 0.17 0.03 TA as continuous* 0.06 0.02, 0.10 0.02 0.08 AP diameter 0.43 -0.13, 0.99 0.14 0.03 Intensity signal changes* Low T1/high T2 vs. Normal T1/High T2 Low T1/high T2 vs. Normal T1/Norm T2
0.75 0.99
-0.16, 2.64 -0.96, 2.98
0.61 0.02
Number of compressed segments* ≥ 3 vs. 2 compressed segments ≥ 3 vs. 1 compressed segment
2.35 1.06
0.57, 4.13 -1.01, 3.13
0.04
0.10
Final mJOA
Baseline mJOA* 1.014 <.0001 0.30 Age as continuous -1.002 -3.005, 1.001 0.01 0.11 Age as dichotomized* <=65 years old >65 years old
-1.072 -3.110 , 0.966 <.0001 0.22
94
Gender* -1.018 -3.057, 1.022 0.33 0.06 Duration of symptoms as continuous 1.0 -1.003 , 3.003 0.72 0.002 Duration of symptoms as dichotomized* <=12 months >12 months
-1.034 -3.075, 1.007 0.09 0.05
TA as continuous* 1.0 -1.005, 3.005 0.24 0.02 TA as dichotomized 1.01 -1.030, 3.050 0.56 0.01 AP diameter as continuous 1.005 -1.012, 3.022 0.49 0.01 Intensity signal changes* Low T1/High T2 vs. Normal T1/High T2 Low T1/High T2 vs. Normal T1/NormT2
1.016 1.038
-1.037, 3.069 -1.017, 3.093
0.33
0.04
Number of compressed segments* ≥ 3 vs. 2 compressed segments ≥ 3 vs. 1 compressed segment
-1.00 1.03
-3.055 , 1.055 -1.017, 3.077
0.78
0.01
* Chosen exposure variables for multivariable analysis
95
Table 5.5: Statistical details of full models (multivariable analysis)
Dependent Variable Independent Variables
Coefficient 95% CI MSE for the Model
P Value for the Model
Adjusted R2 for the Model
Baseline mJOA score Age -2.83 -1.420, -4.240 2.44 p=0.0002 20% Follow-up mJOA score adjusted for baseline mJOA score
Age -1.04 -3.081, 1.001 0.06 p<0.0001 36%
96
FIGURES Figure D.1: Measurements for the antero-posterior diameter (AP) (A) and transverse area (TA) measurements of the spinal cord using T2-weighted MR image (B).
Figure D.2: T1-weighted image of the sagittal view revealing hypointensity in the spinal cord (C) and T2-weighted image of the sagittal view showing hyperintensity in the spinal cord (D) before surgery (arrow).
97
Figure D.3: (E) Focal compression (F) Multiple level of compression.
Figure D. 4: Distribution of baseline mJOA scores.
98
Figure D. 5: Distribution of post-operative mJOA scores at 12 months.
99
CHAPTER 8
APPENDICES Appendix 1 Search strategy (results: November 28, 2008) Database Searches # Ovid MEDLINE(R) 1. Magnetic Resonance Imaging/
24. course*.tw. 25. diagnosed.tw. 26. cohort*.tw. 27. death.tw. 28. exp case-control studies/ 29. disease-free survival.mp. 30. medical: futil:.mp. 31. treatment outcome:.mp. 32. treatment failure:.mp. 33. exp disease progression/ 34. (disease adj1 progress:).mp. 35. fatal outcome:.mp. 36. hospital mortality:.mp. 37. exp survival analysis/ 38. natural histor:.mp. 39. or/16-38 40. exp Spinal Cord Compression/ 41. cervical spondylotic myelopath:.mp. 42. cervical spond: myelopath:.mp. 43. (cervical adj2 myelopath:).mp. 44. spinal canal compromis:.mp. 45. spin: cord compress:.mp. 46. central cord syndrome/ 47. medulla: compress:.mp. 48. (spinal cord: adj2 pinch:).mp. 49. conus medullaris syndrome.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 50. conus medullaris syndromes.mp. [mp=title, original title, abstract, name of substance word, subject heading word] 51. or/40-50 52. 39 and 51 and 15 53. exp animals/ 54. exp human/ 55. 53 not (53 and 54) 56. 52 not 55
102
Appendix 3: RELIABILITY
A comparison of four quantitative methods to assess spine stenosis and
spinal cord compression on magnetic resonance imaging in patients with
cervical spine myelopathy
F. 1. INTRODUCTION AND OVERVIEW F. 2. STUDY OBJECTIVE F. 3. HYPOTHESIS F. 4. STUDY DESIGN F. 5. TARGET POPULATION F. 6. DEFINITION OF MR IMAGING PARAMETERS
F.6.1 Strategies to improve reliability of MR imaging parameters (variation due to) F.6.1. a. Clinicians
F.6.1. b. Patients F.6.1. c. MR Imaging protocol F.6.1. d. Measurement errors
F. 7. SAMPLE SIZE F. 8. DATA ANALYSIS
INTRODUCTION AND OVERVIEW
Lack of standardized approaches to assess the severity of cord compression in the
setting of CSM may contribute to variability in interpretations of MRI-based features.
The process of developing a radiological measure to assess severity of CSM requires
selection of most suitable imaging modality, appraisal of reliability and determination of
validity. A valid measurement instrument needs to be reliable or reproducible, even
though reliability is not sufficient condition for validity. Reliability measures the degree
of consistency across repeated assessments of different patients by the same rater
(intrarater reliability) or agreement across different raters for the same patient (interrater
reliability). The estimate of reliability is significant because: 1) reliability represents the
minimal requirement for a valid clinical measure, and 2) efficiency of clinical trials relies
on reliable measurements.
103
STUDY OBJECTIVE
The objective is to investigate the intra- and inter-reliability of four published
methods of examining cord stenosis and canal compression on axial (transverse area
[TA], anteroposterior diameter [AP]) and sagittal (maximum spinal cord compression
[MSCC], maximum canal compromise [MCC]) MR imaging planes.
HYPOTHESIS
We hypothesize that using a systematic approach to evaluate cervical canal
stenosis and spinal cord compression with a magnified software based tools, written
instructions and consistent interpretations, TA, AP, MSCC and MCC would be
reproducible imaging assessment of the severity of cord compression in CSM patients,
irrespective of clinician’s background/experience, learning and CSM severity.
STUDY DESIGN Subjects:
The patients were randomly selected from a prospectively accrued database of
CSM patients who were referred for surgical treatment in our unit which is a large tertiary
care, university-based spine center.
Procedures:
Seventeen cervical spine digital MR images were evaluated by four spine
specialists (two neurosurgery, two orthopaedic surgery), in a blinded fashion on four
separate occasions, from North America (n=2), Europe (n=1) and Asia (n=1).
TARGET POPULATION
The patients had a clinical diagnosis of myelopathy confirmed by evidence of
cord compression on MRI. This project is based on analysis of a single centre (n=65)
which is part of a larger multicentre AOSpine North America CSM Trial; n=283 cases.
104
DEFINITION OF MR IMAGING PARAMETERS (EXPOSURE VARIA BLES)
Defined Radiological Parameters
Based on three dimensions from MRI, the maximum cord compression using T2-
weighted MRI and canal compromise using T1-weighted MRI were calculated using the
following formulas [9].
Maximum spinal cord compression (%):
where Di is the anteroposterior spinal cord diameter at the level of maximum spinal cord
compression, Da is the anteroposterior spinal cord diameter at the normal levels
immediately above, and Db is the anteroposterior spinal cord diameter at the normal
levels immediately below the level of injury (Figure F.1.).
Maximum canal compromise (%):
where Di is the anteroposterior spinal canal diameter at the level of maximum spinal cord
compression, Da is the anteroposterior spinal canal diameter at the normal levels
immediately above, and Db is the anteroposterior spinal canal diameter at the normal
levels immediately below the level of injury. Measurements of the normal canal
anteroposterior diameter should be taken at midvertebral body level.
Transverse area: was identified as the site of greatest compression using T2 axial view of
the spinal cord [3] (Figure F.2.).
Anteroposterior diameter: was identified as smallest sagittal diameter of the spinal cord,
[Yone et al 1992] (Figure F.2.).
105
Strategies to improve reliability of MR imaging parameters (variation due to)
Clinicians
First, raters were blinded to clinical and neurologic data. Second, raters assessed
the same patients on four occasions (or rounds), three days apart from each other to guard
against memory recall. Third, the scans were read individually and randomly. Fourth, for
validity of the experiment, the raters will be given the same images on all four occasions.
Fifth, the teaching session prior to the first round of measurements was conducted in one
meeting to ensure consistencies of images interpretations.
Patients
To ensure a range of symptoms severity for reliability testing, the modified
version of the Japanese Orthopaedic Association Scale (mJOA) (Table C.1), was used to
classify CSM into mild ( mJOA score >=15), moderate ( mJOA score 12-14) and severe
(mJOA< 12) degrees of functional disability. Of the seventeen subjects in this study, six
individuals had mild CSM (mJOA score >=15), five individuals had a moderate CSM
(mJOA score 12-14) and six individuals had severe CSM (mJOA score <12). As
described in Table F.1., the cases had varying numbers of levels of cord compression due
to a variety of different pathologies, which are commonly seen in clinical practice
including spondylosis, disc herniation, ossification of the posterior longitudinal ligament,
hypertrophy of the ligamentum flavum, degenerative subluxation and congenital stenosis.
MR Imaging protocol
The preoperative mid-sagittal T1-weighted, axial and midsagittal T2-weighted
MRI series of all patients were included in a CD-ROM with eFilm Lite (2003) and
0.01]). This observation is also supported by consistently increased level of agreement
among four raters from Session 1 to Session 4 (Table F.4). However, the time differences
are illustrated as normal fluctuations (i.e. random error) (Figure F.3), indicating that
there is no systematic error in the data.
The data illustrated in Table F.5. - F.8 show the effect for rater is
shown to be statically significant in all four methods of spine and canal stenosis
assessment based on three-way ANOVA with Bonferroni post-hoc analysis ([MSCC,
p<0.0001], [MCC, p <0.0001], [AP, p=0.0008], [TA, p <0.0001]).
111
Table F.5. Analysis of Variance summary table for maximum spinal cord compression
(MSCC) measurements data set
SOURCE OF VARIATION
Df MS F Sig
Between subjects 16 68.58 (BMS) 15.30 <0.0001 Within subjects Between raters 3 53.37 (RMS) 11.90 <0.0001 Between times 3 5.76 (TMS) 1.28 0.2813 Rater*time 9 3.41 (RTMS) 0.76 0.6520 Rater*subject 48 9.05 (RSMS) 2.20 0.0004 Error (EMS) 4.48
Table F.6. Analysis of Variance summary table for maximum canal compromise (MCC)
measurements data set
Source of variation Df MS F Sig Between subjects 16 75.21(BMS) 15.20 <0.0001 Within subjects Between raters 3 193.49 (RMS) 39.09 <0.0001 Between times 3 5.46 (TMS) 1.10 0.3489 Rater*time 9 5.75 (RTMS) 1.16 0.3212 Rater*subject 48 20.76 (RSMS) 4.20 <0.0001 Error 4.95 (EMS)
Table F.7. Analysis of Variance summary table for transverse area of spinal cord (TA)
measurements data set
Source of variation Df MS F Sig Between subjects 16 3696.59(BMS) 41.17 <0.0001 Within subjects Between raters 3 5094.48(RMS) 56.74 <0.0001 Between times 3 343.369(TMS) 3.82 0.0108 Rater*time 9 157.22 (RTMS) 1.75 0.08 Rater*subject 48 700.00(RSMS) 7.80 <0.0001 Error 89.78(EMS)
112
Table F. 8. Analysis of Variance summary table for anteroposterior diameter (AP) of
spinal cord measurements data set
Source of variation Df MS F Sig Between subjects 16 0.047 (BMS) 18.92 <0.0001 Within subjects Between raters 3 0.0148(RMS) 5.85 0.0008 Between times 3 0.005(TMS) 1.99 0.1175 Rater*time 9 0.0033(RTMS) 1.33 0.2243 Rater*subject 48 0.0068(RSMS) 2.71 <0.0001 Error 0.0025(EMS)
DISCUSSION AND CONCLUSION
This project enhances the understanding of challenges in MRI interpretations in
CSM population. First, the advantage of T2W is that it provides a visual contrast to the
spinal cord due to its bright CSF. In contrast, T1W imaging shows indistinct anatomy
regions of bony canal and spinal cord typically presented in CSM population. This is
likely why MSCC provides more reliable measurements than MCC on T1W technique
(Table F.3. - F. 4). However, both measurement methods demonstrate the ability to
provide degree of spinal cord compression relative to its own normal values. Second, the
applications of software used for transverse area and anteroposterior diameter of spinal
cord are underdeveloped to establish more accurate estimates of spinal cord deformities.
For example, the application software used to assess the anteroposterior diameter
measurements appeared to hold 1-digit numbers. We suspect that the repeated reduction
to 1 digit could cause systematic build-up of error in the calculation of ICC value. Further
research requires it to utilize more rigorous mathematical procedures.
In contrast to the previously published studies (reference TA and AP), the
refinement of two published MR imaging techniques such as the TA and AP diameter
method took place with improvements in the written instructions. Furlan et al. 2007
supported the hypothesis that the interrater and intrarater reliability of MR imaging
assessments techniques are enhanced using magnified digitized images and therefore
reduce procedural variability of the measurements. In our study, the MR scans were
113
consistently magnified across all cases. Lack of publications of quantified intra- and
inter-reliability of the measurement methods listed above limit further comparisons.
Based on the findings of our study, the variances in the severity of population,
clinicians’ experience and individual approaches of MR imaging reading appear to
influence the procedural variability of measurements. Therefore, future studies should
include these details in the descriptions of study design and discussions. First, all four
methods appeared to be significantly varied by the raters’ individual interpretations based
on CSM severity. Second, specialty training seems to influence the variability of
measurements. After completing review of the circumstances of third rater’s consistently
higher ratings, it seems reasonable to speculate that the differences between raters could
have been influenced to some extent by specialty training Table F. 3. While all raters
were fellowship trained spine surgeons, Rater 2 had orthopaedic compared to
neurosurgery residency training background. Third, some individual approaches
employed by raters that were not apparent at the stage of designing protocol but crucial
for future studies. First, clinician may have an internal subjective standard as to what they
believe to be the anatomical midline of the spine on MR imaging. Secondly, fluctuations
of the internal subjective standard with the selection of the most compressed site, which
is partially contributed by the tendency of multilevel involvement as the result of
degenerative changes of spine in CSM.
Limitations
One limitation associated with statistical analysis of reliability is averaging of
ratings. If more than one measurement were performed, the means of several trials are
usually used to estimate reliability. Averaging data can increase the reliability coefficient
by minimizing the magnitude of differences between measurements. In our study, the
reliability is reported for the mean of all trials. Yet, practitioners typically administer a
single trial when determining a measure.
There are some limitations regarding our study design that are potential sources
for an increased inter-observer variation and, therefore, reduced reliability. First, a study
with one single recruitment centre might potentially systematically under- or
114
overestimate measurement errors due to particular characteristics of patients. Second, the
position of patients during MR imaging scanning might affect the results. When the
positioning is slightly changed from flexion to extension, the dural sac cross sectional
area diminishes. Despite careful selection of images, at least one report of abnormal
positioning was recognized. Third, the variations due to lack of standardized features of
imaging protocol such as different slice thicknesses of MRI scans might effect the results.
Although it is true that not all MR images had similar slice thickness that might have
introduced some bias, majority of scans (11/17) had slice thickness of 2.50 mm, the rest
had higher thickness of 3 mm. Nevertheless, methods used for the scans in this study
reflected the typical protocols available during the study period. Fourth, clinicians’ area
of expertise trained at different institutions is another potential limitation. However, we
anticipate that these limitations are actually relatively minor and reflect real world issues.
References:
1. Montgomery, D.M. and R.S. Brower, Cervical spondylotic myelopathy. Clinical syndrome and natural history. [Review] [54 refs]. Orthopedic Clinics of North America. 23(3):487-93, 1992 Jul., 1992.
2. Chen, C.J., et al., Intramedullary high signal intensity on T2-weighted MR images in cervical spondylotic myelopathy: prediction of prognosis with type of intensity. Radiology. 221(3):789-94, 2001 Dec., 2001.
3. Okada, Y., et al., Magnetic resonance imaging study on the results of surgery for cervical compression myelopathy. Spine. 18(14):2024-9, 1993 Oct 15., 1993.
4. Morio, Y., et al., Correlation between operative outcomes of cervical compression myelopathy and mri of the spinal cord. Spine. 26(11):1238-45, 2001 Jun 1., 2001.
5. Fukushima, T., et al., Magnetic resonance imaging study on spinal cord plasticity in patients with cervical compression myelopathy. Spine. 16(10 Suppl):S534-8, 1991 Oct., 1991.
6. Feinstein, A., Clinical biostatistics: XLI. Hard science, soft data, and the challenges of choosing clinical variables in research. . Clinical Pharmacology & Therapeutics, 1977. 22(0): p. 485–498.
7. Henrica C.W. de Veta, C.B.T., Dirk L. Knola, Lex M. Boutera, When to use agreement versus reliability measures. Journal of Clinical Epidemiology, 2006. 59 p. 1033–1039.
8. Wright J. G. , F.A.R., Improving the reliability of orthopaedic measurements. The Journal of Bone and Joint Surgery, 1992. 74B(2): p. 287-291.
9. Fehlings MG, F.J., Massicotte EM, et al. , Interobserver and intraobserver reliability of maximum canal compromise and spinal cord compression for evaluation of acute traumatic cervical spinal cord injury. . Spine 2006. 31: p. 1719–1725.
115
10. Bednarik, J., Kadanka, Z., Dusek, L., Kerkovsky, M., Vohanka, S., Novotny, O., Urbanek, I., Kratochvilova, D. , Presymptomatic spondylotic cervical myelopathy: an updated predictive model. . European Spine Journal, 2008. 17: p. 421–431.
11. Kraemer HC, K.A., Statistical alternatives in assessing reliability, consistency and individual differences for quantitative measures: application to behavioral measures of neonates. Psychol Bull 1976. 83: p. 914–921.
12. Walter S.D., E., M., Donner, A. , Sample size and optimal designs for reliability studies. . Statistics in medicine, 1998. 17: p. 101-110.
13. Shrout, P.E., Fleiss, J.L., Intraclass Correlations: Uses in Assessing Rater Reliability. . Psychological Bulletin, 1979. 86(2): p. 420-428.
14. Fleiss JL, C.J., The equivalence of weighted kappa and intraclass correlation coefficient as measures of reliability. . Educ Psychol Meas, 1973. 2: p. 113–117.
15. Burdock EIF, H.A., A new view of interobserver agreement. Perspect Psychol 1963. 16: p. 373–384.
16. Morris, R., ed. Assessing the reliability of clinical measurement. 1997, ed. , 1st ed. Oxford: Butterworth-Heinemann. 1-18.
17. Weir, J.P., Quantifying test-retest reliability using the intraclas correlation coefficient and the SEM. Journal of Strength and Conditioning Research, 2005. 19(1): p. 231–240.
18. Furlan, J.C., Fehlings, M.G., Massicotte, E.M. Aarabi, B., Vaccaro, A.R. Bono, C.R., Madrazo, I. Villanueva, C., Grauer, J.N., Mikulis, M. , A quantitative and reproducible method to assess cord compression and canal stenosis after cervical spine trauma. . Spine, 2007. 32: p. 2083–2091.
19. Singh, A., et al., Clinical and radiological correlates of severity and surgery-related outcome in cervical spondylosis. Journal of Neurosurgery. 94(2 Suppl):189-98, 2001 Apr., 2001.
20. Boutin RD, S.L., Finnesey K. , MR imaging of degenerative diseases in the cervical spine. . Magn Reson Imaging Clin N Am 2000. 8: p. 471-490.
21. Emery, S., Cervical spondylotic myelopathy: diagnosis and treatment. . J Am Acad Orthop Surg 2001. 9: p. 376-88
Figure F.1: Measurements for the maximum spinal cord compression (MSCC) using T2-weighted MRI [Da,Dx,Db] and maximum canal compromise (MCC) using T1-weighted MRI [da,dx,db].
116
Figure F. 2: Measurements for the anteroposterior diameter (AP) and drawing of the transverse area (TA) of spinal cord using axial T2-weighted MRI.
117
118
Figure F. 3: These graphs illustrate that there was not a time dependency
(learning/fatigue) of the MCC, MSCC, AP and TA measurements for spine and canal
stenosis assessments.
119
Appendix 4 Grade of recommendation: Levels of Evidence Table (2002).
Grade of recommendation
Level of Evidence
Therapy: Whether a treatment is efficacious/ effective/harmful
Therapy: Whether a drug is superior to another drug in its same class
Prognosis Diagnosis Differential diagnosis/symptom prevalence study Economic and decision analysis
1a
SR (withhomogeneity*) of RCTs
SR (with homogeneity**) of head-to-head RCTs
SR (with homogeneity*) of inception cohort studies;CDR† validated in different populations
SR (with homogeneity*) of Level 1 diagnostic studies;CDR† with 1b studies from different clinical centres
SR (with homogeneity*) of prospective cohort studies
SR (with homogeneity*) of Level 1 economic studies
1b
Individual RCT (with narrow Confidence Interval‡)
Within a head-to-head RCT with clinically important outcomes
Individual inception cohort study with > 80% follow-up; CDR† validated in a single population
Validating** cohort study with good††† reference standards; or CDR† tested within one clinical centre
Prospective cohort study with good follow-up****
Analysis based on clinically sensible costs or alternatives; systematic review(s) of the evidence; and including multi-way sensitivity analyses
A
1c All or none§ All or none case-series Absolute SpPins and SnNouts†† All or none case-series Absolute better-value or worse-value analyses‡‡
2a SR (withhomogeneity*) of cohort studies
Within a head-to-head RCT withvalidated surrogate outcomes‡‡‡
SR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTs
SR (with homogeneity*) of Level >2 diagnostic studies
SR (with homogeneity*) of 2b and better studies
SR (with homogeneity*) of Level >2 economic studies
2b
Individual cohort study (including low quality RCT; e.g., <80% follow-up)
Across RCTs of different drugs v. placebo in similar or different patients with clinically important or validated surrogate outcomes
Retrospective cohort study or follow-up of untreated control patients in an RCT; Derivation ofCDR† or validated onsplit-sample§§§ only
Exploratory** cohort study with good†††reference standards; CDR† after derivation, or validated only on split-sample§§§ or databases
Retrospective cohort study, or poor follow-up
Analysis based on clinically sensible costs or alternatives; limited review(s) of the evidence, or single studies; and including multi-way sensitivity analyses
2c "Outcomes" Research; Ecological studies
"Outcomes" Research Ecological studies Audit or outcomes research
3a
SR (withhomogeneity*) of case-control studies
Across subgroup analyses from RCTs of different drugs v. placebo in similar or different patients, with clinically important or validated surrogate outcome
SR (with homogeneity*) of 3b and better studies
SR (with homogeneity*) of 3b and better studies
SR (with homogeneity*) of 3b and better studies
B
3b
Individual Case-Control Study
Across RCTs of different drugs v. placebo in similar or different patients but with unvalidated surrogate outcomes
Non-consecutive study; or without consistently applied reference standards
Non-consecutive cohort study, or very limited population
Analysis based on limited alternatives or costs, poor quality estimates of data, but including sensitivity analyses incorporating clinically sensible variations.
C 4
Case-series (andpoor quality cohort and case-control studies§§ )
Between non-randomised studies (observational studies and administrative database research) with clinically important outcomes
Source: Sackett DL, Straus SE, Richardson WS, Rosenberg WM, Haynes RB (2000) Evidence-based medicine: how to practice and teach EBM. Toronto: Churchill Livingstone.
1. These levels were generated in a series of iterations among members of the NHS R&D Centre for Evidence-Based Medicine (Bob Phillips, Chris Ball, Dave Sackett, Brian Haynes, Sharon Straus and Finlay McAlister).
2. Users can add a minus-sign "-" to denote the level of that fails to provide a conclusive answer because of: o EITHER a single result with a wide Confidence Interval (such that, for example, an ARR in an RCT is not statistically significant
but whose confidence intervals fail to exclude clinically important benefit or harm) o OR a Systematic Review with troublesome (and statistically significant) heterogeneity.
3. Grades of recommendation are shown as linked directly to a level of evidence. However levels speak only of the validity of a study not its clinical applicability. Other factors need to be taken into account (such as cost, easy of implementation, importance of the disease) before determining a grade. Grades that are currently in the guides link closely to the validity of the evidence - these will change over time to reflect better concerns that we highlight in the text of the guide or related CATs.
Notes * By homogeneity we mean a systematic review that is free of worrisome variations (heterogeneity) in the directions and degrees of results between individual studies. Not all systematic
reviews with statistically significant heterogeneity need be worrisome, and not all worrisome heterogeneity need be statistically significant. As noted above, studies displaying worrisome heterogeneity should be tagged with a "-" at the end of their designated level.
† Clinical Decision Rule. (These are algorithms or scoring systems which lead to a prognostic estimation or a diagnostic category)
‡ See comment #2 for advice on how to understand, rate and use trials or other studies with wide confidence intervals.
§ Met when all patients died before the Rx became available, but some now survive on it; or when some patients died before the Rx became available, but none now die on it.
§§ By poor quality cohort study we mean one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both exposed and non-exposed individuals and/or failed to identify or appropriately control known confounders and/or failed to carry out a sufficiently long and complete follow-up of patients. By poor quality case-control study we mean one that failed to clearly define comparison groups and/or failed to measure exposures and outcomes in the same (preferably blinded), objective way in both cases and controls and/or failed to identify or appropriately control known confounders.
§§§ Split-sample validation is achieved by collecting all the information in a single tranche, then artificially dividing this into "derivation" and "validation" samples.
†† An "Absolute SpPin" is a diagnotic finding whose Specificity is so high that a Positive result rules-in the diagnosis. An "Absolute SnNout" is a diagnostic finding whose Sensitivity is so high that a Negative result rules-out the diagnosis.
121
‡‡ Better-value treatments are clearly as good but cheaper, or better at the same or reduced cost. Worse-value treatments are as good and more expensive, or worse and equally or more expensive.
††† Good reference standards are independent of the test, and applied blindly or objectively to applied to all patients. Poor reference standards are haphazardly applied, but still independent of the test. Use of a non-independent reference standard (where the 'test' is included in the 'reference', or where the 'testing' affects the 'reference') implies a level 4 study.
** Validating studies test the quality of a specific diagnostic test, based on prior evidence. An exploratory study collects information and trawls the data (e.g. using a regression analysis) to find which factors are 'significant'.
*** By poor quality prognostic cohort study we mean one in which sampling was biased in favour of patients who already had the target outcome, or the measurement of outcomes was accomplished in <80% of study patients, or outcomes were determined in an unblinded, non-objective way, or there was no correction for confounding factors.
**** Good follow-up in a differential diagnosis study is >80%, with adequate time for alternative diagnoses to emerge (eg 1-6 months acute, 1 - 5 years chronic)
‡‡‡ Surrogate outcomes are considered validated only when the relationship between the surrogate outcome and the clinically important outcomes has been established in long-term RCTs.