The Neglected “R” in the Risk-Needs-Responsivity Model: A New Approach for Assessing Responsivity to Correctional Interventions Authors Grant Duwe, Ph.D. Research Director 1450 Energy Park Drive, Suite 200 St. Paul, MN 55108-5219 Email: [email protected]KiDeuk Kim, Ph.D. Senior Fellow, Urban Institute 2100 M Street NW Washington, DC 20037 Email: [email protected]1450 Energy Park Drive, Suite 200 St. Paul, Minnesota 55108-5219 651/361-7200 TTY 800/627-3529 www.doc.state.mn.us January 2019 This information will be made available in alternative format upon request. Printed on recycled paper with at least 10 percent post-consumer waste
34
Embed
The Neglected “R” in the Risk-Needs-Responsivity Model: A ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Neglected “R” in the Risk-Needs-Responsivity Model: A New
Approach for Assessing Responsivity to Correctional Interventions
In this study, we introduce a more rigorous, actuarial approach for assessing both
general and specific responsivity. In particular, we assess responsivity by estimating the
7
likelihood that an individual’s participation in an intervention will result in desistance. If
an individual participated in, say, CD treatment, what is the probability it would lead to
desistance? As we demonstrate later, this approach to responsivity assessment not only
accounts for the efficacy of an intervention but also the varying effects an intervention
has on individuals. By improving the process in which individuals are assigned to
interventions, we propose that an actuarial approach to responsivity assessment can help
achieve better recidivism outcomes.
Chemical Dependency Treatment in MnDOC
Shortly after their admission to prison in Minnesota, prisoners with at least six
months to serve in prison undergo a brief (20-40 minutes) chemical dependency (CD)
assessment conducted by a licensed assessor. CD assessors use DSM-IV criteria for
substance abuse in their diagnoses, which are based on both self-report and collateral
information. The criteria for abuse include problems at work or school, not taking care of
personal responsibilities, financial problems, engaging in dangerous behavior while
intoxicated, legal problems, problems at home or in relationships, and continued use
despite experiencing problems. The criteria for dependence, on the other hand, include
increased tolerance; withdrawal symptoms; greater use than intended over a relatively
long period of time, inability to cut down or quit; a lot of time spent acquiring, using, or
recovering from use; missing important family, work, or social activities; and knowledge
that continued use would exacerbate a serious medical or psychological condition. After
completing the assessment, CD assessors assign prisoners a rating of no need, moderate
need, or high need for CD treatment.
8
Even though most newly admitted offenders are considered to be chemically
abusive or dependent, the number of prisoners directed to CD treatment greatly exceeds
the number of CD treatment beds available. In fact, among prisoners who receive a CD
assessment, roughly one-fourth enter CD treatment during their confinement. As a result,
the Minnesota Department of Corrections (MnDOC) has used a relatively simple,
summative algorithm to prioritize prisoners for CD treatment.
The algorithm produces a score that ranges from a low of 0 points to a high of 40
points. Of the 40 possible points, 10 are based on the assessed need for CD treatment.
More specifically, prisoners are given 0 points for no need, 5 points for moderate need,
and 10 points for high need. Likewise, 10 of the 40 points are based on assessed
recidivism risk. As noted below, our sample contains prisoners released from Minnesota
prisons between 2003 and 2011. During this time, the MnDOC used the Level of Service
Inventory-Revised (LSI-R) to assess recidivism risk. Depending on their LSI-R score,
prisoners were given either 10 points (very high risk), 7 points (high risk), 4 points
(medium risk), or 0 points (low risk) for recidivism risk.
While risk and needs make up half of the 40 points in the algorithm, the offense
for which offenders are imprisoned accounts for a total 10 points. In particular, offenders
in prison for a felony DWI are given 10 points while those in prison for other offenses
receive 0 points. The final 10 points cover items related to factors such as mental illness,
traumatic brain injury, and a history of assaultive behavior. Based on the score (ranging
from 0 to 40) from the algorithm, prisoners are then given a CD treatment priority level
of 1 (score of 20 or higher), 2 (score between 14 and 19), or 3 (score of 13 or lower).
9
Priority level 1 prisoners are most likely to receive a CD treatment offer, followed by
those in priority level 2 and priority level 3.
A prior evaluation of the MnDOC's CD treatment showed it is effective in
reducing recidivism. Using propensity score matching to match 926 treated offenders
released in 2005 with 926 inmates who had been untreated, Duwe (2010) found that
treatment decreased the risk of recidivism by 17 percent for rearrest, 21 percent for
reconviction, and 25 percent for reimprisonment for a new felony offense. Moreover,
consistent with earlier research (Wexler et al., 1990), the results showed that increased
treatment time appeared to lower the risk of recidivism, but only up to a point. While
short-term (90 days) and medium-term (180 days) programs had a statistically significant
impact on all three recidivism measures, no significant effects were found for long-term
(365 days) programming.
Data and Method
Our overall sample consists of 23,034 offenders released from Minnesota prisons
between 2003 and 2011 who had been assessed for chemical dependency. Within this
sample, there were 2,314 females and 20,720 males. Each of the 23,034 prisoners were
given a treatment need level from one of three categories—high need for treatment,
moderate need, and no need—as well as a CD treatment priority level. Of the 23,034
prisoners, a total of 5,414 (24 percent) participated in CD treatment during their
confinement prior to their release.
While the treatment need level (high, moderate or no need) provides us with the
assessed CD treatment needs for the prisoners in our sample, we developed predictive
models for assessments of recidivism risk and responsivity. More specifically, because
10
there are important gender differences with respect to risk and needs, we initially
separated our overall sample into males (N = 20,720) and females (N = 2,314). Next, we
separated these samples into three sets by the year prisoners were released from prison.
Our first set, the training set, consisted of individuals (10,517 males and 1,250
females) released from Minnesota prisons between 2003 and 2007. Our second set, the
test set, contained individuals (4,876 males and 556 females) released from prison
between 2008 and 2009. Our final set, the validation set, consisted of individuals (5,327
males and 509 females) released from prison in either 2010 or 2011.
Focusing first on the assessment of recidivism risk, we developed predictive
models on the training set data. As shown in Tables 1 and 2, our dataset contained a total
of 36 predictors that are available when a MnDOC prisoner goes through intake at the
time of admission to prison. These predictors encompass items commonly found to be
predictive of recidivism, such as criminal history, age at release, gang affiliation, and
marital status. We also include items such as prison admission and offense type. Our
measure of recidivism is reconviction for a misdemeanor, gross misdemeanor, or felony
within three years of release from prison. We obtained reconviction data on all 23,034
prisoners from the Minnesota Bureau of Criminal Apprehension. As shown in Tables 1
and 2, females had lower recidivism rates compared to males.
For both males and females, we used two different classification methods—
logistic regression and random forests—to develop predictive models on the training set.
Over the last few decades, regression modeling has been increasingly used to develop
prediction tools in the criminal justice field (Brennan and Oliver, 2000; Duwe, 2012;
Duwe, 2014; Duwe and Freske, 2012; Lowenkamp and Whetzel, 2009), while the use of
11
Table 1. Descriptive Statistics for Male Prisoner Sample
Predictors Description Mean and SD Training Test Validation
Static/Criminal History Mean SD Mean SD Mean SD
Total Convictions Total # of convictions (any offense level) 11.05 8.47 13.05 9.49 14.15 10.46 Felony Convictions Total # of felony convictions 1.70 1.68 2.29 2.05 2.82 2.38
Felony Specialization/Diversity Degree of specialization/diversity in felony offenses 0.86 0.26 0.87 0.24 0.85 0.25
Violent Convictions Total # of violent offense convictions 1.46 1.81 1.69 2.04 1.87 2.09 Violent Specialization/Diversity Degree of specialization/diversity in violent offenses 0.91 0.21 0.92 0.19 0.92 0.19
Total Assault Convictions Total # of assault offense convictions 0.93 1.51 1.11 1.70 1.24 1.77
Total Robbery Convictions Total # of robbery convictions 0.17 0.57 0.20 0.68 0.20 0.67 VOFP Convictions Total VOFP, stalking and harassment convictions 0.14 0.54 0.22 0.73 0.32 0.89
Disorderly Conduct Convictions Total # of disorderly conduct convictions 0.09 0.34 0.15 0.46 0.24 0.63
Prostitution Convictions Total # of prostitution offense convictions 0.01 0.12 0.01 0.14 0.01 0.15 Drug Offense Convictions Total # of drug offense convictions 0.99 1.34 1.11 1.49 1.12 1.60
Drug Offense Specialization/Diversity Degree of specialization/diversity in drug offenses 0.93 0.18 0.95 0.15 0.96 0.14
False Information to Police Cons. Total # of false information to police convictions 0.44 0.90 0.52 0.99 0.55 1.03 Flee/Escape Convictions Total # of flee/escape police convictions 0.23 0.61 0.27 0.66 0.31 0.71
Weapons Offense Convictions Total # of weapons offense convictions 0.08 0.32 0.10 0.36 0.12 0.40
Total Property Convictions Total # of property offense convictions 2.91 4.09 3.03 4.38 3.29 4.77 Property Offense Specialization/Diversity Degree of specialization/diversity in prop. offenses 0.89 0.19 0.92 0.15 0.92 0.15
Driving While Intoxicated (DWI) Convictions Total # of DWI convictions 0.29 0.67 0.59 1.03 0.66 1.12
Failure to Register (FTR) Convictions Total # of FTR convictions 0.05 0.28 0.08 0.36 0.09 0.40 Total Supervision Failures Total # of revocations on probation and parole 1.08 1.36 1.17 1.50 1.30 1.63
Intake
Metro County of Commitment Commit from Twin Cities metro-area county 0.51 0.50 0.51 0.50 0.51 0.50 Length of Stay in Prison (Months) Difference in months between admission and release 18.72 16.15 21.14 20.83 22.03 22.90
New Court Commitment Admitted to prison directly from court 0.60 0.49 0.65 0.48 0.68 0.47
Probation Violator Admitted to prison for probation violation 0.36 0.48 0.34 0.47 0.31 0.46 Release Violator Admitted to prison for parole violation 0.04 0.21 0.01 0.12 0.01 0.11
Person Offense Most serious index offense is person offense = 1 0.21 0.41 0.22 0.41 0.23 0.42
Sex Offense Most serious index offense is sex offense = 1 0.08 0.26 0.06 0.24 0.06 0.24 Drug Offense Most serious index offense is drugs = 1 0.30 0.46 0.27 0.45 0.24 0.43
Property Offense Most serious index offense is property = 1 0.24 0.42 0.19 0.39 0.17 0.38
DWI Offense Most serious index offense is DWI = 1 0.05 0.21 0.12 0.32 0.09 0.29 Other Offense Most serious index offense is “Other” offense = 1 0.13 0.34 0.15 0.35 0.20 0.40
Suicidal History Suicidal history = 1; no history = 0 0.11 0.32 0.16 0.36 0.17 0.37 Security Threat Group (STG) Total # of STG criteria (0-10) 0.92 1.69 0.88 1.66 0.94 1.69
Marital Status Married = 1; unmarried = 0 0.11 0.32 0.11 0.31 0.11 0.31
Age at Release Age in years at time of release 33.15 9.36 34.51 9.85 34.60 10.02 Unsupervised Release Released to no correctional supervision 0.02 0.15 0.01 0.12 0.01 0.10
Recidivism within 3 Years
General Recidivism Reconviction for misd., gross misd., or felony 0.68 0.47 0.62 0.49 0.63 0.48 N 10,517 4,876 5,327
12
machine learning algorithms such as random forests has been more recent (Barnes and
Hyatt, 2012). Created by Breiman (2001), random forests is an ensemble method that
involves growing a forest of many trees, each of which is grown on an independent
bootstrap sample from the training data. Each time a tree is fit at each node, some of the
predictor variables are censored. Random forests then find the best split based on the
selected predictor variables. The trees are grown to a maximum depth, and a consensus
prediction is obtained after voting the trees.
Recent research has advocated testing multiple classification methods when
developing a predictive model (Duwe and Kim, 2016; Ridgeway, 2013), given that there
is no single best algorithm that yields the best performance in every situation (Caruana
and Niculescu-Mizil, 2006; Wolpert, 1996).1 To identify the best predictive models, we
evaluated performance on the test set data. After doing so, we then applied the best-
performing models to the validation set data. In the validation sets, each prisoner received
a predicted probability that reflects his or her likelihood of recidivating within three years
of release from prison.
To assess responsivity, we also developed predictive models on the training set
data. The main difference between the recidivism risk and responsivity assessments had
to do with the outcome being predicted. With the recidivism risk assessment, the
predicted outcome was whether individuals recidivated within three years. With
1 Prior research has provided mixed evidence on the performance of machine learning algorithms, such as
random forests, versus older, more traditional approaches like logistic regression. Some studies have found
little or no difference between these two sets of classification methods (Hamilton et al., 2015; Liu et al.,
2011; Tollenaar and van der Heijden, 2013), whereas others have observed a performance advantage for
machine learning approaches (Berk and Bleich, 2013; Caruana et al., 2006; Duwe and Kim, 2015, 2016;
Hess and Turner, 2013). The evidence seems to be clearer that statistical and machine learning algorithms
outperform simplistic, Burgess-style methods (Duwe and Kim, 2016). Given the fact there is no single best
algorithm that performs the best in every situation, research has advocated testing multiple algorithms
(Duwe and Kim, 2015, 2016; Ridgeway, 2013), which is the approach we have followed here.
13
Table 2. Descriptive Statistics for Female Prisoner Sample
Predictors Description Mean and SD Training Test Validation
Static/Criminal History Mean SD Mean SD Mean SD
Total Convictions Total # of convictions (any offense level) 9.24 7.67 10.75 9.38 11.86 10.23 Felony Convictions Total # of felony convictions 1.58 1.58 2.20 2.07 2.66 2.47
Felony Specialization/Diversity Degree of specialization/diversity in felony offenses 0.84 0.29 0.83 0.27 0.81 0.28
Violent Convictions Total # of violent offense convictions 0.86 1.89 0.80 1.49 0.98 1.98 Violent Specialization/Diversity Degree of specialization/diversity in violent offenses 0.94 0.19 0.96 0.13 0.94 0.18
Total Assault Convictions Total # of assault offense convictions 0.37 0.91 0.41 0.99 0.53 1.35
Total Robbery Convictions Total # of robbery convictions 0.07 0.32 0.07 0.30 0.10 0.48 VOFP Convictions Total VOFP, stalking and harassment convictions 0.03 0.23 0.05 0.31 0.08 0.39
Disorderly Conduct Convictions Total # of disorderly conduct convictions 0.04 0.22 0.11 0.41 0.19 0.54
Prostitution Convictions Total # of prostitution offense convictions 0.34 1.46 0.18 0.88 0.24 1.17 Drug Offense Convictions Total # of drug offense convictions 1.11 1.35 1.21 1.61 1.31 1.74
Drug Offense Specialization/Diversity Degree of specialization/diversity in drug offenses 0.89 0.25 0.91 0.21 0.92 0.19
False Information to Police Cons. Total # of false information to police convictions 0.49 0.97 0.64 1.29 0.56 1.05 Flee/Escape Convictions Total # of flee/escape police convictions 0.10 0.36 0.10 0.38 0.08 0.30
Weapons Offense Convictions Total # of weapons offense convictions 0.01 0.09 0.01 0.15 0.02 0.15
Total Property Convictions Total # of property offense convictions 3.38 4.87 3.58 5.61 3.78 5.67 Property Offense Specialization/Diversity Degree of specialization/diversity in prop. offenses 0.83 0.24 0.86 0.22 0.86 0.21
Driving While Intoxicated (DWI) Convictions Total # of DWI convictions 0.20 0.59 0.49 0.90 0.58 1.04
Failure to Register (FTR) Convictions Total # of FTR convictions 0.00 0.03 0.01 0.11 0.01 0.13 Total Supervision Failures Total # of revocations on probation and parole 0.85 0.92 1.04 1.06 0.97 1.12
Intake
Metro County of Commitment Commit from Twin Cities metro-area county 0.50 0.50 0.42 0.49 0.43 0.50 Length of Stay in Prison (Months) Difference in months between admission and release 11.93 11.61 14.30 14.26 17.11 15.79
New Court Commitment Admitted to prison directly from court 0.49 0.50 0.44 0.50 0.51 0.50
Probation Violator Admitted to prison for probation violation 0.47 0.50 0.50 0.50 0.48 0.50 Release Violator Admitted to prison for parole violation 0.04 0.20 0.06 0.24 0.01 0.10
Person Offense Most serious index offense is person offense = 1 0.13 0.34 0.13 0.34 0.15 0.36
Sex Offense Most serious index offense is sex offense = 1 0.01 0.08 0.01 0.08 0.01 0.10 Drug Offense Most serious index offense is drugs = 1 0.44 0.50 0.44 0.50 0.44 0.50
Property Offense Most serious index offense is property = 1 0.33 0.47 0.28 0.45 0.24 0.42
DWI Offense Most serious index offense is DWI = 1 0.03 0.17 0.07 0.26 0.11 0.31 Other Offense Most serious index offense is “Other” offense = 1 0.06 0.23 0.07 0.26 0.06 0.24
Suicidal History Suicidal history = 1; no history = 0 0.21 0.41 0.32 0.47 0.36 0.48 Security Threat Group (STG) Total # of STG criteria (0-10) 0.13 0.51 0.16 0.60 0.12 0.53
Marital Status Married = 1; unmarried = 0 0.10 0.30 0.09 0.29 0.12 0.33
Age at Release Age in years at time of release 34.61 8.45 35.70 9.36 35.88 9.33 Unsupervised Release Released to no correctional supervision 0.03 0.18 0.06 0.23 0.00 0.00
Recidivism
General Reconviction for misd., gross misd., or felony 0.59 0.49 0.57 0.50 0.50 0.50 N 1,250 555 509
14
responsivity assessment, the predicted outcome was whether individuals had 1)
participated in CD treatment and 2) desisted within three years of release from prison.
Therefore, for the entire dataset, we created a variable, CD treatment desistance, that
assigned a value of “1” to desistors who participated in CD treatment and a value of “0”
to all other offenders. As a result, offenders who participated in CD treatment but
recidivated were given a value of “0”. Likewise, offenders who desisted but did not
participate in CD treatment were assigned a value of “0” for this item.
After developing responsivity assessment models on the training set data for both
males and females, we evaluated predictive performance on the test sets. We then applied
the best-performing models to the validation sets for males and females. In the validation
sets, each offender received a predicted probability that reflects his or her likelihood of
desisting as a result of CD treatment.
Predictive Performance Metrics
To measure predictive performance, we used six metrics to capture the three main
areas of predictive validity—accuracy, discrimination, and calibration. To evaluate
predictive accuracy, which assesses how well a model makes correct classification
decisions, we used accuracy (ACC). For predictive discrimination, which measures the
degree to which the model separates the recidivists from the desistors, we used three
separate metrics—the AUC, the H measure developed by Hand (2009), and the precision-
recall curve (PRC). The AUC has been one of the most widely used predictive
performance metrics, and it is relatively robust across different recidivism base rates and
selection ratios (Smith, 1996). Still, the AUC can provide overly optimistic estimates of
predictive discrimination for imbalanced datasets (Davis and Goadrich, 2006), and it can
15
provide misleading results if receiver operating characteristic (ROC) curves cross (Hand,
2009). As a result, we also used Hand’s H-measure, which uses a common cost
distribution for all classifiers (Hand, 2009), and the precision-recall curve (PRC), which
assesses discrimination with the precision and recall values. Precision measures the
percent of positive predictions that were correct (based on the 50 percent threshold),
whereas recall reflects the percentage of positives (i.e., recidivists) that were captured.
Compared to the AUC, the PRC has been found to be a better metric for highly
imbalanced datasets (i.e., making predictions for an infrequently occurring outcome)
(Davis and Goadrich, 2006).
Calibration assesses how well the predicted probabilities from a model correspond
with the observed outcome being predicted. For our calibration metric, we used root
mean square error (RMSE), which measures the squared root of the average squared
difference between observed recidivism and predicted probabilities. The sixth metric we
used is the SAR (squared error, accuracy, and ROC area) statistic developed by Caruana,
Niculescu-Mizil, Crew, and Ksikes (2004). SAR is a combined measure of
discrimination, accuracy and calibration, and the formula for SAR is: (ACC + AUC + (1
– RMSE))/3 (Caruana, Niculescu-Mizil, Crew, and Ksikes, 2004).
Prioritizing Prisoners for CD Treatment
In the validation sets for the male and female prisoners, each offender had been
assessed for risk, needs, and responsivity. Put another way, each of the 5,327 males and
509 females in the validation sets had values for 1) recidivism risk probability, 2) CD
treatment need, and 3) responsivity probability. The values for both recidivism risk and
responsivity ranged from a low of 0 percent to a high of 100 percent. A higher predicted
16
probability for recidivism signifies a higher risk for recidivism. On the other hand, a
higher predicted probability for responsivity denotes a greater likelihood that an
individual will desist after participating in CD treatment. The values for CD treatment
need consisted of “1” for no need, “2” for moderate need, and “3” for high need.
Using these values from the risk, needs, and responsivity assessments, we
examined several different ways of prioritizing prisoners for CD treatment. In particular,
we prioritized offenders on the basis of 1) risk and needs, 2) risk and responsivity, 3)
needs and responsivity, and 4) risk, needs, and responsivity. For example, in prioritizing
prisoners by risk and needs, we added the values from the risk and needs assessments to
form a total risk-needs score. Likewise, to prioritize prisoners by risk, needs, and
responsivity, we added the values from the risk, needs, and responsivity assessments to
form a total risk-needs-responsivity score. Therefore, individuals with the highest scores
are presumably those with the highest risk, needs, and responsivity to CD treatment.
A little more than one-fourth of the prisoners in the male and female validation
sets entered CD treatment. For example, 1,377 (26%) of the 5,327 male offenders in the
validation set participated in CD treatment. The recidivism rate for the treated offenders
was 49 percent versus 68 percent for those who were untreated. For females, 145 (28%)
of the 509 offenders participated in CD treatment. The recidivism rate for the treated
offenders was 35 percent compared to 56 percent for those who were untreated
To determine how each prioritization scheme might perform in assigning
individuals for CD treatment, we organized the validation sets into quartiles and then
analyzed recidivism outcomes by CD treatment participation. To illustrate with the 5,327
male offenders in the validation set, the recidivism rate was 49 percent for the 1,377
17
(26%) who entered CD treatment and 68 percent for the 3,950 (74%) who did not. The
rate was therefore 27 percent lower for the treated offenders. With a recidivism rate of 49
percent among the 1,377 treated offenders, there were still 677 who were recidivists. Yet,
if we assumed that none of the 1,377 were able to enter treatment and the recidivism rate
for untreated offenders is 68 percent, then 932 would have been recidivists. Delivering
CD treatment to the 1,377 offenders is thus associated with a reduction of 255 recidivists
(932 minus 677).
If we prioritized the top one-fourth of offenders (i.e., CD treatment capacity) on
the basis of risk-needs, risk-responsivity, needs-responsivity, or risk-needs-responsivity,
would we still see a 27 percent reduction? Similarly, if we prioritized the top one-fourth
on the basis of these four prioritization schemes, would we still observe 255 prevented
recidivists? Would the treatment effect sizes and number of prevented recidivists be
smaller, larger, or about the same? To answer these questions, we present the findings in
the following section.
Results
In Table 3, we present the predictive performance results from the recidivism and
responsivity assessments for males and females. As noted above, we used two types of
classification methods—logistic regression and random forests. The results in Table 3
indicate the recidivism risk models for both classification methods predicted recidivism
relatively well for both males and females. For male offenders, the logistic regression
model slightly outperformed the random forests model across each of the six predictive
performance metrics in both the test and validation sets. For female offenders, the
random forests model slightly outperformed logistic regression.