Top Banner
Exploring mental well-being from prisoner case notes using text mining Jo Lee Data Scientist at the Ministry of Justice [email protected], @jo_noms on Slack
28

Exploring mental well-being from prisoner case notes using text mining - Civil Service · 2017. 11. 23. · Exploring mental well-being from prisoner case notes using text mining

Feb 06, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Exploring mental well-being from

    prisoner case notes using text mining

    Jo Lee – Data Scientist at the Ministry of Justice

    [email protected], @jo_noms on Slack

  • Mental well-being in prisoners

    https://www.theguardian.com/society/2017/may/02/ministers-

    should-have-legal-duty-to-combat-rise-in-prison-suicides

    https://www.theguardian.com/society/2017/mar/11/prison-

    psychiatrists-warm-mental-health-care-breaking-point

    https://www.theguardian.com/healthcare-

    network/2017/may/10/prison-mental-health-crisis

    Unpublished, exploratory analysis & not for wider

    distribution

  • Mental well-being in prisoners

    Unpublished, exploratory analysis & not for wider

    distribution

    The most recent Adult Psychiatric Morbidity Survey of prisoners (Singleton et al 1998)

    found that over 90% of prisoners had one or more of the five psychiatric disorders studied

    (psychosis, neurosis, personality disorder, hazardous drinking and drug dependence).

    A recent NICE publication indicated:

    “There is low quality evidence for a range of systems for the delivery and

    coordination of care in the criminal justice system (for example, drug or mental

    health courts, and case management).”

    “There is clear evidence of poor engagement, uptake and retention in treatment

    for people with mental health problems in contact with the criminal justice system.”

    https://www.nice.org.uk/guidance/ng66/resources/mental-health-of-adults-in-contact-with-the-criminal-

    justice-system-pdf-1837577120965

  • Mental well-being in prisoners

    The most recent Adult Psychiatric Morbidity Survey of prisoners (Singleton et al 1998)

    found that over 90% of prisoners had one or more of the five psychiatric disorders studied

    (psychosis, neurosis, personality disorder, hazardous drinking and drug dependence).

    A recent NICE publication indicated:

    “There is low quality evidence for a range of systems for the delivery and

    coordination of care in the criminal justice system (for example, drug or mental

    health courts, and case management).”

    “There is clear evidence of poor engagement, uptake and retention in treatment

    for people with mental health problems in contact with the criminal justice system.”

    https://www.nice.org.uk/guidance/ng66/resources/mental-health-of-adults-in-contact-with-the-criminal-

    justice-system-pdf-1837577120965

    Surveys are expensive, so I am investigating the reliability of using

    administrative data to gain understanding of prisoner well-being.

    Unpublished, exploratory analysis & not for wider

    distribution

  • Aims for project

    Use text mining techniques to determine mental well-being:

    Can the case notes be used to correctly identify prisoners with mental health

    issues?

    How does mental well-being correlate with external factors (i.e. drug/alcohol

    issues, vulnerability/bullying, debt or smoking)?

    Can the analysis feed into self-harm/violence predictive models?

    Are the case notes an effective way to determine/investigate/monitor mental

    health issues?

    Unpublished, exploratory analysis & not for wider

    distribution

  • The case notes

    The case notes are a rich source of information, detailing the journey of a prisoner

    through the prison system and providing key information about their well-being.

    • there is no obligation to make case notes detailed

    • the data set is not a complete representation of events in prisons

    • gives insight into how prison officers document events and implemented procedures.

    Mental health in the case notes

    • Recording is idiosyncratic to each prison officer, but tend to be descriptive of events

    that lead prison officers to worry about well-being

    “X stated he wanted to be dead and became very emotional and started

    to cry when asked about his visit with his mum earlier in the week he stated

    he missed his children and wanted to see them but was told this would not

    be possible at the present time.”

    Unpublished, exploratory analysis & not for wider

    distribution

  • Overview of methodology

    Use text mining techniques to determine mental well-being:

    Pre-processing the free text case notes.

    Curating a mental well-being dictionary.

    Exploratory analysis of the dictionary.

    Use classification techniques to determine the polarity of the case notes.

    Unpublished, exploratory analysis & not for wider

    distribution

  • Overview of methodology

    Convert to

    UTF-8

    Free text

    case notes

    Remove punctuation,

    except separator ‘;’

    Change ‘;’ to , in text to

    use ‘;’ as separator

    Remove all

    numbers from case

    notes (case notes

    have date & time

    stamp)

    Place cleaned

    notes in

    postgreSQL DBDevelop a

    Mental Well-

    Being Dictionary

    Search for

    dictionary

    terms

    Categorise

    the case

    notes

    Analyse

    term/category

    occurrence

    Replace

    misspellings

    and abbrev.

    Unpublished, exploratory analysis & not for wider

    distribution

  • Pre-processing the free text case notes

    Protocol was implemented at the command line: bash script

    • This was because the 19 million case notes were too large to process in RAM

    greedy languages (R or python)

    for file in *.csv; do # change file directory for case note dumpiconv -f ISO-8859-1 -t UTF-8 "$file" > "${file%.txt}.utf8.converted"

    echo "Converting "$file""done

    for file in utf8.converted; do # create one file to processcat "$file" >> utf8.tmp

    rm "$file" # deletes the old, temporary files after UTF-8 conversionecho ""$file" added to utf8.tmp"

    done

    The first step was to ensure the file encoding was correct, as it was provided in

    CSV format from the pNOMIS database.

    Unpublished, exploratory analysis & not for wider

    distribution

  • Pre-processing the free text case notes

    Remove duplicate lines, multiple concurrent punctuation occurrences, any

    unprintable characters

    cat utf8.tmp | awk '!seen[$0]++' | tr -s '[:punct:]' | sed's/[^[:print:]\r\t]/ /g' > tidy_casenote.tmpecho "tidied tmp"

    Tidying the case notes into a format for text mining

    Unpublished, exploratory analysis & not for wider

    distribution

  • Pre-processing the free text case notes

    AWK was invaluable for coding line-by-line pre-processing

    hadn’t -> had not

    OASYSS -> OASYS

    recieved -> received

    BEGIN { FS = ","while (getline

  • Pre-processing the free text case notes

    Remove punctuation from the case notes

    remove the first two columns: ID and date and then replace the CSV delimiter ‘,’

    with ‘;’ in the case notes – removing instances of ‘;’ in the case notes

    cat misspellings.tmp | cut -d',' -f1-2 | sed -i 's/,/;/g' > iddate.tmp# take first 2 columns, and change delimitercat misspellings.tmp | cut -d',' -f3- | sed -i 's/;/,/g' > casenote.tmp# take the case note and remove occurrences of the delimiter from there

    pr -mts';' iddate.tmp casenote.tmp > casenote_forDBecho "made final file for DB"

    Unpublished, exploratory analysis & not for wider

    distribution

  • Curating a mental well-being dictionary

    The dictionary was curated with input from

    policy colleagues and DH colleagues.

    Unpublished, exploratory analysis & not for wider

    distribution

    Clinical History

    • Mental illness diagnosis (e.g. depression,

    bipolar disorder, schizophrenia)

    • Personality disorder diagnosis (e.g. borderline

    personality disorder)

    Psychological and Psychosocial Factors

    • Desperate

    • Angry

    • Sad

    • Ashamed

    • Hopeless

    • Worthless

    • Lonely

    • Disconnected

    • Powerless

    Current ‘context’

    • Recent suicide/self-harm thoughts/actions

    • Violence, intimidation or fear of these

    • Parole refusal or other knock-back

    • Longer sentence than expected

    • Alcohol/drug misuse

    • Irrational behaviour, out of touch with reality

    • Recklessness

    • Hostile rejection of help

  • Curating a mental well-being dictionary

    The dictionary was curated with input from

    policy colleagues and DH colleagues.

    Unpublished, exploratory analysis & not for wider

    distribution

    Clinical History

    • Mental illness diagnosis (e.g. depression,

    bipolar disorder, schizophrenia)

    • Personality disorder diagnosis (e.g. borderline

    personality disorder)

    Psychological and Psychosocial Factors

    • Desperate

    • Angry

    • Sad

    • Ashamed

    • Hopeless

    • Worthless

    • Lonely

    • Disconnected

    • Powerless

    Current ‘context’

    • Recent suicide/self-harm thoughts/actions

    • Violence, intimidation or fear of these

    • Parole refusal or other knock-back

    • Longer sentence than expected

    • Alcohol/drug misuse

    • Irrational behaviour, out of touch with reality

    • Recklessness

    • Hostile rejection of help

    ACCT: Assessment of Care in Custody

    Teamwork

    Any prisoner believed to be at risk of self-

    harm is placed under an ACCT review.

    The individual is given an assessment, and

    the mental health in-reach team is informed.

    The individual is given the opportunity to

    talk to a Listener and/or a Samaritan.

    The day/night manager is informed of the

    risk.

  • Curating a mental well-being dictionary

    The dictionary was curated with input from

    policy colleagues and DH colleagues.

    The most commonly occurring mental health dictionary

    terms in the case notes. The size of the word represents

    occurrence.

    Unpublished, exploratory analysis & not for wider

    distribution

    Clinical History

    • Mental illness diagnosis (e.g. depression,

    bipolar disorder, schizophrenia)

    • Personality disorder diagnosis (e.g. borderline

    personality disorder)

    Psychological and Psychosocial Factors

    • Desperate

    • Angry

    • Sad

    • Ashamed

    • Hopeless

    • Worthless

    • Lonely

    • Disconnected

    • Powerless

    Current ‘context’

    • Recent suicide/self-harm thoughts/actions

    • Violence, intimidation or fear of these

    • Parole refusal or other knock-back

    • Longer sentence than expected

    • Alcohol/drug misuse

    • Irrational behaviour, out of touch with reality

    • Recklessness

    • Hostile rejection of help

  • Exploratory analysis of the case notes

    The case notes can be categorised according

    to their topic, into the topics on the plot below

    • Since ACCT reviews can be opened for long

    periods of time, it is not expected that every

    case note will relate to mental well being.

    • The most common categories are relations,

    IEP (Incentives and Earned Privileges),

    chaplaincy, and ACCT (if individual is on

    ACCT).

    The categories of case notes for individuals on

    ACCT and not on ACCT review

    not on ACCT

    on ACCTFNC = first night in custody, which is when prisoners are

    perceived to be at their most vulnerable

    SID = individuals who went on to have a self-inflicted death

    ACCT

    Bullying

    Canteen

    Chaplaincy

    Constant Supervision

    Debt

    Drug/Alcohol

    Education

    FNC

    Gym

    IEP

    Medical

    Relation

    SID

    Work

    Unpublished, exploratory analysis & not for wider

    distribution

    Percentage of case notes

  • Exploratory analysis of the dictionary

    Heat map showing probability of co-occurrence of 2 terms in the case notes for Sept - Nov 2016

    The probability of both terms occurring is shown in each square of the heat map. The most commonly occurring terms are

    ‘ACCT’ with ‘suicidal’, ‘stress’ and ‘anxiety’ with ‘depression’.

    0.06

  • Use classification techniques to determine

    the polarity of the case notes

    Unpublished, exploratory analysis & not for wider

    distribution

    The starting point for this iterative process is the R package ETEA.

    This contains a list of words that assign polarity (positive or negative sense) to the case notes.

    It contains its own categorical tags (specific for clinical notes), which will be refined to fit the case

    notes and mental health.

    https://github.com/chriskirkhub/etea

    How the ETEA algorithm works

    1. Cleans the case note, pre-processing

    before running the algorithm.

    2. Classification of the case note by binning it

    by the presence of the categorical tags.

    3. Assesses the polarity of the case note

    (positive or negative).

  • Use classification techniques to determine

    the polarity of the case notes

    Unpublished, exploratory analysis & not for wider

    distribution

    How the ETEA algorithm works

    1. Cleans the case note, pre-processing

    before running the algorithm.

    2. Classification of the case note by binning it

    by the presence of the categorical tags.

    3. Assesses the polarity of the case note

    (positive or negative).

    The starting point for this iterative process is the R package ETEA.

    This contains a list of words that assign polarity (positive or negative sense) to the case notes.

    It contains its own categorical tags (specific for clinical notes), which will be refined to fit the case

    notes and mental health.

    https://github.com/chriskirkhub/etea

    “X stated he wanted to be dead and

    became very emotional and started to

    cry when asked about his visit with his

    mum earlier in the week he stated he

    missed his children and wanted to see

    them but was told this would not be

    possible at the present time.”

  • Use classification techniques to determine

    the polarity of the case notes

    Unpublished, exploratory analysis & not for wider

    distribution

    How the ETEA algorithm works

    1. Cleans the case note, pre-processing

    before running the algorithm.

    2. Classification of the case note by binning it

    by the presence of the categorical tags.

    3. Assesses the polarity of the case note

    (positive or negative).

    The starting point for this iterative process is the R package ETEA.

    This contains a list of words that assign polarity (positive or negative sense) to the case notes.

    It contains its own categorical tags (specific for clinical notes), which will be refined to fit the case

    notes and mental health.

    https://github.com/chriskirkhub/etea

    “X stated he wanted to be dead and

    became very emotional and started to

    cry when asked about his visit with his

    mum earlier in the week he stated he

    missed his children and wanted to see

    them but was told this would not be

    possible at the present time.”

  • Use classification techniques to determine

    the polarity of the case notes

    Unpublished, exploratory analysis & not for wider

    distribution

    How the ETEA algorithm works

    1. Cleans the case note, pre-processing

    before running the algorithm.

    2. Classification of the case note by binning it

    by the presence of the categorical tags.

    3. Assesses the polarity of the case note

    (positive or negative).

    The starting point for this iterative process is the R package ETEA.

    This contains a list of words that assign polarity (positive or negative sense) to the case notes.

    It contains its own categorical tags (specific for clinical notes), which will be refined to fit the case

    notes and mental health.

    https://github.com/chriskirkhub/etea

    “X stated he wanted to be dead and

    became very emotional and started to

    cry when asked about his visit with his

    mum earlier in the week he stated he

    missed his children and wanted to see

    them but was told this would not be

    possible at the present time.”

  • Evaulation of output

    Unpublished, exploratory analysis & not for wider

    distribution

    Number of ETEA assignments that match manual assignments

    The confusion table shows how well the algorithm works compared with

    a manual assignment: 65% agreement

    Algorithm assignment

    Negative Neutral Positive

    Ma

    nu

    al

    as

    sig

    nm

    en

    t

    Negative 14 11 7

    Neutral 4 33 5

    Positive 2 11 26

  • Longitudinal study

    Unpublished, exploratory analysis & not for wider

    distribution

    Score

    giv

    en t

    o c

    ase n

    ote

    by E

    TE

    A

    ETEA algorithm assignment of polarity against the manual

    assignment of polarity (indicated by the colour)

    Case notes over time

    Strong negative

    Weak negative

    Neutral

    Weak positive

    Strong positive

    1.

    2.

    manual assignment

    Case note examples from the prisoner’s timeline in plot

    1.“Personal officer Spoke with X who states he is very pleased to have been awarded his enhanced status. He states he is doing well with starting computers and art classes. He has finished his induction and is aware he will not be assessed for at least 9 months. He gets on with most prisoners on the unit and will help out with additional cleaning if so asked by staff. No issues raised.”

    2.“X has told me that he would like to help in the barbershop as he has qulaifications in hair cutting. He has missed several music classes as a result. I would like X to decide which activity he would like to concentrate on and to adhere to his timetable accordingly and not pick and choose when he wants to come to education. When he does comes he is demanding of time and attention and always wants what he wants straight away and gets frustrated when things don’t happen as quickly as he would like. I am also wondering if part of his behaviour may be located somewhere on the autistic spectrum. I wonder if he has been tested for this. If not perhaps this might be a good idea to help him deal with his impatience and demanding nature.”

  • Self-Inflicted Death (SID) study

    Unpublished, exploratory analysis & not for wider

    distribution

    De

    nsity

    0

    Polarity score

    The distribution of polarity scores for 4 week periods,

    counting back in time for the self-inflicted death event

    Does polarity change over time, in the weeks running up to a SID

  • Self-Inflicted Death (SID) study

    Unpublished, exploratory analysis & not for wider

    distribution

    Does polarity change over time, in the weeks running up to a SID

    Po

    larity

    0

    March May

    Date

    The positive (green) and negative (red) contributions

    to an individuals polarity over time

    April

  • Violence Predictor (VIPER)

    • The violence in prisons estimator (Viper) is a measure which is to be used

    to assess the risk of an individuals likelihood to be a perpetrator of

    violence in prison.

    • Viper is a framework rather than a single score, which can be applied in

    several situations to provide the most appropriate score/measure in that

    specific scenario.

    • The framework is made up of several random forest models. Each model

    will represent the probability of an individual being the perpetrator of

    violence in a month.

    • The case note categorisation, and the polarity have been used as

    variables in the machine learning.

    Unpublished, exploratory analysis & not for wider

    distribution

  • Conclusions and further work

    Unpublished, exploratory analysis & not for wider

    distribution

    Using the case notes in predictive modelling

    • The use of categorisation and polarity in the case notes is being tested in the violence predictive model.

    • There are further applications for the case notes in self-harm predictive models.

    Operational use of the case notes

    • The case note analysis can be implemented in a Shiny application, which can be available to prison staff. This will permit

    prison staff to search the case notes by prison or prisoner, by time, or by content of the case note.

    Further analytical work

    • A more detailed exploration of the case notes:

    • Comparison of polarity between ACCT and non-ACCT review individuals. Can this be used to explain where

    individuals with mental health issues are being missed?

    • Can the magnitude of polarity be used as a useful metric? Especially when doing a longitudinal study for

    individuals.

    • Expansion of these methods to other free text fields. For example, using natural language processing / machine learning

    to triage the ~500,000 intelligence reports received per year, to ensure that analysts time is only spent on the most

    important ones, and applications in relation to sentence plans, pre-sentence reports.

  • Thank you for your attention

    [email protected], @jo_noms on Slack

    Acknowledgements

    Chris Kirk – data scientist who developed ETEA

    Offender Insight Team, DaSH, MoJ

    Maria Angulo, Offender Health Analysis Team, MoJ

    Unpublished, exploratory analysis & not for wider

    distribution