Top Banner
ag agenda fb feedback un understanding ip interpersonal effectiveness co collaboration pt pacing and efficient use of time gd guided discovery cb focusing on key cognitions and behaviors sc strategy for change at application of cognitive-behavioral techniques hw homework NF is supported by the USC Annenberg Fellowship VRM is partially supported by Mexican Council of Science and Technology (CONACyT) •Cognitive Behavior Therapy (CBT) is a psychotherapy treatment that uses cognitive change strategies to address mental health problems 1 . •Quality assessment is traditionally addressed by human raters who evaluate recorded sessions along 11 behavioral codes, defined by the Cognitive Therapy Rating Scale (CTRS). •cost prohibitive •time consuming •We examine how linguistic features can be effectively used to develop an automatic competency rating tool for CBT. •Experiments are conducted on manual transcripts and on automatically derived ones, thus introducing an end-to-end approach. adout set: 386 adult-outpatient sessions from 131 therapists trans set: manually transcribed subset of adout 92 sessions from 70 therapists CTRS codes are manually scored on a 7-point Likert scale (0-6) 7 We binarize them labeling as negative (‘bad’) sessions with a code less than 3. For the total CTRS the threshold is 40. Results for the fusion techniques are reported only if better than the best independent classifier. • Unigrams with tf-idf weighting 2 • GloVe embeddings 3 • Linguistic Inquiry and Word Count (liwc) features 4 • Psycholinguistic Norm Features (pnf) 5 • Counts of Dialogue Acts (da) 6 • Counts of non-lexical (cnt-nlex) tags in transcripts Motivation & Background CTRS codes Method Features Datasets & Preprocessing Results Language Features for End-to-End Evaluation of Cognitive Behavior Psychotherapy Sessions Nikolaos Flemotomos 1 , Victor R. Martinez 2 , David C. Atkins 3 , Torrey A. Creed 4 , Shrikanth Narayanan 1,2 1 Department of Electrical Engineering and 2 Department of Computer Science, University of Southern California, Los Angeles, CA, USA 3 Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA 4 Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA • tf-idf_T almost always yields the best results among the independent features, but NOT for the highly skewed codes • Best performance usually achieved through a fusion method • For adout combining T’s and C’s features seems beneficial, but not for trans. Diarization and role matching errors? Future efforts will focus on • regression (‘how good is a session?’) • interpretability of the results: feedback for improvement Conclusions 1 A.T. Beck, Cognitive Therapy of Depression, Guilford press, 1979 2 G. Salton et al., Introduction to Modern Information Retrieval, McGraw-Hill Inc., 1986 3 J. Pennington et al., “Glove: Global vectors for word representation,” in Proc. EMNLP, 2014 4 J.W. Pennebaker et al., “The development and psychometric properties of LIWC2015,” Tech. Rep., University of Texas at Austion, 2015 5 N. Malandrakis et al., “Therapy language analysis using automatically generated psycholinguistic norms,” in Proc. Interspeech, 2015 6 D. Can et al., “A dialog act tagging approach to behavioral coding: A case study of addiction counseling conversations,” in Proc. Interspeech, 2015 7 T.A. Creed et al., “Implementation of trans diagnostic cognitive therapy in community behavioral health: The Beck Community Initiative,” Journal of consulting and clinical psychology, vol. 84, no. 12, 2016 References trans adout independent fusion independent fusion best feature UAR (%) best method UAR (%) best feature UAR (%) best method UAR (%) ag tf-idf_T 86.08 vt-T 90.15 tf-idf_T 67.39 vt-T 69.74 fb tf-idf_T 88.51 vt-T 91.41 tf-idf_T 63.96 tf-idf_(T+C) 66.07 un pnf_T 58.54 ft-all 60.43 glove_T 54.24 st-all 54.53 ip cnt-nlex 65.76 glove_T 54.00 co glove_C 69.32 st-all 72.82 glove_T 52.28 pt tf-idf_T 82.82 vt-T 83.94 tf-idf_T 66.17 gd tf-idf_T 82.35 vt-T 85.65 tf-idf_T 64.88 tf-idf+da_T 67.78 cb tf-idf_T 85.06 vt-T 88.50 tf-idf_T 65.95 tf-idf+da_T 67.35 sc tf-idf_T 86.88 vt-T 89.30 da_T 65.11 vt-all 65.96 at tf-idf_T 86.69 vt-T 89.86 tf-idf_T 63.96 tf-idf+da_T 66.63 hw tf-idf_T 83.82 tf-idf_T 65.95 vt-T 67.90 tot tf-idf_T 88.28 vt-T 93.06 glove_T 61.65 st-all 63.31
1

Language Features for End-to-End Evaluation of Cognitive ...

May 25, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Language Features for End-to-End Evaluation of Cognitive ...

ag agendafb feedbackun understandingip interpersonal effectivenessco collaborationpt pacing and efficient use of timegd guided discoverycb focusing on key cognitions and behaviorssc strategy for changeat application of cognitive-behavioral techniqueshw homework

NF is supported by the USC Annenberg FellowshipVRM is partially supported by Mexican Council of Science and Technology (CONACyT)

•Cognitive Behavior Therapy (CBT) is a psychotherapy treatment that uses cognitive change strategies to address mental health problems1.

•Quality assessment is traditionally addressed by human raters who evaluate recorded sessions along 11 behavioral codes, defined by the Cognitive Therapy Rating Scale (CTRS). •cost prohibitive •time consuming

•We examine how linguistic features can be effectively used to develop an automatic competency rating tool for CBT.

•Experiments are conducted on manual transcripts and on automatically derived ones, thus introducing an end-to-end approach.

✦adout set: 386 adult-outpatient sessions from 131 therapists ✦trans set: manually transcribed subset of adout

92 sessions from 70 therapists

• CTRS codes are manually scoredon a 7-point Likert scale (0-6)7

• We binarize them labeling asnegative (‘bad’) sessions with a code less than 3. For the total CTRS the threshold is 40.

Results for the fusion techniques are reported only if better than the best independent classifier.

• Unigrams with tf-idf weighting2 • GloVe embeddings3 • Linguistic Inquiry and Word Count (liwc) features4 • Psycholinguistic Norm Features (pnf)5 • Counts of Dialogue Acts (da)6 • Counts of non-lexical (cnt-nlex) tags in transcripts

Motivation & Background CTRS codes

Method Features

Datasets & Preprocessing

Results

Language Features for End-to-End Evaluationof Cognitive Behavior Psychotherapy Sessions

Nikolaos Flemotomos1, Victor R. Martinez2, David C. Atkins3, Torrey A. Creed4, Shrikanth Narayanan1,2

1 Department of Electrical Engineering and 2 Department of Computer Science,University of Southern California, Los Angeles, CA, USA

3 Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA4 Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA

• tf-idf_T almost always yields the best results among the independent features, but NOT for the highly skewed codes

• Best performance usually achieved through a fusion method • For adout combining T’s and C’s features seems beneficial,

but not for trans. Diarization and role matching errors?

Future efforts will focus on • regression (‘how good is a session?’) • interpretability of the results: feedback for improvement

Conclusions

1 A.T. Beck, Cognitive Therapy of Depression, Guilford press, 1979

2 G. Salton et al., Introduction to Modern Information Retrieval, McGraw-Hill Inc., 1986

3 J. Pennington et al., “Glove: Global vectors for word representation,” in Proc. EMNLP, 2014

4 J.W. Pennebaker et al., “The development and psychometric properties of LIWC2015,” Tech. Rep., University of Texas at Austion, 2015 5 N. Malandrakis et al., “Therapy language analysis using automatically generated psycholinguistic norms,” in Proc. Interspeech, 2015 6 D. Can et al., “A dialog act tagging approach to behavioral coding: A case study of addiction counseling conversations,” in Proc. Interspeech, 2015 7 T.A. Creed et al., “Implementation of trans diagnostic cognitive therapy in community behavioral health: The Beck Community Initiative,” Journal of consulting and clinical psychology, vol. 84, no. 12, 2016

References

trans adoutindependent fusion independent fusion

best feature

UAR(%)

best method

UAR(%)

best feature

UAR(%)

best method

UAR(%)

ag tf-idf_T 86.08 vt-T 90.15 tf-idf_T 67.39 vt-T 69.74fb tf-idf_T 88.51 vt-T 91.41 tf-idf_T 63.96 tf-idf_(T+C) 66.07un pnf_T 58.54 ft-all 60.43 glove_T 54.24 st-all 54.53ip cnt-nlex 65.76 — — glove_T 54.00 — —co glove_C 69.32 st-all 72.82 glove_T 52.28 — —pt tf-idf_T 82.82 vt-T 83.94 tf-idf_T 66.17 — —gd tf-idf_T 82.35 vt-T 85.65 tf-idf_T 64.88 tf-idf+da_T 67.78cb tf-idf_T 85.06 vt-T 88.50 tf-idf_T 65.95 tf-idf+da_T 67.35sc tf-idf_T 86.88 vt-T 89.30 da_T 65.11 vt-all 65.96at tf-idf_T 86.69 vt-T 89.86 tf-idf_T 63.96 tf-idf+da_T 66.63hw tf-idf_T 83.82 — — tf-idf_T 65.95 vt-T 67.90

tot tf-idf_T 88.28 vt-T 93.06 glove_T 61.65 st-all 63.31