Top Banner
hud Reiter, Computing Science, University of Aberdeen BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept
48

Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Jan 14, 2016

Download

Documents

Jolie Voyles
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 1

BabyTalk: Generating English Summaries of

Clinical DataEhud Reiter

Univ of Aberdeen, CS Dept

Page 2: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 2

Structure

Background: data-to-text Babytalk project Results of first evaluation Current work

Page 3: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 3

What is data-to-text

Goal: generate English summaries of non-linguistic data» Numerical weather predictions» Medical records» Statistics» Etc

Page 4: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 4

Simple Example:Weather Forecasts

Input: numerical weather predictions» From supercomputer running a numerical weather

simulation

Output: textual weather forecast We’ve developed several systems

» Two used commercially (oil rig, road gritting)– Users prefer some gen texts to human texts!

» Demo of pollen system on our webpage

So have others (FoG, MultiMeteo, …)

Page 5: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 5

Pollen forecasts

Grass pollen levels for Tuesday have decreased from the high levels of yesterday with values of around 4 to 5 across most parts of the country. However, in South Eastern areas, pollen levels will be high with values of 6.

Page 6: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 6

Other data-text apps

Medical: to-be-discussed Assistive technology: help blind people

access statistical data Financial: summarise stock-market data Education: Summarise assessment

results, help write stories Engineering: Sum. gas-turbine data Etc

Page 7: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 7

Why is data-to-text useful

The world is drowning in data» NLP researchers talk about problems of

too much text, but data problems are worse

– Texts are at least read by someone (writer)– Most data is automatically collected and never

looked at by a human

Page 8: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 8

Data overload

Sensor recording 2 bytes/second» 170KB/day» 63MB/year» Millions of sensors in hospitals, jet engines, …

Simulations» Weather: 30MB for one day in one UK county,

from one model» Climate models: petabytes of data

Too much data, need better tools for utilising!

Page 9: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 9

Decision Support

Data often used for decision support» Medical: help doctors make decisions» Weather: helps staff on offshore oil rigs plan their

operations» Engineering: help plan maintenance» Etc

Often under time pressure» Make a decision in 3 min, here is 30MB of data to

help you

Page 10: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 10

Using data for decision support

Alarming» Trigger alarm if value exceeds threshold

– Or other such simple rule

» Works, doesn’t get full value from data Visualisation

» Show data to experts visually– People like this, unclear how much it helps,

especially when massive amount of data

Page 11: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 11

Using data for decision support

Knowledge-based systems» Feed data into an expert system which

makes recommendations based on it» Can work in some contexts, but problems

– Domain experts dislike being told what to do– Often key data not available to KBS– Can be brittle, fragile

Page 12: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 12

Data-text for decision support

Idea: use KBS, NLP tech to generate a short text summary of a data set

Intermediate between KBS and visualisation» Use domain reasoning to highlight key info,

infer causal links, add background know» But stick to describing data, don’t tell

experts what to do!

Page 13: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 13

Data-text for decision support

vs alarms: deeper info vs visualisation

» Just key facts, not everything» Supplemented with causal links, etc

vs KBS» More acceptable to users» More robust, since not useless if missing

some key data or knowledge

Page 14: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 14

Data-text for decision support

Above is still somewhat speculative But people in many domains are

interested in exploring the concept to see if it works» Esp since current situation is so bad!

Of course other uses of data-to-text» Assistive technology, education

Page 15: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 15

Language and World

How does language relate to the world? Data-to-text is a great way of exploring

this» The real reason I got into this…

Page 16: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 16

BabyTalk

Goal: Summarise clinical data about premature babies in neonatal ICU

Input: sensor data; records of actions/observations by medical staff

Output: multi-para texts, summarise» BT45: 45 mins data, for doctors (completed)» BT-Nurse: 12 hrs data, for nurses» BT-Family: 24 hrs data, for parents» BT-Clan: 24 hrs data, for other friends, family» Bt-Doc: several hrs data, for doctors

Page 17: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 17

Neonatal ICU

Page 18: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 18

Baby MonitoringSpO2 (SO,HS)

ECG (HR)

Core Temperature (TC)

Arterial Line

(Blood Pressure)

Peripheral Temperature (TP)

Transcutaneous Probe

(CO,OX)

Page 19: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 19

Input: Sensor Data

Page 20: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 20

Input: Action Records

FullDescriptor Time

SETTING;VENTILATOR;FiO2 (36%)

10.30

MEDICATION;Morphine 10.44

ACTION;CARE;TURN/CHANGE POSITION;SUPINE

10.46-10.47

ACTION;RESPIRATION;HAND-BAG BABY

10.47-10.51

SETTING;VENTILATOR;FiO2 (60%)

10.47

ACTION;RESPIRATION;INTUBATE 10.51-10.52

Page 21: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 21

BT45 texts

Human corpus text At 1046 the baby is turned for re-intubation and re-intubation is

complete by 1100 the baby being bagged with 60% oxygen between tubes. During the re-intubation there have been some significant bradycardias down to 60/min, but the sats have remained OK. The mean BP has varied between 23 and 56, but has now settled at 30. The central temperature has fallen to 36.1°C and the peripheral temperature to 33.7°C. The baby has needed up to 80% oxygen to keep the sats up.

Computer-generated text By 11:00 the baby had been hand-bagged a number of times

causing 2 successive bradycardias. She was successfully re-intubated after 2 attempts. The baby was sucked out twice.At 11:02 FIO2 was raised to 79%.

Page 22: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 22

Babytalk architecture

Signal analysis: patterns, trends Data interpretation: based on medical

knowledge (like expert sys) Doc planning: select and structure

events to be mentioned Microplanning: choose words, syntactic

structures, referring exp Realisation: generate actual text

Page 23: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 23

Signal Analysis

Detect trends, patterns, events, etc» Blood oxygen levels increasing» Downward spike in heart rate

Detect artefacts» Changes due to sensor problems

Plenty of algorithms exist for this Will not further discuss here

Page 24: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 24

Data Abstraction

Detect higher-level events in the data» Sequence of bradycardias (downward

spikes in HR) Determine medical importance

» Bradycardia more important if simultaneous desaturation (downward spike in SO)

Medical KBS

Page 25: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 25

Data Abs: Links Between Events

Infer links between events» Blood O2 falls, therefore O2 level in

incubator is increased» HR up because baby is being handled» Morphine given as part of the intubation

procedure Very imp, much of value added of text

» Helps readers build good mental model of what is happening to the baby

Page 26: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 26

Document Planning

First NLP stage Decide what events to mention Decide how these are ordered and

organised

Page 27: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 27

Content Determination

First approach: Include most medically important events» Also include moderately important events

which are linked to very important events Doesn’t always work

Page 28: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 28

Problem: Continuity

Omitting intermediate events confuses readers» Example: TcPO2 suddenly decreased to

8.1. SaO2 increased to 92. TcPO2 suddenly decreased to 9.3

» There is a gradual rise in TcPO2 between the sudden falls

– This is less important medically– But important for reader’s comprehension

Page 29: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 29

Document Structure

How do we order/group events» By time» By medical importance» By body subsystem (eg, respiration)

Initially focused on time, but users want more emphasis on subsystem» Eg, first a “scene” about respiration, then a

“scene” about thermoregulation– Not constant shifting between two

Page 30: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 30

Doc Planning: Narrative

High-level analysis: need to do a better job of generating a “story” from the data» Link events together» Include events needed for story

progression even if not important» “Scene” structure

Qualitative observation by users

Page 31: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 31

Microplannig

Second NLP stage Choose words and syntactic structure to

express information Aggregation Reference

Page 32: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 32

Challenge: Time

Need to communicate temporal info» Enough so that readers can interpret the

data» Not too much, text becomes unreadable

– Imagine story with “At 10.14 John left home. At 10.28 he met Mary in the pub. At 10.39…”

Page 33: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 33

Tenses

Use Reichenbach model» Speech time: time of report being read» Event time: time of event being described» Reference time: determined using a

salience model– Similar to resolving anaphoric reference

Usually worked, sometimes failed» Need better model for reference time

Page 34: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 34

What does event time mean?

Sometimes explicit time given for event» Supposed to be start time of event, sometimes

misinterpreted Ex:”After three attempts, at 13.53 a peripheral

venous line was inserted successfully.”» 13.53 refers to time of first (failed) attempt

– Start of LINE-INSERT-ATTEMPTS event

» Readers interpret as time of final (succ) attempt Need better linguistic model of time

» Linguistic temporal ontology (Moens Steedman)?

Page 35: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 35

Lexical Choice

Need mechanism to map domain events (instances in a Protégé ontology) to linguistic structures

Use JESS rules» Lexical info from Verbnet, NIH lexicon

Engineering challenge» Relate to Sheffield work on NLG/ontologies

Page 36: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 36

Vague language

Human texts are full of vague language» Ex: There is a momentary bradycardia» What does “momentary” mean?

Our models of this are very crude, need to be improved!

Page 37: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 37

Realisation

Last NLG stage Generate actual text, once choices

made Use Aberdeen simplenlg package Will not further discuss here

Page 38: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 38

BT45 Evaluation

Showed 35 medical professionals 24 scenarios in 3 conditions (8 of each)» Visualisation of medical data» Textual summary (manually written)» Textual summary (from BT45)

Asked to make a treatment decision» Limited to 3 minutes» Measured correctness (against gold stan)

Off-ward, using historical data» So no other knowledge about baby

Page 39: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 39

Free-text comments

Comments were not solicited, but were recorded if made

Most important were» Better layout (eg, bullet lists)» Continuity (as mentioned before)

Page 40: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 40

Decision-Support results

No sig difference in time taken Avg decision-quality (scale -1 to 1)

» Human texts: 0.39» Computer texts: 0.34» Visualisation: 0.33

Human sig better than comp, visual No sig diff comp, visual

Page 41: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 41

Results by subject type

Analysis by type of subjects» Human texts especially good for junior

nurses (ie, least experienced subjects)

Page 42: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 42

Results by scenario

Each scenario had a main target action» 8 different ones

Computer texts as good as human texts for five of these; worse for three» No action, manage temperature, monitor

equipment» These relate to specific problems in the

system, which can be fixed

Page 43: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 43

Target Actions with Poor Perf

No action: Needs high-level summary, not blow-by-blow event description

Manage Temperature: Two temp channels, need to describe together

Monitor equipment: Need to mention (not ignore) sensor artefacts

Page 44: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 44

Summary

Good performance with human texts shows textual presentation is effective» Also seen in previous study

Babytalk as good as visualisation, could make better by addressing above issues» Even now giving users BabyTalk text as

supplement to visualisations could help

Page 45: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 45

Current Work

BT-Nurse: shift summaries for nurses» Use live data from current babies» Evaluate on ward, using babies that

subjects (nurses) actually looking after» Focus on info relevant to nurse shift

planning, not real-time decision support» Longer time period (12 hrs)

– Need more sensor abstraction

» Longer texts (multi-page)

Page 46: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 46

Current Work

BT-Family: information for parents» Estimate how stressed parents are, use

this to control content, phrasing– High stress means less content– Relate to Sheffield work on personality??

» Express information in language which parents can understand, not medicalese

Page 47: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 47

Current Work

BT-Clan: Information for friends, family» Social networking perspective: encourage

useful support, minimise hassle of dealing with numerous inquiries

– Parents decide what to tell people– Intentional deceit: if granny is frail, don’t tell her

bad news

» Info about parents as well as baby

Page 48: Dr. Ehud Reiter, Computing Science, University of Aberdeen1 BabyTalk: Generating English Summaries of Clinical Data Ehud Reiter Univ of Aberdeen, CS Dept.

Dr. Ehud Reiter, Computing Science, University of Aberdeen 48

Research agenda

Detecting complex events in the data Integration with medical guidelines Better use of vague language Better stories Role of text in interactive multimodal

information presentation system Try in domain of assisted living