Top Banner
Healthcare Innovations at Kno.e.sis Put Knoesis Banner Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014 Amit Sheth Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis ) Wright State University, USA
83

Health Innovations at Kno.e.sis (July 2014)

Jan 30, 2023

Download

Documents

Deborah Crusan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Health Innovations at Kno.e.sis (July 2014)

Healthcare Innovations at Kno.e.sis

Put Knoesis Banner

Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014

Amit Sheth

Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)

Wright State University, USA

Page 2: Health Innovations at Kno.e.sis (July 2014)

• Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014)

• Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications

• Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )

• 100 researchers including 15 World Class faculty (>3K citations/faculty) and ~45 PhD students- practically all funded

• Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)

2

Page 3: Health Innovations at Kno.e.sis (July 2014)

Amit Sheth’s

PHD students

Ashutosh Jadhav

Hemant

Purohit

Vinh Nguyen

Lu ChenPavan

Kapanipathi

Pramod Anantharam

Sujan Perera

Alan Smith

Swapnil Soni

Maryam Panahiazar

Sarasi Lalithsena

Shreyansh Batt

Kalpa Gunaratna

Delroy Cameron

Sanjaya Wijerat

ne

Wenbo Wang

Kno.e.sis in 2014 = ~100 researchers (15 faculty, ~50 PhD students)

3

Special thanks

Special thanks

Special thanks

Special thanks

Special thanks: This presentation covers some of the work of these researchers.

Page 4: Health Innovations at Kno.e.sis (July 2014)

• 80% of doctors will eventually become obsolete: Vinod Khosla, VC and founder of Sun Microsystems

• “The Doctor is (Always) In: Reinventing the Doctor-Patient Relationship for the 21st Century” [Dr. J. Shlain]. More data is generated under patient control and outside clinical system. Patient empowerment, reimbursement changes and AHA.

• #dHealth and #IoT are two hottest hashtags at CES and SXSW

4

Healthcare is changing way too fast

Page 6: Health Innovations at Kno.e.sis (July 2014)

6

Collaborators

Page 7: Health Innovations at Kno.e.sis (July 2014)

7

Healthcare Innovation at Kno.e.sis

(with subset of applications)

Page 8: Health Innovations at Kno.e.sis (July 2014)

8

kHealth:Knowledge empowered personalized digital

mhealthWith applications to: ADHF, GI, Asthma, [Geriatrics]

Contact: Prof. Amit Sheth

Page 10: Health Innovations at Kno.e.sis (July 2014)

10

Providing actionable information in a timely manner is crucial to avoid information overload

or fatigue

Sleep dataCommunity dataPersonal

Schedule Activity dataPersonal health

records

Data Overload for Patients/health aficionados

Page 11: Health Innovations at Kno.e.sis (July 2014)

Weather Application

11

Detection of events, such as wheezing sound, indoor temperature, humidity, dust, and CO2 level

Weather Application

Asthma Healthcare Application

Action in the Physical World

Close the window at home during day to avoid CO2 inflow, to avoid asthma

attacks at night

Public Health

Personal

Population Level

‘FOR human’: Improving Human Experience

Page 12: Health Innovations at Kno.e.sis (July 2014)

12

Making sense of sensor data with

Page 13: Health Innovations at Kno.e.sis (July 2014)

Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine

knowledge-enabled healthcare

13

kHealth

Page 14: Health Innovations at Kno.e.sis (July 2014)

14

kHealth to Manage ADHF(Acute Decompensated Heart Failure)

Page 15: Health Innovations at Kno.e.sis (July 2014)

15

1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.

25 millio

n

300 millio

n

$50 billio

n

155,000

593,000

People in the U.S. are diagnosed with asthma (7 million are children)1.People suffering from asthma worldwide2.

Spent on asthma alone in a year2

Hospital admissions in 20063

Emergency department visits in 20063

Asthma

Page 16: Health Innovations at Kno.e.sis (July 2014)

Asthma is a multifactorial disease with health signals spanning personal, public health, and population levels.

16

Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.

Variety Volume

VeracityVelocity

Value

Can we detect the asthma severity level?Can we characterize asthma control level?What risk factors influence asthma control?What is the contribution of each risk factor?

semantics

Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information

WHY Big Data to Smart Data?Healthcare example

Page 17: Health Innovations at Kno.e.sis (July 2014)

ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist

Asthma Control and Actionable Information

Sensors and their observations for understanding asthma

17

Personal, Public Health, and Population Level Signals for

Monitoring Asthma

Page 18: Health Innovations at Kno.e.sis (July 2014)

18

At Discharge

Health Score

Non-compliance

Poor economic status

No living assistance

Vulnerability Score

Well Controlled

Low

Well Controlled

Very low

Not Well Controlled

High

Not Well Controlled

Medium

Poor Controlled

Very High

Poor Controlled

High

Estimation of readmission vulnerability based on the personal health score

Personal Health Score and Vulnerability Score

Page 19: Health Innovations at Kno.e.sis (July 2014)

19

Population Level

Personal

Wheeze – YesDo you have tightness of chest? –Yes

Observations Physical-Cyber-Social System Health Signal ExtractionHealth Signal Understanding

<Wheezing=Yes, time, location><ChectTightness=Yes, time, location><PollenLevel=Medium, time, location>

<Pollution=Yes, time, location><Activity=High, time, location>

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

Wheezing

ChectTightness

PollenLevel

Pollution

Activity

RiskCategory

<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>

.

.

.

Expert Knowledge

Background Knowledge

tweet reporting pollution level and asthma attacks

Acceleration readings fromon-phone sensors

Sensor and personal observations

Signals from personal, personal spaces, and community spaces

Risk Category assigned by doctors

Qualify

Quantify

Enrich

Outdoor pollen and pollution

Public Health

Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor

Health Signal Extraction to Understanding

Page 20: Health Innovations at Kno.e.sis (July 2014)

20

Social streams has been used to extract many near real-time events

Twitter provides access to rich signals but is noisy, informal, uncontrolled capitalization,

redundant, and lacks context

We formalize the event extraction from tweets as a sequence labeling problem

How do we know the event phrases and who creates the training set? (manual creation is

ruled out)

Now you know why you’re miserable! Very High Alert for B-ALLERGEN Ragweed I-ALLERGEN pollen. B-FACILITY Oklahoma I-FACILITY Allergy I-FACILITY Clinic says it’s an extreme exposure situation

Idea: Background knowledge used to create the training set e.g., typing information becomes the label

for a concept

Health Signal Extraction Challenges

Page 21: Health Innovations at Kno.e.sis (July 2014)

intelligence at the edge

Approach 1: Send all sensor observations to the cloud for processing

Approach 2: downscale semantic processing so that each device is capable of machine perception

21Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.

Page 22: Health Innovations at Kno.e.sis (July 2014)

Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning

0101100011010011110010101100011011011010110001101001111001010110001101011000110100111

22

Efficient execution of machine perception

Page 23: Health Innovations at Kno.e.sis (July 2014)

O(n3) < x < O(n4) O(n)

Efficiency Improvement

• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to linear

23

Evaluation on a mobile device

Page 24: Health Innovations at Kno.e.sis (July 2014)

2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web

3 Intelligence at the edgeBy downscaling semantic inference, machine

perception can execute efficiently on resource-constrained devices

1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making

24

Semantic Perception for smarter analysis:

3 ideas to takeaway

Page 25: Health Innovations at Kno.e.sis (July 2014)

25

PREDOSE:Social media analysis driven epidemiology

Application: Prescription drug abuse and beyond

Contact: Delroy Cameron

Page 26: Health Innovations at Kno.e.sis (July 2014)

26

D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)

Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled Computing

CITAR - Center for Interventions Treatment and Addictions Research

http://wiki.knoesis.org/index.php/PREDOSE

Bridging the gap between researcher and policy makers

Early identification of emerging patterns and

trends in abuse

PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

Page 27: Health Innovations at Kno.e.sis (July 2014)

In 2008, there were 14,800 prescription painkiller deaths*

*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/

• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs Gil Kerlikowske

Director, ONDCP

Launched May 2011

PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology

27

Page 28: Health Innovations at Kno.e.sis (July 2014)

Early Identification and Detection of

Trends

Access hard-to-reach Populations

Large Data Sample Sizes

Group Therapy: http://www.thefix.com/content/treatment-options-prison90683

Interviews

Online Surveys

Automatic Data Collection

Not Scalable

Manual Effort

Sample Biases

Epidemiologist

Qualitative Coding

Problems

Computer Scientist

Automate Information Extraction & Content

Analysis

PREDOSE: Bringing Epidemiologists and Computer Scientist together

28

Page 29: Health Innovations at Kno.e.sis (July 2014)

29

Page 30: Health Innovations at Kno.e.sis (July 2014)

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Codes Triples (subject-predicate-object)

Suboxone used by injection, negative experience

Suboxone injection-causes-Cephalalgia

Suboxone used by injection, amount Suboxone injection-dosage amount-2mg

Suboxone used by injection, positive experience

Suboxone injection-has_side_effect-Euphoria

experience sucked

feel pretty damn good

didn’t do shit

feel great

Sentiment Extraction

bad headache

+ve

-ve

TriplesDOSAGE PRONOUNINTERVAL Route of

Admin.RELATIONSHIPS SENTIMENTS

DIVERSE DATA TYPESENTITIES

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.

Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.

Buprenorphine

subClassOf

bupe

Entity Identification

has_slang_term

SuboxoneSubutexsubClassO

f

bupeyhas_slang_te

rm

Drug Abuse Ontology (DAO) 83 Classes

37 Properties

33:1 Buprenorphine24:1 Loperamide

30

Page 31: Health Innovations at Kno.e.sis (July 2014)

Ontology Lexicon Lexico-ontology

Rule-based Grammar

ENTITIESTRIPLES

EMOTIONINTENSITYPRONOUN

SENTIMENT

DRUG-FORMROUTE OF

ADMSIDEEFFECT

DOSAGEFREQUENCYINTERVAL

Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia

disgusted, amazed, irritated

more than, a, few of

I, me, mine, myIm glad, turn out

bad, weird

ointment, tablet, pill, film

smoke, inject, snort, sniff

Itching, blisters, flushing, shaking hands, difficulty

breathing

DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)

FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)

INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)

PREDOSE: Smarter Data through Shared Context and Data Integration

31

Page 32: Health Innovations at Kno.e.sis (July 2014)

Data Type

Semantic Web Technique

Limitations of Other Approaches

Entity Ontology-driven Identification & Normalization

ML/NLP IRRequires Labeled Data

Unpredictable term

frequencies

Triple Schema-driven

Difficult to develop language model

Requires entity

disambiguation

Sentiment

Ontology-assisted Target Entity Resolution

Inconsistent data for Parse Trees or rules

Diverse simple & complex

slang terms & phrases

PREDOSE: Role of Semantic Web and Ontologies

32

Page 33: Health Innovations at Kno.e.sis (July 2014)

33

Loperamide is used to self-medicate to from Opioid Withdrawal symptoms

Loperamide-Withdrawal Discovery

Page 34: Health Innovations at Kno.e.sis (July 2014)

34

EMR and clinical text analysis:

Intelligence from clinical data

Contact: Sujan Parera

Page 35: Health Innovations at Kno.e.sis (July 2014)

• Active Semantic EMR: high quality, low error, faster completion of patient records

• Predicting patient outcomes and advice discharge decisions based on both structured (billing) data and clinical text (unstructured data)

• Deep understanding of clinical text for Computer Assisted Coding for ICD9 and ICD10 and Computerized Document Improvement (commercial products from ezDI) 35

Page 36: Health Innovations at Kno.e.sis (July 2014)

Explanation Module

Explained?

Yes

NoHypothesis Filtering

Hypothesis Generation

Hypothesis with High Confidence

D

D D

DD

D

Patient Notes

UMLS

Semantic Driven Approach for Knowledge Acquisition from EMRs

Page 37: Health Innovations at Kno.e.sis (July 2014)

Deep clinical text analysis using semantics enhanced NLP has enabled our industry partner ezDI to develop exciting commercial products: ezCDI (Computerized Document Improvement) and ezCAC (Computer Assisted ICD9/ICD10 Coding)

See: http://ezdi.us

Semantics enhanced NLP

37

Page 38: Health Innovations at Kno.e.sis (July 2014)

•Typical NLP algorithms misclassify linguistic nuances• Document 1:

• Coronary artery disease listed in the current diagnosis list• “Send for carotid duplex to rule out carotid artery stenosis given his risk factors and underlying

coronary artery disease.“ (NLP output says patient does not have coronary artery disease)

• Document 2:• “Extremities : Warm and dry. No clubbing or cyanosis. No lower extremity edema.“• “I have advised the patient on the side effect of potential lower extremity edema.“ (NLP

output says patient has lower extremity edema)

• Document 3• “He is not having any symptoms of chest pain or exertional syncope or dizziness.”• “I advised him that if he experiences chest pain, shortness of breath with exertion or dizziness or

syncopal episodes to let us know and we can do appropriate workup.” (NLP output says patient has chest pain, shortness of breath, dizziness, syncopal)

Green - correctly identified entities Red – misclassified entities

Semantics enhanced NLP

38

Page 39: Health Innovations at Kno.e.sis (July 2014)

Semantics enhanced NLP

• Domain knowledge can be used to resolve misclassifications

Atrial FibrillationSyncope

Is_symptom_of

Warfarin

Atenolol

AspirinIs_medication_for

Symptoms Medication

Medication

Medication

• There are strong evidences to suggest that patient has Atrial Fibrillation.

39

Page 40: Health Innovations at Kno.e.sis (July 2014)

Raw Text to Knowledge

He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on

atenolol alone for his hypertension.Raw Text

Concepts

Knowledge

Inference

diovan

lotrel

renal insuffici

encyatenolol

hypertension

diovan

valtuna

valsartan

antihypertensive agent

atenolol

tenomin

atenix

kidney

failure

renal insufficiency

kidney diseas

e

disorder

blood pressure disorder

hypertension

systoloc hypertens

ion

pulmonary hypertens

ion

Patient taking atenolol for hypertension

Patient has kidney disease

Patient is on antihypertensive drugs

is used to treatis a

drugdisorder 40

Page 41: Health Innovations at Kno.e.sis (July 2014)

cTAKESezNLP

ezKB<problem value="Asthma" cui="C0004096"/><med value="Losartan" code="52175:RXNORM" /><med value="Spiriva" code="274535:RXNORM" /><procedure value="EKG" cui="C1623258" />

ezFIND ezMeasure ezCDIezCAC

www.ezdi.us

ezHealth Platform

41

Page 42: Health Innovations at Kno.e.sis (July 2014)

42

Online Health Information Seeking

Contact: Ashutosh Jadhav

Page 43: Health Innovations at Kno.e.sis (July 2014)

Internet Users in the World

http://www.internetlivestats.com/internet-users/

Around 3 Billions (40%) of the world population Around 300 Million (87 %) of the US population

43

Page 44: Health Innovations at Kno.e.sis (July 2014)

• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently

– Can help non-experts to make more informed decisions

– Play a vital role in improving health literacy

Online Health Information Seeking

44

Page 45: Health Innovations at Kno.e.sis (July 2014)

• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information

According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.

*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013

Online Health Information Seeking

45

Page 46: Health Innovations at Kno.e.sis (July 2014)

• One of the most common ways to seek online health Information is via Web search engines such as Google, Yahoo! and Bing

According to the Pew Survey, approximately 8 in 10 online health inquiries initiate from a search

engine.Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013

Online Health Information Seeking

46

Page 47: Health Innovations at Kno.e.sis (July 2014)

• Analyzing health search log– Helps to understand population level health information needs

– How users formulate search queries (“expression of information need”)

– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors

• Such knowledge can be applied – to improve the health search experience – to develop next-generation knowledge and content delivery systems

Motivation

47

Page 48: Health Innovations at Kno.e.sis (July 2014)

Online Health Information Seeking

Smart Devices

Personal Computers

vs.

Jadhav A et al. “Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal” Journal of Medical Internet Research 2014;16(7):e160 (Impact factor 3.8)

Page 49: Health Innovations at Kno.e.sis (July 2014)

Desktop

Mobile

Mobile

usagetakesOver

Motivation

Page 50: Health Innovations at Kno.e.sis (July 2014)

• With the recent exponential increase in usage of smart devices, the percentage of people using smart devices to search for health information is also growing rapidly

Motivation

Page 51: Health Innovations at Kno.e.sis (July 2014)

• Experience of online information searching varies depending on the device used – Smart devices (SDs) : mobile, tablets– Personal computers (PCs): desktop, laptop

• PCs and SDs have distinct characteristics– Readability, user experience, accessibility, etc.

Motivation

Page 52: Health Innovations at Kno.e.sis (July 2014)

• In order to improve the health information searching process and to be prepared for technology shift, it is necessary – to understand how device choice influences

online health information seeking

Study Objective

Page 53: Health Innovations at Kno.e.sis (July 2014)

• Data:– Health search queries – lunched from PCs and SDs– submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal (MayoClinic.com)

• Data timeframe: – June 2011 to May 2013

• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)

• Dataset size: – More than 100 million health search queries for both PCs and SDs

Dataset Creation

Page 54: Health Innovations at Kno.e.sis (July 2014)

• For PCs and SDs, we analyzed and compared– Frequently searched health categories– Types of search queries (keyword-based, Wh-questions, Yes/No questions)

– Structural properties of the queries • Length of the search queries• Usage of the search query operators• Usage of special characters

– Misspellings in the health search queries– Linguistic characteristics of the queries

Comparative Data Analysis

Page 55: Health Innovations at Kno.e.sis (July 2014)

The most-searched health categories are ‘Symptoms’ (1 in 3 search queries), ‘Causes’ and ‘Treatments & Drugs’

One of the least searched health category is “Prevention”

The distribution of search queries for different health categories differ with the device used for search

Search queries from both PCs and SDs, follow similar pattern for distribution of the search queries between health categories

Intent Mining for Health Information Seeking

Page 56: Health Innovations at Kno.e.sis (July 2014)

Health queries are predominately formulated using keywords (~85%); followed by Wh and Yes/No questions

Users ask more health questions from SDs compared to those from PCs

In the health search queries, users ask more “what”, “how” questions => descriptive information

need “can”, “is” and “does” questions => factual

information need

Intent Expression: Search Query Type

Page 57: Health Innovations at Kno.e.sis (July 2014)

Average length of the queries from SDs (3.29 words and 18.86 characters) is bit longer than that of PCs (2.9 words and 17.61 chars)

Health queries tend to be longer than the general search queries indicating users interest in more specific information

Intent Expression: Search Query Length

Page 58: Health Innovations at Kno.e.sis (July 2014)

Online Health Information Seeking for Cardiovascular

DiseasesJadhav A et al."What Information about Cardiovascular Diseases do People Search Online?”, 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014.

Jadhav A et al. "Online Information Searching for Cardiovascular Diseases: An Analysis of Mayo Clinic Search Query Logs” AMIA 2014 Annual Symposium, Washington DC, Nov 15-19, 2014

Page 59: Health Innovations at Kno.e.sis (July 2014)

• According to CDC, in the United States– CVD is one of the most common chronic diseases– the leading cause of death (1 in every 4 deaths)

• CVD is common across all socioeconomic groups and demographics

• Most of the CVDs require lifelong care and the patient is in charge of managing the disease through self-care

• Online health resources are “significant information supplement” for the patients with chronic conditions

Motivation

59

Page 60: Health Innovations at Kno.e.sis (July 2014)

• Although chronic diseases affect large population, very few prior studies have investigated online health information searching exclusively for chronic diseases and especially for CVD.

• In this study, we address this knowledge gap in the community – by performing population-level intention mining for online health information seeking

Motivation

60

Page 61: Health Innovations at Kno.e.sis (July 2014)

• Data:– CVD related search queries – submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal (MayoClinic.com)

• Data timeframe: – September 2011 to August 2013

• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)

• Dataset size: – 10 million CVD related search queries, which is a significantly large dataset for a single class of diseases.

Dataset Creation

61

Page 62: Health Innovations at Kno.e.sis (July 2014)

• Identification of users intent for health information seeking

• For exampleSearch Query Health Category

Heart palpitations with headache

Symptoms

Tylenol raise blood pressure

Medication, Vital sign

Pump for pulmonary hypertension

Medical device, Disease

Red wine heart disease Food, DiseaseBypass surgery Treatment

Research Problem

62

Page 63: Health Innovations at Kno.e.sis (July 2014)

• Using background knowledge based to develop a rule based classification approach– Using UMLS MetaMap and based on UMLS concepts and semantic types

– To categorize CVD search queries into 14 “consumer oriented” health categories

– Precision: 88.42% , Recall: 86.07% and F-Score: 0.8723

Intent Mining for Online Health Information Seeking

63

Page 64: Health Innovations at Kno.e.sis (July 2014)

Methods Overview

64

Page 65: Health Innovations at Kno.e.sis (July 2014)

Intent Mining for Health Information Seeking:Association Rules for

Categorization

Page 66: Health Innovations at Kno.e.sis (July 2014)

• One in every two search is related to either ‘Diseases and Conditions’ or ‘Vital signs’.

• Other popular health categories that users search for includes ‘Symptoms’, ‘Living with’, ‘Treatments’, ‘Food and Diet’ and ‘Causes’.

• Although CVD can be prevented with some lifestyle and diet changes, interestingly very few OHISs search for CVD ‘Prevention’.

Intent Mining for Health Information Seeking:

Categorization Results

Page 67: Health Innovations at Kno.e.sis (July 2014)

• A search query can be categorized into zero, one or more health categories

• Using our categorization approach, we categorized 92% of the 10 million CVD related queries into at least one health category

• Most of the queries (around 88%) are categorized into either one or two categories

• Very few CVD queries (4.28%) are categorized into 3 or more categories.

Intent Mining for Health Information Seeking:

Categorization Results

Page 68: Health Innovations at Kno.e.sis (July 2014)

• Most of the top search queries are related to major CVD diseases and conditions.

• At the same time, queries about blood pressure (high/low) and heart rate also searched frequently

Top CVD Search Queries

Page 69: Health Innovations at Kno.e.sis (July 2014)

• Average search query length for CVD is 3.88 words and 22.22 characters

• Around 80% of the CVD search queries have 3 or more words.

• The analysis implies that, CVD search queries are longer than previously reported non-medical as well as medical queries

• Longer search queries also denote users’ interest in more specific information about the disease; subsequently users use more words to narrow down to a particular health topic.

Intent Expression: Search Query Length

Page 70: Health Innovations at Kno.e.sis (July 2014)

• Users predominantly formulate search queries using keywords (80%), though queries with Wh-Questions are also significant

• Few queries (2.5%) are formulated as Yes/No type questions • In Wh-questions, OHISs mostly use “How” and “What” in the

search queries and both of them generally signify that more descriptive information is needed

• Yes/No questions are usually used to check some factual information. In Yes/No Questions, OHISs more often start the search queries with “does” “can” and “is”

Intent Expression: Search Query Types

Page 71: Health Innovations at Kno.e.sis (July 2014)

Comparative Analysis of Online Health Information

Seeking for Chronic Diseases

Cardiovascular Diseases

Arthritis

Cancer Diabetes

Page 72: Health Innovations at Kno.e.sis (July 2014)

Analyzing Temporal Patterns in Online Health

Information Seeking

Page 73: Health Innovations at Kno.e.sis (July 2014)

Analyzing online information seeking for “Food and Diet” in the context of “Health”

Page 74: Health Innovations at Kno.e.sis (July 2014)

74

Social Health Signals

Contact: Ashutosh Jadhav

Page 75: Health Innovations at Kno.e.sis (July 2014)

• Everyday millions of health related tweets shared

• Most of these tweets are highly personal and contextual

• Only around 12% posts are informative*

• Keyword-based search doesn't help

• User has to manually identify informative tweets

How to automate the identification of informative content?

75

Problem: Identifying Signals from Noise

Page 76: Health Innovations at Kno.e.sis (July 2014)

Present high quality, reliable and informative health related information shared over social media by understanding

76

Whowho shared the information?social network user

People Analysis

share whatwhat content is shared? social media post

Content Analysis

when when the post is generated? Temporal Analysis

in what context

what is the topic of the message? Semantic Analysis

on which channel

To which website, the social media post is pointing?

Reliability Analysis

with what social effect

how many retweets, facebook like/share, comments for the post?

Popularity Analysis

Social Health Signals

Page 77: Health Innovations at Kno.e.sis (July 2014)

77

Search and

Explore

Top health newsFaceted

search (by health topics)

Social Health Signals

Page 78: Health Innovations at Kno.e.sis (July 2014)

78

On going projects

Page 79: Health Innovations at Kno.e.sis (July 2014)

• Stress, obesity/lifestyle disease, chronic diseases

• Food and diet in the health context

• Keeping elderly at home as long as possible

• Clinical research – developing blood test for esophageal cancer detection

79

On the drawing board

Page 80: Health Innovations at Kno.e.sis (July 2014)

• Kno.e.sis is a truly multidisciplinary, pan-University Center of Excellence were world class technology/computing expertise come together with clinical research and applications in health, fitness & wellbeing

• Major theme: personalized digital health, patient empowerment, informed patients, epidemiology

• More is covered in my talk on Semantic Data enabling Personalized Digital Health

80

Take Away

Page 81: Health Innovations at Kno.e.sis (July 2014)

81

http://knoesis.orghttp://knoesis.org/vision

http://knoesis.org/amit/hcls

Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA

thank you, and please visit us at

Page 82: Health Innovations at Kno.e.sis (July 2014)

1. Henson C, Thirunarayan K, Sheth A. An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices 11th International Semantic Web Conference (ISWC 2012), Boston, Massachusetts, USA, November 11-15, 2012

2. Henson C, Sheth A, Thirunarayan K. Semantic Perception: Converting Sensory Observations to Abstractions IEEE Internet Computing, vol. 16, no. 2, pp. 26-34, Mar./Apr. 2012, doi:10.1109/MIC.2012.20

3. Henson C, Thirunarayan K, Sheth A. An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web. Applied Ontology, vol. 6(4), pp.345-376, 2011.

4. Perera S, Sheth A, Thirunarayan K, Nair S and Shah N. Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help. International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM), pp. 21-26, Burlingame, USA, Nov 1, 2013,

5. Perera S, Henson C, Thirunarayan K, Sheth A, Nair S. Semantics Driven Approach for Knowledge Acquisition From EMRs. IEEE Journal of Biomedical and Health Informatics, vol.18, no.2, pp.515-524, March 2014, doi: 10.1109/JBHI.2013.2282125, PMID: 24058038

82

Selected References

Page 83: Health Innovations at Kno.e.sis (July 2014)

6.Cameron D, Smith GA, Daniulaityte R, Sheth A et al.PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. 46(6): 985-997, 2013. PMID: 23892295

7.Cameron D, Bodenreider O, Yalamanchili H, Danh T et al. A Graph-Based Recovery and Decomposition of Swanson's Hypothesis using Semantic Predications. Journal of Biomedical Informatics 46(2): 238-251, 2013.

8.Jadhav A, Sheth A, Pathak J. Analysis of Online Information Searching for Cardiovascular Diseases on a Consumer Health Information Portal. American Medical Informatics Association (AMIA) Annual Symposium 2014, Washington DC, November 15-19, 2014

9.Jadhav A, Andrews D, Fiksdal A, Kumbamu A, McCormick JB, et al. Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal. J Med Internet Res 2014;16(7):e160, PMID: 25000537

10.Fiksdal A, Kumbamu A, Jadhav A, Nelsen L, Pathak J, McCormick JB. Evaluating the Process of Online Health Information Searching: A Qualitative Approach to Exploring Consumer Perspectives. in press at J Med Internet Res 2014

11.Jadhav A, Wu S, Sheth A, Pathak J. Online Information Seeking for Cardiovascular Diseases: A Case Study from Mayo Clinic. 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014

83

Selected References