Healthcare Innovations at Kno.e.sis Put Knoesis Banner Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014 Amit Sheth Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis ) Wright State University, USA
Healthcare Innovations at Kno.e.sis
Put Knoesis Banner
Presentation to the Boonshoft School of Medicine Executive Committee, July 10, 2014
Amit Sheth
Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)
Wright State University, USA
• Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014)
• Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups )
• 100 researchers including 15 World Class faculty (>3K citations/faculty) and ~45 PhD students- practically all funded
• Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)
2
Amit Sheth’s
PHD students
Ashutosh Jadhav
Hemant
Purohit
Vinh Nguyen
Lu ChenPavan
Kapanipathi
Pramod Anantharam
Sujan Perera
Alan Smith
Swapnil Soni
Maryam Panahiazar
Sarasi Lalithsena
Shreyansh Batt
Kalpa Gunaratna
Delroy Cameron
Sanjaya Wijerat
ne
Wenbo Wang
Kno.e.sis in 2014 = ~100 researchers (15 faculty, ~50 PhD students)
3
Special thanks
Special thanks
Special thanks
Special thanks
Special thanks: This presentation covers some of the work of these researchers.
• 80% of doctors will eventually become obsolete: Vinod Khosla, VC and founder of Sun Microsystems
• “The Doctor is (Always) In: Reinventing the Doctor-Patient Relationship for the 21st Century” [Dr. J. Shlain]. More data is generated under patient control and outside clinical system. Patient empowerment, reimbursement changes and AHA.
• #dHealth and #IoT are two hottest hashtags at CES and SXSW
4
Healthcare is changing way too fast
The Patient of the FutureMIT Technology Review, 2012
http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/
5
8
kHealth:Knowledge empowered personalized digital
mhealthWith applications to: ADHF, GI, Asthma, [Geriatrics]
Contact: Prof. Amit Sheth
10
Providing actionable information in a timely manner is crucial to avoid information overload
or fatigue
Sleep dataCommunity dataPersonal
Schedule Activity dataPersonal health
records
Data Overload for Patients/health aficionados
Weather Application
11
Detection of events, such as wheezing sound, indoor temperature, humidity, dust, and CO2 level
Weather Application
Asthma Healthcare Application
Action in the Physical World
Close the window at home during day to avoid CO2 inflow, to avoid asthma
attacks at night
Public Health
Personal
Population Level
‘FOR human’: Improving Human Experience
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine
knowledge-enabled healthcare
13
kHealth
15
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
25 millio
n
300 millio
n
$50 billio
n
155,000
593,000
People in the U.S. are diagnosed with asthma (7 million are children)1.People suffering from asthma worldwide2.
Spent on asthma alone in a year2
Hospital admissions in 20063
Emergency department visits in 20063
Asthma
Asthma is a multifactorial disease with health signals spanning personal, public health, and population levels.
16
Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
Value
Can we detect the asthma severity level?Can we characterize asthma control level?What risk factors influence asthma control?What is the contribution of each risk factor?
semantics
Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information
WHY Big Data to Smart Data?Healthcare example
ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist
Asthma Control and Actionable Information
Sensors and their observations for understanding asthma
17
Personal, Public Health, and Population Level Signals for
Monitoring Asthma
18
At Discharge
Health Score
Non-compliance
Poor economic status
No living assistance
Vulnerability Score
Well Controlled
Low
Well Controlled
Very low
Not Well Controlled
High
Not Well Controlled
Medium
Poor Controlled
Very High
Poor Controlled
High
Estimation of readmission vulnerability based on the personal health score
Personal Health Score and Vulnerability Score
19
Population Level
Personal
Wheeze – YesDo you have tightness of chest? –Yes
Observations Physical-Cyber-Social System Health Signal ExtractionHealth Signal Understanding
<Wheezing=Yes, time, location><ChectTightness=Yes, time, location><PollenLevel=Medium, time, location>
<Pollution=Yes, time, location><Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert Knowledge
Background Knowledge
tweet reporting pollution level and asthma attacks
Acceleration readings fromon-phone sensors
Sensor and personal observations
Signals from personal, personal spaces, and community spaces
Risk Category assigned by doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor
Health Signal Extraction to Understanding
20
Social streams has been used to extract many near real-time events
Twitter provides access to rich signals but is noisy, informal, uncontrolled capitalization,
redundant, and lacks context
We formalize the event extraction from tweets as a sequence labeling problem
How do we know the event phrases and who creates the training set? (manual creation is
ruled out)
Now you know why you’re miserable! Very High Alert for B-ALLERGEN Ragweed I-ALLERGEN pollen. B-FACILITY Oklahoma I-FACILITY Allergy I-FACILITY Clinic says it’s an extreme exposure situation
Idea: Background knowledge used to create the training set e.g., typing information becomes the label
for a concept
Health Signal Extraction Challenges
intelligence at the edge
Approach 1: Send all sensor observations to the cloud for processing
Approach 2: downscale semantic processing so that each device is capable of machine perception
21Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning
0101100011010011110010101100011011011010110001101001111001010110001101011000110100111
22
Efficient execution of machine perception
O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to linear
23
Evaluation on a mobile device
2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web
3 Intelligence at the edgeBy downscaling semantic inference, machine
perception can execute efficiently on resource-constrained devices
1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making
24
Semantic Perception for smarter analysis:
3 ideas to takeaway
25
PREDOSE:Social media analysis driven epidemiology
Application: Prescription drug abuse and beyond
Contact: Delroy Cameron
26
D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. July 2013 (in press)
Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled Computing
CITAR - Center for Interventions Treatment and Addictions Research
http://wiki.knoesis.org/index.php/PREDOSE
Bridging the gap between researcher and policy makers
Early identification of emerging patterns and
trends in abuse
PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology
In 2008, there were 14,800 prescription painkiller deaths*
*http://www.cdc.gov/homeandrecreationalsafety/rxbrief/
• Drug Overdose Problem in US• 100 people die everyday from drug overdoses• 36,000 drug overdose deaths in 2008• Close to half were due to prescription drugs Gil Kerlikowske
Director, ONDCP
Launched May 2011
PREDOSE: Prescription Drug abuse Online Surveillance and Epidemiology
27
Early Identification and Detection of
Trends
Access hard-to-reach Populations
Large Data Sample Sizes
Group Therapy: http://www.thefix.com/content/treatment-options-prison90683
Interviews
Online Surveys
Automatic Data Collection
Not Scalable
Manual Effort
Sample Biases
Epidemiologist
Qualitative Coding
Problems
Computer Scientist
Automate Information Extraction & Content
Analysis
PREDOSE: Bringing Epidemiologists and Computer Scientist together
28
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Codes Triples (subject-predicate-object)
Suboxone used by injection, negative experience
Suboxone injection-causes-Cephalalgia
Suboxone used by injection, amount Suboxone injection-dosage amount-2mg
Suboxone used by injection, positive experience
Suboxone injection-has_side_effect-Euphoria
experience sucked
feel pretty damn good
didn’t do shit
feel great
Sentiment Extraction
bad headache
+ve
-ve
TriplesDOSAGE PRONOUNINTERVAL Route of
Admin.RELATIONSHIPS SENTIMENTS
DIVERSE DATA TYPESENTITIES
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected 2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That was about half an hour ago. I feel great now.
Buprenorphine
subClassOf
bupe
Entity Identification
has_slang_term
SuboxoneSubutexsubClassO
f
bupeyhas_slang_te
rm
Drug Abuse Ontology (DAO) 83 Classes
37 Properties
33:1 Buprenorphine24:1 Loperamide
30
Ontology Lexicon Lexico-ontology
Rule-based Grammar
ENTITIESTRIPLES
EMOTIONINTENSITYPRONOUN
SENTIMENT
DRUG-FORMROUTE OF
ADMSIDEEFFECT
DOSAGEFREQUENCYINTERVAL
Suboxone, Kratom, Herion, Suboxone-CAUSE-Cephalalgia
disgusted, amazed, irritated
more than, a, few of
I, me, mine, myIm glad, turn out
bad, weird
ointment, tablet, pill, film
smoke, inject, snort, sniff
Itching, blisters, flushing, shaking hands, difficulty
breathing
DOSAGE: <AMT><UNIT> (e.g. 5mg, 2-3 tabs)
FREQ: <AMT><FREQ_IND><PERIOD> (e.g. 5 times a week)
INTERVAL: <PERIOD_IND><PERIOD> (e.g. several years)
PREDOSE: Smarter Data through Shared Context and Data Integration
31
Data Type
Semantic Web Technique
Limitations of Other Approaches
Entity Ontology-driven Identification & Normalization
ML/NLP IRRequires Labeled Data
Unpredictable term
frequencies
Triple Schema-driven
Difficult to develop language model
Requires entity
disambiguation
Sentiment
Ontology-assisted Target Entity Resolution
Inconsistent data for Parse Trees or rules
Diverse simple & complex
slang terms & phrases
PREDOSE: Role of Semantic Web and Ontologies
32
33
Loperamide is used to self-medicate to from Opioid Withdrawal symptoms
Loperamide-Withdrawal Discovery
34
EMR and clinical text analysis:
Intelligence from clinical data
Contact: Sujan Parera
• Active Semantic EMR: high quality, low error, faster completion of patient records
• Predicting patient outcomes and advice discharge decisions based on both structured (billing) data and clinical text (unstructured data)
• Deep understanding of clinical text for Computer Assisted Coding for ICD9 and ICD10 and Computerized Document Improvement (commercial products from ezDI) 35
Explanation Module
Explained?
Yes
NoHypothesis Filtering
Hypothesis Generation
Hypothesis with High Confidence
D
D D
DD
D
Patient Notes
UMLS
Semantic Driven Approach for Knowledge Acquisition from EMRs
Deep clinical text analysis using semantics enhanced NLP has enabled our industry partner ezDI to develop exciting commercial products: ezCDI (Computerized Document Improvement) and ezCAC (Computer Assisted ICD9/ICD10 Coding)
See: http://ezdi.us
Semantics enhanced NLP
37
•Typical NLP algorithms misclassify linguistic nuances• Document 1:
• Coronary artery disease listed in the current diagnosis list• “Send for carotid duplex to rule out carotid artery stenosis given his risk factors and underlying
coronary artery disease.“ (NLP output says patient does not have coronary artery disease)
• Document 2:• “Extremities : Warm and dry. No clubbing or cyanosis. No lower extremity edema.“• “I have advised the patient on the side effect of potential lower extremity edema.“ (NLP
output says patient has lower extremity edema)
• Document 3• “He is not having any symptoms of chest pain or exertional syncope or dizziness.”• “I advised him that if he experiences chest pain, shortness of breath with exertion or dizziness or
syncopal episodes to let us know and we can do appropriate workup.” (NLP output says patient has chest pain, shortness of breath, dizziness, syncopal)
Green - correctly identified entities Red – misclassified entities
Semantics enhanced NLP
38
Semantics enhanced NLP
• Domain knowledge can be used to resolve misclassifications
Atrial FibrillationSyncope
Is_symptom_of
Warfarin
Atenolol
AspirinIs_medication_for
Symptoms Medication
Medication
Medication
• There are strong evidences to suggest that patient has Atrial Fibrillation.
39
Raw Text to Knowledge
He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on
atenolol alone for his hypertension.Raw Text
Concepts
Knowledge
Inference
diovan
lotrel
renal insuffici
encyatenolol
hypertension
diovan
valtuna
valsartan
antihypertensive agent
atenolol
tenomin
atenix
kidney
failure
renal insufficiency
kidney diseas
e
disorder
blood pressure disorder
hypertension
systoloc hypertens
ion
pulmonary hypertens
ion
Patient taking atenolol for hypertension
Patient has kidney disease
Patient is on antihypertensive drugs
is used to treatis a
drugdisorder 40
cTAKESezNLP
ezKB<problem value="Asthma" cui="C0004096"/><med value="Losartan" code="52175:RXNORM" /><med value="Spiriva" code="274535:RXNORM" /><procedure value="EKG" cui="C1623258" />
ezFIND ezMeasure ezCDIezCAC
www.ezdi.us
ezHealth Platform
41
42
Online Health Information Seeking
Contact: Ashutosh Jadhav
Internet Users in the World
http://www.internetlivestats.com/internet-users/
Around 3 Billions (40%) of the world population Around 300 Million (87 %) of the US population
43
• Online health resources– Easily accessible– Helps to obtain medical information quickly, conveniently
– Can help non-experts to make more informed decisions
– Play a vital role in improving health literacy
Online Health Information Seeking
44
• With the growing availability of online health resources, consumers are increasingly using the Internet to seek health related information
According to a 2013 Pew Survey*, one in three American adults has gone online to find information about a specific medical condition.
*Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013
Online Health Information Seeking
45
• One of the most common ways to seek online health Information is via Web search engines such as Google, Yahoo! and Bing
According to the Pew Survey, approximately 8 in 10 online health inquiries initiate from a search
engine.Fox S, Duggan M. Pew Internet & American Life Project. 2013. Health online 2013
Online Health Information Seeking
46
• Analyzing health search log– Helps to understand population level health information needs
– How users formulate search queries (“expression of information need”)
– availability of potentially larger, cohorts of real users and their behaviors, e.g. querying behaviors
• Such knowledge can be applied – to improve the health search experience – to develop next-generation knowledge and content delivery systems
Motivation
47
Online Health Information Seeking
Smart Devices
Personal Computers
vs.
Jadhav A et al. “Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal” Journal of Medical Internet Research 2014;16(7):e160 (Impact factor 3.8)
• With the recent exponential increase in usage of smart devices, the percentage of people using smart devices to search for health information is also growing rapidly
Motivation
• Experience of online information searching varies depending on the device used – Smart devices (SDs) : mobile, tablets– Personal computers (PCs): desktop, laptop
• PCs and SDs have distinct characteristics– Readability, user experience, accessibility, etc.
Motivation
• In order to improve the health information searching process and to be prepared for technology shift, it is necessary – to understand how device choice influences
online health information seeking
Study Objective
• Data:– Health search queries – lunched from PCs and SDs– submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal (MayoClinic.com)
• Data timeframe: – June 2011 to May 2013
• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)
• Dataset size: – More than 100 million health search queries for both PCs and SDs
Dataset Creation
• For PCs and SDs, we analyzed and compared– Frequently searched health categories– Types of search queries (keyword-based, Wh-questions, Yes/No questions)
– Structural properties of the queries • Length of the search queries• Usage of the search query operators• Usage of special characters
– Misspellings in the health search queries– Linguistic characteristics of the queries
Comparative Data Analysis
The most-searched health categories are ‘Symptoms’ (1 in 3 search queries), ‘Causes’ and ‘Treatments & Drugs’
One of the least searched health category is “Prevention”
The distribution of search queries for different health categories differ with the device used for search
Search queries from both PCs and SDs, follow similar pattern for distribution of the search queries between health categories
Intent Mining for Health Information Seeking
Health queries are predominately formulated using keywords (~85%); followed by Wh and Yes/No questions
Users ask more health questions from SDs compared to those from PCs
In the health search queries, users ask more “what”, “how” questions => descriptive information
need “can”, “is” and “does” questions => factual
information need
Intent Expression: Search Query Type
Average length of the queries from SDs (3.29 words and 18.86 characters) is bit longer than that of PCs (2.9 words and 17.61 chars)
Health queries tend to be longer than the general search queries indicating users interest in more specific information
Intent Expression: Search Query Length
Online Health Information Seeking for Cardiovascular
DiseasesJadhav A et al."What Information about Cardiovascular Diseases do People Search Online?”, 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014.
Jadhav A et al. "Online Information Searching for Cardiovascular Diseases: An Analysis of Mayo Clinic Search Query Logs” AMIA 2014 Annual Symposium, Washington DC, Nov 15-19, 2014
• According to CDC, in the United States– CVD is one of the most common chronic diseases– the leading cause of death (1 in every 4 deaths)
• CVD is common across all socioeconomic groups and demographics
• Most of the CVDs require lifelong care and the patient is in charge of managing the disease through self-care
• Online health resources are “significant information supplement” for the patients with chronic conditions
Motivation
59
• Although chronic diseases affect large population, very few prior studies have investigated online health information searching exclusively for chronic diseases and especially for CVD.
• In this study, we address this knowledge gap in the community – by performing population-level intention mining for online health information seeking
Motivation
60
• Data:– CVD related search queries – submitted from Web search engines – and directed users to Mayo Clinic’s consumer health information portal (MayoClinic.com)
• Data timeframe: – September 2011 to August 2013
• Data collection tool:– IBM NetInsight On Demand (Web Analytics tool)
• Dataset size: – 10 million CVD related search queries, which is a significantly large dataset for a single class of diseases.
Dataset Creation
61
• Identification of users intent for health information seeking
• For exampleSearch Query Health Category
Heart palpitations with headache
Symptoms
Tylenol raise blood pressure
Medication, Vital sign
Pump for pulmonary hypertension
Medical device, Disease
Red wine heart disease Food, DiseaseBypass surgery Treatment
Research Problem
62
• Using background knowledge based to develop a rule based classification approach– Using UMLS MetaMap and based on UMLS concepts and semantic types
– To categorize CVD search queries into 14 “consumer oriented” health categories
– Precision: 88.42% , Recall: 86.07% and F-Score: 0.8723
Intent Mining for Online Health Information Seeking
63
• One in every two search is related to either ‘Diseases and Conditions’ or ‘Vital signs’.
• Other popular health categories that users search for includes ‘Symptoms’, ‘Living with’, ‘Treatments’, ‘Food and Diet’ and ‘Causes’.
• Although CVD can be prevented with some lifestyle and diet changes, interestingly very few OHISs search for CVD ‘Prevention’.
Intent Mining for Health Information Seeking:
Categorization Results
• A search query can be categorized into zero, one or more health categories
• Using our categorization approach, we categorized 92% of the 10 million CVD related queries into at least one health category
• Most of the queries (around 88%) are categorized into either one or two categories
• Very few CVD queries (4.28%) are categorized into 3 or more categories.
Intent Mining for Health Information Seeking:
Categorization Results
• Most of the top search queries are related to major CVD diseases and conditions.
• At the same time, queries about blood pressure (high/low) and heart rate also searched frequently
Top CVD Search Queries
• Average search query length for CVD is 3.88 words and 22.22 characters
• Around 80% of the CVD search queries have 3 or more words.
• The analysis implies that, CVD search queries are longer than previously reported non-medical as well as medical queries
• Longer search queries also denote users’ interest in more specific information about the disease; subsequently users use more words to narrow down to a particular health topic.
Intent Expression: Search Query Length
• Users predominantly formulate search queries using keywords (80%), though queries with Wh-Questions are also significant
• Few queries (2.5%) are formulated as Yes/No type questions • In Wh-questions, OHISs mostly use “How” and “What” in the
search queries and both of them generally signify that more descriptive information is needed
• Yes/No questions are usually used to check some factual information. In Yes/No Questions, OHISs more often start the search queries with “does” “can” and “is”
Intent Expression: Search Query Types
Comparative Analysis of Online Health Information
Seeking for Chronic Diseases
Cardiovascular Diseases
Arthritis
Cancer Diabetes
• Everyday millions of health related tweets shared
• Most of these tweets are highly personal and contextual
• Only around 12% posts are informative*
• Keyword-based search doesn't help
• User has to manually identify informative tweets
How to automate the identification of informative content?
75
Problem: Identifying Signals from Noise
Present high quality, reliable and informative health related information shared over social media by understanding
76
Whowho shared the information?social network user
People Analysis
share whatwhat content is shared? social media post
Content Analysis
when when the post is generated? Temporal Analysis
in what context
what is the topic of the message? Semantic Analysis
on which channel
To which website, the social media post is pointing?
Reliability Analysis
with what social effect
how many retweets, facebook like/share, comments for the post?
Popularity Analysis
Social Health Signals
• Stress, obesity/lifestyle disease, chronic diseases
• Food and diet in the health context
• Keeping elderly at home as long as possible
• Clinical research – developing blood test for esophageal cancer detection
79
On the drawing board
• Kno.e.sis is a truly multidisciplinary, pan-University Center of Excellence were world class technology/computing expertise come together with clinical research and applications in health, fitness & wellbeing
• Major theme: personalized digital health, patient empowerment, informed patients, epidemiology
• More is covered in my talk on Semantic Data enabling Personalized Digital Health
80
Take Away
81
http://knoesis.orghttp://knoesis.org/vision
http://knoesis.org/amit/hcls
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA
thank you, and please visit us at
1. Henson C, Thirunarayan K, Sheth A. An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices 11th International Semantic Web Conference (ISWC 2012), Boston, Massachusetts, USA, November 11-15, 2012
2. Henson C, Sheth A, Thirunarayan K. Semantic Perception: Converting Sensory Observations to Abstractions IEEE Internet Computing, vol. 16, no. 2, pp. 26-34, Mar./Apr. 2012, doi:10.1109/MIC.2012.20
3. Henson C, Thirunarayan K, Sheth A. An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web. Applied Ontology, vol. 6(4), pp.345-376, 2011.
4. Perera S, Sheth A, Thirunarayan K, Nair S and Shah N. Challenges in Understanding Clinical Notes: Why NLP Engines Fall Short and Where Background Knowledge Can Help. International Workshop on Data management & Analytics for healthcaRE (DARE) at ACM Conference of Information and Knowledge Management (CIKM), pp. 21-26, Burlingame, USA, Nov 1, 2013,
5. Perera S, Henson C, Thirunarayan K, Sheth A, Nair S. Semantics Driven Approach for Knowledge Acquisition From EMRs. IEEE Journal of Biomedical and Health Informatics, vol.18, no.2, pp.515-524, March 2014, doi: 10.1109/JBHI.2013.2282125, PMID: 24058038
82
Selected References
6.Cameron D, Smith GA, Daniulaityte R, Sheth A et al.PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media. Journal of Biomedical Informatics. 46(6): 985-997, 2013. PMID: 23892295
7.Cameron D, Bodenreider O, Yalamanchili H, Danh T et al. A Graph-Based Recovery and Decomposition of Swanson's Hypothesis using Semantic Predications. Journal of Biomedical Informatics 46(2): 238-251, 2013.
8.Jadhav A, Sheth A, Pathak J. Analysis of Online Information Searching for Cardiovascular Diseases on a Consumer Health Information Portal. American Medical Informatics Association (AMIA) Annual Symposium 2014, Washington DC, November 15-19, 2014
9.Jadhav A, Andrews D, Fiksdal A, Kumbamu A, McCormick JB, et al. Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal. J Med Internet Res 2014;16(7):e160, PMID: 25000537
10.Fiksdal A, Kumbamu A, Jadhav A, Nelsen L, Pathak J, McCormick JB. Evaluating the Process of Online Health Information Searching: A Qualitative Approach to Exploring Consumer Perspectives. in press at J Med Internet Res 2014
11.Jadhav A, Wu S, Sheth A, Pathak J. Online Information Seeking for Cardiovascular Diseases: A Case Study from Mayo Clinic. 25th European Medical Informatics Conference (MIE 2014), Istanbul, Turkey, August 31 - Sept 3, 2014
83
Selected References