1 1 T. K. Prasad (Krishnaprasad Thirunarayan ) Professor of Computer Science and Engineering Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, OH-45435 Big Data and Smart Healthcare Honors Institute Symposium on Visions of the Future
With the rapid proliferation of mobile phones, social media, and sensors, it is critical to collect and convert big data so generated into actionable information that is relevant for decision making. In this session, we explore challenges and approaches for synthesizing relevant background knowledge and inferences that can enable smart healthcare and ultimately benefit community at large.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11
T. K. Prasad (Krishnaprasad Thirunarayan )Professor of Computer Science and Engineering
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, OH-45435
Big Data and Smart HealthcareHonors Institute Symposium on Visions of the Future
Big Data Processing and Smart HealthcareKrishnaprasad Thirunarayan (T. K. Prasad)
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, OH-45435
Prasad 3
Outline
• Extent and Economics of Healthcare Problem• Nature of Health-related Big Data• Cognitive Computing Goals• Five V’s of Big Data Research• Our Research
– Semantic Perception for Scalability– Lightweight Semantics to Manage Heterogeneity – Hybrid Knowledge Representation and Reasoning
Healthcare Related Big Data for Potential Exploitation: Assorted Examples
• Semi-structured: Electronic medical records (EMR) market has been valued at $20 billion in 2012.
• User-generated content / informal text: Social media posts / microblogs discussing depression, drug abuse/liberalization policies, side-effects, etc.
• Sensor data: M. J. Fox Foundation Parkinson disease challenge dataset that tracked 16 people (9 patients + 7 control) with 7 mobile phone sensors over 8 weeks is 12 GB.
• Other Applications: The healthcare industry spends roughly $250 billion per year due to fraud.
03/20/2014
Structured vs Unstructured Data
Patient Disorders ICD-9 Code
Patient1 Hypertension 401
Patient2 Atrial fibrillation 427.31
Patient1 Pulmonary hypertension 416
Patient3 Edema 782.3
Patient4 hyperthyroidism 242.9
Coronary artery disease, status post four-vessel coronary artery bypass graft surgery on , by Dr. X with a left internal mammary artery to the left anterior descending artery, sequential vein graft to the ramus and first diagonal, and a vein graft to the posterior descending artery. He had normal left ventricular function. He is having some symptoms that are unclear if they are angina or not. I am therefore going to get him scheduled for an exercise Cardiolite stress test.
VS
Patient Data Distribution
Structured data
Unstructured data
Search Mining
Decision Support
Knowledge Discovery Prediction
NLP +
Semantics
Nature of Processing
An Example
He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on atenolol alone for his hypertension.
– Using domain models to be created to monitor asthma patients and their surroundings
• Ultimately, recommend prevention, treatment, and control options …[EVIDENCE-BASED APPROACH]
03/20/2014
Prasad 24
Volume : (2) Exploiting Embarrassing Parallelism
• Cloud Computing–Hardware : Networked Stock PCs–Middleware: Replicated storage and
restarted computations for fault tolerance• E.g., Hadoop file system, Google file system
–Application Programming: Models / languages for distributed computation• E.g., Map-Reduce, PIG, HIVE
03/20/2014
Prasad 25
Volume with a Twist
Resource-constrained reasoning on mobile-devices
Goal: Boolean encodings to ensure feasibility, efficiency, and economy
03/20/2014
Prasad 26
Cory Henson’s Thesis Statement
Machine perception can be formalized using semantic web technologies to derive abstractions from sensor data using background knowledge on the Web, and efficiently executed on resource-constrained devices.
03/20/2014
Prasad 27* based on Neisser’s cognitive model of perception
ObserveProperty
PerceiveFeature
Explanation
Discrimination
1
2
Perception Cycle* that exploits background knowledge / domain models
Abstracting raw data for human
comprehension
Focus generation for disambiguation and action(incl. human in the loop)
logical– Refine structure to better estimate parameters
E.g., Medical Data Analytics using PGMs + KBs
03/20/2014
Prasad 45
Veracity
Scalable and Agile Big Data Analytics cannot deliver value unless we have confidence and trust in our data.
Open Problem: Develop expressive frameworks for trust to make explicit all aspects that go into trust formation and inferences.
03/20/2014
Prasad 46
Veracity: Confession of sorts!
Trust is well-known, but is not well-understood.
The utility of a notion testifies not to its clarity but rather to the philosophical importance of clarifying it.
-- Nelson Goodman (Fact, Fiction and Forecast, 1955)
03/20/2014
Prasad 47
(More on) Value
Discovering gaps and enriching domain models using data
E.g., Semantics Driven Approach for Knowledge Acquisition from EMRs
Idea: Use known associations between diseases, symptoms and medications implicit in real world scenarios (EMRs) to acquire unknown associations and bridge the gaps in knowledge base
03/20/2014
Prasad 48
(More on) Value
Discovering drug-drug interaction by analyzing search query logs
• E.g., The antidepressant, paroxetine, and the cholesterol lowering drug, pravastatin, were shown to interfere causing high blood sugar, by correlated searches with “hyperglycemia”, “high blood sugar” or “blurry vision”.
03/20/2014
Prasad 49
Conclusions
• Glimpse of our research organized around the 5 V’s of Big Data• Discussed role in harnessing Value
– Semantic Perception (Volume)– Continuum of Semantic models to manage