Top Banner
[email protected] Jointly embeddings text and Knowledge Graph for information extraction Armando Vieira Data Scientist @dataAI and @Stratified Medical
46
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Extracting Knowledge from Pydata London 2015

[email protected]

Jointly embeddings text and Knowledge Graph for information

extraction

Armando Vieira

Data Scientist @dataAI and @Stratified Medical

Page 2: Extracting Knowledge from Pydata London 2015

[email protected]

Summary

Why machines struggle to “understand” text?

The challenges of discover new knowledge in text

Deep Learning to the rescue

Words as distributed vectors

Combining text with knowledge graphs

Page 3: Extracting Knowledge from Pydata London 2015

[email protected]

Wouldn't it be great that...

We could extract “knowledge” expressed in text into a machine readable format?

Page 4: Extracting Knowledge from Pydata London 2015

[email protected]

Or that...

We could transform all biomedical information into an automated drug discovery process

Page 5: Extracting Knowledge from Pydata London 2015

[email protected]

NLP: the traditional way

Page 7: Extracting Knowledge from Pydata London 2015

[email protected]

Why understanding text is so hard for a machine?

The verbs nightmare

Nested structures

Syntactic is doable semantics is hard

Other challenges (negations,…)

Long range interactions

Page 8: Extracting Knowledge from Pydata London 2015

[email protected]

Deep learning to the rescue

Page 9: Extracting Knowledge from Pydata London 2015

[email protected]

How distributed representations solve the curse of dimensionality problem

Page 12: Extracting Knowledge from Pydata London 2015

[email protected]

Distributed representations are powerful

Page 14: Extracting Knowledge from Pydata London 2015

[email protected]

The Skip-gram algorithm

IDEA: Words together are semantically related Mikolov et al 2013

Page 15: Extracting Knowledge from Pydata London 2015

[email protected]

But its not the end of the story

The verbs nightmare

Nested relations structure

Syntactic is doable semantics is hard

Other challenges (negations,…)

Long range correlations

Page 16: Extracting Knowledge from Pydata London 2015

[email protected]

Neural Embeddings

Credit: Omer Levy

Page 17: Extracting Knowledge from Pydata London 2015

[email protected]

Mikolov et al. (2013)

Page 20: Extracting Knowledge from Pydata London 2015

[email protected]

What does each similarity term mean?

Observe the joint features with explicit representations!

uncrowned Elizabeth majesty Katherine second impregnate

… …

Page 21: Extracting Knowledge from Pydata London 2015

[email protected]

Words as vector operations

Page 22: Extracting Knowledge from Pydata London 2015

[email protected]

Gensim implementation in Python

Page 24: Extracting Knowledge from Pydata London 2015

[email protected]

How to train the embedding?

Page 25: Extracting Knowledge from Pydata London 2015

[email protected]

Advantages

 Efficient coding of words and relations

 Capture both local and global semantics

 Easy to parallelize

 Completely unsupervised

 Can easily handle ambiguity

Page 26: Extracting Knowledge from Pydata London 2015

[email protected]

Limitations of word embeddings

 They are (bi)linear machines

 Perform poorly on infrequent words

 Can not incorporate external knowledge

Page 27: Extracting Knowledge from Pydata London 2015

[email protected]

Knowledge graphs

Page 32: Extracting Knowledge from Pydata London 2015

[email protected]

Why its hard to expand knowledge?

 Sparsely connected

 Highest degree nodes are sometimes irrelevant

 Some relations types are too vague

 Integrate local and global (contextual) information

Page 33: Extracting Knowledge from Pydata London 2015

[email protected]

Combining text and graphs

Page 34: Extracting Knowledge from Pydata London 2015

[email protected]

What’s inside a knowledge graph?

Page 35: Extracting Knowledge from Pydata London 2015

[email protected]

Idea: combine KG and text corpus

Page 36: Extracting Knowledge from Pydata London 2015

[email protected]

The algorithm

Chang Xu et al

Page 37: Extracting Knowledge from Pydata London 2015

[email protected]

Data

Wikipedia 2014 • 3.5 billion word tokens • Vocabulary size: 2 million

Freebase • 44 million topics • 2.4 billion facts • > 1500 relation types

Page 38: Extracting Knowledge from Pydata London 2015

[email protected]

Results

Corpus of data ??

Page 40: Extracting Knowledge from Pydata London 2015

[email protected]

Beating humans in IQ test?

Analogy I Isotherm is to temperature as isobar is to: A) atmosphere, B) wind; C) Pressure; D) latitude; E) current

Analogy 2 Identify two words (one from each set of brackets) that form a connection (analogy) when paired with the words in capitals: CHAPTER (book, verse, read), ACT (stage, audience, play).

Classification Which is the odd one out? (i) calm, (ii) quiet, (iii) relaxed, (iv) serene, (v) unruffled.

Synonym Which word is closest to IRRATIONAL? (i) intransigent, (ii) irredeemable, (iii) unsafe, (iv) lost, (v) nonsensical.

Antonym Which word is most opposite to MUSICAL? (i) discordant, (ii) loud, (iii) lyrical, (iv) verbal, (v) euphonious

Page 41: Extracting Knowledge from Pydata London 2015

[email protected]

In average, yes!

Huang et al, June 2015

Page 42: Extracting Knowledge from Pydata London 2015

[email protected]

Resources

http://technology.stitchfix.com/blog/2015/03/11/word-is-worth-a-thousand-vectors/ Chris Moody

https://levyomer.wordpress.com Levy Omer

Page 43: Extracting Knowledge from Pydata London 2015

[email protected]

How about biomedical data?

Few data (25 million documents)

Complex interactions between entities

Fat tail

Incorporate constrains from Physics, Chemistry & Biology

Non-linearities: complex manifold

Page 44: Extracting Knowledge from Pydata London 2015

[email protected]

From here… Neuroinflammation is the local reaction of the brain to infection, trauma, toxic molecules or protein aggregates. The brain resident macrophages, microglia, are able to trigger an appropriate response involving secretion of cytokines and chemokines, resulting in the activation of astrocytes and recruitment of peripheral immune cells. IL-1β plays an important role in this response; yet its production and mode of action in the brain are not fully understood and its precise implication in neurodegenerative diseases needs further characterization. Our results indicate that the capacity to form a functional NLRP3 inflammasome and secretion of IL-1β is limited to the microglial compartment in the mouse brain. We were not able to observe IL-1β secretion from astrocytes, nor do they express all NLRP3 inflammasme components. Microglia were able to produce IL-1β in response to different classical inflammasome activators, such as ATP, Nigericin or Alum. Similarly, microglia secreted IL-18 and IL-1α, two other inflammasome-linked pro-inflammatory factors. Cell stimulation with α-synuclein, a neurodegenerative disease-related peptide, did not result in the release of active IL-1β by microglia, despite a weak pro-inflammatory effect. Amyloid-β peptides were able to activate the NLRP3 inflammasome in microglia and IL-1β secretion occurred in a P2X7 receptor-independent manner. Thus microglia-dependent inflammasome activation can play an important role in the brain and especially in neuroinflammatory conditions.

Page 45: Extracting Knowledge from Pydata London 2015

[email protected]

To here

If protein A interacts with gene G at cell types C what other proteins related to A may interact with gene G at cell types C1?

If chemical Q attach to target T at protein P what chemicals may attach to target T1 at protein P1?

Page 46: Extracting Knowledge from Pydata London 2015

[email protected]

Looking for new knowledge

We are not really looking to understand language

Rather

Extract and “validate” novel knowledge.