Extracting Semantic Predication from Medline Citations for Pharm acogenomics C.B. Ahlers 1 , M. Fiszman 2 , D.D. Fushman 1 , F.M. Lang 1 and T.C. Rindflesch 1 1 National Center for Biomedical Communications, National Library of Medicine 2 University of Tennessee, USA (PSB 2007 12:209-220)
26
Embed
Extracting Semantic Predication from Medline Citations for Pharmacogenomics
Extracting Semantic Predication from Medline Citations for Pharmacogenomics. C.B. Ahlers 1 , M. Fiszman 2 , D.D. Fushman 1 , F.M. Lang 1 and T.C. Rindflesch 1 1 National Center for Biomedical Communications, National Library of Medicine 2 University of Tennessee, USA (PSB 2007 12:209-220). - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Extracting Semantic Predication from Medline Citations for Pharmacogenomics
C.B. Ahlers1, M. Fiszman2, D.D. Fushman1, F.M. Lang1 and T.C.
Rindflesch1 1National Center for Biomedical
Communications, National Library of Medicine
2University of Tennessee, USA(PSB 2007 12:209-220)
2/26
Abstract This paper describes a NLP system (Enh
anced SemRep) to identify core assertions on pharmacogenomics ( 基因藥理學 ) in Medline.
The development of the system is based on the adaptation of an existing system and depends on UMLS.
Preliminary evaluation: 55% recall and 73% precision.
3/26
1. Introduction (1/3) Core research in pharmacogenomics investig
ates the interaction of genes/proteins with therapeutic substances. E.g. treatment of oncology( 腫瘤學 ).
Current NLP for pharmacogenomics concentrates on co-occurrence information without specifying exact relations.
Enhanced SemRep complements that approach by representing assertions in text as semantic predications.
4/26
1. Introduction (2/3) Example
These findings therefore demonstrate that dexamethasone ( 皮質類固醇 ) is a potent inducer of multidrug resistance-associated protein ( 多抗藥性蛋白質 ) expression in rat hepatocytes ( 肝細胞 ) through a mechanism that seems not to involve the classical glucocorticoid receptor ( 糖皮質激素受體 ) pathway. 1. Dexamethasone STIMULATES Multidrug Resistance-Ass
SemGen: identify semantic predications on the genetic etiology of disease.
Gene and protein name: ABGene. Since UMLS Semantic Network does not cover
molecular genetics, semantic relations are created: Gene-disease interactions: (ASSOCIATE_WITH, PRE
DISPOSE( 易感染的 ), and CAUSE) Gene-gene interactions: (INHIBIT, STIMULATE, and
INTERACTS_WITH)
12/26
3. Methods (1/2) Scrutiny of the pharmacogenomics literature t
o identify relevant predications not identified by either SemRep or SemGen. 1000 Medline were retrieved containing drug and g
ene names. 400 sentences were selected, including genetic (ge
ne-disease), genomic (gene-gene), and pharmacogenomic (drug-gene, drug-genome) relations; in addition relations between genes and population groups; disease and population groups; and pharmacological relations (drug-disease, drug-pharmacological effect, drug-drug) were scrutinized.
13/26
3. Methods (2/2) After processing these 400 sentences with Se
mRep, errors were analyzed and categorized for etiology. The majority of errors
The Semantic Network Errors in argument identification due to “empty” heads Gene name identification
Extensive modifications for Enhanced SemRep. Gene name identification was addressed by addin
g ABGene to the machinery.
14/26
3.1 Modification of Semantic Network for Enhanced SemRep (1/4)
Grouping semantic types: Five broader semantic groups (Substance, Anatomy, Living Being, Process, and Pathology) were defined to permit predications relevant to pharmacogenomics. Substance: ‘Amino Acid, Peptide, or Protein’, ‘Antib
Word sense ambiguity (28%) Ticlopidine ( 血小板抑制劑 ) inhibition of ph
enytoin ( 二苯妥因 ) metabolism mediated by potent inhibition of CYP2C19 ( 基因 ).
Inhibition wrongly mapped to ‘Psychological Inhibition’.
CYP2C19 AFFECTS Psychological Inhibition.
22/26
5.1 Discussion: Error Analysis (2/2)
Process coordinate structures (35%) The cytotoxic ( 細胞毒素 ) activities of merca
ptopurine ( 藥:胇基嘌呤 ) and fluorouracil ( 抗腫瘤代謝藥物 ) are regulated by thiopurine methyltransferase (TPMT) and dihydropyrimidine dehydrogenase (DPD), respectively.
Fluorouracil INTERACTS_WITH DPD gene. (○) mercaptopurine INTERACTS_WITH thiopurine m
ethyltransferase. (X)
23/26
5.2 Process Medline Citations on CYP2D6 (1/3)
2849 Medline citations contain variant forms of CYP2D6.
5219 predications containing CYP2D6 as an argument were analyzed according to two predication categories (Genetic Etiology and Substance Relations).
Compare with relations listed for this gene on the PharmGKB Web site (PharmacoGenetic Knowledge Base).
24/26
5.2 Process Medline Citations on CYP2D6 (2/3)
Genetic Etiology 267 total predications represented CYP2D6 as an et
iologic agent for a disease. Parkinson’s disease ( 帕金森氏症 ) (35), carcinoma