The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge

Published online 21 November 2007 Nucleic Acids Research, 2008, Vol. 36, Database issue D913–D918doi:10.1093/nar/gkm1009

The pharmacogenetics and pharmacogenomicsknowledge base: accentuating the knowledgeTina Hernandez-Boussard, Michelle Whirl-Carrillo, Joan M. Hebert, Li Gong,

Ryan Owen, Mei Gong, Winston Gor, Feng Liu, Chuong Truong, Ryan Whaley,

Mark Woon, Tina Zhou, Russ B. Altman and Teri E. Klein*

Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA

Received September 10, 2007; Revised and Accepted October 24, 2007

ABSTRACT

PharmGKB is a knowledge base that captures therelationships between drugs, diseases/phenotypesand genes involved in pharmacokinetics (PK) andpharmacodynamics (PD). This information includesliterature annotations, primary data sets, PK and PDpathways, and expert-generated summaries ofPK/PD relationships between drugs, diseases/phenotypes and genes. PharmGKB’s website isdesigned to effectively disseminate knowledge tomeet the needs of our users. PharmGKB currentlyhas literature annotations documenting the relation-ship of over 500 drugs, 450 diseases and 600 variantgenes. In order to meet the needs of whole genomestudies, PharmGKB has added new functionalities,including browsing the variant display by chromo-some and cytogenetic locations, allowing theuser to view variants not located within a gene.We have developed new infrastructure forhandling whole genome data, including increasedmethods for quality control and tools for compar-ison across other data sources, such as dbSNP,JSNP and HapMap data. PharmGKB has also addedfunctionality to accept, store, display and query highthroughput SNP array data. These changes allow usto capture more structured information on pheno-types for better cataloging and comparison of data.PharmGKB is available at www.pharmgkb.org

INTRODUCTION

The Pharmacogenetics and Pharmacogenomics Knowl-edge Base (PharmGKB; www.pharmgkb.org) is a publicresource that promotes research into the relationshipsbetween human genotypes, phenotypes and clinicaloutcomes by linking and annotating primary data setsfrom ongoing research and established data from the

literature (1,2). In addition to gene–drug relationships, thePharmGKB also contains data on gene variation,genomics, gene–disease relationships, drug action andpathways. The PharmGKB has developed highly curatedpathways documenting the genes involved in pharmaco-dynamics and pharmacokinetics of a selection of drugs.We have developed an XML format for defining

genotype data, a relational database schema for datastorage and a flexible mechanism for submitting pheno-type data (3). We are also participating in the PML effortto define an XML standard for genotype/phenotype dataexchange. PharmGKB first came online in April 2000, andmajor updates to the user interface appeared in 2002, 2004and 2006. PharmGKB data and knowledge are updatedon a continuous basis. Access is free but requires users toregister for a username and password for viewingindividual subject data.

DISTINGUISHING PHARMGKB USER GROUPS

PharmGKB supplies a wide variety of pharmacogenomicknowledge to a broad range of users, from gene-basedusers to clinical scientists. In order to better serve our usergroups, PharmGKB has redesigned its website homepageas well as the resource and submission tabs to moreeffectively disseminate knowledge in a form that matchesuser needs. PharmGKB has identified four main usergroups of the database: gene-oriented users, drug-orientedusers, bioinformaticians and clinical/disease-orientedinvestigators. PharmGKB organizes knowledge pertinentto each group into user-based views and resources.Figure 1 depicts the PharmGKB homepage that providesdirect links to different user-based knowledge, such asdrug pages or gene pages and more specifically pharma-cokinetic and pharmacodynamic data.

INTEGRATING GENOME-SCALE DATA

High-throughput functional genomic technologies haveresulted in the rapid accumulation of genome-scale

*To whom correspondence should be addressed. Tel: +1 650 725 7013; Fax: +1 650 736 0077; Email: [email protected]

� 2007 The Author(s)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/

by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

http://creativecommons.org/licenses/

data sets. There is a renewed emphasis on geneticvariation, partly due to the haplotype mapping projectundertaken by HapMap(4). The interest lies in the notionthat a single nucleotide polymorphism (SNP) can con-tribute directly to disease predisposition by modifying agene’s function, or that SNPs can be used as geneticmarkers to tag near neighbor disease causing mutationsthrough association studies and linkage analysis. Geneticvariation is now measured on a genomic scale using SNParrays. The successful analysis of such data sets dependson rapid access to the most current annotation of theSNPs being studied in conjunction with phenotypic data,linkage disequilibrium information and other genomicdata (5). Accordingly, PharmGKB has added function-ality to integrate, aggregate and annotate data fromgenome-wide studies. These data can be queried andviewed by chromosome browsing, gene pages or

individual submission pages. Figure 2 shows the ABCB1gene page that has been populated with an additional 15variants from an Illumina 317K SNP array, assayed bothfrom traditional and high-throughput methods.

VIP GENE AND VARIANT PAGES

Very Important Pharmacogenes, provide annotated infor-mation about genes, variants, haplotypes and splicevariants of particular relevance for pharmacogeneticsand pharmacogenomics. VIP gene pages highlight thekey variants, haplotypes, drugs, diseases and phenotypesassociated with the pharmacogene (Figure 3). A VIP geneis defined as a gene that has well-documented informationabout its involvement in the pharmacodynamics orpharmacokinetics of a drug. There are a total of about200 well-documented VIP genes that were selected by

Figure 1. PharmGKB disseminates highly curated pharmacogenomic knowledge in a form that matches user needs.

D914 Nucleic Acids Research, 2008, Vol. 36, Database issue

Figure 2. PharmGKB VariantPage. CYP2D6 variant page contains information on the variants within the gene boundaries. SNP array data areintegrated with traditional PharmGKB primary data and other external variant data, such as dbSNP and JSNP.

Figure 3. PharmGKB very important pharmacogenomic (VIP) genes: PharmGKB gathers knowledge on key pharmacogenomic genes, whichhighlight important variants, haplotypes, drugs, diseases and phenotypes associated with the gene.

Nucleic Acids Research, 2008, Vol. 36, Database issue D915

pharmacogenomics experts for PharmGKB to annotate.VIP pages are hand-curated and contain informationabout the mapping of the variant to allow crosscomparison with other resources, frequencies and drugand disease associations from key studies and the links tothe literature that document them. VIP variants link fromthe variant page via a flag in the row for that golden pathposition.

PHARMGKB PATHWAYS

Historically, many pharmacogenetic studies have focusedon single genes involved in drug side-effects. There is now

a growing interest in how pathways of interacting genescan affect both drug metabolism and drug response.PharmGKB pathways are drug-centered, depicting candi-date genes for pharmacogenetics and pharmacogenomicsstudies and they provide the means to connect separatedata sets to represent the current knowledge as a cohesivesnapshot. Pathways are created based on communityinterest and involvement/contributions. The diagramshave information content in the shape and color of theicons that represent whether the component is a gene,drug, metabolic intermediate, etc. (Figure 4). The path-ways are interactive: clicking on a gene takes the user tothe gene page from which available genotype andphenotype data and literature citations can be found.

Copyright 2004

Cell Membrane

Cancer Cell

SN-38G

SN-38Irinotecan

UGT1A3UGT1A1

UGT1A8UGT1A9

CES1

CES2

BCHE

Cell Death

ADPRT

XRCC1TDP1

CDC45L

NFKB1

TOP1

SN-38

ABCC2

UGT1A6

Irinotecan CES2 CES1BCHE

Neutropenia

ABCC2ABCB1

ABCC2ABCB1ABCG2

UGT1A10

UGT1A4

Figure 4. PharmGKB Irinotecan Pathway. View of a model human liver cell showing blood, bile, and intestinal compartments, indicating tissue-specific involvement of genes in the irinotecan pathway. Drugs are depicted by purple boxes, transporter genes by turquoise ovals and genes codingfor metabolic enzymes by blue ovals. Available at: http://www.pharmgkb.org/search/pathway/irinotecan/liver.jsp.


http://www.pharmgkb.org/search/pathway/irinotecan/liver.jsp

Drugs and metabolites are represented by rectangles.Clicking on a drug takes the user to the drug page fromwhich available phenotype data and literature citationscan be found. Clicking on a golden arrow presents the userwith phenotype data that support a relationship. Inaddition to the pathway diagram, a summary is providedto describe the content of the graphic. The pathways aregenerated by collaboration of investigators and representa consensus of the opinions of the authors. Currently,these pathways are constructed by hand as graphic imagesand updated by the scientific community every 2 years.Dates of pathway release and updates are posted on thepathway pages.

LITERATURE ANNOTATIONS

PharmGKB has a collection of pharmaco-related litera-ture annotations that are generated and enhancedthrough hand-curation. We allow reference addition toPharmGKB either as a Publication entry or as aLiterature Annotation, which includes additional hand-curated details about the reference and the pharmaco-genomic relationships described in the article. To aidusers, PharmGKB knowledge generation is achieved by

combining highly searchable controlled vocabulary classi-fications of references with brief, detailed free-textdescriptions of the primary research findings. Thisprovides users with the most flexible access to theknowledge we generate about the references, dependingon their different needs and preferences.

PHENOTYPE DATA SETS

PharmGKB houses a variety of phenotype data sets. Allphenotype submissions are accepted and PharmGKB hasexpanded submission methods to include Microsoft Officedocuments, or alternatively, a URL to another establishedarchival public database (e.g. GEO). High-impact pheno-type data sets are curated by hand, while others receiveminimal oversight at PharmGKB. High-impact phenotypedata sets correspond to genotype data submitted toPharmGKB and are typically published in peer-reviewjournals. These data are featured on the PharmGKBwebsite with an interactive display, curated annotationsand downloadable Excel files. All phenotype files onPharmGKB are fully text-searchable. PharmGKB alsooffers links to web sites with controlled vocabularies in

Figure 5. PharmGKB phenotype data set. PharmGKB displays phenotype data. In the figure, there are five tabs: (i) Overview presents key indexingterms and a summary of the file, (ii) Publications points to key publication summaries, (iii) Column Headers defines the data present in the file and(iv) Individualized Data provides a data browser to look at the primary data. (v) External Data Links point to external data sources relevant to thedata set, such as exon arrays stored in other databases.

Nucleic Acids Research, 2008, Vol. 36, Database issue D917

order to encourage investigators to optimize documenta-tion of their deposits.The phenotype data set knowledge is available to the

general public via an integrated tab-display (Figure 5).For all data sets in PharmGKB, genes, drugs, diseases andcategories of pharmacogenetic evidence are tagged andindexed for querying. Curated phenotype data sets arereviewed and include meaningful phenotypic annotationsrelated to pharmacogenomic research, from clinical-metabolite data to protein constructs. Genotype and/orphenotype data can be downloaded by clicking on arrowsin the upper right corner of the webpage.

FUTURE DIRECTIONS

PharmGKB catalyzes the generation of new knowledge inpharmacogenomics through the development, implemen-tation and dissemination of a public resource focused onboth data and knowledge. In the short-term, this resourcewill facilitate basic research. In the long-term, it willimpact how medicine is delivered. Our future work focuseson detailed annotation of individual human polymorph-isms (or haplotypes) that are important for drug responsephenotypes. We also are active in creating consortia ofinvestigators interested in pooling pharmacogenomic data

sets in order to improve population coverage andstatistical power.

ACKNOWLEDGEMENTS

PharmGKB is supported by NIH/NIGMS Grantno. UO1GM61374. Funding to pay the OpenAccess publication charges for this article was providedby NIH/NIGMS grant no. UO1GM61374.

Conflict of interest statement. None declared.

REFERENCES

1. Thorn,C.F., Klein,T.E. and Altman,R.B. (2005) PharmGKB: thepharmacogenetics and pharmacogenomics knowledge base. MethodsMol. Biol., 311, 179–191.

2. Hewett,M., Oliver,D.E., Rubin,D.L., Easton,K.L., Stuart,J.M.,Altman,R.B. and Klein,T.E. (2002) PharmGKB: the pharmaco-genetics knowledge base. Nucleic Acids Res., 30, 163–165.

3. Whirl-Carrillo,M.M.W., Thorn,C.F., Klein,T.E. and Altman,R.B.(2007) The PharmGKB XML Schema: an interchange format forgenotype-phenotype databases. Human Mutation. In press.

4. Consortium,I.H. (2005) A haplotype map of the Human genome.Nature, 437, 1299–1320.

5. Hernandez-Boussard,T., Klein,T.E. and Altman,R.B. (2006)Pharmacogenomics: the relevance of emerging genotypingtechnologies. MLO Med. Lab. Obs., 38, 24, 26–30.


https://www.researchgate.net/publication/11605605_PharmGKB_the_Pharmacogenetics_Knowledge_Base?el=1_x_8&enrichId=rgreq-160fa31e-5862-4714-b108-ef0fd372f4c8&enrichSource=Y292ZXJQYWdlOzU4MTU5MDM7QVM6OTkwMDE5NDg5NjY5MjVAMTQwMDYxNTMxMjc4Nw==



https://www.researchgate.net/publication/7170639_Pharmacogenomics_The_relevance_of_emerging_genotyping_technologies?el=1_x_8&enrichId=rgreq-160fa31e-5862-4714-b108-ef0fd372f4c8&enrichSource=Y292ZXJQYWdlOzU4MTU5MDM7QVM6OTkwMDE5NDg5NjY5MjVAMTQwMDYxNTMxMjc4Nw==



https://www.researchgate.net/publication/279185370_A_Haplotype_Map_of_the_Human_Genome?el=1_x_8&enrichId=rgreq-160fa31e-5862-4714-b108-ef0fd372f4c8&enrichSource=Y292ZXJQYWdlOzU4MTU5MDM7QVM6OTkwMDE5NDg5NjY5MjVAMTQwMDYxNTMxMjc4Nw==

https://www.researchgate.net/publication/279185370_A_Haplotype_Map_of_the_Human_Genome?el=1_x_8&enrichId=rgreq-160fa31e-5862-4714-b108-ef0fd372f4c8&enrichSource=Y292ZXJQYWdlOzU4MTU5MDM7QVM6OTkwMDE5NDg5NjY5MjVAMTQwMDYxNTMxMjc4Nw==

https://www.researchgate.net/publication/6127437_The_Education_Potential_of_the_Pharmacogenetics_and_Pharmacogenomics_Knowledge_Base_PharmGKB?el=1_x_8&enrichId=rgreq-160fa31e-5862-4714-b108-ef0fd372f4c8&enrichSource=Y292ZXJQYWdlOzU4MTU5MDM7QVM6OTkwMDE5NDg5NjY5MjVAMTQwMDYxNTMxMjc4Nw==



The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge

Documents