Top Banner
Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science
36

Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.

Mar 27, 2015

Download

Documents

Claire Reyes
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1

Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science Slide 2 Scientists often want to work with more than one gene or protein that are related by a common feature TAIR (and the PMN) offer some basic tools to create customized data sets (e.g. lists of genes or proteins) to add more information to data sets to analyze data sets Sometimes, one gene isnt enough... Slide 3 Data sets can be based on many different criteria: Overall sequence alignment (DNA or protein) Sequence motifs (DNA or protein) Protein domains and biochemical properties Gene/Protein function Subcellular location Molecular function Biological process Expression pattern Biochemical pathway Mapping region Phenotype Gene families Creating customized data sets using TAIR and the PMN How do you generate these data sets? Slide 4 Creating customized data sets: TAIR Data sets can be obtained using several strategies at TAIR Advanced search pages Data-mining tools Slide 5 Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN What is the PMN? It is the home of AraCyc the Arabidopsis metabolic pathway database The Plant Metabolic Network (PMN) maintains a set of metabolic pathway databases for Arabidopsis and other plants Provides tools to analyze metabolic data Generates new metabolic pathway databases for crops and other important plants www.plantcyc.org Slide 6 Pathway Enzyme Gene Reaction Compound Evidence Code AraCyc Pathway pages contain several types of data Metabolic pathway data in AraCyc at the PMN Slide 7 Pathway pages contain curated comments and useful links Metabolic pathway data in AraCyc at the PMN Slide 8 Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN Advanced search page Data-mining tools (*coming soon*) Metabolic pathway pages Slide 9 Additional information can be obtained for your data set Enhancing customized data sets Bulk data retrieval tool FTP files Slide 10 You have mapped a mutation that disrupts flower development to a region of Chromosome 1 What are some good candidates in the mapping interval? Get a list of all the genes in the mapping interval and find candidates involved in flower development Find all the associated gene function (GO) and expression (PO) annotations for the candidate genes Obtain gene confidence scores for all associated gene models to choose sequence for complementation Customized data sets: case studies Slide 11 Get a list of all the genes in a mapping interval involved in flower development Customized data sets: Flower development PVV4.1NCC1 Slide 12 Customized data sets: Flower development AT1G09000 MAP kinase kinase kinase activity, cellular component unknown, embryo, flower, flower development, kinase activity, leaf, petal differentiation and expansion stage, response to oxidative stress, root, seed, shoot apex, whole plant, D bilateral stage, E expanded cotyledon stage, F mature embryo stage, Choose gene models to express for complementation experiments... Slide 13 Customized data sets: Flower development Obtain gene confidence scores for all associated gene models Slide 14 You work on a transcription factor that affects jasmonic acid biosynthesis Do JA biosynthetic genes share common sequences in their promoters? Obtain a list of all the genes involved in JA biosynthesis Get upstream promoter sequences Search for over-represented DNA sequences in promoters Creating customized data sets Slide 15 Customized data sets: JA biosynthesis jasmonic acid Slide 16 Customized data sets: JA biosynthesis Take this gene list to TAIR... to get upstream sequences Slide 17 Customized data sets: JA biosynthesis Get upstream promoter sequences Slide 18 Customized data sets: JA biosynthesis Search for over-represented or prevalent DNA sequences in promoters Use the Motif Analyzer in TAIR to identify common 6-mers AT1G69490 AT1G48270 AT1G11870 AT1G12820 Slide 19 Creating customized data sets You are studying a protein with an exciting new domain: Thr-x-Ala-x-Ile-x-Arg Are there other TxAxIxR proteins? Do they share additional domains? Find all of the proteins that have the TxAxIxR domain Identify all of the other domains found in those proteins Slide 20 Customized data sets: TxAxIxR proteins Find all of the proteins that have the TxAxIxR domain Slide 21 Customized data sets: TxAxIxR proteins Identify all of the other domains found in those proteins Slide 22 Analyzing data sets Sometimes you want to analyze data sets We have a few analysis tools: Analyze = DISPLAY data in a visual manner with a few statistics Data must be pre-cleaned If you want to display quantitative metabolic data on genes, enzymes or compounds OMICS viewer If you want to look for over-represented annotations for a list of genes or proteins All the genes up-regulated in a mutant All of the proteins found in the ovule GO categorization tool Slide 23 GO categorization Classify your list of genes/proteins using GO annotations Slide 24 GO categorization Slide 25 ... or use a tool at AmiGO (on hand-out) Slide 26 Putting TAIR and the PMN to work for you Use TAIR to find detailed information for specific genes or proteins Locus page, gene model page, protein page Many sections, many data types, many external links GBrowse Many tracks New gene confidence scores as part of TAIR9 release Use TAIR and the PMN to generate and work with customized data sets Create and add data to lists of proteins and genes Specific and Advanced Search pages Motif analysis tools FTP files with large data sets Visualize and analyze data OMICs viewer (PMN) GO categorization (TAIR) If youre having trouble getting any information you want... Slide 27 We are here to help! www.arabidopsis.org [email protected] www.plantcyc.org [email protected] Please visit us and ask questions at the Curation Booth! Workshop Part II: Practice sets and individual help www.arabidopsis/portals/education/presentations/2009/ICAR/ICAR_workshop_2009.jsp Slide 28 Acknowledgements TAIR, AraCyc, and the PMN Current Curators: - Tanya Berardini (lead curator functional annotation) - David Swarbreck (lead curator structural annotation) - Peifen Zhang (Director and lead curator- metabolism) - A. S. Karthikeyan (curator) - Philippe Lamesch (curator) - Donghui Li (curator) - Rajkumar Sasidharan (curator) Recent Past Contributors: - Debbie Alexander (curator) - Christophe Tissier (curator) - Hartmut Foerster (curator) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Anjo Chi - Vanessa Kirkup - Cynthia Lee - Tom Meyer - Shanker Singh - Chris Wilks Metabolic Pathway Software: - Peter Karp and SRI group Eva Huala (Director and Co-PI) Sue Rhee (PI and Co-PI) Slide 29 Part I: Tips and techniques from curators Bonus slides... Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science Slide 30 Customized data sets: Flower development Find all the associated GO terms and PO terms and get evidence codes Slide 31 Obtain a list of all the genes involved in JA biosynthesis Customized data sets: JA biosynthesis Slide 32 Slide 33 Another option Use pathway page Slide 34 OMICs Viewer Slide 35 Slide 36 Customized data sets: JA biosynthesis Experimental results provide a more detailed sequence: (A or T)C(A or C or G)TCGGT(G or T)A