An Introduction to Apollo A webinar for the Phascolarctos cinereus research community Monica Munoz-Torres, PhD | @monimunozto Berkeley Bioinformatics Open-Source Projects (BBOP) Lawrence Berkeley National Laboratory Joint Genome Institute | University of California Berkeley | U.S. Department of Energy 04 August, 2015
82
Embed
Apollo - A webinar for the Phascolarctos cinereus research community
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
An Introduction to Apollo A webinar for the Phascolarctos cinereus research community
Monica Munoz-Torres, PhD | @monimunozto
Berkeley Bioinformatics Open-Source Projects (BBOP)Lawrence Berkeley National LaboratoryJoint Genome Institute | University of California Berkeley | U.S. Department of Energy
04 August, 2015
APOLLO DEVELOPMENT
APOLLO DEVELOPERS 2
h"p : / /GenomeA r c h i t e c t . o r g /
Nathan Dunn
Eric Yao JBrowse, UC Berkeley
Deepak Unni Colin Diesh
Elsik Lab, University of Missouri
Suzi Lewis Principal Investigator
BBOP
Moni Munoz-Torres Stephen Ficklin GenSAS,
Washington State University
3
ANNOTATION PLAN
Introduction
Assembly freeze Automated Annotation
Manual annotation
Using Web Apollo
Curation freeze
Merge: automated +
manual
Genome-wide & gene-specific comparative
analyses
QC QC
Synthesis & dissemination.
OUTLINE
Web Apollo Collabora(ve Cura(on and Interac(ve Analysis of Genomes
4 OUTLINE
• THE GENE MODEL predic(on, annota(on, cura(on
• APOLLO
empowering collabora(ve cura(on • APOLLO on THE WEB
becoming acquainted
• EXAMPLE demonstra(ons
5
BY THE END OF THIS TALKyou will
v Be>er understand genome cura(on in the context of annota(on: assembled genome à automated annotaEon à manual annotaEon
v Become familiar with the environment and func(onality of the Apollo genome annota(on edi(ng tool.
v Learn to iden(fy homologs of known genes of interest in a newly sequenced genome.
v Learn about corrobora(ng and modifying automa(cally annotated gene models using available evidence in Apollo.
Introduction
REVIEW ON YOUR OWNfor manual annotation
To remember… Biological concepts to be>er understand manual annota(on
6 FOOD FOR THOUGHT
• GLOSSARY from con$g to splice site
• CENTRAL DOGMA
in molecular biology • WHAT IS A GENE?
defining your goal
• TRANSCRIPTION mRNA in detail
• TRANSLATION
and other defini(ons
• GENOME CURATION steps involved
7 CURATING GENOMES
What is a gene?
v The defini(on of a gene paints a very complex picture of molecular ac(vity and it is a con(nuously evolving concept.
• From the Sequence Ontology (SO): “A gene is a locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions and/or other func(onal sequence regions”. “Evolving Concept” at h>p://goo.gl/LpsajQ
8 CURATING GENOMES
What is a gene?
v In our life(me, the Encyclopedia of DNA Elements (ENCODE) project updated this concept yet again. Long transcripts & dispersed regula$on!
“A gene is a DNA segment that contributes phenotype/func(on. In the absence of demonstrated func(on, a gene may be characterized by sequence, transcrip(on or homology.”
https://www.encodeproject.org/
9 CURATING GENOMES
What is a gene?considerations
v Consider : • A gene is a genomic sequence (DNA or RNA) directly encoding
func(onal product molecules, either RNA or protein.
• If several func(onal products share overlapping regions, we take the union of all overlapping genomics sequences coding for them.
• This union must be coherent – i.e., processed separately for final protein and RNA products – but does not require that all products necessarily share a common subsequence.
Gerstein et al., 2007. Genome Res.
10 CURATING GENOMES
What is a gene?
v “The gene is a union of genomic sequences encoding a coherent set of poten(ally overlapping func(onal products.”
Gerstein et al., 2007. Genome Res
11 CURATING GENOMES
TRANSLATIONreading frame
v Reading frame is a manner of dividing the sequence of nucleo(des in mRNA (or DNA) into a set of consecu(ve, non-‐overlapping triplets (codons).
v Three frames can be read in the 5’ à 3’ direc(on. Given that DNA has two an(-‐parallel strands, an addi(onal three frames are possible to be read on the an(-‐sense strand. Six total possible reading frames exist.
v In eukaryotes, only one reading frame per sec(on of DNA is biologically relevant at a (me: it has the poten(al to be transcribed into RNA and translated into protein. This is called the OPEN READING FRAME (ORF) • ORF = Start signal + coding sequence (divisible by 3) + Stop signal
v The sec(ons of the mature mRNA transcribed with the coding sequence but not translated are called UnTranslated Regions (UTR); one at each end.
12 CURATING GENOMES
TRANSLATIONreading frame: splice sites
v The spliceosome catalyzes the removal of introns and the liga(on of flanking exons. • introns: spaces inside the gene, not part of the coding sequence • exons: expression units (of the coding sequence)
v Splicing “signals” (from the point of view of an intron): • There is a 5’ end splice “signal” (site): usually GT (less common: GC) • And a 3’ end splice site: usually AG • …]5’-‐GT/AG-‐3’[…
v It is possible to produce more than one protein (polypep(de) sequence from the same genic region, by alterna(vely bringing exons together= alternaEve splicing. For example, the gene Dscam (Drosophila) has 38,000 alterna(vely spliced mRNAs = isoforms
13
"Gene structure" by Daycd- Wikimedia Commons
CURATING GENOMES
TRANSLATIONnow in your mind
14
Text for figures goes here
CURATING GENOMES
TRANSLATIONreading frame: phase
v Introns can interrupt the reading frame of a gene by inser(ng a sequence between two consecu(ve codons
v Between the first and second nucleo(de of a codon
v Or between the second and third nucleo(de of a codon
"Exon and Intron classes”. Licensed under Fair use via Wikipedia
CURATING GENOMESoverview
1 PredicEon of Gene Models
2 AnnotaEon of gene models
3 Manual annotaEon
CURATING GENOMES 15
16 Gene Prediction
GENE PREDICTION
v The iden(fica(on of structural features of the genome:
• Primarily focused on protein-‐coding genes. • Predicts also transfer RNAs (tRNA), ribosomal RNAs (rRNA),
regulatory mo(fs, long and small non-‐coding RNAs (ncRNA), repe((ve elements (masked), etc.
• Two methods for iden(fica(on. • Some are self-‐trained and some must be trained.
17 Gene Prediction
GENE PREDICTIONmethods for discovery
1) Ab ini,o: -‐ based on DNA composi(on, -‐ deals strictly with genomic sequences -‐ makes use of sta(s(cal approaches to search for coding regions and typical gene signals. • E.g. Augustus, GENSCAN,
2) Homology-‐based: -‐ evidence-‐based, -‐ finds genes using either similarity searches in the main databases or experimental data including RNAseq, expressed sequence tags (ESTs), full-‐length complementary DNAs (cDNAs), etc.
• E.g: fgenesh++, Just Annotate My genome (JAMg), SGP2
19
GENE ANNOTATION
Integra(on of data from computa(onal & experimental evidence with data from predic(on tools, to generate a reliable set of structural annotaEons. Involves: 1) ab ini$o predic(ons 2) assessment of biological evidence to drive the gene predic(on process 3) synthesis of these results to produce a set of consensus gene models
Gene Annotation
20
In some cases algorithms and metrics used to generate consensus sets may actually reduce the accuracy of the gene’s representa(on.
GENE ANNOTATION
Gene models may be organized into “sets” using: v automa(c integra(on of predicted sets (combiners); e.g: GLEAN,
EvidenceModeler or
v tools packaged into pipelines; e.g: MAKER, PASA, Gnomon, Ensembl, etc.
Gene Annotation
ANNOTATION IS NOT PERFECT automated annotation remains an imperfect art
Unlike the more highly polished genomes of earlier projects, today’s genomes usually have:
• more frequent assembly errors, which lead to annota(on of genes across mul(ple scaffolds
• lower coverage
No one is perfect, least of all automated annotation. 21
Image: www.BroadInstitute.org
MANUAL ANNOTATIONworking concept
Precise elucidaEon of biological features encoded in the genome requires careful
examinaEon and review.
Schiex et al. Nucleic Acids 2003 (31) 13: 3738-‐3741
Automated Predictions
Experimental Evidence
Manual Annotation – to the rescue. 22
cDNAs, HMM domain searches, RNAseq, genes from other species.
The manual annotator evaluates all available evidence and corroborates or modifies genome element predic(ons.
BUT, MANUAL CURATIONdoes not always scale
Researchers on their own; may or may not publicize results; may be a dead-‐end with very few people ever aware of these results.
Elsik et al. 2006. Genome Res. 16(11):1329-‐33.
MANUAL ANNOTATION 23
Too many sequences and not enough hands.
A small group of highly trained experts (e.g. GO).
1 Museum
A few very good biologists, a few very good bioinforma(cians camping together for intense but short periods of (me.
Jamboree 2
Co"age 3
24
MANUAL ANNOTATIONobjectives
IdenEfies elements that best represent the underlying biology and eliminates elements that reflect systemic errors of automated analyses.
Assigns funcEon through compara(ve analysis of similar genome elements from closely related species using literature, databases, and experimental data.
MANUAL ANNOTATION
h>p://GeneOntology.org
1
2
GENOME ANNOTATIONan inherently collaborative task
Researchers oren turn to colleagues for second opinions and insight from those with exper(se in par(cular areas (e.g., domains, families).
APOLLO 25
We need annota$on edi$ng tools to modify and refine the precise loca$on and structure of the genome elements that
v Web based, integrated with JBrowse. v Supports real (me collabora(on! v Automa(c genera(on of ready-‐made computable data. v Supports annota(on of genes, pseudogenes, tRNAs, snRNAs,
snoRNAs, ncRNAs, miRNAs, TEs, and repeats. v Intui(ve annota(on, gestures, and pull-‐down menus to create and
edit transcripts and exons structures, insert comments (CV, freeform text), associate GO terms, etc.
APOLLO
h>p://GenomeArchitect.org
APOLLO ARCHITECTUREsimpler, more flexible
APOLLO 27
Web-‐based client + annota(on-‐edi(ng engine + server-‐side data service
REST / JSON Websockets
Annotation Engine (Server)
Shiro
LDAP
OAuth
JBrowse Data Organism 2
Annotations
Security
Preferences
Organisms
Tracks
BAM BED VCF GFF3 BigWig
Annotators
Google Web Toolkit (GWT) / Bootstrap
JBrowse DOJO / jQuery JBrowse Data Organism 1
Load genomic evidence for selected organism
Single Data Store PostgreSQL, MySQL,
MongoDB, ElasticSearch
Apollo v2.0
We train and support hundreds of geographically dispersed scien(sts from diverse research communi(es to conduct manual annota(ons, to recover coding sequences in agreement with all available biological evidence using Web Apollo. v Gate keeping and monitoring. v Tutorials, training workshops, and “geneborees”.
What we have learned: • Collabora(ve work dis(lls invaluable knowledge • We must enforce strict rules and formats • We must evolve with the data • A li>le training goes a long way • NGS poses addi(onal challenges
LESSONS LEARNED 29
Apollo h>p://genomearchitect.org/web_apollo_user_guide
1. Select a chromosomal region of interest, e.g. scaffold.
2. Select appropriate evidence tracks to review the gene model.
3. Determine whether a feature in an exis(ng evidence track will provide a reasonable gene model to start working. -‐ select and drag the feature to the ‘User-‐created Annota(ons’
area, creaEng an iniEal gene model. If necessary use edi(ng func(ons to adjust the gene model.
4. Check your edited gene model for integrity and accuracy by comparing it with available homologs.
Becoming Acquainted with Web Apollo 31 |
Always remember: when annota(ng gene models using Apollo, you are looking at a ‘frozen’ version of the genome assembly and you will not be able to modify the assembly itself.
31
GENERAL PROCESS OF CURATIONsteps to remember
32
APOLLOannotation editing environment
BECOMING ACQUAINTED WITH APOLLO
Color by CDS frame, toggle strands, set color scheme and highlights.
Get coordinates and “rubber band” selec(on for zooming.
Login
User-‐created annota(ons. Annotator
panel.
Evidence Tracks
Stage and cell-‐type specific transcrip(on data.
REMOVABLE SIDE DOCKwith customizable tabs
HIGHLIGHTED IMPROVEMENTS 33
Annotations Organism Users Groups Admin Tracks Reference Sequence
EDITS & EXPORTSannotation details, exon boundaries, data export
HIGHLIGHTED IMPROVEMENTS 34
1 2
Annotations
1
2
HIGHLIGHTED IMPROVEMENTS 35
Reference Sequences
3
FASTA
GFF3
EDITS & EXPORTSannotation details, exon boundaries, data export
3
36 | 36 Becoming Acquainted with Web Apollo.
USER NAVIGATION
Annotator panel.
• Choose appropriate evidence tracks from list on annotator panel. • Select & drag elements from evidence track into the ‘User-created Annotations’ area.
• Edge-matching.
• Hovering over annotation in progress brings up an information pop-up.
37 | 37
USER NAVIGATION
Becoming Acquainted with Web Apollo.
• Annotation right-click menu
38
Annota(ons, annota(on edits, and History: stored in a centralized database.
38
USER NAVIGATION
Becoming Acquainted with Web Apollo.
39
The Annota(on InformaEon Editor
DBXRefs are database crossed references: if you have reason to believe that this gene is linked to a gene in a public database (including your own), then add it here.
39
USER NAVIGATION
Becoming Acquainted with Web Apollo.
40
The Annota(on InformaEon Editor
• Add PubMed IDs • Include GO terms as appropriate
from any of the three ontologies • Write comments sta(ng how you
have validated each model.
40
USER NAVIGATION
Becoming Acquainted with Web Apollo.
41 |
Zoom in/out with keyboard: shir + arrow keys up/down
41
USER NAVIGATION
Becoming Acquainted with Web Apollo.
• ‘Zoom to base level’ op(on reveals the DNA Track.
• Color exons by CDS from the ‘View’ menu.
• Toggle reference DNA sequence and translaEon frames from either direc(on.
Annota(ng
“Simple case”: -‐ the predicted gene model is correct or nearly correct, and
-‐ this model is supported by evidence that completely or mostly agrees with the predic(on.
-‐ evidence that extends beyond the predicted model is assumed to be non-‐coding sequence.
The following are simple modifica(ons.
43 | 43
ANNOTATING SIMPLE CASES
Becoming Acquainted with Web Apollo. SIMPLE CASES
44 |
• A confirma(on box will warn you if the receiving transcript is not on the same strand as the feature where the new exon originated.
• Check ‘Start’ and ‘Stop’ signals arer each edit.
44
ADDING EXONS
Becoming Acquainted with Web Apollo. SIMPLE CASES
If transcript alignment data are available and extend beyond your original annota(on, you may extend or add UTRs.
1. Right click at the exon edge and ‘Zoom to base level’.
2. Place the cursor over the edge of the exon un$l it becomes a black arrow then click and drag the edge of the exon to the new coordinate posi(on that includes the UTR.
45 |
To add a new spliced UTR to an exis(ng annota(on follow the procedure for adding an exon.
45
ADDING UTRs
Becoming Acquainted with Web Apollo. SIMPLE CASES
1. Zoom in to clearly resolve each exon as a dis(nct rectangle.
2. Two exons from different tracks sharing the same start and/or end coordinates will display a red bar to indicate matching edges.
3. Selec(ng the whole annota(on or one exon at a (me, use this ‘edge-‐matching’ func(on and scroll along the length of the annota(on, verifying exon boundaries against available data. Use square [ ] brackets to scroll from exon to exon.
4. Check if cDNA / RNAseq reads lack one or more of the annotated exons or include addi(onal exons.
46 | 46
CHECK EXON INTEGRITY
Becoming Acquainted with Web Apollo. SIMPLE CASES
To modify an exon boundary and match data in the evidence tracks: select both the offending exon and the feature with the expected boundary, then right click on the annota(on to select ‘Set 3’ end’ or ‘Set 5’ end’ as appropriate.
47 |
In some cases all the data may disagree with the annota(on, in other cases some data support the annota(on and some of the
data support one or more alterna(ve transcripts. Try to annotate as many alterna(ve transcripts as are well supported by the data.
Non-‐canonical splices are indicated by an orange circle with a white exclama(on point inside, placed over the edge of the offending exon. Most insects, have a valid non-‐canonical site GC-‐AG. Other non-‐canonical splice sites are unverified. Web Apollo flags GC splice donors as non-‐canonical.
Canonical splice sites:
3’-‐…exon]GA / TG[exon…-‐5’
5’-‐…exon]GT / AG[exon…-‐3’ reverse strand, not reverse-‐complemented:
forward strand
49
SPLICE SITES
Becoming Acquainted with Web Apollo. SIMPLE CASES
Zoom to review non-‐canonical splice site warnings. Although these may not always have to be corrected (e.g GC donor), they should be flagged with the appropriate comment.
Web Apollo calculates the longest possible open reading frame (ORF) that includes canonical ‘Start’ and ‘Stop’ signals within the predicted exons.
If ‘Start’ appears to be incorrect, modify it selec(ng an in-‐frame ‘Start’ codon further up or downstream, depending on evidence (protein database, addi(onal evidence tracks).
It may be present outside the predicted gene model, within a region supported by another evidence track.
In very rare cases, the actual ‘Start’ codon may be non-‐canonical (non-‐ATG).
50 | 50
‘START’ AND ‘STOP’ SITES
Becoming Acquainted with Web Apollo. SIMPLE CASES
Evidence may support joining two or more different gene models. Warning: protein alignments may have incorrect splice sites and lack non-‐conserved regions!
1. In ‘User-‐created AnnotaEons’ area shir-‐click to select an intron from each gene model and right click to select the ‘Merge’ op(on from the menu.
2. Drag suppor(ng evidence tracks over the candidate models to corroborate overlap, or review edge matching and coverage across models.
3. Check the resul(ng transla(on by querying a protein database e.g. UniProt. Add comments to record that this annota(on is the result of a merge.
51 | 51
Red lines around exons: ‘edge-‐matching’ allows annotators to confirm whether the evidence is in agreement without examining each exon at the base level.
COMPLEX CASES merge two gene predictions on the same scaffold
Becoming Acquainted with Web Apollo. COMPLEX CASES
One or more splits may be recommended when: -‐ different segments of the predicted protein align to two or more different gene families -‐ predicted protein doesn’t align to known proteins over its en(re length
Transcript data may support a split, but first verify whether they are alterna(ve transcripts.
52 | 52
COMPLEX CASES split a gene prediction
Becoming Acquainted with Web Apollo. COMPLEX CASES
DNA Track
‘User-‐created AnnotaEons’ Track
53
COMPLEX CASES correcting frameshifts, single-base errors, and selenocysteines
Becoming Acquainted with Web Apollo. COMPLEX CASES
1. Web Apollo allows annotators to make single base modifica(ons or frameshirs that are reflected in the sequence and structure of any transcripts overlapping the modifica(on. Note that these manipula(ons do NOT change the underlying genomic sequence.
2. If you determine that you need to make one of these changes, zoom in to the nucleo(de level and right click over a single nucleo(de on the genomic sequence to access a menu that provides op(ons for crea(ng inser(ons, dele(ons or subs(tu(ons.
3. The ‘Create Genomic InserEon’ feature will require you to enter the necessary string of nucleo(de residues that will be inserted to the right of the cursor’s current loca(on. The ‘Create Genomic DeleEon’ op(on will require you to enter the length of the dele(on, star(ng with the nucleo(de where the cursor is posi(oned. The ‘Create Genomic SubsEtuEon’ feature asks for the string of nucleo(de residues that will replace the ones on the DNA track.
4. Once you have entered the modifica(ons, Web Apollo will recalculate the corrected transcript and protein sequences, which will appear when you use the right-‐click menu ‘Get Sequence’ op(on. Since the underlying genomic sequence is reflected in all annota(ons that include the modified region you should alert the curators of your organisms database using the ‘Comments’ sec(on to report the CDS edits.
5. In special cases such as selenocysteine containing proteins (read-‐throughs), right-‐click over the offending/premature ‘Stop’ signal and choose the ‘Set readthrough stop codon’ op(on from the menu.
54 | 54 Becoming Acquainted with Web Apollo. COMPLEX CASES
COMPLEX CASES correcting frameshifts, single-base errors, and selenocysteines
Follow the checklist un(l you are happy with the annota(on!
And… – Comment to validate your annota(on, even if you made no changes to an exis(ng model. Think of comments as your vote of confidence.
– Or add a comment to inform the community of unresolved issues you think this model may have.
55 | 55
Always Remember: Web Apollo cura(on is a community effort so please use comments to communicate the reasons for your
annota(on (your comments will be visible to everyone).
COMPLETING THE ANNOTATION
Becoming Acquainted with Web Apollo.
Checklist
1. Can you add UTRs (e.g.: via RNA-‐Seq)?
2. Check exon structures
3. Check splice sites: most splice sites display these residues …]5’-‐GT/AG-‐3’[…
4. Check ‘Start’ and ‘Stop’ sites
5. Check the predicted protein product(s) – Align it against relevant genes/gene family. – blastp against NCBI’s RefSeq or nr
6. If the protein product s(ll does not look correct then check: – Are there gaps in the genome? – Merge of 2 gene predic(ons on the same scaffold
– Merge of 2 gene predic(ons from different scaffolds
– Split a gene predic(on – Frameshies
– error in the genome assembly? – Selenocysteines, single-‐base errors, etc
57 | 57
7. Finalize annota(on by adding: – Important project informa(on in the form of
comments – IDs from public databases e.g. GenBank (via
DBXRef), gene symbol(s), common name(s), synonyms, top BLAST hits, orthologs with species names, and everything else you can think of, because you are the expert.
– Whether your model replaces one or more models from the official gene set (so it can be deleted).
– The kinds of changes you made to the gene model of interest, if any.
– Any appropriate func(onal assignments of interest to the community (e.g. via BLAST, RNA-‐Seq data, literature searches, etc.)
THE CHECKLIST for accuracy and integrity
MANUAL ANNOTATION CHECKLIST
Example
Example
Example 59
A public Apollo Demo using the Honey Bee genome is available at h>p://genomearchitect.org/WebApolloDemo
-‐ Demonstra(on using the Hyalella azteca genome (amphipod crustacean).
What do we know about this genome?
• Currently publicly available data at NCBI: • >37,000 nucleo(de seqsà scaffolds, mitochondrial genes • 300 amino acid seqsà mitochondrion • 53 ESTs • 0 conserved domains iden(fied • 0 “gene” entries submi>ed
• Data at i5K Workspace@NAL (annota(on hosted at USDA) -‐ 10,832 scaffolds: 23,288 transcripts: 12,906 proteins
Example 60
PubMed Search: what’s new?
Example 61
PubMed Search: what’s new?
Example 62
“Ten popula(ons (3 cultures, 7 from California water bodies) differed by at least 550-‐fold in sensiEvity to pyrethroids.”
“By sequencing the primary pyrethroid target site, the voltage-‐gated sodium channel (vgsc), we show that point muta(ons and their spread in natural popula(ons were responsible for differences in pyrethroid sensi(vity.”
“The finding that a non-‐target aqua(c species has acquired resistance to pes(cides used only on terrestrial pests is troubling evidence of the impact of chronic pesEcide transport from land-‐based applica(ons into aqua(c systems.”
Customizations: high-scoring segment pairs (hsp) in “BLAST+ Results” track
Example 67
Available Tracks
Example 68
Creating a new gene model: drag and drop
Example 69
• Apollo automatically calculates ORF. In this case, ORF includes the high-scoring segment pairs (hsp).
Get Sequence
Example 70
http://blast.ncbi.nlm.nih.gov/Blast.cgi
Also, flanking sequences (other gene models) vs. NCBI nr
Example 71
In this case, two gene models upstream, at 5’ end.
BLAST hsps
Review alignments
Example 72
HaztTmpM006234
HaztTmpM006233
HaztTmpM006232
Hypothesis for vgsc gene model
Example 73
Editing: merge the three models
Example 74
Merge by dropping an exon or gene model onto another.
Merge by selec(ng two exons (holding down “Shir”) and using the right click menu.
or…
Editing: correct boundaries, delete exons
Example 75
Modify exon / intron boundary: -‐ Drag the end of the
exon to the nearest canonical splice site.
-‐ Use right-‐click menu.
Delete first exon from M006233
Editing: set translation start
Example 76
Editing: modify boundaries
Example 77
Modify intron / exon boundary also at coord. 78,999.
Finished model
Example 78
Corroborate integrity and accuracy of the model: -‐ Start and Stop -‐ Exon structure and splice sites …]5’-‐GT/AG-‐3’[… -‐ Check the predicted protein product vs. NCBI nr
Information Editor
• DBXRefs: e.g. NP_001128389.1, N. vitripennis, RefSeq
Apollo demo video available at: h>ps://youtu.be/VgPtAP_fvxY
DEMO 81
• Berkeley BioinformaEcs Open-‐source Projects (BBOP), Berkeley Lab: Web Apollo and Gene Ontology teams. Suzanna E. Lewis (PI).
• § Chris$ne G. Elsik (PI). University of Missouri.
• * Ian Holmes (PI). University of California Berkeley.
• Arthropod genomics community: i5K Steering Commi>ee (esp. Sue Brown (Kansas State)), Alexie Papanicolaou (UWS), and the Honey Bee Genome Sequencing Consor(um.
• Apollo is supported by NIH grants 5R01GM080203 from NIGMS, and 5R01HG004483 from NHGRI; by Contract No. 60-‐8260-‐4-‐005 from the Na(onal Agricultural Library (NAL) at the United States Department of Agriculture (USDA); and by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-‐AC02-‐05CH11231.
• Insect images used with permission: h>p://AlexanderWild.com