- 1.Analysis:Discovery of possible regulatory motifs What follows
is asimulationof the proposed graphical interface. As you go
through the simulation please consider what capabilities you would
want to serveyourresearch and annotation interests. A narrative to
help you go through the simulation appears in a red-bordered box,
such as the one below. To begin: 1. Click onSlide Show , (on the
upper toolbar) 2. ClickView Show 3. ClickContinuebutton Continue
Scenario 5
2. Youve decided you want to know what regulates the expression
ofnifgenes, encoding the machinery for nitrogen fixation. Heres
your strategy: Scenario 5 Continue
-
- ( Search for other genes with same motifs)
- Analyze set of 5 sequences for motifs
- Extract 5 sequences from all genes in set
- Collectnifgenes fromAnabaenaPCC 7120 into set
- Include in set orthologs of theAnabaenagenes
Analysis:Discovery of possible regulatory motifs 3. Build set
Display set Modify set Set operation Click onBuild Setto begin
finding orfs with the desired specifications 4. AllitemsinAll open
reading frames of All amino acid sequences of All intergenic
regions of Human-annotated orfs of Private set Public set All open
reading frames of Build set Display set Modify set Set operation
Cancel Choose set type The first goal is to find all open reading
frames within Prochlorococcus annotated as nif genes, so click
onAll open reading frames in 5. AllitemsinAll open reading frames
of Arthrobacter platensis Gloeobacter violaceus Microcystis
aeruginosa Nostoc punctiforme NostocPCC 7120 ProchlorococcusMED4
ProchlorococcusMIT9313 ProchlorococcusS120 SynechococcusPCC6301
SynechococcusPCC7942 SynechococcusWH SynechocystisPCC 6803
Thermosynechococcus Trichodesmium Unicellulular Filamentous All
AnabaenaPCC 7120 Display set Modify set Set operation Cancel Choose
set type Choose database Build set Click onAnabaena PCC 7120 6.
AllitemsinAnabaenaPCC 7120 Display set Modify set Set operation
Cancel such that:Variable Data Operation Function Done Choose
database Build set All open reading frames of Choose set type You
want to compare the description of each orf with nif. To get a tool
to extract the description, click on. Function 7.
AllitemsinAnabaenaPCC 7120 Display set Modify set Set operation
Cancel such that:Variable Data Operation Function Done Choose
database Closest ortholog of Protein product of Upstream region of
Downstream region of Description of Category of Annotation level of
Description of Choose function ( item Build set All open reading
frames of Choose set type Click onDescription of . 8.
AllitemsinDisplay set Modify set Set operation Cancel Variable Data
Operation Function Done Description of Choose function ( item )=
includes excludes includes Op Build set You want to find orfs whose
description includes the word nif. Click onincludes . AnabaenaPCC
7120 such that:Choose database All open reading frames of Choose
set type 9. AllitemsinDisplay set Modify set Set operation Cancel
Data Operation Function Done includes Op nif Type description
term(s) Build set Description of Choose function ( item )You can
type in any characters to search for. For this simulation, the term
nif is provided. Press theEnterkey AnabaenaPCC 7120 such
that:Choose database All open reading frames of Choose set type 10.
AllitemsinDisplay set Modify set Set operation Cancel Variable Data
Operation Function Done includes Op nif Type description term(s)
Build set Description of Choose function ( item )No more
specifications. Press theDonebutton. AnabaenaPCC 7120 such
that:Choose database All open reading frames of Choose set type 11.
AllitemsinDisplay set Modify set Set operation Cancel Variable Data
Operation Function Done includes Op nif Type description term(s)
Build set Description of Choose function ( item )Done Save results
and script Save only results Save only results If this were a
complicated search, you might want to save the specifications as a
script. In this case, just save the results by clicking onSave only
results . AnabaenaPCC 7120 such that:Choose database All open
reading frames of Choose set type 12. AllitemsinDisplay set Modify
set Set operation Cancel Variable Data Operation Function Done
includes Op nif Type description term(s) Build set Description of
Choose function ( item )7120 nif genes Type name of set AnabaenaPCC
7120 such that:Choose database All open reading frames of Choose
set type All orfs of Anabaena whose descriptions include nif will
be collected into a set. You can name the set anything you want.
For this simulation, a name is provided. Press theEnterkey. 13.
Build set Display set Modify set Set operation Anab7120:all0687
hupL[NiFe] uptake hydrogenase large subunit, C terminus
Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, N
terminus Anab7120:all0688 hupS[NiFe] uptake hydrogenase small
subunit Anab7120:alr0692 similar tonifU Anab7120:alr0874
nifH2dinitrogenase reductase Anab7120:asr1309 similar tonifU
Anab7120:alr1407 nifV1homocitrate synthase Anab7120:asr1408
nifZiron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set:
7120 nif genes > This is the result of the search. The set is
displayed both as a list of orfs and a graphical representation of
the genetic neighborhood of each orf. You can find out more about
an orf by clicking its name or its arrow. For now, just press.
Continue Continue 14. Build set Display set Modify set Set
operation Anab7120:all0687 hupL[NiFe] uptake hydrogenase large
subunit, C terminus Anab7120:all0687 hupL[NiFe] uptake hydrogenase
large subunit, N terminus Anab7120:all0688 hupS[NiFe] uptake
hydrogenase small subunit Anab7120:alr0692 similar tonifU
Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:asr1309
similar tonifU Anab7120:alr1407 nifV1homocitrate synthase
Anab7120:asr1408 nifZiron-sulfur cofactor synthesis
Anab7120:asr1409 nifT Done Set: 7120 nif genes > This search,
like most, is only a beginning. It brought up some unintended hits
(nif found NiFe). More seriously, it brought up many genes probably
in the middle of operons and unlikely to be preceded by regulatory
motifs. The genetic neighborhood gives clues as to operon
structure. Select the two most likely orfs to begin operons by
clicking on the circles next to alr0874 and alr1407. 15. Build set
Display set Modify set Set operation Anab7120:all0687 hupL[NiFe]
uptake hydrogenase large subunit, C terminus Anab7120:all0687
hupL[NiFe] uptake hydrogenase large subunit, N terminus
Anab7120:all0688 hupS[NiFe] uptake hydrogenase small subunit
Anab7120:alr0692 similar tonifU Anab7120:alr0874 nifH2dinitrogenase
reductase Anab7120:asr1309 similar tonifU Anab7120:alr1407
nifV1homocitrate synthase Anab7120:asr1408 nifZiron-sulfur cofactor
synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > Lets
suppose you proceed in a like fashion through the rest of the list.
Press.Done 16. Build set Display set Modify set Set operation
Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:alr1407
nifV1homocitrate synthase Done Set: 7120 nif genes The set now
consists of the six Anabaena nif genes that you judged most likely
to be preceded by transcriptional signals. It might be interesting
to see where this set is located on the genome. To do this, click,
then make some room by clicking onShow graphic . Display set
Anab7120:all1438 nifEnitrogenase Fe/Mo cofactor Anab7120:all1455
nifHdinitrogenase reductase Anab7120:all1517 nifBnitrogen fixation
protein Anab7120:alr2968 nifV2homocitrate synthase Display set Show
orf ID Show gene name Show description Show coordinates Show
graphic Show neighbors: +/- 1 Show map 17. Build set Display set
Modify set Set operation Anab7120:alr0874 nifH2dinitrogenase
reductase Anab7120:alr1407 nifV1homocitrate synthase Done Set: 7120
nif genes Replace the space-consuming description with coordinates
by clicking onShow description , and then clickShow coordinates and
finally Show map . Anab7120:all1438 nifEnitrogenase Fe/Mo cofactor
Anab7120:all1455 nifHdinitrogenase reductase Anab7120:all1517
nifBnitrogen fixation protein Anab7120:alr2968 nifV2homocitrate
synthase Display set Show orf ID Show gene name Show description
Show coordinates Show graphic Show neighbors: +/- 1 Show map 18.
Build set Display set Modify set Set operation Anab7120:alr0874
nifH2 Anab7120:alr1407 nifV1 Done Set: 7120 nif genes
Anab7120:all1438 nifE Anab7120:all1455 nifH Anab7120:all1517 nifB
Anab7120:alr2968 nifV2 Replace the space-consuming description with
coordinates by clicking onShow description , and then clickShow
coordinates and finally Show map . Display set Show orf ID Show
gene name Show description Show coordinates Show graphic Show
neighbors: +/- 1 Show map 19. Anab7120:alr0874
nifH21008496->1009389 Anab7120:alr1407 nifV11671878->1673011
Anab7120:all1438 nifE16963891673011 Set: 7120 nif genes
Anab7120:all1438 nifE1696389 The resulting set consists
ofsequencesnotorfs , and so the elements are defined by
coordinates. Clicking on a coordinate brings up the sequence
display (see Scenario 6). Clicking on a graph of an orf brings up
the orfs annotation page. Click. Continue Continue 30. Build set
Display set Modify set Set operation Done
Anab7120.C:1006982-1008496d Anab7120.C:1671462-1671878d Set: all
nif genes 5 Anab7120.C:1697832-1698138c Anab7120.C:1713264-1713395c
Anab7120.C:1778098-1779034c Anab7120.C:3609273-3609624d
NostPunc.637:37288-37376d NostPunc.510:15955-16325d
NostPunc.651:60311-60584c NostPunc.510:5239-6338c > The final
step in this procedure is to analyze the set of upstream sequences
of nif genes hoping to find a common motif. Click onSet operatio ,
thenAnalysis tools . Tools based onP osition- S pecificS coringM
atrices (PSSMs) are most often used for the task. Click on one of
these:Meme . Set operation Set operation Maintenance Set operations
Analysis tools Discovery tools Transformations Analysis tools Align
PSSM: Gibbs sampler PSSM: Meme Make HMM PSSM: Meme 31. PSSM: Meme
of (Build set Display set Modify set Set operation Cancel Public
set Private set Private set Choose set type ClickPrivate setand
thenall nif genes 5to give Meme the set of 5 sequences. 32. PSSM:
Meme of (Build set Display set Modify set Set operation Cancel
Private set Choose set type ClickPrivate setand thenall nif genes
5to give Meme the set of 5 sequences. 7120 IS895 seqs 7120 nif
genes 7120 STTR7 regions all nif genes all nif genes 5 Npun STTR7
regions all nif genes 5 Choose set ) 33. PSSM: Meme of (Build set
Display set Modify set Set operation Cancel Private set Choose set
type Give the results a name, pressEnter , and the task is
accomplished. all nif genes 5 Choose set ) PSSM:all nif 5 Type name
of results 34. Analysis:Discovery of possible regulatory motifs
Summary
- The interface facilitates operations on sets of genes and
sequences
- The interface puts at your disposal powerful tools (thatalready
exist), without the need to figure out a different computer
environment
- Taken together, these capabilities make possible a focus by
those not particularly adept at computer programming onthe function
of noncoding sequences
Scenario 5 But dont be fooled the interface does not yet
exist.Thats the point of the proposal!