Welcome to Introduction to Bioinformatics* Wednesday, 8 February Genome Sequencing/Assembly (Didn’t have time to do this in class) Discussion of Study Question 14 from notes of Feb 6, which focuses on Table 1 and other sequence assembly parameters presented by Myers EW et al (2000). A whole- genome assembly of Drosophila. Science 287:2196- 2204. * http://www.people.vcu.edu/~elhaij/
15
Embed
Welcome to Introduction to Bioinformatics* Wednesday, 8 February Genome Sequencing/Assembly
Welcome to Introduction to Bioinformatics* Wednesday, 8 February Genome Sequencing/Assembly. (Didn’t have time to do this in class) Discussion of Study Question 14 from notes of Feb 6, which focuses on Table 1 and other sequence assembly parameters presented by - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Welcome toIntroduction to Bioinformatics*
Wednesday, 8 FebruaryGenome Sequencing/Assembly
(Didn’t have time to do this in class)
Discussion of Study Question 14 from notes of Feb 6, which focuses on Table 1 and other sequence assembly parameters presented by
Myers EW et al (2000). A whole-genome assembly of Drosophila. Science 287:2196-2204.
We’re using this article to get an idea of how one can progressively make sense
out of a genome sequence.
The article presents a lot of quantitative information, particularly on…
…the second page of the article, and particularly in Table 1.
One of you asked:
I am having trouble understanding the meaning of the requested
and received columns
An explanation of special terms in a table should be found either in a
footnote to the table or similar, and…
…so it is!
Don’t concern yourself much with the Requested column. It’s just what
Myers et al figured they would need, as judged by simulation experiments.
The Received column is much more important, giving the
actual values of their sequencing.
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
From Tour of Myers et al (2000)
Here’s the study question we’re considering.
Many in the class were confused as to how to go about checking these statements. You should interpret “checking” to mean finding consistent evidence elsewhere in the article
that confirms that the numbers are not misprints.
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
From Tour of Myers et al (2000)
Focus on the Part a of the question. Here’s a sample comment:
I understand what the table represents but cannot make any links as to how they obtained the
3.156 reads and 1.76 Gbp of sequence.
First confirm… the statement indeed comes from the article, and there doesn’t seem to be anything else in the statement that sheds
much light on the numbers.
a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ."
What do these quantities mean?
Take the first one. What do you need to know to calculate the average read length?
a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ."
Not sure? What would you need to know to calculate the average length
of a book in the VCU library?
a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ."
Is that analogous to information you have with regard to the fly sequence?
a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ."
(yes)
Well, that’s how the game is played.
a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ."
SQ14. From figures given in the text and in Table 1, check the accuracy of each of the following statements: a. "We produced 3.156 million reads that yielded 1.76 Gbp of sequence. . ." b. ". . .trillions of overlaps between reads are examined." c. ". . .to produce 654,000 of the 2-kbp mates and 497,000 of the 10-kbp mates."
From Tour of Myers et al (2000)
Try using a similar approach to figure out the other two parts.