8/11/16 1 DB IGV SERVER/REMOTE Personal Computer/Local Terminal Data files Applications and Servers Compute SSH WEB WEB App Service Data files SCP Window NGS Data and Sequence Alignment Manpreet S. Katari Aug 11, 2016 Outline • NGS Data • FastA • FastQ • SAM • BAM • GFF • Sequence Alignment • Global vs Local • Dynamic Programming • Burrow Wheeler’s Algorithm. Important files types FASTA FASTQ SAM BAM GFF Sequence files Alignment files Annotation files Important file types: FASTA A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. The word following the ">" symbol is the identifier of the sequence, and the rest of the line is the description (both are optional). There should be no space between the ">" and the first letter of the identifier. It is recommended that all lines of text be shorter than 80 characters. The sequence ends if another line starting with a ">" appears; this indicates the start of another sequence. Important file types: FASTA >chrI CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACC CACACACACACATCCTAACACTACCCTAACACAGCCCTAATCTAACCCTG GCCAACCTGTCTCTCAACTTACCCTCCATTACCCTGCCTCCACTCGTTAC CCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTT ACTACCACTCACCCACCGTTACCCTCCAATTACCCATATCCAACCCACTG CCACTTACCCTACCATTACCCTACCATCCACCATGACCTACTCACCATAC TGTTCTTCTACCCACCATATTGAAACGCTAACAAATGATCGTAAATAACA CACACGTGCTTACCCTACCACTTTATACCACCACCACATGCCATACTCAC CCTCACTTGTATACTGATTTTACGTACGCACACGGATGCTACAGTATATA CCATCTCAAACTTACCCTACTCTCAGATTCCACTTCACTCCATGGCCCAT CTCTCACTGAATCAGTACCAAATGCACTCACATCATTATGCACGGCACTT GCCTCAGCGGTCTATACCCTGTGCCATTTACCCATAACGCCCATCATTAT CCACATTTTGATATCTATATCTCATTCGGCGGTCCCAAATATTGTATAAC TGCCCTTAATACATACGTTATACCACTTTTGCACCATATACTTACCACTC CATTTATATACACTTATGTCAATATTACAGAAAAATCCCCACAAAAATCA CCTAAACATAAAAATATTCTACTTTTCAACAATAATACATAAACATATTG GCTTGTGGTAGCAACACTATCATGGTATCACTAACGTAAAAGTTCCTCAA TATTGCAATTTGCTTGAACGGATGCTATTTCAGAATATTTCGTACTTACA CAGGCCATACATTAGAATAATATGTCACATCACTGTCGTAACACTCTTTA GCGT
7
Embed
Data files DB NGS Data and Sequence WEB Alignmenthpc.ilri.cgiar.org/.../content/NGSDataAlignments.pdfData files Applications and Servers Compute SSH WEB WEB App Service Data files
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
c t g a a a c t g g t $t g a a a c t g g t $ cg a a a c t g g t $ c ta a a c t g g t $ c t ga a c t g g t $ c t g aa c t g g t $ c t g a ac t g g t $ c t g a a at g g t $ c t g a a a cg g t $ c t g a a a c tg t $ c t g a a a c t gt $ c t g a a a c t g g$ c t g a a a c t g g t
$ c t g a a a c t g g ta a a c t g g t $ c t ga a c t g g t $ c t g aa c t g g t $ c t g a ac t g a a a c t g g t $c t g g t $ c t g a a ag a a a c t g g t $ c tg g t $ c t g a a a c tg t $ c t g a a a c t gt $ c t g a a a c t g gt g a a a c t g g t $ ct g g t $ c t g a a a c
BWT(Text)= t g a a $ a t t g g c cBurrowsWheelerMatrix