Top Banner
BNFO 602 Lecture 1 Usman Roshan
16

BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

BNFO 602Lecture 1

Usman Roshan

Page 2: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Bio background

• DNA

• Transcription and translation

• Proteins: folding and structure

• SNPs

• SNP genotyping, sequencing

Page 3: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Representing DNA in a format manipulatable by computers

• DNA is a double-helix molecule made up of four nucleotides:– Adenosine (A)– Cytosine (C)– Thymine (T)– Guanine (G)

• Since A (adenosine) always pairs with T (thymine) and C (cytosine) always pairs with G (guanine) knowing only one side of the ladder is enough

• We represent DNA as a sequence of letters where each letter could be A,C,G, or T.

• For example, for the helix shown here we would represent this as CAGT.

Page 4: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Transcription and translation

Page 5: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Amino acidsProteins are chains ofamino acids. There aretwenty different aminoacids that chain indifferent ways to formdifferent proteins.

For example,FLLVALCCRFGH (this is how we could storeit in a file)

This sequence of aminoacids folds to form a 3-Dstructure

Page 6: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Protein folding

Page 7: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Protein folding• The protein foldingproblem is to determinethe 3-D protein structurefrom the sequence.• Experimental techniquesare very expensive. • Computational are cheap but difficult to solve. • By comparing sequences we can deduce the evolutionary conserved portions which are also functional (most of the time).

Page 8: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Protein structure

• Primary structure: sequence ofamino acids.• Secondary structure: parts of thechain organizes itself into alpha helices, beta sheets, and coils. Helices and sheets are usually evolutionarily conserved and can aid sequence alignment.• Tertiary structure: 3-D structure of entire chain• Quaternary structure: Complex of several chains

Page 9: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Key points

• DNA can be represented as strings consisting of four letters: A, C, G, and T. They can be very long, e.g. thousands and even millions of letters

• Proteins are also represented as strings of 20 letters (each letter is an amino acid). Their 3-D structure determines the function to a large extent.

Page 10: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

SNPs

• DNA sequence variations that occur when a single nucleotide is altered.

• Must be present in at least 1% of the population to be a SNP.

• Occur every 100 to 300 bases along the 3 billion-base human genome.

• Many have no effect on cell function but some could affect disease risk and drug response.

Page 11: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Toy example

Page 12: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

SNPs on the chromosome

SNP

Chromosome

Gene

Page 13: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Bi-allelic SNPs

• Most SNPs have one of two nucleotides at a given position

• For example:– A/G denotes the varying nucleotide as

either A or G. We call each of these an allele

– Most SNPs have two alleles (bi-allelic)

Page 14: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

SNP genotype

• We inherit two copies of each chromosome (one from each parent)

• For a given SNP the genotype defines the type of alleles we carry

• Example: for the SNP A/G one’s genotype may be– AA if both copies of the chromosome have A– GG if both copies of the chromosome have G– AG or GA if one copy has A and the other has G– The first two cases are called homozygous and latter two

are heterozygous

Page 15: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

SNP genotyping

Page 16: BNFO 602 Lecture 1 Usman Roshan. Bio background DNA Transcription and translation Proteins: folding and structure SNPs SNP genotyping, sequencing.

Real SNPs

• SNP consortium: snp.cshl.org

• SNPedia: www.snpedia.com