Top Banner
COT 6930 HPC and Bioinformatics Introduction to Molecular Biology Xingquan Zhu Dept. of Computer Science and Engi neering
43

COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Jan 20, 2016

Download

Documents

Quasar

COT 6930 HPC and Bioinformatics Introduction to Molecular Biology. Xingquan Zhu Dept. of Computer Science and Engineering. Outline. Cell DNA DNA Structure DNA Sequencing RNA (DNA-> RNA) Protein Protein structure Protein synthesis. Replication. Transcription. Translation. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

COT 6930HPC and Bioinformatics

Introduction to Molecular Biology

Xingquan ZhuDept. of Computer Science and Engineering

Page 2: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Outline

Cell DNA

DNA Structure DNA Sequencing

RNA (DNA-> RNA) Protein

Protein structure Protein synthesis

Page 3: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Central Dogma of Biology: DNA, RNA, and the Flow of Information

TranslationTranscription

Replication

Page 4: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

A sequence from 20 amino acids

Adopts a stable 3D structure that can be measured experimentally

RibbonSpace fillingCartoon Surface

Oxygen

Nitrogen

Carbon

Sulfur

Protein

Lys Lys Gly Gly Leu Val Ala His

Page 5: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

X-ray Crystallography

Page 6: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

X-ray Crystallography

Page 7: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

X-ray Crystallography

Page 8: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

The 20 amino acids

• Each amino acid contains an "amine" group (NH3) and a "carboxy" group (COOH) (shown in black in the diagram).• The amino acids vary in their side chains (indicated in blue in the diagram).

Page 9: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Protein Structure

Protein Structure Primary structure (amino acid sequence) Secondary structure (local folding) Tertiary Structure (global folding) Quaternary structure (multiple-chain)

Protein Structure Animation https://mywebspace.wisc.edu/jonovic/web/

proteins.html

Page 10: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Primary Structure

Primary structure is described by the sequence of Amino Acids in the chain

Page 11: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

C- terminal

N-terminal

Polypeptide

One end of every polypeptide, called the amino terminal or N-terminal, has a free amino group. The other end, with its free carboxyl group, is called the carboxyl terminal or C-terminal.

Peptide: 50 amino acids or lessPolypeptide: 50-100 amino acidsProtein: over 100 amino acids

Page 12: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Polypeptide

The amino acids are linked covalently by peptide bonds. The image shows how three amino acids linked by peptide bonds into a tripeptide.

Page 13: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Secondary Structure Secondary structure describes the way the chain

folds Local structure of consecutive amino acids Common regular secondary structures

Helix Sheet b turn

Page 14: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Secondary Structure Alpha helix Beta strand / pleated sheet Coil

Page 15: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Tertiary Structure of protein

Tertiary Structure describes the shapes which form when the secondary spirals of the protein chain further fold up on themselves.

Page 16: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Quaternary structure (multi-chain structures)

Quaternary structure describes any final adjustments to the molecule before it can become active. For example, pairs of chains may bind together or other inorganic substances may be incorporated into the molecule.

Page 17: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Protein Structure Space

http://www.nigms.nih.gov/psi/

Protein folding taxonomy :

all alphaall beta

alpha/betaalpha+beta

others

Page 18: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Geometry of Protein Structure

rotatable rotatable

Total number of degree is 2*(n-1)

where n is the length of the protein

Page 19: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

The Leventhal Paradox

Given a small protein (100aa) assume 3 possible conformations/peptide bond

3100 = 5 × 1047 conformations Fastest motions 10- 15 sec so sampling all conformations would

take 5 × 1032 sec 60 × 60 × 24 × 365 = 31536000 seconds in a year Sampling all conformations will take 1.6 × 1025 years Proteins do not have problem in folding, we have! the Leventhal

paradox

Page 20: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Outline

Cell DNA

DNA Structure DNA Sequencing

RNA (DNA-> RNA) Protein

Protein structure Protein synthesis

Page 21: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

3 types of RNA

RNA

Page 22: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Messenger RNADNA: TAC CAT GAG ACT … ATC mRNA: AUG GUA CUC UGA …

UAG

Page 23: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Ribosomal RNA and ribosomes

Page 24: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Transfer RNA

Page 25: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Overview of protein synthesis

Transcription: same language

Translation: different language

Page 26: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Overview of protein synthesis

Page 27: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

A. TranscriptionNo Thymine, instead hasUracil

Page 28: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

2. Translation, the final steps

Page 29: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Rules (the secret of life) Transcription:

A →U T →A

Translation

G →C C →G

AUG: Methionine (Met)

Page 30: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Codons and anticodonsDNA: TAC CAT GAG ACT … ATC

mRNA: AUG GUA CUC UGA … UAGtRNA: UAC CAU GAG ACU … AUC

Page 31: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

DNA RNA

cDNAESTsUniGene

phenotype

GenomicDNADatabases

Protein sequence databases

protein

Protein structure databases

transcription translation

Gene expressiondatabase

Page 32: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

List of Amino Acids (1)

Page 33: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

List of Amino Acids (2)

Page 34: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Transcription & Open Reading Frame (ORF)

Open Reading Frame (ORF) Where to start reading codons (ATG) 6 possible reading frames (3 forward, 3 backward) Gene is usually longest ORF found

Forward reading frame example

Page 35: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Complication – Non-coding Regions

Non-coding regions Very little genomic DNA produce proteins Exon – DNA expressed in protein (2–3% of human genome) Intron – DNA transcribed into mRNA but later removed Untranslated region (UTR) – DNA not expressed

UTRs may affect gene regulation & expression Biological processes

Remove introns from mRNA, splice exons together Transition between intron / exon = splice site

Splicing can be inconsistent Some exons may be skipped Result = splice-variant gene / isoform Estimated 30% of human proteins from splice-variant genes

Page 36: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Non-coding regions

Page 37: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Transcription

The process of making RNA from DNA

Needs a promoter region to begin transcription.

ExonsControl regions

Splicing

Transcription Introns

Page 38: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Alternative Splicing

One single gene produce different forms of a protein A single gene can contain numerous exons and introns, and the

exons can be spliced together in different ways

Page 39: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Complication: Mutations

Mutations Modifications during DNA replication

Possible changes Point mutation / single nucleotide polymorphism (SNP)

5’ A T A C G T A … 5’ A T G C G T A … Occur every 100 to 300 bases along the 3-billion-base human

genome Duplicate sequence Inverted sequence Insert / delete sequence ( indel )

Page 40: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Mutations

Page 41: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Mutations

Page 42: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Outline

Cell DNA

DNA Structure DNA Sequencing

RNA (DNA-> RNA) Protein

Protein structure Protein synthesis

Page 43: COT 6930 HPC and Bioinformatics Introduction to Molecular Biology

Excellent Animation

Cell http://www.youtube.com/watch?

v=UB6G9GD2KFk Central Dogma

http://www.youtube.com/watch?v=GkdRdik73kU