Top Banner
Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif
76

Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mar 27, 2015

Download

Documents

Zachary Pearson
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

Dr. Suzanne Gollery – Sierra Nevada College

Martin Gollery – Active Motif

Page 2: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Who are we?

Suzanne – Assistant Professor, Sierra Nevada College Formerly at Baylor College of Medicine, UC

Berkeley Marty- Senior Scientist, ActiveMotif, Inc.

Formerly at University of Nevada, Reno TimeLogic etc

Page 3: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Why Do Bioinformaticists need it?

Avoiding mistakes Understanding the purpose Appreciating the Difficulties We will look at some of the applicable

programs to these concepts

Page 4: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

I. Protein structure and functionA. Protein structure

B. Protein function

II. Nucleic acid structure and functionA. The central dogma

B. Nucleic acid structure

C. Genetic code

D. Control of gene expression

E. Mutation

Page 5: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

III. The centrality of evolution by natural selection in biology

A. Universal genetic codeB. Types of mutations and their effectsC. GenomesD. Natural selection acts on random variation to

produce evolution

Page 6: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

I. Protein structure and function

A. Protein structure

B. Protein function

Page 7: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Amino acids

An amino acid has four functional groups attached to a central carbon atom

An amino group A carboxyl group A hydrogen atom A variable side chain

(R group) Proteins use L isomers

of amino acids

Page 8: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – 20 amino acids are used in proteins

Polar amino acid side chains have partial negative or positive electrostatic charges

O and N atoms “hog” electrons and have partial negative charges

Atoms attached to N and O have partial positive charges

Page 9: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – 20 amino acids are used in proteins Non-polar amino acid side chains have no

electrostatic charge

Page 10: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – 20 amino acids are used in proteins

Charged amino acids are acidic or basic, and donate or accept a H+ in cells

Some charged amino acids have a positive electrostatic charge

Other charged amino acids have a negative electrostatic charge

Page 11: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – 20 amino acids are used in proteins Aromatic amino acids have a bulky carbon ring

structure in their side chains

Page 12: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Analysis

X-ray Crystallography- interference patterns Nuclear Magnetic Resonance (NMR) ~31,000 structures in Protein Databank (PDB) PDB is organized by other databases Tends to emphasize certain types of proteins

Page 13: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure - Polypeptides

Amino acids are joined to produce polypeptides

A water molecule is removed as the amino group of an amino acid reacts with the carboxyl group at the end of a polypeptide

The peptide bond (yellow) is a planar structure

Page 14: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Protein folding

Chemical interactions among amino acids determine the final 3D shape (conformation) of a protein

Page 15: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Protein folding

Page 16: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure

Primary structure: the linear order of amino acids in a polypeptide

Secondary structure: regions of -helix or -sheet Tertiary structure: globular folded polypeptide Quaternary structure: multiple folded polypeptides

form a complete protein – hemoglobin in this figure

Page 17: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein Structure- Primary

Sequence yields structure/function Homology searching paradigm- similar primary structures

yield similar functions Needleman-Wunsch, Smith-Waterman, FASTA, BLAST Similarity scoring- amino acids with similar properties can

replace each other without breaking structure

Page 18: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Secondary structure

Prediction of Secondary structure from primary sequence is tractable

Programs- Coils, PHD, predator, JPRED Secondary Structure may be used to improve

alignments

Page 19: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein Folding- Prediction

Very Computationally Intensive A Short Protein (100 bases) would take ~20 days straight on a Petaflop

computer Force Field approximation programs CHARMM, AMBER Accelerated versions based on Field Programmable Gate Array (FPGA)

technology

Page 20: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structural Motifs

Page 21: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Structural Motifs

Repeated or combined motifs form functional domains Domains predict protein function NAD(P)-binding domain of proteins that bind to NAD, an electron carrier -sandwich domain of cell surface recognition proteins (Ig, MHC, CD4)

Page 22: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Functional motifs

eMATRIX/eMOTIF MEME/MetaMEME FingerPrintScan PHI-BLAST

Page 23: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Hidden Markov Models

Represent Domains, Motifs or proteins Major programs include

HMMer (hmmer.wustl.edu) SAM (www.cse.ucsc.edu/research/compbio/sam) Wise tools (www.ebi.ac.uk/Wise2/) Meta-MEME (metameme.sdsc.edu/) PSI-BLAST (www.ncbi.nlm.nih.gov/blast) DeCypherHMM (www.timelogic.com)

Page 24: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

What are HMMs, anyway?

Statistical description of a protein family's consensus sequence

Conserved regions receive highest scores Can be seen as a Finite State Machine

Page 25: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Hidden Markov Models

yciH KDGII ZyciH KDGVI VCA0570 KDGDI HI1225 KNGII sll0546 KEDCV

C D E G I K N V

1 1.0

2 0.6 0.2 0.2

3 0.2 0.8

4 0.2 0.2 0.4 0.2

5 0.8 0.2

Contrast with RE type motif, K[DEN][DG][CDIV][IV]

Page 26: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

HMM databases

Pfam TIGRfam Superfamily SMART COG KinFam PirSF Panther KOG …etc

Page 27: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Modifications after protein synthesis

Prosthetic groups associate (Heme of cytochrome c)

Polypeptides are trimmed or cut

Sugars are attached Other chemical groups are

attached Chaperone proteins assist

in polypeptide folding

Page 28: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – Binding to ligand

Ligand (antigen peptide) fits into a cleft on a protein (MHC) like two puzzle pieces fit together

Page 29: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – Binding to ligand

SitesBase –information on known ligand binding sites from the PDB

LigBase adds related sequences and structures

Page 30: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – Chymotrypsin binding to ligand:Complementarity of electrostatic charge and 3D shape

Page 31: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – proteins bind to other proteins

Myosin head binds to actin during muscle contraction

Protein shapes are complementary like puzzle pieces

Binding is reversible Goodness of fit (shape,

charge, hydrophobicity) determines affinity of protein/ligand binding

Page 32: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function

Proteins are dynamic machines

Two or a few protein conformations may be of similar stability

Ligand binding can act as a switch to change protein conformation

Induced fit: binding of glucose (red) to hexokinase changes enzyme conformation

Page 33: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function Hemoglobin switches between a T (taut)

deoxygenated and an R (relaxed) oxygenated state

Page 34: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function Binding of lactose to the lactose transport protein

changes the shape of the protein. Lactose (red) binds to the protein on one side of a

cell membrane and is released on the other side Movement across the membrane is reversible

Page 35: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function The Na+ -K+ pump

switches between two conformations when a phosphate groups is added or removed

Attaching phosphate to the Na+ -K+ pump switches the protein’s shape, moving Na+ outside the cell

Removing phosphate switches the protein’s shape again, moving K+ into the cell

Page 36: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function Myosin heads cycles between multiple conformations during

muscle contraction Binding and release of ATP, ADP, and phosphate (Pi) trigger

changes in myosin head conformation Myosin heads pull on actin to make muscle fibers shorter

Page 37: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function – changes in shape are essential to protein function Intrinsically Disordered Proteins have roles in signalling, etc. Some take shape only when interacting Tend to form hubs in interaction networks Predict with PONDR, Spritz, Wiggle, FoldUnfold Disprot database of ID proteins

Page 38: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein function - Phosphorylation

Phosphorylation changes a protein’s shape

Phosphorylation may turn a protein on or off

Kinases: enzymes that attach phosphates to proteins

Phosphatases: enzymes that remove phosphates

Other charged chemical groups (cAMP, Ca++…) are also attached to proteins to switch them on or off

Page 39: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein structure – Modifications after protein synthesis Post-Translational Modifications (PTM) Phosphorylation takes place on S, T or Y, but

only in certain situations NetPhosK uses Artificial Neural Networks KinasePhos uses HMMs to predict

Phosphorylation sites

Page 40: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Protein Interaction Networks

Page 41: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

II. Nucleic acid structure and function

A. The central dogma

B. Nucleic acid structure

C. Genetic code

D. Control of gene expression

E. Mutation

Page 42: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

The central dogma of molecular biology

DNA contains instructions for making RNA and protein

DNA is transcribed (copied) to make messenger RNA (mRNA)

mRNA is translated (instructions read) to make proteins

Page 43: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Central dogma - Transcription

Transcription occurs in the nucleus

mRNAs are modified before transport to the cytoplasm

Other RNAs are also transcribed:

o tRNAs: “read” genetic code in mRNA

o rRNAs: backbone of ribosomes

o small RNAs: enzymes and regulators of gene function

Page 44: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Central dogma - Translation

Translation occurs in the cytoplasm on ribosomes

tRNAs match the correct amino acids to codons on the mRNA

Ribosomal enzymes join amino acids to the growing polypeptide

The emerging polypeptide folds

Page 45: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Nucleic acid structure - Nucleotides

Nucleotides are built from a sugar, base, and phosphate

Carbon atoms in the sugar are numbered

DNA has 2’ deoxyribose; RNA has ribose

DNA has G, A, T, and C RNA has G, A, U, and C

Page 46: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Nucleic acid structure - Polynucleotides

The 5’ phosphate of one nucleotide is joined to the 3’ hydroxyl group of the preceding nucleotide

The beginning of a nucleic acid has a free 5’ phosphate group

The end of a nucleic acid has a free 3’ hydroxyl group

Page 47: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Nucleic acid structure - DNA

Double stranded DNA forms a helix

DNA strands are joined by hydrogen bonds between complementary bases

A always pairs with T; two hydrogen bonds

G always pairs with C; three hydrogen bonds

Page 48: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Nucleic acid structure - DNA

Each DNA strand serves as a template for replication or repair of the other strand, and for RNA synthesis

Page 49: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Nucleic acid structure - RNA

RNA is single stranded RNAs fold to form

regions of internal double helix with complementary base pairs

Many functional RNAs have globular shapes (green: sugar-phosphate backbone; gray: paired bases)

Page 50: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genetic code – Linear code

One DNA strand serves as a template for mRNA synthesis

The linear order of nucleotides in the mRNA corresponds to the amino acid sequence of the polypeptide

Triplet codons specify insertion of amino acids

The DNA coding strand is complementary to the template strand, so its sequence is comparable to the mRNA (with T instead of U)

Page 51: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genetic code – triplet codons

tRNA “reads” genetic code: anticodon is complementary to the mRNA codon

Redundant genetic code: some amino acids are specified by multiple codons

First codon is AUG Met Three stop codons specify

termination of polypeptide synthesis

Page 52: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of Gene Expression – Gene Structure

RNA polymerase binds to DNA at the promoter to initiate transcription Other proteins bind to sequences near the exons to regulate transcription The information in genes (exons) is interrupted by non-coding sequence

(introns) that are removed from RNA by splicing 5’ cap and poly (A) tail are added for mRNA stability

Page 53: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – DNA binding proteins can activate or inhibit transcription

Multiple proteins must bind to DNA to initiate transcription

Some proteins bind near the promoter, while others bind farther away at enhancers

DNA bends so that enhancer-binding proteins help RNA polymerase assemble on the promoter

Some proteins bind to DNA or to activator proteins to block transcription initiation

Page 54: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – how proteins bind to specific DNA sequences DNA binding proteins

insert an -helix into the major groove of DNA

Helix-turn-helix domain

Page 55: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – how proteins bind to specific DNA sequences The protein stalls on the DNA where amino acids form

the maximum number of hydrogen bonds with nucleotide bases in the major groove

Page 56: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – DNA binding protein domains

Homeodomain Zinc finger Leucine zipper

Page 57: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – other points of control

Whether or not a gene is “expressed” can be controlled at any step that affects proteins concentration and function

Alternative splicing (post-transcriptional processing) produces multiple proteins from one gene

Control of translation (siRNAs block translation)

Control of protein activity (phosphorylation switches proteins on or off)

mRNA or protein longevity (how quickly it is degraded)

Page 58: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Control of gene expression – Alternative splicing

Page 59: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Analysis of gene expression

Microarrays, GeneChips Three classes of software-

Reading the images Clustering the data, building associations

GeneSpring, GeneSifter, Bioconductor Warehousing the data

GEO, SMD, YMD, others

Meta-analysis is difficult due to variability

Page 60: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutations

Although DNA is replicated and repaired accurately, rare mistakes are made, which alters nucleotide sequence

Exposure to some chemicals and radiation damages DNA, increasing the likelihood of mutation

Although rare, a few nucleotide sequence changes occur with each generation

Mutations introduce genetic variability in a population of individuals

Page 61: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Introduction to Molecular Biology for Computer Scientists

III. The centrality of evolution by natural selection in biology

A. Universal genetic code

B. Types of mutations and their effects

C. Genomes

D. Natural selection acts on random variation to produce evolution

Page 62: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Universal genetic code

All living organisms use the same genetic code: CCC encodes proline in all cells

All organisms are descended from a common ancestor – all life on earth evolved from a common point of origin

The impressive variation in living organisms arose through random changes in nucleotide sequence (mutation) acted upon by natural selection over billions of years

Page 63: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutation – nucleotide substitutions

Mutations are random changes in nucleotide sequence

Mutations in non-coding sequences and codon third position are often silent

Some nucleotide substitutions change the amino acid

Nonsense mutations introduce stop codons and truncate a polypeptide

Page 64: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutations - Frameshifts

Insertion or deletion of a nucleotide shifts the reading frame and drastically alters amino acid sequence

Page 65: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutations - Frameshifts

Frameshift tolerant matching programs add additional states corresponding to transitions to other reading frames

FrameSearch, Wise2, BLAST with OOF option

Page 66: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutations – Chromosomal mutations

Large duplications generate multiple copies of genes

Large deletions remove genes from the genome

Duplications, deletions, inversions, and translocations are preserved in future generations, so can be used to trace evolutionary history

Page 67: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Mutations – effects on organisms

Random mutation produces genetic variability that is acted upon by natural selection

Most mutations are deleterious: mutations occur randomly, and are more likely to disrupt protein function than alter it in a positive way

Deleterious mutations are eliminated by natural selection

Rare mutations that introduce altered, even beneficial functions are positively selected

Page 68: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genomes – Eukaryotic cells

Animals, Plants, Fungi, and some other organisms (Protists) have eukaryotic cells

Eukaryotic organisms have existed on earth for millions of years

Eukaryotic genomes are in linear pieces of DNA called chromosomes

Eukaryotic genes usually have introns Eukaryotic genes are separated by lots of spacer

DNA that does not encode proteins Many repetitive sequences are present:

o Tandem repeats of short sequences, like GACo Transposons (for example, LINES and SINES)

Page 69: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genomes – Prokaryotic cells

Bacteria have prokaryotic cells Prokaryotic organisms have existed on earth for

billions of years Most prokaryotic genomes consist of one circular

DNA molecule Most prokaryotic genes lack introns Prokaryotic genomes have little spacer DNA; most

DNA encodes known RNAs or proteins There is much less repetitive DNA than for eukaryotic

genomes Plasmids, tiny circular DNA molecules separate from

the main genome, are used in recombinant DNA technology to introduce genes into bacteria

Page 70: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genomes – evolution through chromosomal mutation Gene duplications result

in multigene families After duplication, one

gene can provide the original function while the other may evolve (through mutation and selection) a different function

The globin gene family evolved through duplication, mutation, and selection

Page 71: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genomes – evolution through chromosomal mutation

- and -globin genes vary in amino acid sequence, yet share the same conformation

Sequence differences are fairly conservative

Page 72: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Multiple Sequence Alignment

Page 73: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Multiple Sequence Alignment

Page 74: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Genomes – evolution through chromosomal mutation

As species diverge, chromosomal mutations shuffle genome content

The more recently two species diverged, the more similar their genome organization

Both chromosomal and single nucleotide mutations can be used to trace the evolutionary history of species

Genetic information in human chromosomes (blue) compared to dog chromosomes

Page 75: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Credits

Many figures were borrowed from three sources:o Nelson and Cox, Lehninger Principles of Biochemistry, 4e,

WH Freeman and Co., 2005 ISBN: 0-7167-4339-6o Raven, Johnson, Losos, Mason, and Singer, Biology, 8e,

McGraw-Hill, 2008 ISBN: 978-0-07-296581-0o Klug, Cummings, and Spencer, Essentials of Genetics, 6e,

Pearson/Prentice Hall, 2007 ISBN: 0-13-224127-7

These texts and the accompanying on-line materials are excellent resources for learning more molecular biology

Page 76: Introduction to Molecular Biology for Computer Scientists Dr. Suzanne Gollery – Sierra Nevada College Martin Gollery – Active Motif.

Thank You!

Marty is at [email protected] Suzanne is at [email protected]