Top Banner
1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013
73

1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

Apr 30, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS

BIOINFORMATICS COURSE

MTAT.03.239

11.09.2013

Page 2: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

2 "Introduction to Bioinformatics"

Bioinformatics Course

Is a characteristic that distinguishes objects that have signaling and self-sustaining processes (i.e. living organism) to those that do not have it

Is a state of living characterized by capacity for metabolism, growth, reaction to stimuli, and reproduction

A diversity of life forms are found on Earth, eg. plants, animals, fungi, protists, archaea and bacteria

LIFE

Page 3: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

3 "Introduction to Bioinformatics"

Bioinformatics Course

Page 4: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

4 "Introduction to Bioinformatics"

Bioinformatics Course

WHAT IS BIOLOGY ?

Page 5: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

5 "Introduction to Bioinformatics"

Bioinformatics Course

http://www.tagxedo.com/app.html

Page 6: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

6 "Introduction to Bioinformatics"

Bioinformatics Course

BIOLOGY Is a study of life and living organisms

It brings together the structure, function, growth, origin, distribution, adaptation, interactions, taxonomy and evolution of living organism

AEROBIOLOGY, AGRICULTURE, ANATOMY, ASTROBIOLOGY, BIOCHEMISTRY, BIOENGINEERING, BIOINFORMATICS,

BIOMATHEMATICSOR, MATHEMATICAL BIOLOGY, BIOMECHANICS, BIOMEDICAL RESEARCH, BIOPHYSICS, BIOTECHNOLOGY, BUILDING BIOLOGY, BOTANY, CELLBIOLOGY, CONSERVATION BIOLOGY, CRYOBIOLOGY, DEVELOPMENTAL BIOLOGY, ECOLOGY, EMBRYOLOGY, ENTOMOLOGY, ENVIRONMENTAL BIOLOGY, EPIDEMIOLOGY, ETHOLOGY, EVOLUTIONARY BIOLOGY, GENETICS, HERPETOLOGY, HISTOLOGY, ICHTHYOLOGY, INTEGRATIVE BIOLOGY, LIMNOLOGY, MAMMALOGY, MARINE BIOLOGY, MICROBIOLOGY, MOLECULAR BIOLOGY, MYCOLOGY, NEUROBIOLOGY, OCEANOGRAPHY, ONCOLOGY, ORNITHOLOGY, POPULATION BIOLOGY, POPULATION ECOLOGY, POPULATION GENETICS, PALEONTOLOGY, PATHOBIOLOGY OR PATHOLOGY, PARASITOLOGY, PHARMACOLOGY, PHYSIOLOGY, PHYTOPATHOLOGY, PSYCHOBIOLOGY, SOCIOBIOLOGY, STRUCTURAL BIOLOGY, VIROLOGY

Page 7: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

Is a study of life and living organisms

It brings together the structure, function, growth, origin, distribution, adaptation, interactions, taxonomy and evolution of living organism

AEROBIOLOGY, AGRICULTURE, ANATOMY, ASTROBIOLOGY, BIOCHEMISTRY, BIOENGINEERING, BIOINFORMATICS,

BIOMATHEMATICSOR, MATHEMATICAL BIOLOGY, BIOMECHANICS, BIOMEDICAL RESEARCH, BIOPHYSICS, BIOTECHNOLOGY, BUILDING BIOLOGY, BOTANY, CELLBIOLOGY, CONSERVATION BIOLOGY, CRYOBIOLOGY, DEVELOPMENTAL BIOLOGY, ECOLOGY, EMBRYOLOGY, ENTOMOLOGY, ENVIRONMENTAL BIOLOGY, EPIDEMIOLOGY, ETHOLOGY, EVOLUTIONARY BIOLOGY, GENETICS, HERPETOLOGY, HISTOLOGY, ICHTHYOLOGY, INTEGRATIVE BIOLOGY, LIMNOLOGY, MAMMALOGY, MARINE BIOLOGY, MICROBIOLOGY, MOLECULAR BIOLOGY, MYCOLOGY, NEUROBIOLOGY, OCEANOGRAPHY, ONCOLOGY, ORNITHOLOGY, POPULATION BIOLOGY, POPULATION ECOLOGY, POPULATION GENETICS, PALEONTOLOGY, PATHOBIOLOGY OR PATHOLOGY, PARASITOLOGY, PHARMACOLOGY, PHYSIOLOGY, PHYTOPATHOLOGY, PSYCHOBIOLOGY, SOCIOBIOLOGY, STRUCTURAL BIOLOGY, VIROLOGY

7 "Introduction to Bioinformatics"

Bioinformatics Course

BIOLOGY COMPRISES AREAS OF STUDY THAT FOCUS ON LIFE AT A VARIETY OF LEVELS AND FROM A DIVERSITY OF PERSPECTIVES

BIOLOGY

Page 8: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

8 "Introduction to Bioinformatics"

Bioinformatics Course

Domain - Eukaryota Kingdom - Animalia Phylum - Chordata Vertebrata (Subphylum) Class - Mammalia Order - Primates Anthropoidea (Suborder) Hominoidea (Superfamily) Family - Hominidae Genus - Homo Species - sapiens

LIVING SYSTEMS

Page 9: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

9 "Introduction to Bioinformatics"

Bioinformatics Course

HUMANS Lineage (full): root; cellular organisms; Eukaryota; Opisthokonta; Metazoa; Eumetazoa; Bilateria; Coelomata; Deuterostomia; Chordata; Craniata; Vertebrata; Gnathostomata; Teleostomi; Euteleostomi; Sarcopterygii; Tetrapoda; Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; Primates; Haplorrhini; Simiiformes; Catarrhini; Hominoidea; Hominidae; Homininae; Homo; Homo sapiens

http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606

Page 10: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

10 "Introduction to Bioinformatics"

Bioinformatics Course

SPECIES

Defined as a group of living organisms consisting of similar individuals capable of exchanging genes or interbreeding

http://www.nature.com/news/2011/110823/full/news.2011.498.html

Page 11: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics
Page 12: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

13 "Introduction to Bioinformatics"

Bioinformatics Course

NO. OF SPECIES

http://www.iucnredlist.org/documents/summarystatistics/2010_1RL_Stats_Table_1.pdf

Page 13: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

14

LEVELS OF ORGANISATION

http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468

Page 14: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

15

LEVELS OF ORGANISATION

http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468

Page 15: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

16 "Introduction to Bioinformatics"

Bioinformatics Course

BIOLOGICAL QUESTIONS

How are all life-forms related? What was the first cell like? How do species adapt to their environment? Which part of our genome is evolving the fastest? Are we descendents of Neanderthals? What genes are responsible for major human disease? Why do we need new flu vaccines every day?

Introduction to Computational Biology, Nello Christiani and Matthew W. Hahn

Page 16: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

17 "Introduction to Bioinformatics"

Bioinformatics Course

BIOINFORMATICS ?

Page 17: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

18 "Introduction to Bioinformatics"

Bioinformatics Course

COMPUTER SCIENCE [CS]

STUDIES COMPUTABLE PROCESSES AND STRUCTURES ( WITH THE AID OF COMPUTERS )

Page 18: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

19 "Introduction to Bioinformatics"

Bioinformatics Course

BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

The boundaries between the two diciplines are not well defined and can be distinguished by the problems they solve

BIOINFORMATICS – is the application of statistics and computer science to the field of molecular biology

COMPUTATIONAL BIOLOGY – actual process of analyzing and interpreting data

Page 19: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

20 "Introduction to Bioinformatics"

Bioinformatics Course

DEFINITION OF BIOINFORMATICS

The term bioinformatics was coined in 1978 Bioinformatics is the application of information technology and computer science to the field of molecular biology The science of using / developing computer software and algorithms to record, analyze and merge biologically related data Using computer technology to manage large amounts of biological data Bioinformatics involves the use of techniques including applied mathematics, informatics, statistics, computer science, artificial intelligence, chemistry, and biochemistry to solve biological problems usually on the molecular level http://www.google.com/search?q=define%3ABioinformatics

Page 20: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

21 "Introduction to Bioinformatics"

Bioinformatics Course

DEFINITION OF BIOINFORMATICS

The collection, organization, storage, analysis, and integration of large amounts of biological data using networks of computers and databases Bioinformatics involves the integration of computers, software tools, and databases in an effort to address biological questions In summary, the use of computer science to solve biological problems

http://www.google.com/search?q=define%3ABioinformatics

Page 21: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

22

BIOINFORMATIC FOCUS

http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468

ORGANISM

ORGAN

TISSUE CELL

MOLECULES

Page 22: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

23

BIOINFORMATIC FOCUS

http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468

ORGANISM

ORGAN

TISSUE CELL

MOLECULES

ANALYSIS AND INTERPRETATION OF VARIOUS TYPES OF BIOLOGICAL DATA INCLUDING: NUCLEOTIDE AND AMINO ACID SEQUENCES, PROTEIN DOMAINS, AND PROTEIN STRUCTURES.

Page 23: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

Development of new algorithms and statistics with which to assess biological information, such as relationships among members of large data sets.

24

BIOINFORMATIC FOCUS

http://www.nature.com/msb/journal/v3/n1/images/msb4100163-f4b.jpg

"Introduction to Bioinformatics" Bioinformatics Course

Page 24: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

25

BIOINFORMATIC FOCUS

http://www.jofwidata.com/images/database-design-development.jpg http://wolfson.huji.ac.il/expression/detective.jpg

Development and implementation of tools that enable efficient access and management of different types of information, such as various databases, integrated mapping information

"Introduction to Bioinformatics" Bioinformatics Course

Page 25: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

26

UNITS OF INFORMATION IN BIOINFORMATICS

DNA Sequence Pathways

RNA Structure Interactions

Protein Evolution Mutations

"Introduction to Bioinformatics" Bioinformatics Course

Page 26: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

27

UNITS OF INFORMATION IN COMPUTER SCIENCE

File Storage capacity by Bits and Bytes

Bit Byte Kilobyte Megabyte Gigabyte

bit 1 8 1024*8=

8,192 1024*8192=

8,388,608 1024*8388608= 8,589,934,592

byte 8 1 1024 1024*1024=

1,048,576 1024*1048576= 1,073,741,824

Kilobyte 8,192 1024 1 KB 1024 1,048,576

Megabyte 8,388,608 1,048,576 1024 1 MB 1024

Gigabyte 8,589,934,592 1,073,741,824 1,048,576 1024 1 GB

Terabyte 8,796,093,022,208

1TB 1,099,511,627,776 1,073,741,824 1,048,576 1024

Petabyte

9,007,199,254,740,990

1,125,899,906,842,620

1,099,511,627,776

1,073,741,824 1,048,576

1024 TB 1 PB

"Introduction to Bioinformatics" Bioinformatics Course

Page 27: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

28

File Storage capacity by Bits and Bytes

Bit Byte Kilobyte Megabyte Gigabyte

Petabyte

9,007,199,254,740,990

1,125,899,906,842,620

1,099,511,627,776

1,073,741,824 1,048,576

1024 TB 1 BO

Exabyte

9,223,372,036,854,780,000

1,152,921,504,606,850,000

1,125,899,906,842,620

1,099,511,627,776

1,073,741,824

1,048,576 TB 1024 PB 1 EB

Zettabyte

9,444,732,965,739,290,000,000

1,180,591,620,717,410,000,000

1,152,921,504,606,850,000

1,125,899,906,842,620

1,099,511,627,776

1,073,741,824 TB 1,048,576 PB 1024 EB 1 ZB

Yottabyte

9,671,406,556,917,030,000,000,000

1,208,925,819,614,630,000,000,00

0

1,180,591,620,717,410,000,000 KB

1,152,921,504,606,850,000 MB

1,125,899,906,842,620 GB

1,099,511,627,776 TB 1,073,741,824 PB 1,048,576 EB 1024 ZB 1 YB

"Introduction to Bioinformatics" Bioinformatics Course

UNITS OF INFORMATION IN COMPUTER SCIENCE

Page 28: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

29

CELL SIZES

"Introduction to Bioinformatics" Bioinformatics Course

http://learn.genetics.utah.edu/content/begin/cells/scale/

Page 29: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

30

HUMAN CELL

http://bhavanajagat.files.wordpress.com/2012/02/cell-structure-and-functions.jpg

Page 30: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

31

EXAMPLES OF BIOLOGICAL DATA

"Introduction to Bioinformatics" Bioinformatics Course

GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins

The biological information contained in a genome is encoded in deoxyribonucleic acid (DNA) or, for many types of virus, in ribonucleic acid (RNA)

Page 31: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

32

NAME THE NUMBERS

"Introduction to Bioinformatics" Bioinformatics Course

1

3

4 5

2

NUCLEUS DNA GENES CHROMOSOME CELL

Page 32: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

33

EXAMPLES OF BIOLOGICAL DATA

"Introduction to Bioinformatics" Bioinformatics Course

Page 33: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

34

CENTRAL DOGMA OF MOLECULAR BIOLOGY

http://compbio.pbworks.com/f/central_dogma.jpg

DNA is transcribed into RNA and RNA is translated into proteins

Page 34: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

35

CENTRAL DOGMA OF MOLECULAR BIOLOGY

http://www.uic.edu/classes/bios/bios100/lectures/centraldogma.jpg

Page 35: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

36

EXAMPLES OF BIOLOGICAL DATA

"Introduction to Bioinformatics" Bioinformatics Course

GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins

The biological information contained in a genome is encoded in deoxyribonucleic acid (DNA) or, for many types of virus, in ribonucleic acid (RNA)

Page 36: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

37

GENOME

"Introduction to Bioinformatics" Bioinformatics Course

Is the entirety of an organism’s hereditary information

The genome includes both the genes and non-coding sequences of DNA/RNA

In 1995, Haemophilus influenzae or was the first genome of a living organism to be sequenced in July 1995

1 830 140 base pairs of DNA in single circular chromosome that contains 1740 protein-coding gene, 58 transfer RNA genes and 18 other RNA genes

http://www.sciencemag.org/content/269/5223/local/front-matter.pdf http://en.wikipedia.org/wiki/File:Haemophilus_influenzae_01.jpg

Page 37: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

38

WHOLE GENOMES

"Introduction to Bioinformatics" Bioinformatics Course

Page 38: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

39

GENOME SIZES

"Introduction to Bioinformatics" Bioinformatics Course

Introduction to Computational Biology, Nello Christiani and Matthew W. Hahn

Page 39: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

40

GENOME SIZES

"Introduction to Bioinformatics" Bioinformatics Course

Japanese flower Paris japonica 130 billion base pairs – 50 times the human genome

Page 40: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

41

COMPLETELY SEQUENCED GENOMES

"Introduction to Bioinformatics" Bioinformatics Course

Page 41: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

42

HUMAN GENOME

"Introduction to Bioinformatics" Bioinformatics Course

Human body

• 1014 cells

• (100 trillion)

One cell

• 23 pairs of chromosomes

DNA

• ≈21,000 to 23,000 genes

RNA

• 3 billion pairs of DNA bases

Protein

• ≈100 000 different proteins

Page 42: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

43

Relative proportions (%) of bases in DNA

"Introduction to Bioinformatics" Bioinformatics Course

CURRENT SCIENCE, VOL. 85, NO. 11, 10 DECEMBER 2003

Page 43: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

44

DNA

DNA with high GC-content is more stable than DNA with low GC-content, 3 hydrogen bonds

"Introduction to Bioinformatics" Bioinformatics Course

Page 44: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

45

DNA vs RNA

DNA – deoxyribonucleic acid Sugar is deoxyribose DNA is a polymer of deoxyribonucleotides Bases are adenine (A), guanine (G), cytosine (C) and thymine (T)

RNA –ribonucleic acid Sugar is ribose RNA is a polymer of ribonucleotides Bases are adenine (A), guanine (G), cytosine (C) and uracil (U)

http://www2.chemistry.msu.edu/faculty/reusch/VirtTxtJml/Images3/dna_rna1.gif

Page 45: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

46

DNA SEQUENCE

"Introduction to Bioinformatics" Bioinformatics Course

Raw DNA sequence Coding or non-coding Parses into genes 4 nucleotide bases ATGC

>ENST00000539570 cdna:known chromosome:GRCh37:15:63889592:63893885:1 gene:ENSG00000259662 gene_biotype:protein_coding transcript_biotype:protein_coding ATGTGGCCACTGCTCACCATGCACATAACCCAGCTCAACCGGGAGTGCCTGCTGCACCTCTTCTCCTTCCTAGACAAGGACAGCAGGAAGAGCCTTGCCAGGACCTGCTCCCAGCTCCACGACGTGTTTGAGGACCCCGCACTCTGGTCCCTGCTGCACTTCCGTTCCCTCACTGAACTCCAGAAGGACAACTTCCTCCTGGGCCCGGCACTCCGCAGCCTCTCCATCTGCTGGCACTCCAGCCGCGTGCAGGTGTGCAGCATTGAGGACTGGCTCAAGAGTGCCTTCCAGAGAAGCATCTGCAGCCGGCACGAGAGCCTGGTCAATGATTTCCTCCTCCGGGTGTGCGACAGGCTTTCTGCTGTGCGCTCCCCACGGAGGCGGGAGGCGCCTGCACCGTCCTCGGGGACTCCGATCGCCGTTGGACCGAAATCACCTCGGTGGGGAGGACCTGACCACTCGGAGTTCGCCGACTTGCGCTCGGGGGTGACGGGGGCCAGGGCTGCCGCGCGCAGGGGTCTGGGGAGCCTCCGGGCGGAGCGACCCAGCGAGACCCCGCCGGCTCCCGGAGTGTCCTGGGGACCGCCACCTCCAGGAGCCCCGGTGGTGATCTCGGTGAAGCAGGAGGAGGGGAAGCAGGGGCGCACGGGCAGAAGGAGCCACCGAGCCGCTCCTCCTTGCGGTTTTGCCCGCACGCGCGTCTGCCCGCCCACCTTTCCTGGGGCGGATGCGTTCCCGCAGTGA

Page 46: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

47

A GENE

"Introduction to Bioinformatics" Bioinformatics Course

http://www.down-syndrome.org/updates/2054/updates-2054-figure1-400w.png

Page 48: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

49

GENE EXPRESSION REGULATORS - EPIGENETICS

"Introduction to Bioinformatics" Bioinformatics Course

http://scienceblogs.com/pharyngula/2008/07/22/epigenetics/

Page 49: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

50

EXAMPLES OF BIOLOGICAL DATA

"Introduction to Bioinformatics" Bioinformatics Course

GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins Transcriptome is a set of all RNA molecules including mRNA, rRNA, tRNA, and non-coding RNA produced in one or a population of cells

http://www.bio.miami.edu/~cmallery/150/gene/c7.17.7b.transcription.jpg

Page 50: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

51

TRANSCRIPTION

"Introduction to Bioinformatics" Bioinformatics Course

http://www.youtube.com/watch?v=ztPkv7wc3yU

Page 51: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

52

TRANSCRIPTION

"Introduction to Bioinformatics" Bioinformatics Course

http://www.bio.miami.edu/~cmallery/150/gene/c7.17.7b.transcription.jpg

Page 52: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

53

ALTERNATIVE SPLICING

http://www.nature.com/scitable/content/a-schematic-representation-of-alternative-splicing-95777

"Introduction to Bioinformatics" Bioinformatics Course

Page 53: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

54

TYPES OF RNA

"Introduction to Bioinformatics" Bioinformatics Course

http://csls-text.c.u-tokyo.ac.jp/images/fig/fig03_4.gif

mRNA – messenger RNA:

encodes amino acid sequences of a polypeptide

tRNA – transfer RNA: brings

amino acids to ribosomes during translation

rRNA – ribosomal RNA: with

ribosome proteins makes up the ribosomes, the organelles that translate the mRNA

snRNA – small nuclear

RNA: forms complexes with proteins that are used in RNA processing in eukaryotes

Page 54: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

55

TYPES OF RNA

"Introduction to Bioinformatics" Bioinformatics Course

http://finchtalk.geospiza.com/2009_05_01_archive.html

Page 55: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

56

EXAMPLES OF BIOLOGICAL DATA

"Introduction to Bioinformatics" Bioinformatics Course

GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins

The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism.

http://artavanis-tsakonas.med.harvard.edu/research_images/figure_harsha_proteome.jpg

Page 56: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

57

FROM TRANSCRIPTION TO TRANSLATION

"Introduction to Bioinformatics" Bioinformatics Course

http://www1.cs.columbia.edu/~cleslie/cs4761/microarray/central-dogma.png

Page 57: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

58

TRANSLATION

"Introduction to Bioinformatics" Bioinformatics Course

http://0.tqn.com/d/chemistry/1/0/G/m/mrnatranslation.jpg

Page 58: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

59

TRANSLATION INITIATION

"Introduction to Bioinformatics" Bioinformatics Course

http://bioap.wikispaces.com/Ch+17+Collaboration

Page 59: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

60

TRANSLATION TERMINATION

"Introduction to Bioinformatics" Bioinformatics Course

http://kvhs.nbed.nb.ca/gallant/biology/translation_termination.html

Page 60: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

61

UNIVERSAL GENETIC CODE

"Introduction to Bioinformatics" Bioinformatics Course

http://www.biogem.org/codon.jpg

Page 61: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

62

AMINO ACIDS

Page 62: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

63

PROTEIN

Proteins consists of long chains of amino acid sequences 20 letter alphabet (IUPAC nomenclature)

IUPAC amino acid code

Three letter code

Amino acid

A Ala Alanine

C Cys Cysteine

D Asp Aspartic Acid

E Glu Glutamic Acid

F Phe Phenylalanine

G Gly Glycine

H His Histidine

I Ile Isoleucine

K Lys Lysine

L Leu Leucine

IUPAC amino acid code

Three letter code

Amino acid

M Met Methionine

N Asn Asparagine

P Pro Proline

Q Gln Glutamine

R Arg Arginine

S Ser Serine

T Thr Threonine

V Val Valine

W Trp Tryptophan

Y Tyr Tyrosine

"Introduction to Bioinformatics" Bioinformatics Course

Page 63: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

64

PROTEIN SEQUENCE

>sp|P48431|SOX2_HUMAN Transcription factor SOX-2 OS=Homo sapiens GN=SOX2 PE=1 SV=1 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMVWSRGQRRKMA QENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLM KKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLGY PQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGSM GSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPEPAAPSRLHMSQHYQS GPVPGTAINGTLPLSHM

"Introduction to Bioinformatics" Bioinformatics Course

Page 64: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

65

PROTEIN SIZE

http://www.quora.com/Protein-nutrition-1/Whats-the-average-size-of-a-human-protein-in-kDa

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1150220/

Page 65: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

67

PROTEIN STRUCTURE

http://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Protein_structure.png/1024px-Protein_structure.png

Page 66: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

68

PROTEIN DOMAINS

Page 67: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

69

PROTEIN SEQUENCE

>sp|P48431|SOX2_HUMAN Transcription factor SOX-2 OS=Homo sapiens GN=SOX2 PE=1 SV=1 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMVWSRGQRRKMA QENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLM KKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLGY PQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGSM GSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPEPAAPSRLHMSQHYQS GPVPGTAINGTLPLSHM

Proteins are divided into domains

DNA BINDING DOMAIN

http://www.uniprot.org/

Page 68: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

70

GENE TRANSCRIPTION, TRANSLATION AND PROTEIN SYNTHESIS

http://compbio.pbworks.com/f/central_dogma.jpg

Page 69: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

71

CENTRAL DOGMA

Page 70: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

72

BIOINFORMATIC APPLICATIONS

"Introduction to Bioinformatics" Bioinformatics Course

The integrative approaches are useful and applied in Agricultural Higher yield in crops or fruits Disease or drought resistance crops

Medical To understand processes in healthy and disease individuals Genetic diseases

Pharmaceutical To find or develop new and better drugs Gene based drugs Structure based drug designing

Page 71: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

73

BIOINFORMATIC QUESTIONS 1

"Introduction to Bioinformatics" Bioinformatics Course

To identify an unknown gene of interest

Sequence matching

Is there a match to known sequence in the database

Which protein family does it match to

How to identify more family members

I have an similar structure, how to identify its potential ligands

How to identify if my gene/protein is found present also in other species

How can I identify genes that are inherited together in a specific region

Page 72: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

74

BIOINFORMATIC QUESTIONS 2

"Introduction to Bioinformatics" Bioinformatics Course

I have to constructed a artificial gene, how do I design the primers, how to check if I have the right sequence?

To know structure of an poorly expressed RNA sequence

To identify the structure and function of a protein sequence

To cluster protein sequences into families of related sequences and develop models

To generate phylogenetic trees to identify the evolutionary relationships using similar proteins/DNA

To identify which other proteins interacts with sequence of interest.

Page 73: 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT.03.239 11.09.2013 . 2 "Introduction to Bioinformatics" Bioinformatics

75

BIOINFORMATIC QUESTIONS 3

"Introduction to Bioinformatics" Bioinformatics Course

Find genes that have similar expression in specific conditions

Find transcription factors that regulate specific genes

Vizualise different gene and protein networks

Describe the regulation of genes