Top Banner
Information Information Technology As A Technology As A CATALYST in Basic CATALYST in Basic Biological Research Biological Research Sudha Bhattacharya Sudha Bhattacharya J.N.U. J.N.U. New Delhi New Delhi
22

Information Technology As A CATALYST in Basic Biological Research

Feb 03, 2016

Download

Documents

sarila

Information Technology As A CATALYST in Basic Biological Research. Sudha Bhattacharya J.N.U. New Delhi. Mining of gene Sequence Data Pattern finding in DNA. Specific Example. The Retrotransposons in Entamoeba histolytica genome. Retrotransposons. Mobile DNA elements - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Information Technology As A CATALYST in Basic Biological Research

Information Technology As Information Technology As A CATALYST in Basic A CATALYST in Basic Biological ResearchBiological Research

Sudha Bhattacharya Sudha Bhattacharya

J.N.U. J.N.U.

New DelhiNew Delhi

Page 2: Information Technology As A CATALYST in Basic Biological Research

Mining of gene Sequence Mining of gene Sequence DataData

Pattern finding in DNAPattern finding in DNA

Page 3: Information Technology As A CATALYST in Basic Biological Research

Specific ExampleSpecific Example

The Retrotransposons in The Retrotransposons in Entamoeba histolyticaEntamoeba histolytica

genomegenome

Page 4: Information Technology As A CATALYST in Basic Biological Research

RetrotransposonsRetrotransposons

Mobile DNA elementsMobile DNA elements Some insert in a sequence specific Some insert in a sequence specific

mannermanner Others are widely distributedOthers are widely distributed Can disrupt the function of genes Can disrupt the function of genes

resulting in diseasesresulting in diseases

Page 5: Information Technology As A CATALYST in Basic Biological Research

What Information can What Information can Bioinformatics provide?Bioinformatics provide?

I. Defining the element.I. Defining the element.

II. Where is the element located in the II. Where is the element located in the genome.genome.

III. Pattern Finding in preinsertion sites.III. Pattern Finding in preinsertion sites.

Page 6: Information Technology As A CATALYST in Basic Biological Research

I. Defining the Element

Its sizeIts size Copy number in the genomeCopy number in the genome Are all copies full length?Are all copies full length? Are all copies functional?Are all copies functional? To which group this element belongs To which group this element belongs

(DNA transposon, LTR (DNA transposon, LTR retrotransposon, non LTR retrotransposon, non LTR retrotransposon)retrotransposon)

Page 7: Information Technology As A CATALYST in Basic Biological Research

Empty site

Post insertion

Defining the end points Of the Element bySequence alignment

Constructing a consensusSequence with noMutation

Type of Element:- Deducedby BLAST search, using the sequence of reconstructed element

(could be truncated)

Reconstructed consensus element

Page 8: Information Technology As A CATALYST in Basic Biological Research

Consensus structures of EhLINEs/SINEs

Bakre Abhijeet

Page 9: Information Technology As A CATALYST in Basic Biological Research

Genomic abundance of full-length and truncated copies of EhLINEs and EhSINEs.

Page 10: Information Technology As A CATALYST in Basic Biological Research

II. Where is the element located II. Where is the element located in the genome.in the genome.

Element Analyzer (ELAN) – Element Analyzer (ELAN) – a tool that searches the a tool that searches the genome and locates all genome and locates all

the elements.the elements.

Page 11: Information Technology As A CATALYST in Basic Biological Research

ELANELAN

Page 12: Information Technology As A CATALYST in Basic Biological Research

Occurrence of genes and other elements near EhLINEs/SINEs

Page 13: Information Technology As A CATALYST in Basic Biological Research

GENES LOCATED UPSTREAM OF EhLINE 1GENE NAME Distance (Kb)

E. histolytica Superoxide Dismutase 1.917E. histolytica cysteine proteinase 2.86homolog of Ribosomal protein L7 0.7homolog of unknown protein [A.thaliana] 1.3homolog of coatomer complex 1.16E. histolytica heat shock protein Hsp70 1.219E. histolytica gene for amebapore 2.63homolog of AAA10008879 ~ 1.9homolog of putative protein kinase[A.thaliana]

~1.7

homolog of NP_660094 1.38homologous to AB006697 1.1SSE58 repeat region 0.3-0.8homologous to T31094 0.3homologous to Df gene product of D.melanogaster

0.9

homologous to LRP 16[M.musculus] ~1.6homologous to AP003806[Oryza sativa] 1.8homologous to mitochondrial energy transfer protein [Solanum tuberosum]

~ 1.8

Phosphomannose isomerase homolog[A.thaliana]

0.7

homologous to CAAX prenyl protease[A. thaliana]

0.25

Average distance at which gene is located 1.3 kb

Percentage of hits where ORF found is 64 %

Page 14: Information Technology As A CATALYST in Basic Biological Research

GENES DOWNSTREAM OF EhRLE DISTANCE FROM

THE 3’ END OF ELEMENT (bp)

NP_691986 hypothetical conserved protein [Oceanobacillus iheyensis] gi|23098520|ref|NP_691986.1|[23098520]

500

NP_473455 T-cell activation Rho GTPase-activating protein isoform b [Homo sapiens] gi|21314774|

952

BAC04765 unnamed protein product [Homo sapiens] gi|21755816|

2236

NP_562800 hypothetical protein [Clostridium perfringens] gi|18310866|

2718

NP_345179 metallo-beta-lactamase superfamily protein [Streptococcus pneumoniae TIGR4]

397

NP_070339 long-chain-fatty-acid--CoA ligase (fadD-6) [Archaeoglobus fulgidus]

781

EAA14243 agCP8299 [Anopheles gambiae str. PEST]

2833

AAM43731 Prestalk protein precursor. [Dictyostelium discoideum]

2772

AAM34385 hypothetical protein [Dictyostelium discoideum]

749

XP_124364 similar to RIKEN cDNA 2410043F08 [Mus musculus]

795

NP_473326 predicted using hexExon; MAL3P7.21 (PFC0960c),

1199

Genes located downstream of EhLINE 1

From analysis of both genes upstream and downstream, it is clear that EhLINE 1 has invaded the genome widely

Page 15: Information Technology As A CATALYST in Basic Biological Research

III. Pattern FindingIII. Pattern Finding

Although the element inserts Although the element inserts in many locations, it has some in many locations, it has some

preferences. preferences. What are these?What are these?

Page 16: Information Technology As A CATALYST in Basic Biological Research

Preferred sitesPreferred sites

The sites that are preferred by The sites that are preferred by EndonucleaseEndonuclease for nicking (GCATT) for nicking (GCATT)

Amongst these, the sites that have Amongst these, the sites that have preferred structure preferred structure

GCATT GCATT ? ? ? ?

Page 17: Information Technology As A CATALYST in Basic Biological Research

DNA structure criteria tested based on DNA structure criteria tested based on dinucleotide frequencies dinucleotide frequencies

Thymine ExcessThymine Excess Bendability Bendability Propeller TwistPropeller Twist Stacking EnergyStacking Energy Free EnergyFree Energy DNA Denaturation EnergyDNA Denaturation Energy Protein induced deformability Protein induced deformability Nucleosome positioningNucleosome positioning

Page 18: Information Technology As A CATALYST in Basic Biological Research

Propeller Twist

-8.4

-8.2

-8

-7.8

-7.6

-7.4

-7.2

-7

-6.8

-42 -22 -2 18 38

Position

Pro

pel

ler

Tw

ist

Par

am

Stacking Energy

-3.5

-3.4

-3.3

-3.2

-3.1

-3

-2.9

-2.8

-42 -22 -2 18 38Position

Sta

ckin

g E

ner

gy

Par

am

Duplex Stability

-0.69

-0.67

-0.65

-0.63

-0.61

-0.59

-0.57

-42 -22 -2 18 38Position

Fre

e e

nerg

y

DNA denaturation

33

34

35

36

37

38

-42 -22 -2 18 38

Position

DN

A d

enat

urat

ion

Ene

rgy

(a) (b)

(c) (d)

Computational analysis of preinsertion loci

Page 19: Information Technology As A CATALYST in Basic Biological Research

Conclusion Conclusion

EhLINEs/SINEs insert in a rigid EhLINEs/SINEs insert in a rigid region that can melt easily region that can melt easily and is 10-35 nucleotides and is 10-35 nucleotides

upstream of the preferred EN upstream of the preferred EN sequence (GCATT)sequence (GCATT)

Page 20: Information Technology As A CATALYST in Basic Biological Research

DNA SCANNERDNA SCANNER

Page 21: Information Technology As A CATALYST in Basic Biological Research

Identification of insertion hot spots for non LTR retrotransposons:

computational and biochemicalapplication to Entamoeba histolytica

Prabhat K. Mandal3, Kamal Rawal1, Ram Ramaswamy 1,2, Alok Bhattacharya 1,3 and Sudha Bhattacharya*

Nucleic Acids Research, 2006, Vol. 00, No. 00 1–12doi:10.1093/nar/gkl710

School of Environmental Sciences, Jawaharlal Nehru University, New Mehrauli Road, New Delhi 110 067, India,1School of Information Technology, Jawaharlal Nehru University, New Delhi 110 067, India, 2School of PhysicalSciences, Jawaharlal Nehru University, New Delhi 110 067, India and 3School of Life Sciences, Jawaharlal NehruUniversity, New Delhi 110 067, IndiaReceived June 26, 2006; Revised August 22, 2006; Accepted September 14, 2006

Page 22: Information Technology As A CATALYST in Basic Biological Research

THANKS!THANKS!