Top Banner
In-silico comparative structural modeling of carbonic anhydrase of the marine diatom Thalassiosira pseudonana Keywords: Carbonic anhydrase, Thalassiosira pseudonana, Comparative modeling, Protein Model Database. ABSTRACT: Carbonic anhydrase is an important zinc containing enzyme found in organisms from all kingdoms, catalyses the reversible hydration of carbon dioxide used for inorganic carbon acquisition by phytoplankton. In the oceans, where zinc is nearly depleted, diatoms use cadmium as a catalytic metal atom in cadmium carbonic anhydrase (CDCA). Here we report the structural modeling (in silico) by predicting the 3D model and sequence analysis of carbonic anhydrase in a distinct representative of centric marine diatoms of Thalassiosira pseudonana. The predicted 3D structures were found to be statistically significant by the structure verification program which was deposited into Protein Model Database and it had been assigned the PMDB ID code PM0075791. 009-015 | JRBI | 2012 | Vol 1 | No 1 © Ficus Publishers. This Open Access article is governed by the Creative Commons Attribution License (http:// creativecommons.org/licenses/by/2.0), which gives permission for unrestricted use, non- commercial, distribution, and reproduction in all medium, provided the original work is properly cited. Submit Your Manuscript www.ficuspublishers.com www.ficuspublishers.com/ Journal of Research in Bioinformatics An International Open Access Online Research Journal Authors: Debashree Kakati 1 , Saurov Mahanta 2 and Bhaben Tanti 1 . Institution: 1. Cytogenetics and Plant Breeding Laboratory Department of Botany, Gauhati University Guwahati-781 014, Assam, India. 2. National Institute of Electronics & Information Technology, Guwahati, Assam, India. Corresponding author: Bhaben Tanti. Email: [email protected]. Web Address: http://ficuspublishers.com/ Documents/BI0002.pdf. Dates: Received: 30 Nov 2011 /Accepted: 16 Dec 2011 /Published: 22 Feb 2012 Article Citation: Debashree Kakati, Saurov Mahanta and Bhaben Tanti. In-silico comparative structural modeling of carbonic anhydrase of the marine diatom Thalassiosira pseudonana. Journal of Research in Bioinformatics (2012) 1: 009-015 An International Online Open Access Publication group Original Research Journal of Research in Bioinformatics Journal of Research in Bioinformatics
7

In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

Mar 29, 2023

Download

Documents

Diganta Sarma
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

In-silico comparative structural modeling of carbonic anhydrase of the

marine diatom Thalassiosira pseudonana

Keywords: Carbonic anhydrase, Thalassiosira pseudonana, Comparative modeling, Protein Model Database.

ABSTRACT: Carbonic anhydrase is an important zinc containing enzyme found in organisms from all kingdoms, catalyses the reversible hydration of carbon dioxide used for inorganic carbon acquisition by phytoplankton. In the oceans, where zinc is nearly depleted, diatoms use cadmium as a catalytic metal atom in cadmium carbonic anhydrase (CDCA). Here we report the structural modeling (in silico) by predicting the 3D model and sequence analysis of carbonic anhydrase in a distinct representative of centric marine diatoms of Thalassiosira pseudonana. The predicted 3D structures were found to be statistically significant by the structure verification program which was deposited into Protein Model Database and it had been assigned the PMDB ID code PM0075791.

009-015 | JRBI | 2012 | Vol 1 | No 1

© Ficus Publishers.

This Open Access article is governed by the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which gives permission for unrestricted use, non-commercial, distribution, and reproduction in all medium, provided the original work is properly cited.

Submit Your Manuscript

www.ficuspublishers.com www.ficuspublishers.com/

Journal of Research in

Bioinformatics An International Open Access

Online Research Journal

Authors:

Debashree Kakati1,

Saurov Mahanta2 and

Bhaben Tanti1.

Institution:

1. Cytogenetics and Plant

Breeding Laboratory

Department of Botany,

Gauhati University

Guwahati-781 014,

Assam, India.

2. National Institute of

Electronics & Information

Technology, Guwahati,

Assam, India.

Corresponding author:

Bhaben Tanti.

Email:

[email protected].

Web Address: http://ficuspublishers.com/

Documents/BI0002.pdf.

Dates: Received: 30 Nov 2011 /Accepted: 16 Dec 2011 /Published: 22 Feb 2012

Article Citation: Debashree Kakati, Saurov Mahanta and Bhaben Tanti.

In-silico comparative structural modeling of carbonic anhydrase of the marine diatom Thalassiosira pseudonana. Journal of Research in Bioinformatics (2012) 1: 009-015

An International Online Open Access

Publication group Original Research

Journal of Research in Bioinformatics

Jou

rn

al of R

esearch

in

Bioin

form

atics

Page 2: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

INTRODUCTION

The carbonic anhydrases (CA; EC 4.2.1.1)

form a family of enzymes that catalyze the rapid

conversion of carbon dioxide to bicarbonate and

protons, a reaction that occurs rather slowly in the

absence of a catalyst (Aizawa and Miyachi, 1986).

Due to the essential nature of this enzyme, nature

has evolved the catalytic capacity to hydrate carbon

dioxide/dehydrate bicarbonate several times

(Moroney, 2001). There are three recognised

classes of carbonic anhydrase enzymes, a, b and g,

which have no significant sequence identity, and

have structurally distinct overall folds. Yet, despite

their structural differences, the active sites of all

three classes function with a single Zn atom that is

essential for catalysis. These enzymes are of

ancient origin, and appear to have evolved

independently from one another, thereby providing

an excellent example of convergent evolution. The

three classes have differing distributions in different

organisms: in mammals, all the isozymes so far

discovered belong to the a-class; plants produce

mainly the b-class; prokaryotes encode all three

classes of enzyme, with the b and g classes

predominating (Lane et al., 2005).

Diatoms are unicellular microalgae,

widespread in aquatic environments, and marine

species are considered to be some of the most

important CO2 fixers in the hydrosphere (Apt et al.,

1996). In contrast, the function of the internal form

of CA in marine algae has not been studied

extensively. Only one diatom CA has been isolated,

that from the marine diatom Thalassiosira

weissflogii and the structure of its Zn coordination

site was determined by x-ray absorption

spectrometry (Roberts et al., 1997; Cox et al.,

2000). However, it is not clear whether or not the

distinct structure of CA of T. weissflogii is common

in the CAs of other diatom species.

The prevalence of CAs in diatoms that

presumably contain Cd at their active site probably

reflects the very low concentration of Zn in the

marine environment and the difficulty in acquiring

inorganic carbon for photosynthesis. It is well

established that the surface waters of the oceans,

in which microalgae such as diatoms flourish, are

extremely low in zinc - between 2 and 50 pico-

molar. T. wiessflogii contains genes for two discrete

carbonic anhydrases. This, together with the

observation that adding cadmium allows the diatom

to grow, caused to search for a specific group of

carbonic anhydrase. Detailed molecular studies of

CA from more marine diatom species is certainly

needed because it is one of the critical enzymes for

carbon acquisition in marine microalgae. To identify

such proteins, we analyzed the carbonic anhydrase

of the marine diatom Thalassiosira pseudonana, the

first diatom with a sequenced genome. The recent

determination of the complete genome sequence of

the diatom T. pseudonana offers an unprecedented

opportunity to examine the complex cellular

processes using the tools of genomics and

proteomics (Armbrust et al., 2004). This study

envisages undertaking a comprehensive in silico

comparative modelling of a common carbon fixing

enzyme - carbonic anhydrase of a marine diatom T.

pseudonana for which Protein Data Bank has no X-

ray crystallographic or NMR structure available.

Prediction of the structure of Caconic anhydrase

would help the researchers to do more advanced

studies relate to the function and activities of this

enzyme in T. pseudonana or in any other related

organisms.

MATERIALS AND METHODS

Amino acid sequence of the target protein

(carbonic anhydrase containing 237 amino acid

residues) of Thalassiosira pseudonana CCMP1335

was derived from NCBI RefSeq database (Accession

No. XP_002295227). It is a predicted protein

sequence and has been derived by the method of

conceptual translation. To obtain suitable template,

NCBI-BLASTp and WU-BLAST2 were perormed

indepently with PDB. The target - template

alignment was carried out using Clustal W and

BioEdit program. Comparative (homology) modeling

was conducted manually by using Modeller

program. The final 3D structures with all the

Kakati et al.,2012

010 Journal of Research in Bioinformatics (2012) 1: 009-015

Page 3: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

coordinates for both the proteins were obtained by

optimization of a molecular probability density

function (pdf) of MODELLER (Eswar et al., 2006).

The structure was evaluated by ProCheck (v.3.5.4),

WHAT_CHECK (v.19991018-1516) packages

(Laskowski et al., 2003; Hooft et al., 1996). The

structure is further evaluated by ERRAT2 package.

After faithful verification the 3D coordinate file was

successfully deposited to PMDB. All the graphical

presentations of the 3D structure were prepared

using UCSF Chimera and Rasmol (Pittersen et al.,

2004).

RESULT AND DISCUSSION

Retrieval and analysis of sequence

Based on BLAST results, three suitable

templates were selected for target carbonic

anhydrase of Thalassiosira pseudonana CCMP1335.

The resultant 3D structure of the carbonic

anhydrase of Thalassiosira pseudonana CCMP1335

was based on the coordinates from PDB ID -

3BOH_A (carbonic anhydrase from marine diatom

Thalassiosira weissflogii- cadmium bound domain 2,

chain: A, length 213aa) having residue identity

78% with experimental target protein sequence;

3BOJ_A (carbonic anhydrase from marine diatom

Thalassiosira weissflogii- cadmium bound domain 1

without bound metal, chain: A, length 213aa)

having residue identity 78% with the target protein

sequence and 3BOB_A (carbonic anhydrase from

marine diatom Thalassiosira weissflogii- cadmium

bound domain 2, chain: A, length 210aa) with

residue identity 76% with the target protein

sequence respectively (Figure 1. A-C). The target

sequence of Thalassiosira pseudonana has been

designated as 1pca for our convenience. Finally, the

entire three template’s co-ordinates files and all

FASTA sequence files from RCSB PDB database

were downloaded.

The alignment result of Thalassiosira

pseudonana CCMP1335 was as follows:

>P1;3BOH_A

SHMSLTPDQIVAALQERGWQAEIVTEFSLLNEMVDVDP

QGILKCVDGRGSDNTQFCGPKMPGGIYAIAHNRGVTTL

EGLKQITKEVASKGHVPSVHGDHSSDMLGCGFFKLWVT

GRFDDMGYPRPQFDADQGAKAVENAGGVIEMHHGSH

AEKVVYINLVENKTLEPDEDDQRFIVDGWAAGKFGLDV

PKFLIAAAATVEMLGGPKKAKIVIP*

>P1;3BOJ_A

SHMSLTPDQIVAALQERGWQAEIVTEFSLLNEMVDVDP

QGILKCVDGRGSDNTQFCGPKMPGGIYAIAHNRGVTTL

EGLKQITKEVASKGHVPSVHGDHSSDMLGCGFFKLWVT

GRFDDMGYPRPQFDADQGAKAVENAGGVIEMHHGSH

AEKVVYINLVENKTLEPDEDDQRFIVDGWAAGKFGLDV

PKFLIAAAATVEMLGGPKKAKIVIP*

>P1;3BOB_A

ISPAQIAEALQGRGWDAEIVTDASMAGQLVDVRPEGIL

KCVDGRGDNTRMGGPKMPGGIYAIAHNRGVTSIEGLK

QITKEVASKGHLPSVHGDHSSDMLGCGFFKLWVTGRF

DDMGYPRPQFDADQGANAVKDAGGIIEMHHGSHTEKV

VYINLLANKTLEPNENDQRFIVDGWAADKFGLDVPKFLI

AAAATVEMLGGPKNAKIVVP*

>P1;1pca

LTPKDIVAALQSRGWEAEIISASSISQDMVEVDPAGILK

CVDGRGSDNTRMAGPKMPGGIYAIAHNRGTTSVDGLK

EITKEVASKGHVPSVHGDHSADMLGCGFFRLWVTGEF

Kakati et al.,2012

Journal of Research in Bioinformatics (2012) 1: 009-015 011

Figure 1(A-C): Different selected templates of

Thalassiosira weissflogii

(A) 3BOB (B) 3BOH

(C) 2BOJ

Page 4: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

DSMGYPRPEFDADQGAAAVKESGGVIEMHHGSHTEKV

VYINLVENKTLEPDENDQRFIVDGWAAIKFNLDVVKFLV

AAAATVEMLGGPRIAKIVVA*

Predicted 3D structure of the carbonic

anhydrase of T. pseudonana CCMP1335

The homology alignment of the carbonic

anhydrase of T. pseudonana CCMP1335 showed a

gap region of 28 residues long at the beginning,

which formed a loop region in the model. As no any

suitable template was found for that region for

which the first 28 residues of the target sequence

were deleted and done the alignment.

The predicted 3D structures were found to

be statistically significant by the structure

verification program which was deposited into

Protein Model database and it had been assigned

the PMDB ID code PM0075791 for coordinate entry

(Figure 2). The What Check verification revealed

that the residue nomenclature in the coordinate file

was correct, inside/outside residue distribution

normal, the distribution of residue types over the

inside and the outside of the protein was normal.

ProCheck verification proved that the model is of

good quality, judged by Ramachandran Plot

(Figure 3). Verification with ERRAT2 program

revealed an overall quality factor of 95.522.

Comparative (Homology) modeling is the

most successful structure prediction method that

focuses on the use of structural templates derived

from known structures to build an all-atom model

of a target. The predicted 3D structures for

carbonic anhydrase of T. pseudonana CCMP1335

would definitely help in studying the molecular

mechanism of functions and also further in vitro

analyses (Moult et al., 2003; Peitsch, 2002).

Three-dimensional analysis of protein

structures is proving to be one of the most fruitful

modes of biological and medical discovery in the

early 21st century, providing fundamental insight

into many biochemical functions. Fully realizing

such insight, however, would require analysis of too

many distinct proteins for thorough laboratory

analysis of all proteins to be feasible, thus, any

method capable of accurate, efficient in silico

structure prediction should prove highly

Kakati et al.,2012

012 Journal of Research in Bioinformatics (2012) 1: 009-015

Residues in most favored regions

[A, B, L]

168 95.5%

Residues in additional allowed regions

[a,b,l,p]

6 3.4%

Residues in generously allowed regions

[~a,~b,~l,~p]

2 1.1%

Residues in disallowed regions [XX] 0 0.0%

Number of non-glycine and non-proline

residues

176 100.0%

Number of end-residues

(excl. Gly and Pro)

2

Number of glycine residues 22

Number of proline residues 9

Total number of residues 209

Figure 3: Ramachandran Plot Statistics

A B C

Fig. 2(A-C): Different views of Predicted Structure of

Carbonic Anhydrase (Screen snapshot of 3D

molecular visualization in structure visualizing

software UCSF chimera ) (PMDB ID: PM0075791)

Figure 3: Ramachandran analysis of the backbone dihedral

angles Psi (j) and Phi (s) for the final structure of carbonic

anhydrase of T. pseudonana CCMP1335 validated with

ProCheck program. Red region represents the most favored

region, yellow = allowed region, light yellow = generously

allowed region, white = disallowed region.

PROCHECK

Page 5: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

expeditious. The technique generally acknowledged

to provide the most accurate protein structure

predictions, called comparative modeling, has, thus,

attracted substantial attention and is the focus of

these findings.

Prediction of general property of carbonic

anhydrase of T. pseudonana using ProtParam

Prediction of the properties of the protein

helps to analyze the predicted structure in a better

way so that establishing relationship between

structure and function becomes easier.

The general property of the carbonic

anhydrase of T. pseudonana was predicted through

the online tool ProtParam. This tool can be

accessed from the Expasy site (www.expasy.org)

The obtained result was as follows:

Number of amino acids: 237, Molecular weight:

25303.8

Amino acid composition: amiAla (A)-

10.1%; Arg (R)- 3.4%; Asn (N)- 2.5%; Asp (D)-

7.6%; Cys (C)- 1.3%; Gln (Q)- 2.1%; Top of Form

amino am Glu (E) - 5.9%; Gly (G) - 9.7%; His (H) -

3.4%; Ile (I) - 6.3%; Leu (L) - 5.9%; Lys (K) -

5.9%; Met (M) - 4.2%; Phe (F) - 3.0%; Pro (P) -

4.6%; Ser (S) -7.2%; Thr (T) - 4.6%; Trp (W) -

1.3%; Tyr (Y) -1.3%; Val (V) - 9.7%; Pyl (O) -

0.0%; Sec (U) - 0.0%; (B) - 0.0%; (Z) - 0.0%;

(X) - 0.0%.

Total number of negatively charged

residues (Asp + Glu): 32 and Total number of

positively charged residues (Arg + Lys): 22

Atomic composition: Carbon - 1110;

Hydrogen - 1765; Nitrogen - 305; Oxygen - 344

and Sulfur - 13.

Formula: C1110H1765N305O344S13; Total number of

atoms: 3537

Extinction coefficients: Extinction coefficients

are in units of M-1 cm-1, at 280 nm measured in

water.

Ext. coefficient 21095

Abs 0.1% (=1 g/l) 0.834, assuming ALL Cys

residues appear as half cystines

Ext. coefficient 20970

Abs 0.1% (=1 g/l) 0.829, assuming NO Cys

residues appear as half cystines

Estimated half-life:

The N-terminal of the sequence

considered is M (Met).

The estimated half-life is: 30 hours (mammalian

reticulocytes, in vitro).

>20 hours (yeast, in vivo).

>10 hours (Escherichia coli, in vivo).

Instability index:

The instability index (II) is computed to be

34.79. This classifies the protein as stable.

Aliphatic index is 85.99 and Grand

average of hydropathicity (GRAVY) is -0.062.

Isotopic calculation of carbonic anhydrase of

T. pseudonana

The isotopic calculation of the enzyme

carbonic anhydrase of T. pseudonana was done

through the online software tool isotopident. This

tool can be accessed from the Expasy site. The

graphical representation of the isotopic distribution

was presented in Figure 4. The isotopic calculation

revealed that the exact mass as well as

monoisotopic mass of the target protein was

25300.674 and 25287.636 respectively. Further, the

probability of combination was 0.724% and the

most likely combination of the target protein was

9.73% of those masses rounding to 25288 amu.

Physico-chemical property analyses of

carbonic anhydrase of T. pseudonana

CCMP1335

Physico-chemical property of the enzyme

was analyzed with the software ANTHEPROT 2000

V6.0 release 1.1.54. The titration curve of the

target protein was presented in Figure 5. The pH

was found to be 005.055 at net charge 000.066,

specific volume 0.733 cm²cm/g, protein molecular

epsilon at 280 nm (l/mol/cm) was 21345 and the

minimal radius of equivalent sphere of the

unhydrated molecule was 1.94 E-7 cm.

CONCLUSIONS

The model presented here can serve as a

guide for the allocation of amino acid residues

involved in each fold, which is important for further

Kakati et al.,2012

Journal of Research in Bioinformatics (2012) 1: 009-015 013

Page 6: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

investigations on active sites and molecular

mechanism of function. The study was performed

for sequence analyses and prediction of 3D

structure of carbonic anhydrase of Thalassiosira

pseudonana CCMP 1335 using the comparative

(Homology) modeling due to high level sequence

identity with the previously solved Protein Data

Bank. A series of molecular modeling and

computational methods were combined in order to

gain insight into the 3D structure. Thus, in silico

characterization of the enzyme carbonic anhydrase

is one of the fundamental parameters using

bioinformatics tools and techniques which will

obviously help for in vitro characterization of this

important enzyme in molecular level.

REFERENCE

Aizawa K and Miyachi S. 1986. Carbonic

anhydrase and CO2 concentrating mechanisms in

microalgae and cyanobacteria. FEMS Microbiology

Review 39:215-233.

Apt KE, Kroth-Pancic PG and Grossman AR.

1996. Stable nuclear transformation of the diatom

Phaeodactylum tricornutum. Molecular Genomics

and Genetics 252:572-579.

Armbrust E, Berges J, Bowler C, Green B,

Martinez D, Putnam N, Zhou S, Allen A, Apt K,

Bechner M, Brzezinski M, Chaal B, Chiovitti A,

Davis A, Demarest M, Detter J, Glavina T,

Goodstein D, Hadi M, Hellsten U, Hildebrand

M, Jenkins B, Jurka J, Kapitonov V, Kröger N,

Lau W, Lane T, Larimer F, Lippmeier J, Lucas

S, Medina M, Montsant A, Obornik M, Parker

M, Palenik B, Pazour G, Richardson P,

Rynearson T, Saito M, Schwartz D,

Thamatrakoln K, Valentin K, Vardi A,

Wilkerson F and Rokhsar D. 2004. The genome

of the diatom Thalassiosira pseudonana: Ecology,

evolution, and metabolism. Science 306:79-86.

Cox EH, McLendon GL, Morel FMM, Lane TW,

Prince RC, Pickering IJ and George GN. 2000.

The active site structure of Thalassiosira weissflogii

carbonic anhydrase 1. Biochemistry 39:2128-12130.

Eswar N, Eramian D, Webb B, Shen M and Sali

A. 2006. Protein Structure Modeling with

MODELER. Current Protocols in Bioinformatics

5:164-169.

Hooft RWW, Vriend G, Sander C and Abola E

E. 1996. Errors in protein structures, Nature

381:272-272.

Lane T, Saito MA, George GN, Pickering IJ,

Prince RC, Morel FFM. 2005. Isolation and

Preliminary Characterization of Cadmium Carbonic

Anhydrase from a Marine Diatom. Nature 471:435-

442.

Laskowski RA, Watson JD and Thornton JM.

Kakati et al.,2012

014 Journal of Research in Bioinformatics (2012) 1: 009-015

Figure 5: Titration curve (charge Vs pH) Figure 4: Graphical representation of Isotopic distribution

2.5290 2.5295 2.5300 2.5305 2.5310 2.5315 2.5320 2.5325 2.5330

Isotopic distribution In

ten

sity

Peak .

x101 m/z

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

Page 7: In-silico comparative structural modeling of Carbonic anhydrase of the marine diatom Thalassiosira pseudonana

Kakati et al.,2012

Journal of Research in Bioinformatics (2012) 1: 009-015 015

2003. From protein structure to biochemical

function?. Journal of Structural and Functional

Genomics 4:167-177.

Moroney JV, Bartlett SG and Samuelsson G.

2001. Carbonic anhydrases in plants and algae.

Plant Cell Environment 24:141-153.

Moult J, Fidelis K, Zemla A and Hubbard T.

2003. Critical assessment of methods of protein

structure prediction (CASP)-round V. Proteins 56

(6):334-339.

Peitsch MC. 2002. About the use of protein

models, Bioinformatics 18:934-938.

Pittersen C, Couch GS, Greenblatt DM, Merg

EC and Ferrin TE. 2004. UCSF Chimera-a

visualization system for exploratory research and

analysis, Journal of Computational Chemistry

25:1605-1612.

Roberts SB, Lane TW and Morel FMM. 1997.

Carbonic anhydrase in the marine diatom

Thalassiosira weissflogii (Bacillariophyceae). Journal

of Phycology 33:845-850.

Submit your articles online at Ficuspublishers.com

Advantages

Easy online submission Complete Peer review Affordable Charges Quick processing Extensive indexing Open Access and Quick spreading You retains your copyright

[email protected]

www.ficuspublishers.com/submit1.aspx.