Top Banner
Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily Gaurav Narale
33

Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Jan 13, 2016

Download

Documents

Livia

Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily. Gaurav Narale. Major Facilitator Superfamily (MFS). MEMBRANE TRANSPORT Largest secondary transporter protein family known so far with more than 1000 members identified. 1 - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Structure Prediction and Modeling of a Eukaryotic

Member of the Major Facilitator Superfamily

Gaurav Narale

Page 2: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Major Facilitator Superfamily (MFS)

• MEMBRANE TRANSPORT

• Largest secondary transporter protein family known so far with more than 1000 members identified.1

• Use a solute gradient to drive the translocation of substrates such as ions, sugars, amino acids, peptides and other hydrophilic solutes.2

• Typically 400-600 amino acids long.

• 12 transmembrane -helices, with both the N- and C-termini in the cytosol.3

– Two six-helix halves connected by a central loop.

• Found in all three kingdoms of living organisms.

Page 3: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Identifying Templates and Targets

• TEMPLATES - Two known structures:– Lactose Permease (LacY) E. Coli

– Glycerol-3-Phosphate Transporter (GlpT) E. Coli

• Sequence identity between the two is negligible (~9%).

• CE algorithm for structural alignment indicates that they superimpose over most of their chain length (RMSD~3.7Å)

• 1st GOAL: To find a Eukaryotic member of the MFS that shows enough sequence identity with one of the known structures to allow reasonable alignment.

Page 4: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Function and Mechanism of LacY and GlpTBoth use a solute gradient to drive translocation of substrate:

- LacY mediates the coupled transport of lactose and H+

- GlpT catalyzes the exhange of glycerol-3-phosphate for phosphate

Alternating-Access Model-Outward-facing conformation exposed to the extracellular side.-Inward-facing conformation exposed to the cytoplasm.

Ribbon Representation-Amino-terminal domain (blue).-Carboxyl-terminal domain (green).-Bends and other irregularities in the -helices are indicated by deviations from ideally straight and continuous helical ribbon.

Page 5: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Identifying Templates and Targets• Lactose Permease (LacY)

– Obtained protein pdb file from protein data bank (1PV6) and extracted amino acid sequence in FASTA format. www.rcsb.org/pdb

– Searched for a TARGET with high sequence identity using NCBI BLAST. www.ncbi.hlm.nih.gov

1. General search against all organisms: 2 iterations, threshold 0.005- hits were mainly bacterial proteins.

2. Saved the results as a profile (PSSM)3. More sensitive search using the original sequence as well as the saved

profile as input while limiting to a eukaryotic search: 2 iterations, threshold 0.01

– Unable to identify a suitable target.

Page 6: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Identifying Templates and Targets• Glucose-3-Phosphate Transporter (GlpT)

– Obtained protein pdb file from protein data bank (1PW4) and extracted amino acid sequence in FASTA format. www.rcsb.org/pdb

– Searched for a TARGET with high sequence identity using NCBI

BLAST. www.ncbi.hlm.nih.gov

1. General search against all organisms: 2 iterations, threshold 0.005

2. Obtained a suitable TARGET: Glucose-6-Phosphate Translocase

Homo Sapien

3. Utilized BLink to identify several eukaryotic “close targets” for use in multiple sequence alignments.

Page 7: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Multiple sequence alignment

• Only template and target - initial review• Both templates, target and close targets

– 15 proteins similar to the target selected from different species to get a better alignment

– Only template and target extracted• Around 30 % similarity between template and

target• Well distributed alignment

Page 8: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Alignment using FUGUE 10 20 30 40 50

hs1pw4a ( 5 ) fkpaphkarlpaaeidptYrrlrwqIflGIffGyaAYylVRkNFALAMpyQUERY g6pt -------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKTFSFVMPS aaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaa 60 70 80 90 100 hs1pw4a ( 55 ) L-veqgfsrgDLGfALSGISiAygfSkfimgsvSdrsnPrvfLPaGLilAQUERY g6pt LVEEIPLDKDDLGFITSSQSAAYAISKFVSGVLSDQMSARWLFSSGLLLV aaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa 110 120 130 140 150 hs1pw4a ( 104 ) AavMlfMGfvpwATssiavMfvlLflCGwfQGmGwpPCgrTmvhwwsqkeQUERY g6pt GLVNIFFAWSSTV----PVFAALWFLNGLAQGLGWPPCGKVLRKWFEPSQ aaaaaaaaa aaaa aaaaaaaaaaaaaaa aaaaaaaaa a 160 170 180 190 200 hs1pw4a ( 154 ) rggivsVwncAhNvggGiPPllFllGmawfndwhAALYmPAfcAilvAlfQUERY g6pt FGTWWAILSTSMNLAGGLGPILATILAQSY-SWRSTLALSGALCVVVSFL aaaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaaaa 210 220 230 240 250 hs1pw4a ( 204 ) AfamMrdTpqsCglppiee-----ykndtakqifmqyVlpnklLwyIAiAQUERY g6pt CLLLIHNEPADVGLRNLDPMPSEGKKGSLKEESTLQELLLSPYLWVLSTG aaaa aaaaaa aaaaaaaaa 260 270 280 290 300 hs1pw4a ( 262 ) NvfVyLLRYGiLDwSPtylkevKhfaldkSSwAYflYEyagipGTllCgwQUERY g6pt YLVVFGVKTCCTDWGQFFLIQEKGQSALVGSSYMSALEVGGLVGSIAAGY aaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa 310 320 330 340 350 hs1pw4a ( 312 ) msdkv----------frgnrGaTGvfFMtlVtiaTivywmnpagNptvdmQUERY g6pt LSDRAMAKAGLSNYGNPRHGLLLFMMAGMTVSMYLFRVTVTSDSPKLWIL aaaa aaaaaaaaaaaaaaaaaa aaaaa 360 370 380 390 400 hs1pw4a ( 352 ) iCmivIGflIyGPvmLIglHAleLApkkAagtAagfTglfGylgGSvaAsQUERY g6pt VLGAVFGFSSYGPIALFGVIANESAPPNLCGTSHAIVGLMANVGGFL-AG aaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa 410 420 430 440 450 hs1pw4a ( 402 ) aiVGytvdffgwdgGfmvMigGSilAvilLivVmigekrrheqllqelvpQUERY g6pt LPFSTIAKHYSWSTAFWVAEVICAASTAAFFLLRNIRTKMGRVSKKAE-- aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa33333

Page 9: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

MPSA - only template and targetP_1P4W FKPAPHKARLPAAEIDPTYRRLRWQIFLGIFFGYAAYYLVRKNFALAMPYLVEQG-FSRGGLUCOSE6HUMAN -------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKTFSFVMPSLVEEIPLDKD * * ** .:* **: **: **.*::.** ***: :.:.

P_1P4W DLGFALSGISIAYGFSKFIMGSVSDRSNPRVFLPAGLILAAAVMLFMGFVPWATSSIAVMGLUCOSE6HUMAN DLGFITSSQSAAYAISKFVSGVLSDQMSARWLFSSGLLLVGLVNIFFAWS----STVPVF **** *. * **.:***: * :**: ..* ::.:**:*.. * :*:.: *::.*:

P_1P4W FVLLFLCGWFQGMGWPPCGRTMVHWWSQKERGGIVSVWNCAHNVGGGIPPLLFLLGMAWFGLUCOSE6HUMAN AALWFLNGLAQGLGWPPCGKVLRKWFEPSQFGTWWAILSTSMNLAGGLGPILATI-LAQS .* ** * **:******:.: :*:. .: * :: . : *:.**: *:* : :*

P_1P4W NDWHAALYMPAFCAILVALFAFAMMRDTPQSCGLP-----PIEEYKNDTAKQIFMQYVLPGLUCOSE6HUMAN YSWRSTLALSGALCVVVSFLCLLLIHNEPADVGLRNLDPMPSEGKKGSLKEESTLQELLL .*:::* :.. .::*:::.: :::: * . ** * * *.. :: :* :*

P_1P4W NKLLWYIAIANVFVYLLRYGILDWSPTYLKEVKHFALDKSSWAYFLYEYAGIPGTLLCGWGLUCOSE6HUMAN SPYLWVLSTGYLVVFGVKTCCTDWGQFFLIQEKGQSALVGSSYMSALEVGGLVGSIAAGY . ** :: . :.*: :: **. :* : * : .* * .*: *:: .*:

P_1P4W MSDKVFRGN--------RGATGVFFMTLVTIATIVYWMNPAGN--PTVDMICMIVIGFLIGLUCOSE6HUMAN LSDRAMAKAGLSNYGNPRHGLLLFMMAGMTVSMYLFRVTVTSDSPKLWILVLGAVFGFSS :**:.: * . :*:*: :*:: :: :. :.: :: *:**

P_1P4W YGPVMLIGLHALELAPKKAAGTAAGFTGLFGYLGGSVAASAIVGYTVDFFGWDGGFMVMIGLUCOSE6HUMAN YGPIALFGVIANESAPPNLCGTSHAIVGLMANVGGFLAGLPFSTIAKHYSWSTAFWVAEV ***: *:*: * * ** : .**: .:.**:. :** :*. .: : .: . ::. :

P_1P4W GGSILAVILLIVVMIGEKRRHEQLLQELVPGLUCOSE6HUMAN ICAASTAAFFLLRNIRTKMGRVSKKAE--- : :. :::: * * : . *

Page 10: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Extracted template-targetP_1PW4 -------FKPAPHKARLPAAEIDPTYRRLRWQIFLGIFFGYAAYYLVRKNFALAMPYLVEgi|2765461|e --------------------MAAQGYGYYRTVIFSAMFGGYSLYYFNRKTFSFVMPSLVE . P_1PW4 QGFS---RGDLGFALSGISIAYGFSKFIMGSVSDRSNPRVFLPAGLILAAAVMLFMGFVPgi|2765461|e EIPLD--KDDLGFITSSQSAAYAISKFVSGVLSDQMSARWLFSSGLLLVGLVNIFFAWSS : . : . *. : P_1PW4 WATSS--IAVMFVLLFLCGWFQGMGWPPCGRTMVHWWSQKERGGIVSVWNCAHN--VGGGgi|2765461|e TVP------VFAALWFLNGLAQGLGWPPCGKVLRKWFEPSQFGTWWAILSTSMN--LAGG . : : . .. P_1PW4 IPP-------LLFLLGMAWFN-----------DWHAALYMPAFCAILVALFAFAMMRDTPgi|2765461|e LGP-------ILATILAQSYS------------WRSTLALSGALCVVVSFLCLLLIHNEP : . .. : P_1PW4 QSCGLPPIEEYKNDT-------------------AKQIFMQYVLPNKLLWYIAIANVFVYgi|2765461|e ADVGLRNLDPMPSEG--------------KKGSLKEESTLQELLLSPYLWVLSTGYLVVF :. . : . P_1PW4 LLRYGILDWSPTYLKEVKHFALDK-SSWAYFLYEYAGIPGTLLCGWMSDKVFR-------gi|2765461|e GVKTCCTDWGQFFLIQEKGQSALV-GSSYMSALEVGGLVGSIAAGYLSDRAMAKAGLSNY . . P_1PW4 -GNRGATGVFFMTLVTIATIVYWMNPAG---------------NPTVDMICMIVIGFLIYgi|2765461|e GNPRHGLLLFMMAGMTVSMYLFRVTVTSD-----------S--PKLWILVLGAVFGFSSY P_1PW4 GP-VMLIGLHALELAPKKAAGTAAGFTGLFGYLGGSVAASAIVGYTVDF-FGWDGGFMVMgi|2765461|e GP-IALFGVIANESAPPNLCGTSHAIVGLMANVG-GFLAGLPFSTIAKH-YSWSTAFWVA : P_1PW4 IGGSILAVILLIVVMIGEKRRHEQLLQELVP-----------------------------gi|2765461|e EVICAASTAAFFLLRNIRTKMGRVSKKAE-------------------------------

Page 11: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Checking alignment in MODELER

Using chk_align.top script

_aln.pos 210 220 230 240 250 260 270

1PW4 MRDTPQSCGLPPIEEYKND/T-----AKQIFMQYVLPNKLLWYIAIANVFVYLLRYGILDWSPTYLKE

G6PT IHNEPADVGLRNLDPMPSE-GKKGSLKEESTLQELLLSPYLWVLSTGYLVVFGVKTCCTDWGQFFLIQ

_consrvd * ** * * ** * ** *

Problem near chain break

_aln.pos 210 220 230 240 250 260 270

1PW4 MRDTPQSCGLPPIEEYKND/----TAKQIFMQYVLPNKLLWYIAIANVFVYLLRYGILDWSPTYLKEV

G6PT IHNEPADVGLRNLDPMPSEGKKGSLKEESTLQELLLSPYLWVLSTGYLVVFGVKTCCTDWGQFFLIQE

_consrvd * ** * * ** * ** *

Page 12: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Modeler Runs

• Using extracted template and target alignment• Sequence for template extracted from structure

using Insight• Missing residues in structure appear as chain

breaks• Parameters:

– OUTPUT_CONTROL = 1 1 1 1 1– STARTING_MODEL= 1– ENDING_MODEL = 5 – LIBRARY_SCHEDULE = 4– MD_LEVEL = 'refine_1'

Page 13: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

PROSA 2 runs

• Used to evaluate models

• Models with best scores from MODELER were compared using PROSA

• Z value used for initial comparison

• Graph used to identify location of major violations

Page 14: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Model Selection Criteria

• MODELER log file– Minimum energy

– Number of violations

– Number of really bad violations

– Location of violations with respect to alignment and structure

• PROSA 2 log file– Z score closest to template

– Peaks and troughs in graph relative to template

Page 15: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Adjusting the alignment

• Comparison of structures obtained from modeler in Insight

• Alignment violations clearly visible• Criteria for modifying alignment:

– Unequal number of residues in loop– Unsatisfied structural similarity constraints– Residues violating constraints as generated by

modeler

Page 16: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 17: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

1st run - adjustment in Insight

Page 18: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 19: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Modeling

• Modeler Run 2

• Loop Modeling Run 1

Page 20: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop modeling

• Generate models based on adjusted alignment• 25 models obtained• Models selected based on minimum energy and

constraint violations• Parameters:

– OUTPUT_CONTROL = 1 1 1 1 1– STARTING_MODEL= 1– ENDING_MODEL = 5 – LIBRARY_SCHEDULE = 2– MD_LEVEL = 'refine_3’– DO_LOOPS = 1 – LOOP_ENDING_MODEL = 5– LOOP_MD_LEVEL = 'refine_3’

Page 21: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Modeling Run 1Best 4 Models Picked

ID1, ID2 : 1 5Current energy : 192PROSA Z score : -6.60( Z score of template : -7.3 )

ID1, ID2 : 3 2Current energy : 387PROSA Z score : -6.57

ID1, ID2 : 4 2Current energy : 363PROSA Z score : -6.76

ID1, ID2 : 5 4Current energy : 242PROSA Z score : -6.3

Page 22: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

-------------------------------------------------------------------------------------------------

Feature 25 : Phi/Psi pair of dihedral restraints List of the RVIOL violations larger than : 6.5000

# ICSR RESNO1/2 ATM1/2 INDATM1/2 FEAT restr viol rviol RESTR VIOL RVIOL

7 1360 45D 46K C N 368 370 -68.99 -70.20 30.80 2.20 -62.90 150.55 19.23 7 46K 46K N CA 370 371 109.62 140.40 -40.80 8 1361 46K 47D C N 377 379 173.18 54.50 123.21 12.43 -63.30 132.44 18.20 8 47D 47D N CA 379 380 7.79 40.90 -40.00 9 1362 47D 48D C N 385 387 -138.58 -63.30 76.02 11.52 -63.30 76.02 11.52 9 48D 48D N CA 387 388 -29.45 -40.00 -40.00

12 1369 103F 104A C N 811 813 -69.81 -68.20 21.24 1.77 -62.50 165.18 26.73 12 104A 104A N CA 813 814 124.12 145.30 -40.90 13 1370 104A 105A C N 816 818 -169.75 -62.50 107.58 21.02 -62.50 107.58 21.02 13 105A 105A N CA 818 819 -49.29 -40.90 -40.90

ID1, ID2 : 1 5Current energy : 192.1849

# RESTRAINT_GROUP NUM NUMVI NUMVP RMS_1 RMS_2 MOL.PDF S_i-------------------------------------------------------------------------------------------------25 Phi/Psi pair of dihedral restraints: 64 44 11 36.170 140.638 79.036 1.000

Violations - MODELER log file

Page 23: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

1st loop model - violations in Insight

Residue 46 Residue 104

Page 24: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Model Run 1 - adjustment

Page 25: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Modeling 2

• Refinement of Loop Model 1

• Loop Modeling 2

• Modeler Run 3

Page 26: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Modeling Run 2Best 5 Models

ID1, ID2 : 5 1Current energy : 237.4322PROSA Z score : -5.82

ID1, ID2 : 3 1Current energy : 222.2522PROSA Z score : -6.27

ID1, ID2 : 1 1Current energy : 195.7286PROSA Z score : -6.32

ID1, ID2 : 2 4Current energy : 226.8002PROSA Z score : -6.09

ID1, ID2 : 2 2Current energy : 198.0359PROSA Z score : -6.15

Page 27: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

-------------------------------------------------------------------------------------------------

Feature 25 : Phi/Psi pair of dihedral restraints List of the RVIOL violations larger than : 6.5000

# ICSR RESNO1/2 ATM1/2 INDATM1/2 FEAT restr viol rviol RESTR VIOL RVIOL

3 1430 45D 46K C N 368 370 -103.79 -118.00 33.92 1.76 -62.90 154.80 22.53 3 46K 46K N CA 370 371 169.89 139.10 -40.80 4 1431 46K 47D C N 377 379 -95.02 -70.90 59.16 2.00 -63.30 119.95 16.85 4 47D 47D N CA 379 380 -155.68 150.30 -40.00 5 1432 47D 48D C N 385 387 -63.33 -70.90 31.08 1.19 -63.30 160.16 19.77 5 48D 48D N CA 387 388 120.16 150.30 -40.00

9 1441 103F 104A C N 811 813 -122.41 -134.00 20.39 1.24 -62.50 166.47 30.50 9 104A 104A N CA 813 814 163.78 147.00 -40.90 10 1442 104A 105A C N 816 818 -64.90 -68.20 29.69 2.28 -62.50 156.71 25.57 10 105A 105A N CA 818 819 115.80 145.30 -40.90

# RESTRAINT_GROUP NUM NUMVI NUMVP RMS_1 RMS_2 MOL.PDF S_i------------------------------------------------------------------------------------------------- 4 Stereochemical improper torsion pot: 156 1 2 1.943 1.943 16.723 1.00025 Phi/Psi pair of dihedral restraints: 67 40 11 34.260 132.074 73.358 1.000

Violations - MODELER log fileID1, ID2 : 1 1Current energy : 195.7286

Page 28: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Loop Model Violation Sites

Page 29: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 30: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 31: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 32: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily
Page 33: Structure Prediction and Modeling of a Eukaryotic Member of the Major Facilitator Superfamily

Refinements in Final Model

• Some regions can be realigned and refined further taking into consideration their energy violations.

• Other tools could be used such as PROCHECK etc in addition to Modeler and PROSA to get further insight into energy details.

• Structural alignment of model with other known transport protein structures might be of some help.