Electronic Supplementary Information for - rsc.org · with the usual formula for a torus ... All nodes of degree zero are removed from the perturbation network (i.e, nodes for which
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
ElectronicSupplementaryInformationfor
In proteins, the structural responses of a position tomutation rely on theGoldilocksprinciple:nottoomanylinks,nottoofewRodrigoDorantes-Gilardi,LaëtitiaBourgeat,LorenzaPacini,LaurentVuillon,ClaireLesieurClaireLesieurEmail:[email protected]:SupplementaryMethodsFigs.S1toS4TablesS1References
SupplementaryMethodsBoxplot.Aboxplotdividesdatabyfourthequalpart.ThefirstquartileQ1isthevaluesofthefirst25%ofthedata,thesecondquartileQ2isthemedian,(50%ofthedata),andthethirdquartileQ3isthevaluesof75%ofthedata.AboveQ3arethevaluesbetweenQ3andthemaximum,andbelowQ1arethevaluebetweenQ1valueandtheminvalue.ThedegreesandweightsprobablyoverestimatetheaminoacidandtheatomicpackingofaminoacidsbecausetheradiusofvanderWaalsofatomsisignored.Nevertheless,aminoacidsarecomposedofthesameatoms,carbon,hydrogen(notincludedhere),oxygen,nitrogen and sulfur (Met andCys), and in thedataset same residue typeshave identical numberof atoms, so theover estimation islikewiseforeveryaminoacid,makingtheapproximation(ignoringvanderwallsvolume)reasonable.Torus. Inorder tocompare thenumberofaminoacidsonthesurfaceofaproteinandthenumberofaminoacids inside theprotein(calledburiedaminoacids),wemadeatheoreticalmodel.Asproteinsinthedatasetareoligomerstheirtopologyisatorus(adoughnut-shapeobject),theycannotbemodelledbyasphereasaremonomericproteins.Inordertodefineatorus,weneedtwoquantities:thewholediameter2Rofthedoughnut(fromthetwomostoppositeoutsidepoints)andthediameter2rofthe’tube’ofthedoughnut(fromanoutsidepointtoitsclosestoppositepointinsidepointonthetube).Thearea(thatisthecontactsurfaceofthedoughnut)iscalculatedwith theusual formula for a torus,namely4π2Rr x0.9where0.9 is thedensityof spherical packingon theplane,becauseas a firstapproximationanaminoacidisasphereonthesurface.Thevolumeiscomputedwiththeusualvolumeofatorusnamely2π2Rr2x0.74where 0.74 is the spherical packing in space.With this computation the ratio of the number of amino acids of the protein and thenumberofaminoacidsonthesurfaceoftheproteinisbetween0.2and2whenrvariesfrom3to8nm.Thismeansthatthedoughnut-shapedmodelgivesa largepossibilityofranges:fromanumberofaminoacidstwiceas largeasatthesurfacetoanumberofaminoacids5timesbiggerontheinsideoftheprotein(Fig.S1).AccessibilitySurfaceArea(ASA).ASAwascalculatedusingtheprogramavailableathttp://cib.cf.ocha.ac.jp/bitool/ASA.Thisprogramisbasedonamethodpreviouslydescribedin(1).Linkweightperturbationnetworks.Theperturbationnetworksarebuiltasfollows:1.TheaminoacidsnetworksofthereferenceGref=(Vref,Eref)andthemutantGmut=(Vmut,Emut)aregenerated.G,VandEstandforGraph,Vertex(node)andEdge(link),respectively.TherefisCtxB5.2.Initially,theperturbationnetworkGp=(Vp,Ep)containsallthenodesthatappearinGrefandGmut:Vp=Vref∪Vmut (1)3. Ep containsall the links thathaveaweightdifferencebetween the twonetworkshigher than4. If a link is contained inonlyonenetwork,itisconsideredashavingnullweight[w(u,v)=0]:Ep={(i,j) ∈Eref∪Emuts.t.|wmut(i,j)-wref(i,j)|>4} (2)4.Thelinkweightsw(i,j)intheperturbationnetworkaregivenbytheabsolutevalueofthedifferenceinlinkweightsbetweenthetwonetworks:wp(i,j)=| Δw(i,j)|=|wmut(i,j)-wref(i,j)| (3)5.AlinkcolorisassignedbasedonthesignofΔw(i,j):color(i,j)=redifwmut(i,j)-wref(i,j)<0greenifwmut(i,j)-wref(i,j)>0 (4) 6.Allnodesofdegreezeroareremovedfromtheperturbationnetwork(i.e,nodesforwhichthereisnodifferenceinlinkweightsbetweenthetwonetworks).Sphereofinfluence.Theinducedperturbationnetworkfromasourcenodev*,referredtoasthesphereofinfluenceofthepositionv*,isbuiltasfollows:
3
1.TheperturbationtreeisbuiltbyapplyingtheBreadth-_rstsearchalgorithmintherootedcasestotheperturbationnetwork,usingv*as root (https://en.wikipedia.org/wiki/Breadth-first_search).Theperturbation treecontainsall thenodesthatcanbereachedstartingatthesourcev*followingsimplepathsintheperturbationnetwork.Ifv*isamutatedposition,thentheperturbationtreecontainsallthenodesthatareaffectedbythemutation.2.Alllinksoftheperturbationnetworkwhoseend-pointsareintheperturbationtreeareaddedtotheperturbationtree,ifnotpresentyet.Inthisway,therescuemechanismsappearascyclesintheinducedperturbationnetwork.Perturbationnetworksandinducedperturbationnetworksareclassifiedas1D,2D,3D,4D,3-4Detc.iftheycontainlinksrepresenting1D,2D,3D,4D,3and4Dcontacts,respectively.a1Drelationmeansthatthetwonodesarefirstneighborsintheaminoacidssequence;a2Drelationmeansthatthetwonodesbelongtothesamesecondarystructure(thesame α-helix,thesameβ-sheetorthesameloop);a3Drelationmeansthatthetwonodesdonotbelongtothesamesecondarystructurebuttheybelongtothesamechain;a4Drelationmeansthatthetwonodesbelongtodifferentchains.Jaccardsimilaritymeasure.Wemadeanalgorithmtocomparetheenvironment(neighborhood)ofeveryaminoacidofthetwotoxinsCtxB5andhLTB5.Westartwithvectorsof20countersassociatedwiththe20aminoacidtypes,andweinitializeeachcounterwithavalueequalsto0.Givenanaminoacid–i-,thevectorgivesthenumbereachaminoacidtypeintheenvironmentof–i-,e.g. ifVal is3timesintheenvironmentof–i-,thentheentrycorrespondingtoValinthevectoris3.Tocompare twoenvironments,wecalculate the Jaccardsimilaritymeasureon thepairofvectors.The Jaccardsimilarity is computedusingtheenvironmentvectorsasfollows:theintersectionofeachentryofthevector,thatisthenumberofaminoacidsincommoninthetwoproteinsforeachaminoacidtype,e.g.ifthereare5Valintheenvironmentofaminoacid–i-inprotein1and3inprotein2,thentheintersectionoftheentryValinthevectorsisequaltotheminimalvaluethatis3;theunionofeachentryofthevector,thatisthemaximalnumberofaminoacidsinthetwoproteinsforeachaminoacidtype,e.g.ifthereare5Valintheenvironmentofaminoacid–i-in protein 1 and 3 in protein 2, then the union of the entry Val in the vectors is equal to themaximal value that is 5. There is oneintersectionvalueperaminoacidtypeandthesumofthetwentyintersectionvaluesisnotedinter(-i-).Likewise,wecomputetheunionofeachentryinthetwovectorsmaximalvalueandthesumoftheunionisnotedunion(-i-).TheJaccardmeasureforaminoacid–i-istheratiointer(-i-)tounion(-i-).NotethattheJaccardmeasureisavalueintheinterval[0,1]becauseinter(-i-)islowerorequaltounion(-i-).IfJaccard(-i-)equalsto0,thismeansthatinter(-i-)equalsto0andtheenvironmentsof–i-ofthetwoproteinsareeithercomposedof0ordonotshareanaminoacidtypeincommon.Ifontheotherhand,Jaccard(-i-)equalsto1,thenthetwoenvironmentsareidentical.References1. SamantaU,BahadurRP,ChakrabartiP(2002)Quantifyingtheaccessiblesurfaceareaofproteinresiduesintheirlocalenvironment.ProteinEng15(8):659–667.
FigureS2A-F.Aminoacidcapacityofinteractions.Upperpanels:Weightversusdegreeoftheaminoacids.Thecontinuouslinesshowthearea (envelope) covered by the set of degrees andweights adoptedby the amino acids.Middle panels. X-ray local structures of theaminoacids(Atomicpackingrepresentation)foramin(left),amode,i.e.mostfrequent(middle)andamax(right)degrees.Thewholeproteinisshownforthemindegreesbutonlythelocalstructuresareshownforthemodeandmaxdegrees.Theresidue–i-isindicatedincyanandtheneighbors–jk-inCPK.Theaminoacids–i-and–jk-areshowninspacefill.ThefigureisgeneratedwithsPDBviewer.ThePDBcode,thechain,thepositionoftheresiduealongthesequenceanddegreeareindicated.Lowerpanels.LocalNetworksofthelocalstructures(aminoacidpackingrepresentation)asinmiddlepanel.Theresidue–i-isindicatedincyanandtheneighbors–jk-inpink.Thenodes(circles)aretheresiduesandthelinksbetweenaminoacidpairs(lines)arebasedonthetworesidueshavingatleastoneatomeachwithina5Ådistance.
Fig. S4. Sphereof influenceof theposition80.A. Sphereof influenceof position80.Nodes are aminoacids,withK84:EforLysineatposition84inchainE.MutatedpositionareA-T80:EforA(Ala)inCtxB5andT(Thr)inLTB5.Yellownode is the source of the perturbation. Green and red links are for lower and higher link weights in CtxB5,respectively.Linkthicknessisproportionalto∆wij.