variability in the UK Biobank · 1 1 Genotype-by-environment interactions inferred from genetic effects on phenotypic 2 variability in the UK Biobank 3 4 Huanwei Wang1, Futao Zhang1,

Post on 26-Jun-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

1

Genotype-by-environmentinteractionsinferredfromgeneticeffectsonphenotypic1 variabilityintheUKBiobank2 3 HuanweiWang1,FutaoZhang1,JianZeng1,YangWu1,KathrynE.Kemper1,AngliXue1,Min4 Zhang1,JosephE.Powell1,2,3,MichaelE.Goddard4,5,NaomiR.Wray1,6,PeterM.Visscher1,6,Allan5 F.McRae1,JianYang1,6,76 7 1InstituteforMolecularBioscience,TheUniversityofQueensland,Brisbane,Queensland4072,8 Australia9 2Garvan-WeizmannCentreforCellularGenomics,GarvanInstituteforMedicalResearch,10 Sydney,NSW2010,Australia11 3FacultyofMedicine,UniversityofNewSouthWales,Sydney,NSW2052,Australia12 4FacultyofVeterinaryandAgriculturalScience,UniversityofMelbourne,Parkville,Victoria,13 Australia;14 5BiosciencesResearchDivision,DepartmentofEconomicDevelopment,Jobs,Transportand15 Resources,Bundoora,Victoria,Australia16 6QueenslandBrainInstitute,TheUniversityofQueensland,Brisbane,Queensland4072,17 Australia18 7InstituteforAdvancedResearch,WenzhouMedicalUniversity,Wenzhou,Zhejiang325027,19 China20 Correspondence:JianYang(jian.yang@uq.edu.au)21 22 Abstract23 Genotype-by-environmentinteraction(GEI)isafundamentalcomponentinunderstanding24 complextraitvariation.However,itremainschallengingtoidentifygeneticvariantswithGEI25 effectsinhumanslargelybecauseofthesmalleffectsizesandthedifficultyofmonitoring26 environmentalfluctuations.Here,wedemonstratethatGEIcanbeinferredfromgenetic27 variantsassociatedwithphenotypicvariabilityinalargesamplewithouttheneedofmeasuring28 environmentalfactors.Weperformedagenome-widevariancequantitativetraitlocus(vQTL)29 analysisof~5.6millionvariantson348,501unrelatedindividualsofEuropeanancestryfor1330

quantitativetraitsintheUKBiobank,andidentified75significantvQTLswithP<2.0´10-9for931 traits,especiallyforthoserelatedtoobesity.DirectGEIanalysiswithfiveenvironmentalfactors32 showedthatthevQTLswerestronglyenrichedwithGEIeffects.Ourresultsindicatepervasive33 GEIeffectsforobesity-relatedtraitsanddemonstratethedetectionofGEIwithout34 environmentaldata. 35

2

Introduction36 Mosthumantraitsarecomplexbecausetheyareaffectedbymanygeneticandenvironmental37 factorsaswellaspotentialinteractionsbetweenthem1,2.Despitethelonghistoryofeffort3-5,38 therehasbeenlimitedsuccessinidentifyinggenotype-by-environmentinteraction(GEI)effects39 inhumans5-8.Thisislikelybecausemanyenvironmentalexposuresareunknownordifficultto40 recordduringthelifecourse,andbecausetheeffectsizesofGEIaresmallgiventhepolygenic41 natureofmosthumantraits9-11sothatthesamplesizesofmostpreviousstudiesarenotlarge42 enoughtodetectthesmallGEIeffects.43 44 TheGEIeffectofageneticvariantonaquantitativetraitcouldleadtodifferencesinvarianceof45 thetraitamonggroupsofindividualswithdifferentvariantgenotypes(Figure1a-b).GEIeffects46 canthereforebeinferredfromavariancequantitativetraitlocus(vQTL)analysis12.Unlikethe47 classicalquantitativetraitlocus(QTL)analysisthatteststheallelicsubstitutioneffectofa48 variantonthemeanofaphenotype(Figure1c),vQTLanalysisteststheallelicsubstitutioneffect49 onthetraitvariance(Figure1bor1d).IncomparisontotheanalysesthatperformdirectGEI50 tests,vQTLanalysiscouldbeamorepowerfulapproachtoidentifyGEIbecauseitdoesnot51 requiremeasuresofenvironmentalfactorsandthuscanbeperformedindatawithverylarge52 samplesizes13.Althoughtherehadbeenempiricalevidenceforthegeneticcontrolof53 phenotypicvarianceinlivestockfordecades14,15,itwasnotuntilrecentyearsthatgenome-wide54 vQTLanalysiswasappliedinhumans12,16,17,andonlyahandfulofvQTLshavebeenidentified55 foralimitednumberoftraits(e.g.theFTOlocusforbodymassindex(BMI)17)owingtothe56 smalleffectsizesofthevQTLs.Theavailabilityofdatafromlargebiobank-basedgenome-wide57 associationstudies(GWAS)18,19provideanopportunitytointerrogatethegenomeforvQTLsfor58 arangeofphenotypesincohortswithunprecedentedsamplesize.59 60 Ontheotherhand,thestatisticalmethodsforvQTLanalysisarenotentirelymature13.There61 havebeenaseriesofclassicalnon-parametricmethods20,originallydevelopedtodetect62 violationofthehomogeneousvarianceassumptioninlinearregressionmodel,whichcanbe63 usedtodetectvQTLs,includingtheBartlett’stest21,theLevene’stest22,23andtheFligner-Killen64 test24.Recently,moreflexibleparametricmodelshavebeenproposed,includingthedouble65 generalizedlinearmodel(DGLM)25-27andthelikelihoodratiotest28.Inaddition,ithasbeen66 suggestedthatthetransformationofphenotypethataltersphenotypedistributionalsohasan67 influenceonthepowerand/orfalsepositiverate(FPR)ofavQTLanalysis16,29.68 69 Inthisstudy,wecalibratedthemostcommonlyusedstatisticalmethodsforvQTLanalysisby70 extensivesimulations.Wethenusedthebestperformingmethodtoconductagenome-wide71

3

vQTLanalysisfor13quantitativetraitsin348,501unrelatedindividualsusingthefullreleaseof72 theUKBiobank(UKB)data18.WefurtherinvestigatedwhetherthedetectedvQTLsareenriched73 forGEIbyconductingadirectGEItestforthevQTLswithfiveenvironmentalfactors.74 75 Results76 EvaluationofthevQTLmethodsbysimulation77 WeusedsimulationstoquantifytheFPRandpower(i.e.,truepositiverate)forthevQTL78 methodsandphenotypeprocessingstrategies(Methods).Wefirstsimulatedaquantitativetrait79 basedonasimulatedsinglenucleotidepolymorphism(SNP),i.e.,asingle-SNPmodel,undera80 numberofdifferentscenarios,namely:1)fivedifferentdistributionsfortherandomerrorterm81 (i.e.,individual-specificenvironmenteffect);2)fourdifferenttypesofSNPwithorwithoutthe82 effectonmeanorvariance(Methods).Weusedthesimulateddatatocomparefourmostwidely83 usedvQTLmethods,namelytheBartlett’stest21,theLevene’stest22,23,theFligner-Killen(FK)84 test24andtheDGLM25-27.WeobservednoinflationinFPRfortheLevene’stestunderthenull85 (i.e.,novQTLeffect)regardlessoftheskewnessorkurtosisofthephenotypedistributionorthe86 presenceorabsenceoftheSNPeffectonmean(SupplementaryFigure1a).Thesefindingsarein87 linewiththeresultsfrompreviousstudies16,20,30thatdemonstratetheLevene’stestisrobustto88 thedistributionofphenotype.TheFPRoftheBartlett’stestorDGLMwasinflatedifthe89 phenotypedistributionwasskewedorheavy-tailed(SupplementaryFigure1a).TheFKtest90 seemedtoberobusttokurtosisbutvulnerabletoskewnessofthephenotypedistribution91 (SupplementaryFigure1a).Wealsoobservedthatlogarithmorrank-basedinverse-normal92 transformation(RINT)couldresultininflatedteststatisticsinthepresenceofQTLeffect(i.e.,93 SNPeffectonmean;SupplementaryFigure1b).94 95 Tosimulatemorecomplexscenarios,weusedamultiple-SNPmodelwithtwocovariates(age96 andsex)anddifferentnumbersofSNPs(Figure2).Theresultsweresimilartothoseobserved97 above,althoughthepoweroftheLevene’stestdecreasedwithanincreaseofthenumberof98 causalSNPs(Figure2a).Again,logarithmtransformationorRINTgaverisetoaninflatedFPRin99 thepresenceofSNPeffectonmean,andRINTledtoafurtherlossofpower(Figure2b).These100 resultsalsosuggestedthatpre-adjustingthephenotypebycovariatesslightlyincreasedthe101 powerofvQTLdetection(Figure2b).WethereforeusedtheLevene’stestforrealdataanalysis102 withthephenotypespre-adjustedforcovariateswithoutlogarithmtransformationorRINT.103 104 Genome-widevQTLanalysisfor13UKBtraits105 Weperformedagenome-widevQTLanalysisusingtheLevene’stestwith5,554,549genotyped106 orimputedcommonvariantson348,501unrelatedindividualsofEuropeanancestryfor13107

4

quantitativetraitsintheUKB18(Methods,SupplementaryTable1andSupplementaryFigure2).108 Foreachtrait,wepre-adjustedthephenotypeforageandthefirst10principalcomponents109 (PCs,derivedfromSNPdata)andstandardisedtheresidualstoz-scoresineachgendergroup110 (Methods).Thisprocessremovednotonlytheeffectsofageandthefirst10PCsonthe111 phenotypebutalsothedifferencesinmeanandvariancebetweenthetwogenders.Weexcluded112 individualswithadjustedphenotypesmorethan5standarddeviations(SD)fromthemeanand113 removedSNPswithminorallelefrequency(MAF)smallerthan0.05toavoidpotentialfalse114 positiveassociationsduetothecoincidenceofalow-frequencyvariantwithanoutlier115 phenotype(seeSupplementaryFigure3foranexample).Weacknowledgethatthisprocess116 couldpotentiallyresultinalossofpower,butthiscanbecompensatedforbytheuseofavery117 largesample(n~350,000).118 119 Withanexperiment-wisesignificantthreshold2.0´10-9(i.e.,1´10-8/5.03with1´10-8beinga120

morestringentgenome-widesignificantthresholdrecommendedbyrecentstudies31,32and5.03121 beingtheeffectivenumberofindependenttraits(SupplementaryNote3)),weidentified75122 vQTLsfor9traits(Figure3,Table1andSupplementaryTable2).TherewasnovQTLforheight,123 consistentwiththeobservationinapreviousstudy17.Weidentifiedmorethan15vQTLsfor124 eachofthethreeobesity-relatedtraits,i.e.,BMI,waistcircumference(WC),andhip125 circumference(HC)(Table1).The75vQTLswerelocatedat40near-independentlociafter126 excludingoneofeachpairoftopvQTLSNPs(i.e.,theSNPwithlowestvQTLp-valueateach127 vQTLassociationpeak)withlinkagedisequilibrium(LD)r2>0.01,suggestingthatsomeofthe128 lociwereassociatedwiththephenotypicvarianceofmultipletraits.Forexample,theFTOlocus129 wasassociatedwiththephenotypicvarianceofWC,HC,BMI,bodyfatpercentage(BFP)and130 basalmetabolicrate(BMR)(Figure4).Forthelung-function-relatedtraits,therewasno131 significantvQTLforforcedexpiratoryvolumeinonesecond(FEV1)andforcedvitalcapacity132 (FVC)butwere3vQTLsforFEV1/FVCratio(FFR).133 134 TheLevene’stestassessesthedifferenceinvarianceamongthreegenotypegroupsfreeofthe135 assumptionaboutadditivity(i.e.,thevQTLeffectofcarryingtwocopiesoftheeffectalleleisnot136 assumedtobetwicethatcarryingonecopy).WefoundtwovQTLs(i.e.,rs141783576and137 rs10456362)potentiallyshowingnon-additivegeneticeffectonthevarianceofHCandBMR,138 respectively(SupplementaryTable2).139 140 GWASanalysisforthe13UKBtraits141 ToinvestigatewhethertheSNPswitheffectsonvariancealsohaveeffectsonmean,we142 performedGWAS(orgenome-wideQTL)analysesforthe13UKBtraitsdescribedabove.We143

5

identified3,803QTLsatanexperiment-wisesignificancelevel(i.e.,PQTL<2.0´10-9)forthe13144 traitsintotal,amuchlargernumberthanthatofthevQTLs(Table1andFigure5).Amongthe145 75vQTLs,thetopvQTLSNPsat9locididnotpasstheexperiment-wisesignificancelevelinthe146 QTLanalysis(SupplementaryTable2).Forexample,theCCDC92locusshowedasignificant147 vQTLeffectbutnosignificantQTLeffectonWC(SupplementaryTable2andFigure6a),148 whereastheFTOlocusshowedbothsignificantQTLandvQTLeffectsonWC(Figure6b).Forthe149 66vQTLswithbothQTLandQTLeffects,thevQTLeffectswereallinthesamedirectionsasthe150 QTLeffects,meaningthatforanyoftheseSNPsthegenotypegroupwithlargerphenotypic151 meanalsotendstohavelargerphenotypicvariancethantheothergroups.Forthe9lociwith152 vQTLeffectsonly,itisequivalenttoascenariowhereaQTLhasaGEIeffectwithno(ora153 substantiallyreduced)effectonaverageacrossdifferentlevelsofanenvironmentalfactor154 (Figure1b).155 156 vQTLandGEI157 TofurtherinvestigatewhethertheassociationsbetweenvQTLsandphenotypicvariancecanbe158 explainedbyGEI,weperformedadirectGEItestbasedonanadditivegeneticmodelwithan159 interactiontermbetweenatopvQTLSNPandoneoffiveenvironmental/covariatefactorsinthe160 UKBdata(Methods).Thefiveenvironmentalfactorsaresex,age,physicalactivity(PA),161 sedentarybehaviour(SB),andeversmoking(SupplementaryNote4,SupplementaryFigure4162 andSupplementaryTable3).Weobserved16vQTLsshowingasignificantGEIeffectwithat163

leastoneoffiveenvironmentalfactorsaftercorrectingformultipletests(p<1.3´10-4=164 0.05/(75*5);Figure7aandSupplementaryTable4).165 166 TotestwhethertheGEIeffectsareenrichedamongvQTLsincomparisonwiththesamenumber167 ofQTLs,weperformedGEItestfor75topGWASSNPsrandomlyselectedfromalltheQTLsand168 repeatedtheanalysis1000times.Ofthe75topSNPswithQTLeffects,thenumberofSNPswith169 significantGEIeffectswas1.39averagedfromthe1000repeatedsamplingswithaSDof1.15170 (Figure7b),significantlylowerthenumber(16)observedforthevQTLs(thedifferenceislarger171 than12SDs,equivalenttop=6.6´10-37).ThisresultshowsthatSNPswithvQTLeffectsare172

muchmoreenrichedwithGEIeffectscomparedtothosewithQTLeffects.Toexcludethe173 possibilitythattheGEIsignalsweredrivenbyphenotypeprocessing(e.g.,theadjustmentof174 phenotypeforsexandage),werepeatedtheGEIanalysesusingrawphenotypedatawithout175 covariatesadjustment;theresultsremainlargelyunchanged(SupplementaryFigure5).176 177 Discussion178

6

Inthisstudy,weleveragedthegeneticeffectsassociatedwithphenotypicvariabilitytoinfer179 GEI.WecalibratedthemostcommonlyusedvQTLmethodsbysimulation.Wefoundthatthe180 FPRoftheLevene’stestwaswell-calibratedacrossallsimulationscenarioswhereastheother181 methodsshowedaninflatedFPRifthephenotypedistributionwasskewedorheavy-tailed182 underthenullhypothesis(i.e.,novQTLeffect),despitethattheLevene’stestappearedtobeless183 powerfulthantheothermethodsunderthealternativehypothesisinparticularwhentheper-184 variantvQTLeffectwassmall(Figure2andSupplementaryFigure1).Parametricbootstrapor185 permutationprocedureshavebeenproposedtoreducetheinflationinthetest-statisticsof186 DGLMandLRT-basedmethod,bothofwhichareexpectedtobemorepowerfulthanthe187 Levene’stest28,30,butbootstrapandpermutationarecomputationallyinefficientandthusnot188 practicallyapplicabletobiobankdatasuchastheUKB.Inaddition,weobservedinflatedFPRfor189 theLevene’stestintheabsenceofvQTLeffectsbutinthepresenceofQTLeffectsifthe190 phenotypewastransformedbylogarithmtransformationorRINT.Wethereforerecommend191 theuseoftheLevene’stestinpracticewithoutlogarithmtransformationorRINTofthe192 phenotype.Inaddition,averyrecentstudybyYoungetal.33developedanefficientalgorithmto193 performaDGLManalysisandproposedamethod(calleddispersioneffecttest(DET))to194 removethefoundinginvQTLassociations(identifiedbyDGLM)duetotheQTLeffects.We195 showedbysimulationthatwhenthenumberofsimulatedcausalvariantswasrelativelylarge196 (notethattheDETtestisnotapplicabletooligogenictraits),theYoungetal.method(DGLM197 followedbyDET)performedsimilarlyastheLevene’stestwithdifferencesdependingonhow198 thephenotypewasprocessed(SupplementaryFigure6).199 200 Weidentified75geneticvariantswithvQTLeffectsfor9quantitativetraitsintheUKBata201 stringentsignificancelevelandobservedstrongenrichmentofGEIeffectsamongthegenetic202 variantswithvQTLeffectscomparedtothosewithQTLeffects.ThereareseveralvQTLsfor203 whichtheGEIeffecthasbeenreportedinpreviousstudies.Thefirstexampleistheinteraction204 effectoftheCHRNA5-A3-B4locus(rs56077333)withsmokinglungfunction(asmeasuredby205 FFRratio,i.e.,FEV1/FVC),PvQTL=1.1´10-14andPGEI(smoking)=4.6´10-25(SupplementaryTable2206

and4).TheCHRNA5-A3-B4geneclusterisknowntobeassociatedwithsmokingandnicotine207 dependence34-36.However,resultsfromrecentGWASstudies37-39donotsupporttheassociation208 ofthislocuswithlungfunction.WehypothesizethattheeffectoftheCHRNA5-A3-B4locuson209 lungfunctiondependsonsmoking40(SupplementaryTable5).ThevQTLsignalatthislocus210

remained(PvQTL=5.2´10-12)afteradjustingthephenotypeforarrayeffect,whichwasreported211 toaffecttheQTLassociationsignalatthislocus18.Thesecondexampleistheinteractionofthe212

WNT16-CPED1locuswithageforBMD(rs10254825:PvQTL=2.0´10-45andPGEI(age)=1.2´10-7).213 TheWNT16-CPED1locusisoneofthestrongestBMD-associatedlociidentifiedfromGWAS41,42.214

7

Weobservedagenotype-by-ageinteractioneffectatthislocusforBMD(SupplementaryTable215 6),inlinewiththeresultsfrompreviousstudiesthattheeffectofthetopSNPatWNT16-CPED1216 onBMDinhumans43andtheknock-outeffectofWnt16onbonemassinmice44areage-217 dependent.ThethirdexampleistheinteractionoftheFTOlocuswithphysicalactivityand218 sedentarybehaviourforobesity-relatedtraits(PvQTL<1´10-10forBMI,WC,HC,BFPandBMR;219

PGEI(PA)=1.3´10-10forBMI,1.4´10-7forWC,5.3´10-7forHCand2.6´10-7forBMR).TheFTO220 locuswasoneofthefirstlociidentifiedbytheGWASofobesity-relatedtraits45although221 subsequentstudies46,47showthatIRX3andIRX5(ratherthanFTO)arethefunctionalgenes222 responsiblefortheGWASassociation.ThetopassociatedSNPattheFTOlocusisnotassociated223 withphysicalactivitybutitseffectonBMIdecreaseswiththeincreaseofphysicalactivity224 level48,49,consistentwiththeinteractioneffectsoftheFTOlocuswithphysicalactivityor225 sedentarybehaviourforobesity-relatedtraitsidentifiedinthisstudy(SupplementaryTables7226 and8).Inaddition,5ofthe22BMIvQTLswereinLD(r2>0.5)withthevariants(identifiedbya227 recentlydevelopedmultiple-environmentGEItest)showingsignificantinteractioneffectsat228

FDR<5%(correspondingtop<1.16´10-3)withatleastoneof64environmentalfactorsfor229 BMIintheUKB50.230 231 ApartfromGEI,thereareotherpossibleinterpretationsofanobservedvQTLsignal,including232 “phantomvQTLs”28,51andepistasis(genotype-by-genotypeinteraction).Iftheunderlyingcausal233 QTLisnotwellimputedornotwelltaggedbyagenotyped/imputedvariant,theuntagged234 variationatthecausalQTLwillinflatethevQTLtest-statistic,potentiallyleadingtoaspurious235 vQTLassociation,i.e.,theso-calledphantomvQTL.Weshowedbytheoreticaldeviationsthat236 theLevene’stest-statisticduetothephantomvQTLeffectwasafunctionofsamplesize,effect237 sizeofthecausalQTL,allelefrequencyofthecausalQTL,allelefrequencyofthephantomvQTL,238 andLDbetweenthecausalQTLandthephantomvQTL(SupplementaryNote5and239 SupplementaryFigure7).Fromourdeviations,wecomputedthenumericaldistributionofthe240 expectedphantomvQTLF-statisticsgivenanumberofparametersincludingthesamplesize(n241 =350,000),varianceexplainedbythecausalQTL(q2=0.005,0.01or0.02),andMAFsofthe242 causalQTLandthephantomvQTL(MAF=0.05–0.5).TheresultshowedthatforacausalQTL243 withq2<0.005andMAF>0.05,thelargestpossiblephantomvQTLF-statisticwassmallerthan244 2.69(correspondingtoap-valueof6.8´10-2;SupplementaryFigure8).Thisexplainswhythere245

werethousandsofgenome-widesignificantQTLsbutnosignificantvQTLforheight(Table1246 andFigure3).ThisresultalsosuggeststhatthevQTLsdetectedinthisstudyareveryunlikelyto247 bephantomvQTLsbecausetheestimatedvarianceexplainedbytheirQTLeffectswereall248 smallerthan0.005exceptforrs10254825attheWNT16locusonBMD(q2=0.014)249 (SupplementaryFigure9).However,ournumericalcalculationalsoindicatedthatforaQTL250

8

withMAF>0.3andq2<0.02,thelargestpossiblephantomvQTLF-statisticwassmallerthan251

5.64(correspondingtoap-valueof3.6´10-3),suggestingrs10254825isalsounlikelytobea252 phantomvQTL.NotethatweusedthevarianceexplainedestimatedatthetopGWASSNPto253 approximateq2ofthecausalQTLsothatq2waslikelytobeunderestimatedbecauseof254 imperfecttagging.However,consideringtheextremelyhighimputationaccuracyforcommon255 variants52,thestrongLDbetweenthecausalQTLsandtheGWAStopSNPsobservedina256 previoussimulationstudybasedonwhole-genome-sequencedata31,andtheoverestimationof257 varianceexplainedbytheGWAStopSNPsbecauseofwinner’scurse,theunderestimationin258 causalQTLq2islikelytobesmall.Inaddition,were-ranthevQTLanalysiswiththephenotype259 adjustedforthetopGWASvariantswithin10MbdistanceofthetopvQTLSNP;thevQTLsignals260 afterthisadjustmentwerehighlyconcordantwiththosewithoutadjustment(Supplementary261 Figure10).Wefurthershowedthattherewasnoevidenceforepistaticinteractionsbetweenthe262 topvQTLSNPsandanyotherSNPinmorethan10Mbdistanceoronadifferentchromosome263 (SupplementaryFigure11).264 265 Inconclusion,wesystematicallyquantifiedtheFPRandthepoweroffourcommonlyusedvQTL266 methodsbyextensivesimulationsanddemonstratedtherobustnessoftheLevene’stest.We267 alsoshowedthatinthepresenceofQTLeffectstheLevene’steststatisticcouldbeinflatedifthe268 phenotypewastransformedbylogarithmtransformationorRINT.Weimplementedthe269 Levene’stestaspartoftheOSCAsoftwarepackage53(URLs)forefficientgenome-widevQTL270 analysis,andappliedOSCA-vQTLto13quantitativetraitsintheUKBandidentified75vQTL(at271 40independentloci)associatedwith9traits,9ofwhichdidnotshowasignificantQTLeffect.272 Asaproof-of-principle,weperformedGEIanalysesintheUKBwith5environmentalfactors,273 anddemonstratedtheenrichmentofGEIeffectsamongthedetectedvQTLs.Wefurtherderived274 thetheorytocomputetheexpected“phantomvQTL”test-statisticduetountaggedcausalQTL275 effect,andshowedbynumericalcalculationthatourobservedvQTLswereveryunlikelytobe276 drivenbyimperfectlytaggedQTLeffects.Ourtheoryisalsoconsistentwiththeobservationof277 pervasivephantomvQTLsformoleculartraitswithlarge-effectQTLs(e.g.,DNAmethylation51).278 However,theconclusionsfromthisstudymaybeonlyapplicabletoquantitativetraitsof279 polygenicarchitecture.WecautionvQTLanalysisforbinaryorcategoricaltraits,ormolecular280 traits(e.g.,geneexpressionorDNAmethylation),forwhichthemethodsneedfurther281 investigation.282 283

9

Methods284 Simulationstudy285 WeusedaDGLM25-27tosimulatethephenotypebasedontwomodelswithsimulatedSNPdata286 inasampleof10,000individuals,i.e.,asingle-SNPmodelandmultiple-SNPmodelwithtwo287 covariates(i.e.ageandsex).Thesingle-SNPmodelcanbewrittenas288

𝑦 = 𝑤𝛽% + 𝑒with𝑙𝑜𝑔(𝜎-.) = 𝑤𝜙% + 𝑙𝑜𝑔(𝜎.)289

andthemultiple-SNPmodelcanbeexpressedas290

𝑦 = ∑ 𝑐345 𝛽67 + ∑ 𝑤89

5 𝛽%: + 𝑒with𝑙𝑜𝑔(𝜎-.) = ∑ 𝑐34

5 𝜙67 + ∑ 𝑤885 𝜙%: + 𝑙𝑜𝑔(𝜎

.),291

where𝑦isasimulatedphenotype;𝑤or𝑤8isastandardizedSNPgenotype,i.e.,𝑤 = (𝑥 −292

2𝑓)/@2𝑓(1 − 𝑓)with𝑥beingthegenotypeindicatorvariablecodedas0,1or2,generatedfrom293

binomial(2,f)andfbeingtheMAFgeneratedfromuniform(0.01,0.5);cjisastandardized294 covariatewithc1(sex)generatedfrombinomial(1,0.5)andc2(age)generatedfromuniform(20,295 60);eisanerrortermnormallydistributedwithmean0andvariance𝜎-..Tosimulatetheerror296 termwithdifferentlevelsofskewnessandkurtosis,wegenerated𝑒fromfivedifferent297 distributions,includingnormaldistribution,t-distributionwithdegreeoffreedom(df)=10or3298 and𝜒.distributionwithdf=15or1.𝛽(𝜙)istheeffectonmean(variance)generatedfrom299 N(0,1)ifexists,0otherwise.𝑙𝑜𝑔(𝜎.)istheinterceptofthesecondlinearmodelwhichwassetto300 0.Were-scaledthedifferentcomponentstocontrolthevarianceexplained,i.e.,0.1and0.9for301 thegenotypecomponentanderrorterm,respectively,forthesingle-SNPmodel,and0.2,0.4and302 0.4forthecovariatecomponent,genotypecomponentanderrorterm,respectively,forthe303 multiple-SNPmodel.WesimulatedtheSNPeffectsinfourdifferentscenarios:1)effecton304 neithermeannorvariance(nei),2)effectonmeanonly(mean),3)effectonvarianceonly(var),305 or4)effectonbothmeanandvariance(both).WesimulatedonlyonecausalSNPinthesingle-306 SNPmodeland4,40or80causalSNPsinthemultiple-SNPmodel.307 308 WeperformedvQTLanalysesusingthesimulatedphenotypeandSNPdatatocomparefour309 vQTLmethods,includingtheBartlett’stest21,theLevene’stest23,theFligner-Killeentest24and310 theDGLM(SupplementaryNote1).WealsoperformedtheLevene’stestwithfourphenotype311 processstrategies,includingrawphenotype(raw),rawphenotypeadjustedforcovariates(adj),312 RNITaftercovariateadjustment(rint),andlogarithmtransformationaftercovariateadjustment313 (log)(SupplementaryNote2).Werepeatedthesimulation1,000timesandcalculatedtheFPR314 andpoweratp<0.05atasingleSNPlevel.315 316 TheUKBiobankdata317

10

ThefullreleaseoftheUKBdatacomprisedofgenotypeandphenotypedatafor~500,000318 participatesacrosstheUK18.ThegenotypedatawerecleanedandimputedtotheHaplotype319 ReferenceConsortium(HRC)52andUK10K54referencepanelsbytheUKBteam.Genotype320 probabilitiesfromimputationwereconvertedtohard-callgenotypesusingPLINK255(--hard-321 call0.1).WeexcludedgeneticvariantswithMAF<0.05,Hardy-Weinbergequilibriumtestp322

value<1´10-5,missinggenotyperate>0.05orimputationINFOscore<0.3,andretained323 5,554,549variantsforanalysis.324 325 WeidentifiedasubsetofindividualsofEuropeanancestry(n=456,422)byprojectingtheUKB326 PCsontothoseof1000GenomeProject(1KGP)56.Furthermore,weremovedoneofeachpairof327 individualswithSNP-derived(basedonHapMap3SNPs)genomicrelatedness>0.05using328 GCTA-GRM57andretained348,501unrelatedEuropeanindividualsforfurtheranalysis.329 330 Weselected13quantitativetraitsforouranalysis(SupplementaryTable1andSupplementary331 Figure2).Therawphenotypevalueswereadjustedforageandthefirst10PCsineachgender332 group.Weexcludedfromtheanalysisphenotypevaluesthatweremorethan5SDfromthe333 mean.Thephenotypeswerethenstandardizedtoz-scoreswithmean0andvariance1.334 335 Genome-widevQTLanalysis336 Thegenome-widevQTLanalysiswasconductedusingtheLevene’stestimplementedinthe337 softwaretoolOSCA53(URLs).TheLevene’stestusedinthestudy(alsoknownasthemedian-338 basedLevene’stestortheBrown-Forsythetest23)isamodifiedversionoftheoriginalLevene’s339 test22developedin1960thatisessentiallyanone-wayanalysisofvariance(ANOVA)ofthe340 variable𝑧D3 = |𝑦D3 − 𝑦FG|,where𝑦D3isphenotypeofthej-thindividualinthei-thgroupand𝑦FG is341

themedianofthei-thgroup.TheLevene’steststatistic342

(𝑛 − 𝑘)(𝑘 − 1)

∑ 𝑛D8DJ5 (𝑧D. − 𝑧..).

∑ ∑ (LM3J5

8DJ5 𝑧D3 − 𝑧D.).

343

followsaFdistributionwith𝑘 − 1and𝑛 − 𝑘degreesoffreedom,wherenisthetotalsample344 size,kisthenumberofgroups(𝑘 = 3invQTLanalysis),𝑛D isthesamplesizeofthei-thgroup,345

i.e.𝑛 = ∑ 𝑛D8DJ5 ,𝑧D3 = |𝑦D3 − 𝑦FG|,𝑧D. =

5LM∑ 𝑧D3LM3J5 ,and𝑧.. =

5O∑ ∑ 𝑧D3

LM3J5

8DJ5 .346

347

Theexperiment-wisesignificancelevelwassetto2.0´10-9,whichisthegenome-wide348

significancelevel(i.e.1´10-8)31,32dividedbytheeffectivenumberofindependenttraits(i.e.5.03349

for13traits).Theeffectivenumberofindependenttraitswasestimatedbasedonthe350 phenotypiccorrelationmatrix58(SupplementaryNote3).Todeterminethenumberof351

11

independentvQTLs,weperformedanLDclumpinganalysisforeachtraitusingPLINK255(--352 clumpoptionwithparameters--clump-p12.0e-9--clump-p22.0e-9--clump-r20.01and--353 clump-kb5000).Tovisualizetheresults,wegeneratedtheManhattanandregionalassociation354 plotsusingggplot2packageinR.355 356 GWASanalysis357 TheGWAS(orgenome-wideQTL)analysiswasconductedusingPLINK255(--assocoption)using358 thesamedataasusedinthevQTLanalysis(notethatthephenotypehadbeenpre-adjustedfor359 covariatesandPCs).Theotheranalyses,includingLDclumping,andvisualization,were360 performedusingthesamepipelinesasthoseforgenome-widevQTLanalysisdescribedabove.361 362 GEIanalysis363 Fiveenvironmental/covariatefactors(i.e.,sex,age,PA,SBandsmoking)wereusedforthe364 directGEItests.Sexwascodedas0or1forfemaleormale.Agewasanintegernumberranging365 from40to74.PAwasassessedbyathree-levelcategoricalscore(i.e.,low,intermediateand366 high)basedontheshortformoftheInternationalPhysicalActivityQuestionnaire(IPAQ)367 guideline59.SBwasanintegernumberdefinedasthecombinedtime(hours)spentdriving,non-368 work-relatedcomputerusingorTVwatching.Thesmokingfactor“eversmoked”wascodedas0369 or1forneveroreversmoker.Moredetailsaboutthedefinitionandderivationof370 environmentalfactorPA,SBandsmokingcanbefoundintheSupplementaryNote4,Figure4371 andTable3.372 373 WeperformedaGEIanalysistotesttheinteractioneffectbetweenthetopvQTLSNPandoneof374 thefiveenvironmentalfactorsbasedonthefollowingmodel375

𝑦 = 𝜇 + 𝛽%𝑥% + 𝛽Q𝑥Q + 𝛽%Q𝑥%Q + 𝑒,376

whereyisphenotype,𝜇isthemeanterm,𝑥%ismean-centredSNPgenotypeindicator,𝑥Qis377

mean-centredenvironmentalfactor,and𝑥%Q = 𝑥%𝑥Q .WeusedastandardANOVAanalysisto378

testfor𝛽%QandappliedastringentBonferroni-correctedthreshold1.33´10-4(i.e.0.05/(75´5))379

toclaimasignificantGEIeffect. 380

12

URLs381 OSCA,http://cnsgenomics.com/software/osca382 PLINK2,http://www.cog-genomics.org/plink2383 GCTA,http://cnsgenomics.com/software/gcta384 UCSCGenomeBrowser,https://genome.ucsc.edu/385 UKB,http://www.ukbiobank.ac.uk/386 387 Acknowledgements388 ThisresearchwassupportedbytheAustralianResearchCouncil(DP160101343and389 DP160101056),theAustralianNationalHealthandMedicalResearchCouncil(1078037,390 1078901,1113400,1107258and1083656),andtheSylvia&CharlesViertelCharitable391 Foundation.ThisstudymakesuseofdatafromtheUKBiobank(projectID:12514).Afulllistof392 acknowledgmentsofthisdatasetcanbefoundinSupplementaryNote6.393 394 Authorcontributions395 J.Y.andA.F.M.conceivedthestudy.J.Y.,H.W.andA.F.M.designedtheexperiment.F.Z.developed396 thesoftwaretool.H.W.performedsimulationsanddataanalysesundertheassistanceor397 guidancefromJ.Y.,J.Z.,Y.W.,K.K.,A.X.andM.Z..J.E.P.,M.E.G.,N.R.W.andP.M.V.providedcritical398 advicethatsignificantlyimprovedtheexperimentaldesignand/orinterpretationoftheresults.399 P.M.V.,N.R.W.andJ.Y.contributedresourcesandfunding.H.W.andJ.Y.wrotethemanuscript400 withtheparticipationofallauthors.401 402 Competinginterests403 Theauthorsdeclarenocompetinginterests.404 405 References406 1. FalconerDS,MackayTFC.Introductiontoquantitativegenetics.4thed:Longman,407

Harlow;1996.408 2. LynchM,WalshB.Geneticsandanalysisofquantitativetraits.SinauerAssociates,409

Sunderland,Ma;1998.410 3. GarrodA.Theincidenceofalkaptonuria:astudyinchemicalindividuality.TheLancet.411

1902;160(4137):1616-1620.412 4. HaldaneJ.Heredityandpolitics.WWNorton&Co.,NY;1938.413 5. KraftP,HunterD.Integratingepidemiologyandgeneticassociation:thechallengeof414

gene–environmentinteraction.PhilosophicalTransactionsoftheRoyalSocietyB:415 BiologicalSciences.2005;360(1460):1609-1616.416

6. ThomasD.Gene–environment-wideassociationstudies:emergingapproaches.Nature417 ReviewsGenetics.2010;11(4):259.418

7. AschardH,LutzS,MausB,etal.Challengesandopportunitiesingenome-wide419 environmentalinteraction(GWEI)studies.HumGenet.2012;131(10):1591-1613.420

13

8. McAllisterK,MechanicLE,AmosC,etal.CurrentChallengesandNewOpportunitiesfor421 Gene-EnvironmentInteractionStudiesofComplexDiseases.AmJEpidemiol.422 2017;186(7):753-761.423

9. YangJ,LeeT,KimJ,etal.Ubiquitouspolygenicityofhumancomplextraits:genome-424 wideanalysisof49traitsinKoreans.PLoSgenetics.2013;9(3):e1003355.425

10. ShiH,KichaevG,PasaniucB.Contrastingthegeneticarchitectureof30complextraits426 fromsummaryassociationdata.TheAmericanJournalofHumanGenetics.427 2016;99(1):139-153.428

11. MaierRM,VisscherPM,RobinsonMR,WrayNR.Embracingpolygenicity:areviewof429 methodsandtoolsforpsychiatricgeneticsresearch.PsycholMed.2017:1-19.430

12. PareG,CookNR,RidkerPM,ChasmanDI.Ontheuseofvariancepergenotypeasatool431 toidentifyquantitativetraitinteractioneffects:areportfromtheWomen'sGenome432 HealthStudy.PLoSGenet.2010;6(6):e1000981.433

13. RönnegårdL,ValdarW.Recentdevelopmentsinstatisticalmethodsfordetecting434 geneticlociaffectingphenotypicvariability.BMCgenetics.2012;13(1):63.435

14. VanVleckLD.Variationofmilkrecordswithinpaternal-sibgroups.JournalofDairy436 Science.1968;51(9):1465-1470.437

15. HillWG,MulderHA.Geneticanalysisofenvironmentalvariation.GeneticsResearch.438 2010;92(5-6):381-395.439

16. StruchalinMV,DehghanA,WittemanJC,vanDuijnC,AulchenkoYS.Variance440 heterogeneityanalysisfordetectionofpotentiallyinteractinggeneticloci:methodand441 itslimitations.BMCGenet.2010;11:92.442

17. YangJ,LoosRJ,PowellJE,etal.FTOgenotypeisassociatedwithphenotypicvariability443 ofbodymassindex.Nature.2012;490(7419):267-272.444

18. BycroftC,FreemanC,PetkovaD,etal.TheUKBiobankresourcewithdeepphenotyping445 andgenomicdata.Nature.2018;562(7726):203-209.446

19. CollinsFS,VarmusH.Anewinitiativeonprecisionmedicine.NewEnglandJournalof447 Medicine.2015;372(9):793-795.448

20. ConoverWJ,JohnsonME,JohnsonMM.Acomparativestudyoftestsforhomogeneityof449 variances,withapplicationstotheoutercontinentalshelfbiddingdata.Technometrics.450 1981;23(4):351-361.451

21. BartlettMS.Propertiesofsufficiencyandstatisticaltests.Paperpresentedat:Proc.R.452 Soc.Lond.A1937.453

22. LeveneH.RobustTestsforEqualityofVariances.InIngramOlkin;HaroldHotelling;etal454 ContributionstoProbabilityandStatistics:EssaysinHonorofHaroldHotellingStanford455 UniversityPress,Stanford.1960:278–292.456

23. BrownMB,ForsytheAB.Robusttestsfortheequalityofvariances.Journalofthe457 AmericanStatisticalAssociation.1974;69(346):364-367.458

24. FlignerMA,KilleenTJ.Distribution-freetwo-sampletestsforscale.Journalofthe459 AmericanStatisticalAssociation.1976;71(353):210-213.460

25. RonnegardL,FellekiM,FikseF,MulderHA,StrandbergE.Geneticheterogeneityof461 residualvariance-estimationofvariancecomponentsusingdoublehierarchical462 generalizedlinearmodels.GenetSelEvol.2010;42:8.463

26. RonnegardL,ValdarW.Detectingmajorgeneticlocicontrollingphenotypicvariability464 inexperimentalcrosses.Genetics.2011;188(2):435-447.465

27. SmythGK.Generalizedlinearmodelswithvaryingdispersion.JournaloftheRoyal466 StatisticalSocietySeriesB(Methodological).1989:47-60.467

28. CaoY,WeiP,BaileyM,KauweJSK,MaxwellTJ.Aversatileomnibustestfordetecting468 meanandvarianceheterogeneity.GenetEpidemiol.2014;38(1):51-59.469

29. SunX,ElstonR,MorrisN,ZhuX.Whatisthesignificanceofdifferenceinphenotypic470 variabilityacrossSNPgenotypes?AmJHumGenet.2013;93(2):390-397.471

30. CortyRW,ValdarW.Mean-VarianceQTLMappingonaBackgroundofVariance472 Heterogeneity.bioRxiv.2018:276980.473

14

31. WuY,ZhengZ,VisscherPM,YangJ.Quantifyingthemappingprecisionofgenome-wide474 associationstudiesusingwhole-genomesequencingdata.Genomebiology.475 2017;18(1):86.476

32. PulitSL,deWithSA,deBakkerPI.Resettingthebar:Statisticalsignificanceinwhole-477 genomesequencing-basedassociationstudiesofglobalpopulations.Genetic478 epidemiology.2017;41(2):145-151.479

33. YoungAI,WauthierFL,DonnellyP.Identifyinglociaffectingtraitvariabilityand480 detectinginteractionsingenome-wideassociationstudies.NaturePublishingGroup;2018.481 1546-1718.482

34. SacconeSF,HinrichsAL,SacconeNL,etal.Cholinergicnicotinicreceptorgenes483 implicatedinanicotinedependenceassociationstudytargeting348candidategenes484 with3713SNPs.Humanmoleculargenetics.2006;16(1):36-49.485

35. ThorgeirssonTE,GellerF,SulemP,etal.Avariantassociatedwithnicotinedependence,486 lungcancerandperipheralarterialdisease.Nature.2008;452(7187):638.487

36. FowlerCD,LuQ,JohnsonPM,MarksMJ,KennyPJ.Habenularα5nicotinicreceptor488 subunitsignallingcontrolsnicotineintake.Nature.2011;471(7340):597.489

37. RepapiE,SayersI,WainLV,etal.Genome-wideassociationstudyidentifiesfiveloci490 associatedwithlungfunction.Naturegenetics.2010;42(1):36.491

38. HancockDB,EijgelsheimM,WilkJB,etal.Meta-analysesofgenome-wideassociation492 studiesidentifymultiplelociassociatedwithpulmonaryfunction.Naturegenetics.493 2010;42(1):45.494

39. WainLV,ShrineN,ArtigasMS,etal.Genome-wideassociationanalysesforlungfunction495 andchronicobstructivepulmonarydiseaseidentifynewlociandpotentialdruggable496 targets.Naturegenetics.2017;49(3):416.497

40. Kaur-KnudsenD,NordestgaardBG,BojesenSE.CHRNA3genotype,nicotinedependence,498 lungfunctionanddiseaseinthegeneralpopulation.EuropeanRespiratoryJournal.499 2012;40(6):1538-1544.500

41. EstradaK,StyrkarsdottirU,EvangelouE,etal.Genome-widemeta-analysisidentifies56501 bonemineraldensitylociandreveals14lociassociatedwithriskoffracture.Nature502 genetics.2012;44(5):491.503

42. KempJP,MorrisJA,Medina-GomezC,etal.Identificationof153newlociassociatedwith504 heelbonemineraldensityandfunctionalinvolvementofGPC6inosteoporosis.Nature505 genetics.2017;49(10):1468.506

43. Medina-GomezC,KempJP,EstradaK,etal.Meta-analysisofgenome-widescansfortotal507 bodyBMDinchildrenandadultsrevealsallelicheterogeneityandage-specificeffectsat508 theWNT16locus.PLoSgenetics.2012;8(7):e1002718.509

44. Movérare-SkrticS,HenningP,LiuX,etal.Osteoblast-derivedWNT16represses510 osteoclastogenesisandpreventscorticalbonefragilityfractures.Naturemedicine.511 2014;20(11):1279.512

45. FraylingTM,TimpsonNJ,WeedonMN,etal.AcommonvariantintheFTOgeneis513 associatedwithbodymassindexandpredisposestochildhoodandadultobesity.514 Science.2007;316(5826):889-894.515

46. SmemoS,TenaJJ,KimK-H,etal.Obesity-associatedvariantswithinFTOformlong-516 rangefunctionalconnectionswithIRX3.Nature.2014;507(7492):371.517

47. ClaussnitzerM,DankelSN,KimK-H,etal.FTOobesityvariantcircuitryandadipocyte518 browninginhumans.NewEnglandJournalofMedicine.2015;373(10):895-907.519

48. KilpeläinenTO,QiL,BrageS,etal.PhysicalactivityattenuatestheinfluenceofFTO520 variantsonobesityrisk:ameta-analysisof218,166adultsand19,268children.PLoS521 medicine.2011;8(11):e1001116.522

49. LoosRJ,YeoGS.ThebiggerpictureofFTO—thefirstGWAS-identifiedobesitygene.523 NatureReviewsEndocrinology.2014;10(1):51.524

50. MooreR,CasaleFP,JanBonderM,etal.Alinearmixed-modelapproachtostudy525 multivariategene–environmentinteractions.NatureGenetics.2018.526

15

51. EkWE,Rask-AndersenM,KarlssonT,EnrothS,GyllenstenU,JohanssonA.Genetic527 variantsinfluencingphenotypicvarianceheterogeneity.HumMolGenet.528 2018;27(5):799-810.529

52. McCarthyS,DasS,KretzschmarW,etal.Areferencepanelof64,976haplotypesfor530 genotypeimputation.NatGenet.2016;48(10):1279-1283.531

53. ZhangF,ChenW,ZhuZ,etal.OSCA:atoolforomic-data-basedcomplextraitanalysis.532 bioRxiv.2018:445163.533

54. TheUK10KConsortium.TheUK10Kprojectidentifiesrarevariantsinhealthand534 disease.Nature.2015;526(7571):82-90.535

55. ChangCC,ChowCC,TellierLC,VattikutiS,PurcellSM,LeeJJ.Second-generationPLINK:536 risingtothechallengeoflargerandricherdatasets.Gigascience.2015;4:7.537

56. GenomesProjectConsortium.Amapofhumangenomevariationfrompopulation-scale538 sequencing.Nature.2010;467(7319):1061.539

57. YangJ,LeeSH,GoddardME,VisscherPM.GCTA:atoolforgenome-widecomplextrait540 analysis.AmJHumGenet.2011;88(1):76-82.541

58. BrethertonCS,WidmannM,DymnikovVP,WallaceJM,BladéI.Theeffectivenumberof542 spatialdegreesoffreedomofatime-varyingfield.Journalofclimate.1999;12(7):1990-543 2009.544

59. IPAQResearchCommittee.Guidelinesfordataprocessingandanalysisofthe545 InternationalPhysicalActivityQuestionnaire(IPAQ)-shortandlongforms.546 www.ipaq.ki.se.2005.547

548

549

16

Figures550

551

552

Figure1.Schematicofthedifferencesinmeanorvarianceamonggenotypegroupsinthe553 presenceofGEI,QTLandvQTLeffect.Thephenotypesof1,000individualsweresimulated554 basedonageneticvariant(MAF=0.3)witha)bothQTLandGEIeffects,(b)GEIeffectonly(no555 QTLeffect),(c)QTLeffectonly(noGEIorvQTLeffect),or(d)vQTLonly(noQTLeffect).556

557

17

558 Figure2.Evaluationofthestatisticalmethodsandphenotypeprocessingstrategiesfor559 vQTLanalysisbysimulation.Phenotypesof10,000individualsweresimulatedbasedon560 differentnumberofSNPs(i.e.4,40or80),twocovariates(i.e.sexandage)andoneerrorterm561 inamultiple-SNPmodel(Methods).TheSNPeffectsweresimulatedunderfourscenarios:1)562 effectonneithermeannorvariance(nei),2)effectonmeanonly(mean),3)effectonvariance563 only(var),or4)effectonbothmeanandvariance(both).Theerrortermwasgeneratedfrom564 fivedifferentdistributions:normaldistribution,t-distributionwithdf=10or3,or𝜒.565 distributionwithdf=15or1.Inpanela,fourstatisticaltestmethods,i.e.,theBartlett’stest566

raw adj rint log

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

4 SNPs modelraw adj rint log

40 SNPs modelraw adj rint log

neim

eanvar

both

80 SNPs model

Bart Lev FK DGLM

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

0.000.250.500.751.00

4 SNPs modelBart Lev FK DGLM

40 SNPs modelBart Lev FK DGLM

neim

eanvar

both80 SNPs modela

b

Distribution of the residualsNormal distribution

t-distribution (df=10)

t-distribution (df=3)

Chi-squared distribution (df=15)

Chi-squared distribution (df=1)

FPR

FPR

Powe

rPo

wer

FPR

FPR

Powe

rPo

wer

18

(Bart),theLevene’stest(Lev),theFligner-Killentest(FK)andtheDGLM,wereusedtodetect567 vQTLs.Inpanelb,theLevene’stestwasusedtoanalysephenotypesprocessedusingfour568 strategies,i.e.,rawphenotype(raw),rawphenotypeadjustedforcovariates(adj),rank-based569 inverse-normaltransformationaftercovariateadjustment(rint),andlogarithmtransformation570 aftercovariateadjustment(log).TheFPRorpowerwascalculatedasthenumberofvQTLswith571 p<0.05dividedbythetotalnumberoftestsacross1,000simulations.Theredhorizontalline572 representsanFPRof0.05.573 574

19

575 Figure3.Manhattanplotsofgenome-widevQTLanalysisfor13traitsintheUKB.Foreach576 ofthe13traits(seeTable1forfullnamesofthetraits),teststatistics(-log10(PvQTL))ofall577

common(MAF³0.05)SNPsfromthevQTLanalysisareplottedagainsttheirphysicalpositions.578

Thedashlinerepresentsthegenome-widesignificancelevel1.0´10-8andthesolidline579

representstheexperiment-wisesignificancelevel2.0´10-9.Forgraphicalclarity,SNPswithPvQTL580

<1´10-25areomitted,SNPswithPvQTL<2.0´10-9arecolour-codedinorange,thetopvQTLSNP581 isrepresentedbyadiamond,andtheremainingSNPsarecolour-codedingreyorblueforodd582 orevenchromosome.583 584

20

585 Figure4.RegionalplotsoftheFTOlocusassociatedwiththephenotypicvariabilityof5586 traits.Foreachofthese5traitsforwhichthephenotypicvarianceissignificantlyassociated587 withtheFTOlocus,vQTLteststatistics(-log10(PvQTL))areplottedagainstSNPpositions588 surroundingthetopvQTLSNP(representedbyapurplediamond)attheFTOlocus.SNPsin589 differentlevelsofLDwiththetopvQTLSNPareshownindifferentcolours.TheRefSeqgenesin590 thetoppanelareextractedfromtheUCSCGenomeBrowser(URLs).591 592

21

593 Figure5.Manhattanplotsofgenome-widevQTLorQTLanalysisforwaistcircumference594 intheUKB.Teststatistics(-log10(PvQTL))ofallcommonSNPsfromvQTL(a)orQTL(b)analysis595 areplottedagainsttheirphysicalpositions.Thedashlinerepresentsthegenome-wide596

significancelevel1´10-8andthesolidlinerepresentstheexperiment-wisesignificancelevel597

2.0´10-9.Forgraphicalclarity,SNPswithPvQTL<1´10-25orPQTL<1´10-75areomitted,SNPswith598

P<2.0´10-9arecolour-codedinorange,thetopvQTLorQTLSNPisrepresentedbyadiamond,599 andtheremainingSNPsarecolour-codedingreyorblueforvQTLanalysis(a)orgreyorpink600 forQTLanalysis(b)foroddorevenchromosomes.601 602

22

603 Figure6.QTLandvQTLregionalplotsoftheCCDC92orFTOlocusforwaist604 circumference.TheQTLandvQTLteststatistics(i.e.,-log10(Pvalues))forwaistcircumference605 areplottedagainstSNPpositionssurroundingthetopvQTLSNPattheCCDC92(panela)orFTO606 locus(panelb).ThetopvQTLSNPisrepresentedbyapurplediamond.SNPsindifferentlevels607 ofLDwiththetopvQTLSNPareshownindifferentcolours.TheRefSeqgenesinthetoppanel608 areextractedfromtheUCSCGenomeBrowser(URLs).609 610

23

611 Figure7.EnrichmentofGEIeffectsamongthe75vQTLsincomparedwitharandomsetof612 QTLs.Fiveenvironmentalfactors,i.e.,sex,age,physicalactivity(PA),sedentarybehaviour(SB),613 andsmoking,wereusedintheGEIanalysis.(a)TheheatmapplotofGEIteststatistics(-614 log10(PGEI))forthe75topvQTLSNPs.“*”denotessignificantGEIeffectsafterBonferroni615

correction(PGEI<1.33´10-4=0.05/(75*5)).(b)ThedistributionofthenumberofsignificantGEI616 effectsfor75topQTLSNPsrandomlyselectedfromallthetopQTLSNPswith1000repeats617 (mean1.39andSD1.15).TheredlinerepresentsthenumberofsignificantGEIeffectsforthe75618 topvQTLSNPs(i.e.,16).619 620

0

100

200

300

0 5 10 15 20The number of top SNPs

Cou

nts

1

2

3

4

>=5

−log10(PGxE)

Trait

WC

HC

BMD

BW

BMI

BFP

BMR

WHR

FFR

a b

*

*

*

***

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

*

Sex Age PA SB Smoking

rs476828(MC4R)rs1421085(FTO)

rs78719460(GOLGA3)rs80083564(BDNF)

rs10456362(ZKSCAN4)rs62033406(FTO)

rs2523625(HLA−B)rs900399(CCNL1)rs1128249(GRB14)

rs2820468(LYPLAL1)rs459193(C5orf67)rs2238691(GIPR)

rs11152213(MC4R)rs1421085(FTO)

rs34898535(STX1B)rs8056890(ATP2A1)

rs10846580(CCDC92)rs17789506(KLF14)

rs141783576(RSPO3)rs72891717(TFAP2B)rs3132947(GPSM3)

rs34158769(BTN3A2)rs10200566(ADCY3)rs6751993(TMEM18)

rs62104180(FAM150B)rs2605098(LYPLAL1)

rs6685593(OPTC)rs1800437(GIPR)

rs11152213(MC4R)rs1421085(FTO)

rs34898535(STX1B)rs8056890(ATP2A1)rs7133378(CCDC92)rs12667251(KLF14)rs987237(TFAP2B)

rs4472337(UHRF1BP1)rs1062070(RNF5)rs13198716(ABT1)

rs12507026(GNPDA2)rs7649970(PPARG)

rs13412194(TMEM18)rs62104180(FAM150B)rs10913469(SEC16B)

rs2238691(GIPR)rs10871777(MC4R)rs11642015(FTO)

rs12716979(STX1B)rs4072402(RABEP2)

rs11057413(ZNF664−FAM101A)rs7132908(BCDIN3D)

rs2049045(BDNF)rs4132670(TCF7L2)rs17150703(MSRA)rs987237(TFAP2B)rs3132947(GPSM3)

rs34817112(PRSS16)rs12507026(GNPDA2)

rs10016841(SLIT2)rs1225053(CPNE4)rs1641155(FANCL)

rs10203386(ADCY3)rs6751993(TMEM18)

rs62104180(FAM150B)rs6689335(LYPLAL1)rs545608(SEC16B)rs13322435(CCNL1)rs603140(TMEM135)rs10254825(WNT16)

rs4576334(STARD3NL)rs3020332(ESR1)

rs9371221(CCDC170)rs1414660(GREM2)

rs56077333(CHRNA3)rs12374521(FBXO38)

rs6537292(HHIP)

24

Table1.Thenumberofexperiment-wisesignificantvQTLsorQTLsforthe13UKBtraits.621

Trait DescriptionNumber

ofvQTLs

Numberof

QTLs

HT Standingheight 0 1063

FVC Forcedvitalcapacity 0 325

FEV1 Forcedexpiratoryvolumein1-second 0 266

FFR FEV1andFVCratio 3 17

BMD HeelbonemineraldensityT-score,automated 6 267

BW Birthweight 1 57

BMI Bodymassindex 22 271

WC Waistcircumference 16 196

HC Hipcircumference 16 249

WHR WaisttoHipRatio 1 157

WHRadjBMI WHRadjustedforBMI 0 221

BFP Bodyfatpercentage 5 249

BMR Basalmetabolicrate 5 465

Total 75 3,803

622 623

top related