Top Banner
GYNECOLOGY A genome-wide association study of polycystic ovary syndrome identified from electronic health records Yanfei Zhang, MD, PhD 1 ; Kevin Ho, MD 1 ; Jacob M. Keaton, PhD; Dustin N. Hartzel, BS; Felix Day, PhD; Anne E. Justice, PhD; Navya S. Josyula, MS; Sarah A. Pendergrass, PhD; Ky'Era Actkins, BS; Lea K. Davis, PhD; Digna R. Velez Edwards, PhD; Brody Holohan, PhD; Andrea Ramirez, MD, MS; Ian B. Stanaway, PhD; David R. Crosslin, PhD; Gail P. Jarvik, MD, PhD; Patrick Sleiman, PhD; Hakon Hakonarson, MD, PhD; Marc S. Williams, MD; Ming Ta Michael Lee, PhD BACKGROUND: Polycystic ovary syndrome is the most common endo- crine disorder affecting women of reproductive age. A number of criteria have been developed for clinical diagnosis of polycystic ovary syndrome, with the Rotterdam criteria being the most inclusive. Evidence suggests that polycystic ovary syndrome is significantly heritable, and previous studies have identified genetic variants associated with polycystic ovary syndrome diagnosed using different criteria. The widely adopted electronic health record system provides an opportunity to identify patients with polycystic ovary syndrome using the Rotterdam criteria for genetic studies. OBJECTIVE: To identify novel associated genetic variants under the same phenotype definition, we extracted polycystic ovary syndrome cases and unaffected controls based on the Rotterdam criteria from the elec- tronic health records and performed a discovery-validation genome-wide association study. STUDY DESIGN: We developed a polycystic ovary syndrome pheno- typing algorithm on the basis of the Rotterdam criteria and applied it to 3 electronic health recordelinked biobanks to identify cases and controls for genetic study. In the discovery phase, we performed an individual genome-wide association study using the Geisinger MyCode and the Electronic Medical Records and Genomics cohorts, which were then meta- analyzed. We attempted validation of the significant association loci (P<110 6 ) in the BioVU cohort. All association analyses used logistic regression, assuming an additive genetic model, and adjusted for principal components to control for population stratification. An inverse-variance fixed-effect model was adopted for meta-analysis. In addition, we examined the top variants to evaluate their associations with each criterion in the phenotyping algorithm. We used the STRING database to charac- terize protein-protein interaction network. RESULTS: Using the same algorithm based on the Rotterdam criteria, we identified 2995 patients with polycystic ovary syndrome and 53,599 population controls in total (2742 cases and 51,438 controls from the discovery phase; 253 cases and 2161 controls in the validation phase). We identified 1 novel genome-wide significant variant rs17186366 (odds ratio [OR]¼1.37 [1.23, 1.54], P¼2.810 8 ) located near SOD2. In addition, 2 loci with suggestive association were also identified: rs113168128 (OR¼1.72 [1.42, 2.10], P¼5.210 8 ), an intronic variant of ERBB4 that is independent from the previously published variants, and rs144248326 (OR¼2.13 [1.52, 2.86], P¼8.4510 7 ), a novel intronic variant in WWTR1. In the further association tests of the top 3 single- nucleotide polymorphisms with each criterion in the polycystic ovary syndrome algorithm, we found that rs17186366 (SOD2) was associated with polycystic ovaries and hyperandrogenism, whereas rs11316812 (ERBB4) and rs144248326 (WWTR1) were mainly associated with oli- gomenorrhea or infertility. We also validated the previously reported as- sociation with DENND1A1. Using the STRING database to characterize protein-protein interactions, we found both ERBB4 and WWTR1 can interact with YAP1, which has been previously associated with polycystic ovary syndrome. CONCLUSION: Through a discovery-validation genome-wide associ- ation study on polycystic ovary syndrome identified from electronic health records using an algorithm based on Rotterdam criteria, we identified and validated a novel genome-wide significant association with a variant near SOD2. We also identified a novel independent variant within ERBB4 and a suggestive association with WWTR1. With previously identified polycystic ovary syndrome gene YAP1, the ERBB4-YAP1-WWTR1 network suggests involvement of the epidermal growth factor receptor and the Hippo pathway in the multifactorial etiology of polycystic ovary syndrome. Key words: EGFR pathway, electronic health record, ERBB4, Hippo pathway, polycystic ovary syndrome, SOD2, WWTR1 Introduction Polycystic ovary syndrome (PCOS) is the most common endocrine disorder that affects women of reproductive age. 1 PCOS is characterized by the dysregula- tion of the menstrual cycle (oligome- norrhea or amenorrhea), elevated levels of androgenic hormones (hyper- androgenism), and multiple cysts of the ovaries (polycystic ovaries). Other fea- tures include hirsutism in a malepattern, acne, increased skin pigment sometimes associated with skin tags, and weight gain. The following 3 criteria to identify women with PCOS have been proposed: the National Institutes of Health (NIH) criteria, the Rotterdam criteria, and the Androgen Excess and PCOS Society (AE-PCOS) criteria. The NIH criteria require both hyper- androgenism and oligoovulation or chronic anovulation 2 ; the Rotterdam criteria require at least 2 of 3 phenotypes: hyperandrogenism, oligoovulation or chronic anovulation, and polycystic ovaries 3 ; and the AE-PCOS criteria require hyperandrogenism and oligoo- vulation or chronic anovulation and/or polycystic ovaries. 4 The Rotterdam criteria are more inclusive than the other 2 that increase its sensitivity; thus, the prevalence of PCOS estimated by the Cite this article as: Zhang Y, Ho K, Keaton JM, et al. A genome-wide association study of polycystic ovary syn- drome identified from electronic health records. Am J Obstet Gynecol 2020;223:559.e1-21. 0002-9378 ª 2020 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http:// creativecommons.org/licenses/by-nc-nd/4.0/). https://doi.org/10.1016/j.ajog.2020.04.004 OCTOBER 2020 American Journal of Obstetrics & Gynecology 559.e1 Original Research ajog.org
21

A genome-wide association study of polycystic ovary syndrome identified from electronic health records

Jan 15, 2023

Download

Documents

Nana Safiana
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
A genome-wide association study of polycystic ovary syndrome identified from electronic health recordsA genome-wide association study of polycystic ovary syndrome identified from electronic health records
Yanfei Zhang, MD, PhD1; Kevin Ho, MD1; Jacob M. Keaton, PhD; Dustin N. Hartzel, BS; Felix Day, PhD; Anne E. Justice, PhD; Navya S. Josyula, MS; Sarah A. Pendergrass, PhD; Ky'Era Actkins, BS; Lea K. Davis, PhD; Digna R. Velez Edwards, PhD; Brody Holohan, PhD; Andrea Ramirez, MD, MS; Ian B. Stanaway, PhD; David R. Crosslin, PhD; Gail P. Jarvik, MD, PhD; Patrick Sleiman, PhD; Hakon Hakonarson, MD, PhD; Marc S. Williams, MD; Ming Ta Michael Lee, PhD
BACKGROUND: Polycystic ovary syndrome is the most common endo- population controls in total (2742 cases and 51,438 controls from the
crine disorder affecting women of reproductive age. A number of criteria have
been developed for clinical diagnosis of polycystic ovary syndrome, with
the Rotterdam criteria being the most inclusive. Evidence suggests that
polycystic ovary syndrome is significantly heritable, and previous studies
have identified genetic variants associated with polycystic ovary syndrome
diagnosed using different criteria. The widely adopted electronic health record
system provides an opportunity to identify patients with polycystic ovary
syndrome using the Rotterdam criteria for genetic studies.
OBJECTIVE: To identify novel associated genetic variants under the
same phenotype definition, we extracted polycystic ovary syndrome cases
and unaffected controls based on the Rotterdam criteria from the elec-
tronic health records and performed a discovery-validation genome-wide
association study.
STUDY DESIGN: We developed a polycystic ovary syndrome pheno- typing algorithm on the basis of the Rotterdam criteria and applied it to 3
electronic health recordelinked biobanks to identify cases and controls for genetic study. In the discovery phase, we performed an individual
genome-wide association study using the Geisinger MyCode and the
Electronic Medical Records and Genomics cohorts, which were then meta-
analyzed. We attempted validation of the significant association loci
(P<1106) in the BioVU cohort. All association analyses used logistic
regression, assuming an additive genetic model, and adjusted for principal
components to control for population stratification. An inverse-variance
fixed-effect model was adopted for meta-analysis. In addition, we
examined the top variants to evaluate their associations with each criterion
in the phenotyping algorithm. We used the STRING database to charac-
terize protein-protein interaction network.
RESULTS: Using the same algorithm based on the Rotterdam criteria,
we identified 2995 patients with polycystic ovary syndrome and 53,599
Cite this article as: Zhang Y, Ho K, Keaton JM, et al. A genome-wide association study of polycystic ovary syn-
drome identified from electronic health records. Am J
Obstet Gynecol 2020;223:559.e1-21.
0002-9378 ª 2020 The Author(s). Published by Elsevier Inc. This is an
open access article under the CC BY-NC-ND license (http://
creativecommons.org/licenses/by-nc-nd/4.0/). https://doi.org/10.1016/j.ajog.2020.04.004
discovery phase; 253 cases and 2161 controls in the validation phase).
We identified 1 novel genome-wide significant variant rs17186366 (odds
ratio [OR]¼1.37 [1.23, 1.54], P¼2.8108) located near SOD2. In
addition, 2 loci with suggestive association were also identified:
rs113168128 (OR¼1.72 [1.42, 2.10], P¼5.2108), an intronic variant
of ERBB4 that is independent from the previously published variants, and
rs144248326 (OR¼2.13 [1.52, 2.86], P¼8.45107), a novel intronic
variant in WWTR1. In the further association tests of the top 3 single-
nucleotide polymorphisms with each criterion in the polycystic ovary
syndrome algorithm, we found that rs17186366 (SOD2) was associated
with polycystic ovaries and hyperandrogenism, whereas rs11316812
(ERBB4) and rs144248326 (WWTR1) were mainly associated with oli-
gomenorrhea or infertility. We also validated the previously reported as-
sociation with DENND1A1. Using the STRING database to characterize
protein-protein interactions, we found both ERBB4 and WWTR1 can
interact with YAP1, which has been previously associated with polycystic
ovary syndrome.
CONCLUSION: Through a discovery-validation genome-wide associ- ation study on polycystic ovary syndrome identified from electronic health
records using an algorithm based on Rotterdam criteria, we identified and
validated a novel genome-wide significant association with a variant near
SOD2. We also identified a novel independent variant within ERBB4 and a
suggestive association with WWTR1. With previously identified polycystic
ovary syndrome gene YAP1, the ERBB4-YAP1-WWTR1 network suggests
involvement of the epidermal growth factor receptor and the Hippo
pathway in the multifactorial etiology of polycystic ovary syndrome.
Key words: EGFR pathway, electronic health record, ERBB4, Hippo
pathway, polycystic ovary syndrome, SOD2, WWTR1
Introduction Polycystic ovary syndrome (PCOS) is the most common endocrine disorder that affects women of reproductive age.1
PCOS is characterized by the dysregula- tion of the menstrual cycle (oligome- norrhea or amenorrhea), elevated levels of androgenic hormones (hyper- androgenism), and multiple cysts of the ovaries (polycystic ovaries). Other fea- tures include hirsutism in a “male” pattern, acne, increased skin pigment sometimes associated with skin tags, and weight gain. The following 3 criteria to identify women with PCOS have been proposed: the National Institutes of Health (NIH) criteria, the Rotterdam criteria, and the Androgen Excess and
OCTOBER 2020 Ameri
PCOS Society (AE-PCOS) criteria. The NIH criteria require both hyper- androgenism and oligoovulation or chronic anovulation2; the Rotterdam criteria require at least 2 of 3 phenotypes: hyperandrogenism, oligoovulation or chronic anovulation, and polycystic ovaries3; and the AE-PCOS criteria require hyperandrogenism and oligoo- vulation or chronic anovulation and/or polycystic ovaries.4 The Rotterdam criteria are more inclusive than the other 2 that increase its sensitivity; thus, the prevalence of PCOS estimated by the
can Journal of Obstetrics & Gynecology 559.e1
AJOG at a Glance
Why was this study conducted? As many as half or more of women with polycystic ovary syndrome (PCOS) do not carry a clinical diagnosis of PCOS, thereby leading to a healthcare gap associated with a lack of focused risk screening and disease management. Therefore, algorithms for PCOS phenotyping using electronic records are war- ranted. Applying the same phenotyping criteria across different systems and/or studies is essential for genomic discovery research. This study aimed to identify novel associated genetic variants under the same phenotype definition.
Key findings We developed and applied a phenotyping algorithm for PCOS based on the Rotterdam criteria to identify cases and controls from 3 biobank cohorts. We validated the association of ERBB4 and identified novel associations forWWTR1 and SOD2 with PCOS.
What does this add to what is known? The study demonstrated the application of a phenotyping algorithm developed on the basis of consensus diagnostic criteria in identifying cases and controls for discovery-validation genome-wide association study. The results of this study implicate roles for the epidermal growth factor receptor and the Hippo pathways in the multifactorial etiology of PCOS.
Original Research GYNECOLOGY ajog.org
Rotterdam criteria is 15%e20% compared with the 7%e12% generated by the other 2 criteria.5,6
Heritability estimates for PCOS derived from twin studies range between 38% and 71%,7 with a polygenic genetic architecture and complex inheritance pattern.8,9 Earlier large-scale genome- wide association studies (GWASs) have identified 19 loci associated with PCOS in women with European or East Asian ancestries, including ERBB4, YAP1, and DENND1A, which were replicated in both ancestries, providing additional evidence for the polygenic architecture of PCOS.10e16 These studies used different phenotype definitions, including different diagnostic criteria and self-reported information. Shared genetic architecture was identified be- tween different diagnostic criteria.13,16
The healthcare systemebased bio- banks with genetic data linked to the electronic health record (EHR) data enable new opportunities for genomic discovery research.17 Examples include the Geisinger MyCode Community Health Initiative (MyCode),18 BioVU at Vanderbilt University,19,20 and the Elec- tronic Medical Records and Genomics (eMERGE) Network, a consortium of
559.e2 American Journal of Obstetrics & Gynecol
multiple medical institutions in the United States that link DNA biobanks to EHR data.21 These multidimensional data are important resources for the development of phenotype algorithms, genetic discoveries, and implementation of clinical decision support.22e24 To identify cases and controls for various diseases from the EHR, phenotyping al- gorithms have been developed.25
Furthermore, such approaches are crit- ical for genetic studies as they integrate data from different EHR systems using the same phenotype definition to reduce case selection bias and heterogeneity among different studies. In this study, we developed an EHR
algorithm for PCOS based on the Rot- terdam criteria to identify PCOS cases and unaffected controls in multiple co- horts and performed GWASs to identify genetic variants associated with PCOS.
Materials and Methods Cohorts The discovery cohorts were identified from the Geisinger MyCode phases I and II and the eMERGE phase III. All MyCode participants provided written consent allowing their clinical and genomic data to be used for health-
ogy OCTOBER 2020
related research.18,26 The eMERGE phase III cohort includes 83,717 in- dividuals recruited from 12 study sites with demographics, diagnosis informa- tion based on the International Classifi- cation of Diseases (ICD) codes, and genotyping data.24 The replication cohort was selected from BioVU, Van- derbilt University’s EHR-linked bio- repository.19,20 This study was designated as exempt by institutional review boards (IRBs) based on the use of deidentified EHR and genetic data from all sites. We received approval from the GeisingerMyCode Governing Board and IRB, the eMERGE coordinating center, and the BioVU Review Committee and IRB to conduct this genetic study. Because both Geisinger and Vanderbilt are eMERGE sites, participants in MyCode and BioVU who were included in the eMERGE data were excluded from the site-specific analysis to avoid double counting.
Polycystic ovary syndrome electronic health record algorithm based on the Rotterdam criteria In this study, Figure 1 illustrates the sample selection and analytic strategy. The PCOS EHR algorithm based on the 2003 Rotterdam criteria was developed, including 3 criteria that represent different aspects of PCOS:
1. Polycystic (C1): Having diagnostic codes of PCOS and/or polycystic ovaries
2. Hyperandrogenic (C2): Having diagnostic codes for hyper- androgenism or hyperandrogenism- related clinical signs or hyper- androgenemia determined by calcu- lated free testosterone levels
3. Reproductive (C3): Having diagnosis codes for oligomenorrhea, amenor- rhea, infertility, and oligoovulation or anovulation
Supplemental Table 1 provides detail inclusion and exclusion ICD codes and laboratory tests for each criterion. PCOS cases were patients who met at least 2 of 3 criteria with an index age between 18 and 45 years. Controls were those who did not have any components of the 3
FIGURE 1 Flowchart for study design
The PCOS algorithm was first developed to identify cases and controls from the EHR based on Rotterdam criteria and then applied to Geisinger patients and the eMERGE cohort. Case-control GWASs were then conducted for the 3 cohorts with genetics data followed by inverse-variance fixed-effect meta- analyses. Variants with P<1e6 were then validated in BioVU samples using the same phenotype algorithm and genetics analysis protocol for meta- analysis and were queried in summary statistics from the PCOS consortium. Association with each criterion in the PCOS algorithm was further tested for these variants. EHR, electronic health record; GWAS, genome-wide association study; PCOS, polycystic ovary syndrome.
Zhang et al. A genome-wide association study of PCOS identified from EHRs. Am J Obstet Gynecol 2020.
ajog.org GYNECOLOGY Original Research
Original Research GYNECOLOGY ajog.org
criteria and whose current age was older than the median diagnosis age of the cases (38 years in this study). This algo- rithm was then applied to the MyCode and eMERGE cohorts for the discovery GWASs.
Discovery genome-wide association studies and meta-analyses MyCode phase I and II samples were genotyped and imputed to HRC.r1-1 EUR reference genome (GRCh37 build) separately using the Michigan Imputation Server as previously described.27 Variants with imputation info score >0.7 were included for the analyses. eMERGE samples were geno- typed at each study site and imputed to HRC.r1-1 EUR reference genome in multiple batches using the Michigan Imputation Server. Data were processed centrally and harmonized as previously described.24 Variants with average info score >0.3 were included. Samples with a genotyping rate below 95% were excluded. Single-nucleotide poly- morphisms (SNPs) with a <99% call rate, minor allele frequency (MAF) of <1%, and a significant deviation from the Hardy-Weinberg equilibrium (P<1107) were removed from ana- lyses. One of the paired individuals with first- or second-degree relatedness was removed. As a result, there were 7,595,111 SNPs, 6,747,339 SNPs, and 5,648,769 SNPs from MyCode phase I (1141 cases and 18,788 controls), MyCode phase II (594 cases and 9024 controls), and eMERGE III (1007 cases and 23,626 controls), respectively, included for GWASs.
We used fixed-effects logistic regres- sion, assuming an additive genetic model, adjusted for index age and the first 6 principal components (PCs) to account for population stratification for the MyCode phase I and II cohorts; in addition, we adjusted for the eMERGE III study sites. EasyQC (University of Regensburg, Regensburg, Germany)28
was employed to harmonize the alleles and data format for GWAS summary statistics from discovery studies before performing an inverse-variance fixed- effecteweighted meta-analysis using the
559.e4 American Journal of Obstetrics & Gynecol
METAL (University of Michigan, Ann Arbor, MI) tool.29 PLINK 1.9 (https:// www.cog-genomics.org/plink2)30 was used to calculate PCs, to estimate the relatedness, and to perform GWAS. P<5108 was the defined genome- wide significant.
Replication for the top variants Top associated variants with P<1106
from the discovery meta-analysis were further evaluated in an independent PCOS cohort identified on the basis of the same algorithm from BioVU. We identified 253 cases and 2161 controls. Genotypes were generated using the Illumina Infinium Expanded Multi- Ethnic Genotyping Array. The same imputation, quality control measures, and association protocols were applied for the replication study.We also queried the summary statistics of the meta- analyses from the PCOS consortium (without the 23andMe data) for the as- sociations of these top variants.16 The significance level for the replication study was defined at P<.0125 (0.05 of 4), with the same direction of effect as in the discovery GWAS.
Power calculation We evaluated the power for our study conservatively assuming a significance level of P<5108 for GWAS and a PCOS prevalence of 8%. The existing PCOS case number of 2995 gives 80% power to identify an associated variant with a MAF of 1% and an odds ratio (OR) of>2.01, a MAF of 2% and an OR >1.68, or a MAF of >8% and an OR of >1.34.
Functional genomics exploration The Variant Effect Predictor was used for variant annotation.31 The Functional Mapping and Annotation was used in the default setting to generate indepen- dent loci and associated pathways.32 In addition, in this study, Open Targets Genetics, an online portal, was used to query the associated genes, phenome- wide association studies (PheWASs), and the expression quantitative trait loci (eQTLs) of the top associated variants.33
The PheWAS data in Open Target Ge- netics included the results from the UK
ogy OCTOBER 2020
Biobank GWAS and the GWAS catalog. The STRING database was used to characterize the protein-protein inter- action (PPI) network.
Results Identification and characterization of polycystic ovary syndrome cases and controls from electronic health record data Supplemental Figure 1 illustrates the details of sample ascertainment using the Rotterdam criteriaebased algorithm in theMyCode and eMERGE cohorts. Only non-Hispanic whites were included in the MyCode and BioVU samples; all races were included in the eMERGE samples, wherein 75% were of European American, 17% were of African Amer- ican, and 8% were other race and/or ethnicity (Supplemental Table 2). Table 1 summarizes the numbers and charac- teristics of the identified cases and con- trols. The proportion of patients with polycystic ovaries was approximately 40% of the PCOS cases identified in the eMERGE and BioVU data, whereas this number was more than 88% in the MyCode phase I and II data. The MyCode cohorts have lower hyper- androgenic features than those of eMERGE and BioVU cohorts. More than 90% of the patients had reproductive issues in all 3 cohorts. Cases showed higher body mass index than controls in the MyCode and BioVU cohorts but not in the eMERGE cohort.
Discovery and replication of the genetic variants associated with the risk of polycystic ovary syndrome Twenty independent loci were identified with P<1105 in the discovery meta- analysis of the MyCode and eMERGE cohorts (Supplemental Table 3). Man- hattan plots for the meta-analysis and the 3 discovery studies are shown in Figure 2, A, and Supplemental Figure 2. Thirteen variants with P<1106 were then examined in the BioVU cohort us- ing the same algorithm. Eleven variants passed the quality control and were included in the analysis. The results are shown in Supplemental Table 4. Figure 2, B, lists the association of the top 3
Cohorts
MyCode phase I MyCode phase II eMERGE phase III BioVU
Case Control Case Control Case Control Case Control
Number 1141 18,788 594 9024 1007 23,626 253 2161
Polycystic (C1) 1011 (88.6%) — 528 (88.9%) — 390 (38.7%) — 107 (42.1%) —
Hyperandrogenic (C2) 773 (67.8%) — 385 (64.8%) — 841 (83.5%) — 209 (82.3%) —
Reproductive (C3) 1120 (98.2%) — 583 (98.1%) — 945 (93.8%) — 240 (94.5%) —
Age, mean (SD) 39.5 (8.4) 64.8 (13.5) 37.7 (8.7) 60.9 (12.4) 33.1 (7.0) 70.7 (16.1) 35.4 (7.2) 41.4 (2.3)
BMI, mean (SD) 35.0 (9.8) 31.1 (7.9) 34.3 (9.5) 31.0 (7.6) 30.4 (8.4) 30.0 (7.1) 29.5 (8.8) 28.2 (8.1)
BMI, body max index; PCOS, polycystic ovary syndrome.
Zhang et al. A genome-wide association study of polycystic ovary syndrome identified from electronic health records. Am J Obstet Gynecol 2020.
ajog.org GYNECOLOGY Original Research
independent SNPs in the discovery and replication cohorts. rs17186366, a novel association located in the promoter flanking region near SOD2 and LOC101929142, reached genome-wide significance (OR¼1.37 [1.23, 1.54], P¼2.8108) in the combined meta- analyses of discovery and replication. We also identified an intronic variant of ERBB4, rs113168128, with near- genome-wide significance (OR¼1.72 [1.42, 2.10], P¼5.2108). This SNP is independent from the previously re- ported ERBB4 variant rs2178575 (r2¼0.001).16 A low-frequency intronic variant of WWTR1, rs144248326 (MAF¼1.01%), was identified and approached genome-wide significance (OR¼2.13 [1.52, 2.86], P¼8.45107). The regional association plots for these 3 loci are shown in Supplemental Figure 3. We also examined the top associations for African Americans in the eMERGE datasets. Only rs113168128 in ERBB4 passed the standard quality check. This variant has higher MAF and a slightly smaller OR with nominal significance (MAF¼0.063, OR¼1.64, P¼.0106).
We did not observe significant asso- ciations of these variants in the PCOS consortium meta-analyses.16 We also examined the associations of previously reported PCOS loci in our meta-analyses result. Only variants in DENND1A1 (rs9696009, rs10818854, rs10986105) were replicated with the same direction and similar effect size (P<.05; Supplemental Table 5).
The functional genomics exploration found rs17186366 near SOD2 associated with menarche (P¼6.6105) and rs113168128 in ERBB4 associated with depressed affect (P¼2.1105)34
(Supplemental Table 6). All the top 3 SNPs were found to be associated with phenotypes related to the nervous sys- tem or to mental or behavioral disorders (Supplemental Table 6). None of these SNPs were found to be eQTLs in any tissue. The PPI network shows both ERBB4 and WWTR1 interact with YAP1 (Figure 2, C), which is also associated with PCOS in both European and Han Chinese.12,13,16 It further illustrates that YAP1, TAZ, LATS1, TEAD1, and TEAD4 interact within a network, highlighting the Hippo pathway.
Association of the top 3 single-nucleotide polymorphisms with each polycystic ovary syndrome criterion Table 2 summarizes the association re- sults in the discovery cohorts for the top 3 variants with each of the 3 criteria in the PCOS algorithm that represent the different aspects of PCOS based on the Rotterdam criteria. rs17186366 was strongly associated with the polycystic and hyperandrogenic traits, whereas the other 2 SNPs in ERBB4 and WWTR1 were mainly associated with the repro- ductive trait as the more significant as- sociation P-values and larger effect sizes were observed for the variants and the corresponding traits.
OCTOBER 2020 Ameri
Discussion Principal findings In this study, we developed an EHR al- gorithm based on the Rotterdam criteria for PCOS and identified cases and con- trols from 3 biobank cohorts. Through a discovery-validation GWAS, we identi- fied a novel genome-wide significant association with rs17186366 near SOD2. We validated the association of previ- ously reported genes ERBB4 and DENND1A1, with rs113168128 being…