Top Banner
ARTICLE Development and evaluation of polygenic risk scores for prediction of endometrial cancer risk in European women Cemsel Baigil 1,2 , Deborah J. Thompson 3 , Artitaya Lophatananon 4 , Neil A.J. Ryan 1,5 , Miriam J. Smith 2 , Joe Dennis 3 , Krisztina Mekli 4 , Tracy A. OMara 6 , D. Gareth Evans 2,7 , Emma J. Crosbie 1,5, * ARTICLE INFO Article history: Received 9 December 2021 Received in revised form 24 May 2022 Accepted 25 May 2022 Available online 15 June 2022 Keywords: Endometrial cancer Genetic predisposition Polygenic risk score Prevention Single-nucleotide variations (SNVs) ABSTRACT Purpose: Single-nucleotide variations (SNVs) (formerly single-nucleotide polymorphism [SNV]) inuence genetic predisposition to endometrial cancer. We hypothesized that a polygenic risk score (PRS) comprising multiple SNVs may improve endometrial cancer risk prediction for targeted screening and prevention. Methods: We developed PRSs from SNVs identied from a systematic review of published studies and suggestive SNVs from the Endometrial Cancer Association Consortium. These were tested in an independent study of 555 surgically-conrmed endometrial cancer cases and 1202 geographically-matched controls from Manchester, United Kingdom and validated in 1676 cases and 116,960 controls from the UK Biobank (UKBB). Results: Age and body mass index predicted endometrial cancer in both data sets (Manchester: area under the receiver operator curve [AUC] = 0.77, 95% CI = 0.74-0.80; UKBB: AUC = 0.74, 95% CI = 0.73-0.75). The AUC for PRS19, PRS24, and PRS72 were 0.58, 0.55, and 0.57 in the Manchester study and 0.56, 0.54, and 0.54 in UKBB, respectively. For PRS19, women in the third tertile had a 2.1-fold increased risk of endometrial cancer compared with those in the rst tertile of the Manchester study (odds ratio = 2.08, 95% CI = 1.61-2.68, P trend = 5.75E9). Combining PRS19 with age and body mass index improved discriminatory power (Manchester study: AUC = 0.79, 95% CI = 0.76-0.82; UKBB: AUC =0.75, 95% CI = 0.73-0.76). Conclusion: An endometrial cancer risk prediction model incorporating a PRS derived from multiple SNVs may help stratify women for screening and prevention strategies. © 2022 The Authors. Published by Elsevier Inc. on behalf of American College of Medical Genetics and Genomics. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). D. Gareth Evans and Emma J. Crosbie contributed equally. * Correspondence and requests for materials should be addressed to Emma J. Crosbie, Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, 5th Floor Research, St Marys Hospital, Oxford Road, Manchester M13 9WL, United Kingdom. E-mail address: [email protected] Afliations are at the end of the document. Genetics in Medicine (2022) 24, 18471856 www.journals.elsevier.com/genetics-in-medicine doi: https://doi.org/10.1016/j.gim.2022.05.014 1098-3600/© 2022 The Authors. Published by Elsevier Inc. on behalf of American College of Medical Genetics and Genomics. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
10

Development and evaluation of polygenic risk scores for prediction of endometrial cancer risk in European women

Jan 14, 2023

Download

Documents

Sophie Gallet
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Development and evaluation of polygenic risk scores for prediction of endometrial cancer risk in European womenwww.journals.elsevier.com/genetics-in-medicine
ARTICLE
Development and evaluation of polygenic risk scores for prediction of endometrial cancer risk in European women
Cemsel Bafligil1,2, Deborah J. Thompson3, Artitaya Lophatananon4, Neil A.J. Ryan1,5, Miriam J. Smith2, Joe Dennis3, Krisztina Mekli4, Tracy A. O’Mara6, D. Gareth Evans2,7, Emma J. Crosbie1,5,*
A R T I C L E I N F O
Article history: Received 9 December 2021 Received in revised form 24 May 2022 Accepted 25 May 2022 Available online 15 June 2022
Keywords: Endometrial cancer Genetic predisposition Polygenic risk score Prevention Single-nucleotide variations (SNVs)
D. Gareth Evans and Emma J. Crosbie contr *Correspondence and requests for materials sh
Biology, Medicine and Health, University of Ma E-mail address: [email protected]
Affiliations are at the end of the document.
doi: https://doi.org/10.1016/j.gim.2022.05.014 1098-3600/© 2022 The Authors. Published by El under the CC BY license (http://creativecommon
A B S T R A C T
Purpose: Single-nucleotide variations (SNVs) (formerly single-nucleotide polymorphism [SNV]) influence genetic predisposition to endometrial cancer. We hypothesized that a polygenic risk score (PRS) comprising multiple SNVs may improve endometrial cancer risk prediction for targeted screening and prevention. Methods: We developed PRSs from SNVs identified from a systematic review of published studies and suggestive SNVs from the Endometrial Cancer Association Consortium. These were tested in an independent study of 555 surgically-confirmed endometrial cancer cases and 1202 geographically-matched controls from Manchester, United Kingdom and validated in 1676 cases and 116,960 controls from the UK Biobank (UKBB). Results: Age and body mass index predicted endometrial cancer in both data sets (Manchester: area under the receiver operator curve [AUC] = 0.77, 95% CI = 0.74-0.80; UKBB: AUC = 0.74, 95% CI = 0.73-0.75). The AUC for PRS19, PRS24, and PRS72 were 0.58, 0.55, and 0.57 in the Manchester study and 0.56, 0.54, and 0.54 in UKBB, respectively. For PRS19, women in the third tertile had a 2.1-fold increased risk of endometrial cancer compared with those in the first tertile of the Manchester study (odds ratio = 2.08, 95% CI = 1.61-2.68, Ptrend = 5.75E–9). Combining PRS19 with age and body mass index improved discriminatory power (Manchester study: AUC = 0.79, 95% CI = 0.76-0.82; UKBB: AUC =0.75, 95% CI = 0.73-0.76). Conclusion: An endometrial cancer risk prediction model incorporating a PRS derived from multiple SNVs may help stratify women for screening and prevention strategies. © 2022 The Authors. Published by Elsevier Inc. on behalf of American College of Medical
Genetics and Genomics. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
ibuted equally. ould be addressed to Emma J. Crosbie, Division of Cancer Sciences, School of Medical Sciences, Faculty of nchester, 5th Floor – Research, St Mary’s Hospital, Oxford Road, Manchester M13 9WL, United Kingdom. k
sevier Inc. on behalf of American College of Medical Genetics and Genomics. This is an open access article s.org/licenses/by/4.0/).
Introduction
Endometrial cancer is the most common gynecological malignancy in the United Kingdom, with incidence and death rates rising steadily.1 Over the past 30 years, the incidence of endometrial cancer has increased by >50%.2
Despite an overall 5-year survival rate approaching 80%, there is a marked discrepancy in survival between women diagnosed with early (>90%) and late stage (16%) disease.3,4 Surgical treatment may be hazardous, particu- larly in elderly and obese women, and removes the opportunity for childbearing in younger women. There is, therefore, an urgent need for effective early detection and prevention strategies to tackle this growing disease burden.5,6
Identifying women at greatest risk of endometrial cancer will maximize the benefits and minimize the harms of tar- geted screening and prevention interventions. Although excess adiposity is a major risk factor for endometrial can- cer,7 body mass index (BMI) alone lacks the necessary precision for accurate risk prediction; not all tumors are obesity-driven8 and women with class III obesity (BMI > 40 kg/m2) have a relatively modest lifetime risk of endometrial cancer of 10% to 15%,9 compared with a population average in the United Kingdom of 3%.10 It is clear that endometrial cancer risk is determined by both environmental and genetic influences.11 The strongest genetic factors are pathogenic variants affecting the mismatch repair genes, which cause Lynch syndrome and a 40% to 60% lifetime risk of endo- metrial cancer.12,13 Rare pathogenic variants in other DNA repair-related genes, including PTEN, POLE, and possibly BRCA1 are also associated with increased risk.14,15 Cases in which specific genetic variants are not identified, women with a first-degree relative with endometrial cancer have a 2- fold higher risk of the disease.16 Indeed, it has been esti- mated that approximately 28% of the familial relative risk of endometrial cancer is attributable to common single- nucleotide variations (SNVs).17 Genome-wide association studies (GWAS), including those by members of our group, have identified 16 common endometrial cancer susceptibil- ity regions.17-19 Although the risk alleles for each SNV in- fluence endometrial cancer risk by a small amount (9%-40% per allele), a polygenic risk score (PRS) (ie, the total number of risk alleles harbored by an individual across all SNVs, weighted by each SNV’s risk estimate) could be used to distinguish women at highest and lowest genetic risk of endometrial cancer. The potential of a PRS-based risk pre- diction model to rationalize screening algorithms and eligibility for chemoprevention strategies is already estab- lished in breast cancer.20
The aim of this study was to develop and test a PRS for endometrial cancer risk prediction in European-descent populations, using SNVs identified from the Endometrial Cancer Association Consortium (ECAC) GWAS,17 and a systematic review of literature.18
Materials and Methods
Study populations
Women treated for endometrial cancer at Manchester Uni- versity NHS Foundation Trust, who donated clinico- pathological data and a blood sample for future research, were the Manchester cases.21 Baseline clinical data included age, ethnicity, BMI (kg/m2), histological subtype (endo- metrioid, serous, clear cell, carcinosarcoma), FIGO (2009) stage, and grade. All pathology specimens were reviewed by at least 2 specialist gynecological pathologists using confirmatory immunohistochemistry as necessary. We excluded cases in which final pathology review indicated cancer of nonendometrial origin. Women of European descent participating in a local general population breast cancer screening study with (1) no personal history of endometrial cancer, (2) no endometrial cancer diagnosis during follow up (median 8.9 years, interquartile range = 8.1-9.6; median age at censor 68 years, interquartile range = 62-73), and (3) an intact uterus were the Man- chester controls.22 Ethics protocols were sponsored by the University of Manchester, approved by research ethics committees (Supplemental Table 1) and conducted in accordance with Good Clinical Practice guidelines. Details of the UK Biobank (UKBB) study were published previously.23
In brief, the UKBB is an openly accessible prospective study that holds extensive genetic and phenotypic data on nearly half a million participants recruited from across the United Kingdom.23 Endometrial cancer cases were selected using International Classification of Diseases 10 (C55 and C54), International Classification of Diseases 9 (179 and 182), and self-reported (1040) coding system. This included 2478 endometrial cancer cases, most of whom were confirmed by the cancer registry. Of 217,422 female con- trols, we included the 179,414 with an intact uterus, no prior cancer diagnosis (except skin) who were alive at the time of study censor. The UKBB study was approved by the North West Multi-centre Research Ethics Committee (11/NW/ 0382). Details of sample collection are described else- where.23 For both Manchester and UKBB studies, we used age and BMI recorded at enrolment.
Genotyping
Genomic DNA was extracted from peripheral blood and saliva for the Manchester cases and controls, respectively. DNA from case specimens was extracted using either Nucleon extraction kit (catalog number SL8502, Gen-Probe Life Sciences Ltd) or Gentra Puregene Blood Kit (catalog number 158389, Qiagen). DNA extracts were resuspended in standard tris(hydroxymethyl)aminomethane-ethylenediamine tetraacetic acid buffer and quantified using NanoDrop
C. Bafligil et al. 1849
Spectrophotometer (ThermoFisher). Samples with very high or low concentrations were treated with Genomic DNA Clean and Concentrator-10 (catalog number D4011, Zymo Research) according to manufacturer’s protocol to normalize the concentrations to approximately 100 ng/μl. DNA from the controls was collected and extracted from saliva using Oragene kit (DNA Genotek Inc) according to the manufac- turer’s protocols. All samples were stored at –80 C until genotyping. A volume of 10 μl of DNA per sample was then dispensed into full-skirted Abgene 96-well SuperPlates (cat- alog number AB-2800, Thermo Scientific) before genotyp- ing. The genotyping platforms used in this study are summarized in Supplemental Table 2. Manchester samples were genotyped using OncoArray 534K, custom designed by the OncoArray Consortium to include approximately 250,000 GWAS backbone SNVs and approximately 250,000 SNVs with previously known associations to 5 common cancers (breast, ovarian, prostate, colorectal, and lung).24
Genotyping process for UKBB was published previously. In brief, UKBB samples were genotyped using Affymetrix UK BiLEVE Axiom array and UK Biobank Axiom Array (see Supplemental Materials and Methods).
Manchester study genotyping and quality control
Genotype calling was performed at the University of Cam- bridge for both Manchester cases and controls. Genotype analyses were carried out using GenABEL package in R version 3.6.0 according to OncoArray Consortium guide- lines.24 SNV-wise quality control (QC) was conducted to exclude SNVs with call rate <95% and deviating from HWE (P < 10E–7 in controls and P < 10E–12 in cases). Sample- wise QC was conducted to exclude samples with call rate <95%; low or high heterozygosity; and genetically XO, XXY, and XY individuals. One pair of each duplicate sam- ples or monozygotic twins and first-degree relatives were excluded using genomic kinship matrices. Only women from European ancestry were included, because of small numbers of participants who were not of European ancestry. All remaining samples were imputed using the v3 of 1000 Ge- nomes Project reference panel.25 Samples were phased using SHAPEITv2 and genotypes were imputed using IMPUTEv2 for nonoverlapping 5 megabase intervals.
SNVs with linkage disequilibrium (LD) r2 ≥ 0.2 were excluded from the PRS analyses. Three suggestively signif- icant SNVs (rs9460655, rs113945442, and rs2305252) failed QC in the UKBB data set and were excluded from the PRSs in both studies (Supplemental Materials and Methods).
Development of the extended PRS
SNVs used in the PRS were identified through 2 ap- proaches: (1) a systematic review of the literature18 and (2) independent suggestive SNVs from ECAC GWAS (excluding UKBB-derived ECAC samples). A suggestive SNV was defined as any SNV with P < 1E–5 but not
genome-wide significant, that is P < 5E–8. Per allele log ORs (odds ratios) (betas) were obtained from the discovery studies and from the ECAC GWAS (excluding UKBB samples) (Supplemental Table 3). Dosages for the whole SNV panel were obtained from imputation output files.
The PRS was derived using the following formula:
PRS = β1χ1 + β2χ2 + … + βkχk
in which βk is the per allele log OR and χk is the allele dosage for k independent SNV(s). The PRS was standard- ized to have a mean of 0 and SD of 1. The PRS was fitted as a continuous variable in a logistic regression to calculate area under the receiver operator curve (AUC) for assessment of the goodness of fit. We then divided the model into ter- tiles on the basis of the PRS in controls and calculated ORs of the second (33.3 > x < 66.6) and third tertile (x > 66.6), in comparison to first tertile (x < 33.3). We further inves- tigated the OR comparing women at 99th and 95th percentile to women at or below the 50th percentile of the PRS. For this, percentile cutoff points were defined by obtaining the PRS scores of the specified thresholds in control women and assigned all participants to a tertile group on the basis of their PRS scores. In total, 3 PRS were developed, 2 of which were based on our systematic review. PRS19 comprised 19 genome-wide significant SNVs (including the AKT1 variant) from the systematic review,18
18 of which were identified by the latest ECAC GWAS;17
PRS24 comprised all 24 SNVs identified by the system- atic review; and PRS72 included PRS24 and 48 suggestive SNVs from the ECAC GWAS.
Descriptive statistics were calculated within and between the Manchester and UKBB studies (Supplemental Results). The effect of BMI on PRS was assessed by comparing the means of controls in different BMI groups using 1-way analysis of variance and Tukey tests. In both studies, the effect of PRS in addition to BMI and/or age was assessed using logistic regression analyses. A linear regression model with PRS and BMI was fitted to control subjects at 99th and 50th percentiles to assess the gradient risk explained by these factors in the general population. The statistical ana- lyses were conducted in R (3.6.0). All tests were 2-sided and P < .05 was accepted as statistically significant.
Results
Characteristics of the study population
In total, 21 women of non-European ancestry were excluded from the Manchester data set, all of whom had endometrial cancer. The final Manchester study data set included 555 cases and 1202 female controls with a median age and BMI of 64 years and 31 kg/m2 and 59 years and 25 kg/m2, respectively (Table 1). Most cases had low grade (64.5%), early stage (84.3%) endometrial cancer of endometrioid
Table 1 Baseline characteristics for the women included in the Manchester and UK Biobank studies
Characteristics Category Cases (n) Controls (n) Cases (n) Controls (n)
Manchester Study UKBB Study BMI, kg/m2 <18.5 6 11 7 897
18.5 < 25 100 489 425 48537 25 < 30 120 405 558 42047 30 < 35 116 157 354 16906 35 < 40 68 56 179 5728 ≥40 91 24 142 2543 Unknown 54 60 11 302
Age, y Mean 63.23 59 61 55 <50 66 85 89 32239 50 < 60 128 518 451 41607 60 < 70 160 530 1118 42751 ≥70 180 69 18 363 Unknown 21 0 0 0
Tumor grade 1 211 2 109 3 176 Unknown 59
Stage I 364 II 55 III 69 IV 9 Unknown 58
Subtype Endometrioid 415 Nonendometrioid 136
BMI, body mass index; UKBB, UK Biobank.
1850 C. Bafligil et al.
histological subtype (75.3%), consistent with national fig- ures.26 The UKBB validation data set comprised 1676 cases and 116,960 controls. The corresponding values for median age and BMI for UKBB were 61 years and 28 kg/m2 and 56 years and 26 kg/m2 for cases and controls, respectively.
PRS
A PRS was developed in the Manchester data set using genome-wide significant SNVs (PRS19) as the base score and then expanded by adding 5 additional SNVs identified from the literature through our systematic review (PRS24) and 48 suggestive SNVs from ECAC GWAS (PRS72) (Supplemental Table 3). The base PRS (PRS19)18 achieved an AUC of 0.58 (95% CI = 0.56-0.61), whereas the full systematic review SNV panel (PRS24) achieved an AUC of 0.55 (95% CI = 0.52-0.58) in the Manchester data set (Figure 1). The PRS calculated using all 72 SNVS (PRS72) showed AUC (95% CI) of 0.57 (95% CI = 0.54-0.60).
The discriminatory performance of the 3 SNV panels was then tested in the UKBB data set (Table 2, Figure 2). The AUCs obtained for PRS19 and PRS24 UKBB were 0.56 (95% CI = 0.54-0.57) and 0.53 (95% CI = 0.52-0.54), respectively. In this data set, PRS72 achieved an AUC of 0.54 (95% CI = 0.52-0.55).
In both data sets, when the PRS was divided into tertiles to assess predictive value, there was a statistically significant risk of endometrial cancer in the third vs first tertile. All PRS
combinations in the Manchester study showed that women in the third tertile had a substantially increased risk of endometrial cancer compared with those in the first tertile (Table 2). This finding was replicated in women in the second tertile compared with those in the first, although to a lesser extent. In UKBB, although the discriminatory per- formance of the PRSs was low, a modestly increased risk was seen in all women in third vs first tertile among all PRSs, and the Ptrend for the difference of risk between ter- tiles was significant for all combinations (Table 2). Women in the top 1% of PRS19 and PRS24 distributions had a 2- to 3-fold increased risk of endometrial cancer than those in the bottom 50% in both Manchester and UKBB data sets (Table 2).
BMI and/or age were moderate to highly effective in predicting odds of endometrial cancer (Manchester: AUC, 95% CI for BMI: 0.71, 0.68-0.74; age: 0.64. 0.61-0.67; BMI + age: 0.77, 0.74-0.80; UKBB: AUC, 95% CI for BMI: 0.64, 0.62-0.65; age: 0.70, 0.69-0.72; BMI + age: 0.74, 0.73-0.75, respectively). Using PRS19 alongside age and BMI for endometrial cancer prediction resulted in approximately 2% increase of AUC in both the Manchester (AUC = 0.79, 95% CI = 0.76-0.82) and UKBB data sets (AUC = 0.75, 95% CI = 0.73-0.76). Within the Manchester data set, the risk gradients of each PRS combined with BMI were obtained at OR 1.04, 1.07, and 1.05 when comparing control subjects at 99th percentile with those at 50th percentile using PRS19, PRS24, and PRS72, respectively.
Figure 1 Polygenic risk scores for prediction of endometrial cancer risk. PRS performances in Manchester study (A) and UKBB (B) cases and controls. PRS distribution of PRS19, PRS24, and PRS72 (A and B). Dashed lines indicate median PRS for cases and controls. PRS, polygenic risk score; UKBB, UK Biobank.
C. Bafligil et al. 1851
In UKBB, the model showed no effect with OR of 1.00 for each PRS tested.
Finally, to investigate whether the PRS performs equally well in the 2 histological subtypes of endometrial cancer, we applied PRS19 separately to the endometrioid and non- endometrioid cases vs controls within the Manchester data set (Figure 3). The PRS performed similarly in both endo- metrioid (AUC = 0.59, 95% CI = 0.56-0.62) and non- endometrioid (AUC = 0.57, 95% CI = 0.52-0.62) cancers.
Discussion
In this study, we described the development and validation of an endometrial cancer PRS in cases and controls of White European descent. We showed that genetic predisposition to endometrial cancer can be calculated from multiple SNVs that are found to influence endometrial cancer susceptibility in large GWAS. Our PRS panels achieved an AUC range of 0.55 to 0.59 in the Manchester study and 0.53 to 0.56 in the UKBB. Endometrial cancer risk stratification through our PRS was independent of BMI and augmented the pre- dictivity of BMI and age. In particular, women at top 1% of
the PRS distributions were at higher risk of developing endometrial cancer than those scoring in the bottom 50%. These data suggest that a PRS combining low risk suscep- tibility variants may help identify women at greatest risk of endometrial cancer for targeted screening and prevention interventions. Our study is limited by a small number of predictive SNVs and more work is needed before a PRS can be implemented clinically. A model that incorporates envi- ronmental endometrial cancer risk factors, including obesity, insulin resistance, and reproductive factors, is likely to be more discriminatory, particularly if the contribution of ge- netic and environmental risk factors are independent of each other.
Endometrial cancer is strongly associated with obesity and the potential etiological contribution of genetic factors has been poorly studied to date. Even so, genetic risk has largely been attributed to rare pathogenic variants affecting high-risk genes, eg, in Lynch syndrome. Over the past decade, GWAS have been employed to examine the influ- ence of SNVs on endometrial cancer predisposition.19
Although limited in terms of ethnicity and pathological subtypes investigated, these provide important evidence that multiple independent loci are associated with endometrial
Ta bl e 2
S re su lt s in
th e M an ch es te r an d UK
Bi ob
Cu to ff
SN…