www.sciencemag.org/cgi/content/full/science.aaq1327/DC1 Supplementary Materials for Co-regulatory networks of human serum proteins link genetics to disease Valur Emilsson*†, Marjan Ilkov*, John R. Lamb*†, Nancy Finkel, Elias F. Gudmundsson, Rebecca Pitts, Heather Hoover, Valborg Gudmundsdottir, Shane R. Horman, Thor Aspelund, Le Shu, Vladimir Trifonov, Sigurdur Sigurdsson, Andrei Manolescu, Jun Zhu, Örn Olafsson, Johanna Jakobsdottir, Scott A. Lesley, Jeremy To, Jia Zhang, Tamara B. Harris, Lenore J. Launer, Bin Zhang, Gudny Eiriksdottir, Xia Yang, Anthony P. Orth, Lori L. Jennings‡, Vilmundur Gudnason†‡ *These authors contributed equally to this work. †Corresponding author. Email: [email protected] (V.E.); [email protected] (V.G.); [email protected] (J.R.L.) ‡These authors contributed equally to this work. Published 2 August 2018 on Science First Release DOI: 10.1126/science.aaq1327 This PDF file includes: Materials and Methods Figs. S1 to S14 Tables S2, S5, S8, S11, S12, S16, and S18 to S20 Captions for tables S1, S3, S4, S6, S7, S9, S10, S13 to S15, S17, S21, and S22 References Other Supplementary Materials for this manuscript include the following: (available at www.sciencemag.org/cgi/content/full/science.aaq1327/DC1) Tables S1, S3, S4, S6, S7, S9, S10, S13 to S15, S17, S21 and S22 (Excel)
61
Embed
Supplementary Materials for · Materials and Methods 1. The study cohort Cohort participants aged 66 through 96 were from the (AGES) – Reykjavik Study (12), a single-center prospective
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Co-regulatory networks of human serum proteins link genetics to disease
Valur Emilsson*†, Marjan Ilkov*, John R. Lamb*†, Nancy Finkel, Elias F. Gudmundsson, Rebecca Pitts, Heather Hoover, Valborg Gudmundsdottir, Shane R. Horman, Thor Aspelund, Le Shu, Vladimir Trifonov, Sigurdur Sigurdsson, Andrei Manolescu, Jun Zhu, Örn Olafsson,
Johanna Jakobsdottir, Scott A. Lesley, Jeremy To, Jia Zhang, Tamara B. Harris, Lenore J. Launer, Bin Zhang, Gudny Eiriksdottir, Xia Yang, Anthony P. Orth, Lori L. Jennings‡,
(SLE), diabetes and venous thrombosis (table S15).
Theoretical and experimental studies suggest that network hubs are evolutionary
conserved and robust against disturbances like deleterious mutations or hub removal (7, 17,
18, 21, 52). In other words, a removal of a hub protein in biological networks will have a
larger effect on phenotype outcome than a removal of a random protein. In fact we have
shown that protein hubs are more strongly connected to various disease outcomes than less
well connected proteins within the serum protein network (Figs. 3 and S9). We explored if the
proteins affected by cis pSNPs showed differential degree of connectivity depending on either
the strength of the -coefficient or in comparison to other proteins across the serum protein
network. First we found that the mean connectivity was significantly lower among proteins
with a detected cis effect compared to proteins with no detected cis effect (fig. S11B). Here,
11
the mean kTotal was 10.8 for proteins with no cis effects vs. 7.7 for proteins with significant
cis effects (down by 28.2%, P = 3×10-16
). Secondly, there was a significant negative
correlation between the standardized -coefficient (absolute values) of the cis effects and the
network connectivity of corresponding cis serum proteins (r = -0.231, P = 1×10-15
) (fig.
S11C). Thus cis pSNP-protein effects were significantly under-represented among highly
connected protein nodes which may reflect a relaxed selective constraint on proteins with low
connectivity. These results are in agreement with previous observations showing hub proteins
to be essential and evolutionary conserved (7, 17, 18, 21, 52), and to have a greater effect on
disease outcome (17). The observed phenomenon described above is not restricted to
humans, but has been noted in other kingdoms as well including plants (21).
We tested the association of each cis-acting pSNP to all proteins screened in the
present study (table S17). Here, 16.0% of the cis pSNPs affected one or more proteins in
trans at a Bonferroni adjusted P-value <1×10-8
, or 911 proteins in trans (table S17). Thus
together with the proteins regulated in cis, the cis-acting pSNPs affected levels of 1,954 serum
proteins. Of interest, 40.7% of the cis pSNPs that were trans-acting were also associated with
GWAS lead SNPs, which is an increase by 20% compared with 20.7% for all cis-acting
pSNPs (see above). This indicates that the trans effects on proteins levels could be a critical
part of the mechanism(s) underlying the genetic risk at GWAS loci.
In the past 10 years, GWASs have discovered thousands of disease-associated genetic
loci providing insights into the genetic architecture of complex disease (19). Here, many
common SNPs, each SNP contributing only a small amount to the total risk, act
synergistically to influence susceptibility to a complex disease. Majority of GWAS lead
SNPs are located outside the coding regions of genes, suggesting a key role for gene
regulation in the disease aetiology. In fact, a strong enrichment of cis-acting eSNP/eQTLs
among the GWAS signals has been observed (25, 53). A recent analytical study demonstrated
that GWAS SNPs that contribute most to the heritability of a given disease are not necessarily
located near genes with disease-specific effects or found in core pathways (25). In other
words, the numerous small peripheral GWAS effects converge onto a common biological
network that integrate other signals (e.g. environmental) as well, influencing activity/levels of
core protein hub(s) which in turn can cause a disease (25). The accumulated data suggest that
cis acting pSNPs affect proteins that are located at the periphery of the network and similar to
GWAS signals may individually or synergistically affect activity/levels of neighboring
proteins including protein hubs to affect disease.
4.2 Validation of cis and trans pSNP-protein findings across different study populations and
proteomic platforms
In this section we tested the replication of previously reported cis and trans pSNP-protein
findings identified in different study populations and across different or related proteomic
profiling platforms. Given the differences in the genotyping and proteomic platforms, and the
definition of cis and trans effects between the different studies, we have used a moderate
proxy threshold of r2 ≥ 0.5 between pSNPs for any comparison of pSNP-protein pairs
between studies. Generally, however, we interrogated the associations of the reported pSNP-
protein pairs directly in our dataset, at least for the large studies.
The percentage confirmation in the AGES of previous findings was only computed for
those proteins that are detected with the present multiplex aptamer-based platform. Proteins
encoded by genes on the X chromosome were excluded from the analysis as they were not
tested for cis linked association. Given our definition of cis effects within a 300kb window
was not necessarily applied in the other studies, we have followed the study-specific
definition. Therefore, in some cases therefore, the study-specific SNPs were not the strongest
12
cis or trans effects identified in the present study. For these we considered P < 1×10-4
to be a
significant replication provided the effect is directionally consistent across studies.
Johansson et al. (54), used mass spectrometry (MS) to quantify 163 proteins in
1,060 subjects and identified cis acting effects for five proteins. These effects were all
replicated in our dataset (table S18). Kim et al. (55), screened 132 proteins in plasma of
521 subjects from the ADNI cohort using multiplex immunoassay-based platform,
identifying 28 cis pSNP-proteins. We confirmed 73.9% of these cis effects in the AGES
(table S18). Further, Enroth et al. (56), applied a multiplex immunoassay-based platform
that quantified 92 inflammation related plasma proteins screened in 1,005 individuals
identifying cis acting effects for 23 proteins, of which 63.5% were replicated in the AGES
(table S18). Liu et al. (57), applied a SWATH mass spectrometry technique to measure 342
unique plasma proteins in 232 samples and identified cis-acting pSNPs affecting 13 proteins.
Out of the 13 proteins, eight proteins were measured with our aptamer-based platform of
which seven cis effects (87.5%), were replicated in the AGES (table S18). Thus on average,
we confirmed 74.6% of all pSNP-protein associations detected with non-aptamer based
technology.
Next, we tested replication of pSNP-protein findings in studies applying the aptamer-
based platform (58, 59). We note that these studies often report multiple pSNPs per locus,
thus we explored all cis and trans pSNPs detected in their studies for association to
corresponding serum protein(s) in the AGES cohort. For the cis and trans effects reported in
Suhre et al. (59), we confirmed 88.3% of all cis and 84.5% of all trans effects in the AGES
dataset (table S18). For instance, Suhre et al., reported 14 trans effects mediated by six
independent pSNPs at the ABO locus (59). We ran two of these trans acting pSNPs rs651007
and rs8176749 proximal to the ABO locus and confirmed all of these trans effects except for
NOTCH1. For the cis and trans effects reported in Sun et al. (58), 75.7% of the cis effects and
72.8% of the trans effects were confirmed in the AGES (table S18). For instance, Sun et al.
detected 115 proteins regulated in trans by the rs704 missense variant (NP_000629.3:
p.Thr400Met) in VTN while we detected 488 trans regulated proteins at their Bonferroni
adjusted P-value < 1.510-11
. The overlap between the rs704 mediated trans effects of the two
studies was 81.7%. In another example, Sun et al. detected 36 proteins that were affected by
20 independent pSNPs acting in trans at that ABO locus (58). We find that 13 of these pSNPs
affected 88 proteins in trans in the AGES dataset at P < 1.510-11
, with an 81% overlap of the
trans regulated proteins at the ABO locus between the two studies.
Of the aptamer-based studies mentioned above (table S18), Sun et al. (58), comes
closest to the present study as regards sample size and number of proteins measured.
However, they used a smaller version of the aptamer-based platform or 28.3% fewer proteins
and 40% fewer study participants which were predominantly of young age. Below we present
the reproducibility of selected examples of cis and trans effects described in Sun et al. in the
AGES dataset. Sun et al. reported a pSNP mediating a cis effect on WFIKKN2 as well as
mediating a trans effect on the myostatin protein GDF11/8 (58). We note that the cis effect
for WFIKKN2 was also reported in Suhre et al. (59). Using a window size of 300kb across
WFIKKN2, we detected a strong cis acting effect for WFIKKN2 (P = 210-93
) and also
mediating a trans effect on GDF11/8 serum levels (P = 210-9
) (fig. S14A). Here, the lead
SNP, the synonymous variant rs9675120 (NP_783165:p.Ser135=) in WFIKKN2, was highly
correlated (r2=0.928) with pSNP rs11079936 (58). The common allele T for rs9675120 was
associated with lower levels of both WFIKKN2 and GDF11/8 (fig. S14A). Furthermore, we
find that the proteins WFIKKN2 and GDF11/8 were positively correlated in the AGES data
(fig. S14B). The direction of all effects is consistent with that reported in Sun et al. (58).
GDF11/8 has been implicated in muscular dystrophy (60), and experimental studies have
shown that WFIKKN2 has strong affinity for GDF11/8 (61). Interestingly, we found that both
13
WFIKKN2 and GDF11/8 map to the same protein module PM27 (table S7), a module
enriched for proteins involved in extracellular matrix organization and vascular disease. This
module is also enriched in fibrosis related signatures (62), where 8 out the 16 well-established
fibrosis-related proteins are found in PM27 (Fisher exact test P-value = 610-7
).
The second example from Sun et al. (58), is the GWAS locus for inflammatory bowel
disease (IBD) at the missense variant rs3197999 (NP_066278.3: p.Arg703Cys) in MST1.
This locus also affected five other proteins in trans including PRDM1 (aka BLIMP1) at
chromosome 6 (58). We find a strong cis acting effect on MST1 and significant trans acting
effects on 11 proteins including three of the five reported in Sun et al. (fig. S14C,D). In the
third and final example of replicated findings from the work of Sun et al., we focused on the
pQTL hotspot at the vasculitis associated missense variant rs28929474 (NP_001121179:
p.Glu366Lys) in SERPINA1 that was associated with 13 proteins (58). We find that
rs28929474 was associated with 17 proteins in trans, of which 8 were reported in Sun et al.
(58), and we find were directionally consistent across both studies (fig. S14E, F). Also, we
find the rs28929474 mediated a weak cis effect on SERPINA1 (T allele, = -0.471, P =810-
6), directionally consistent with that of Sun et al. (58).
In summary, this extensive validation and comparative study not only reveals the
robustness of our multiplex aptamer-based platform to confirm findings across independent
study populations and proteomics platforms, but highlights the added information the present
study can provide in terms of identifying links to new proteins and the relationship between
proteins in the context of the serum protein network. Although the study cohorts were
different in terms of subject recruitment, age range, health status and ethnic homogeneity, and
in the genotyping and proteomic platforms applied, on average 80% of all reported cis effects
and 74% of all trans effects were confirmed in our dataset. It is possible that study-specific cis
and trans effects exist that appear in a single study only. Finally, a lack of replication of cis
and trans effects may indicate false positive findings in the discovery study.
5. Assessment of tissue specificity of cis and trans proteins and protein modules
Transcript expression data for 53 different human tissues as median RPKM by tissue, was
downloaded from GTEx (https://www.gtexportal.org) on 07/25/2017. The GTEx project
provides RNA-Seq based transcriptome data in over 40 tissues from hundreds of human
donors and since multiple tissues are collected from the same individuals, cross-tissue
analysis is feasible (63). The specificity score for a gene in a tissue was calculated by
subtracting from its RPKM value the mean value in all other tissues for that gene and dividing
by the standard deviation of those values. The top 0.5 to 2.5% (Z >9.24 to >2.75) were
declared as tissue specific and mapped to modules after removal of duplicate matches.
Similarly, the subset of cis-trans protein pairs, were selected where mRNA levels for both
scored in the top 2.5% for tissue specificity (Z>2.75). Here, 158 cis-trans pairs showed the
same tissue specific expression while 2,119 pairs exhibited different tissue specific expression
(table S21).
The npSNP discovery also allowed us to assess if the serum networks resulted from
cross-tissue regulatory control. For example, the rs704 control of VTN protein levels occurred
primarily in liver (tissue specific Z >123), and this npSNP regulated proteins across several
modules including other tissue specific proteins. For example tissue specific proteins from
five and 19 distinct tissues were regulated by VTN in the PM7 and PM10 modules
respectively (18 non-liver tissues, table S22). These results provide evidence that in a number
of cases, npSNPs affected serum levels of a tissue specific protein and that subsequently
affected variable serum levels of other proteins synthesized in distinct tissues.
Finally, we interrogated how well the protein modules agree with the gene mRNA co-
expression modules constructed in solid tissues and evaluated if similar network organization
14
is shared at both the protein and gene expression mRNA levels. In addition, this may help
indicate the potential tissues of origin of the serum protein network modules. The assessment
of overlaps between serum protein modules and 2,672 gene mRNA co-expression modules
constructed from whole-genome transcript information from multiple solid human tissues
(16), was based on how well two modules shared similar set of genes encoding either mRNA
or proteins. Here, we counted the number of genes/proteins that were common between a
protein module and a gene mRNA co-expression module in a given tissue and calculated the
overlap ratio of the match (fig. S5). Next, we assessed the significance of this overlap against
random expectation using Fisher´s exact test. Heatmaps of overlap ratio values were used to
show that most module pairs have a very low gene member overlap (<8%) (fig. S5). A
heatmap based on the statistical test P-values showed three protein modules with weak but
significant overlaps with the tissue (mainly liver, muscle and adipose tissue) mRNA co-
expression modules (fig. S5). The accumulated data suggest that the serum protein network
arose at least in part via systemic cross-tissue regulation.
15
Fig. S1. A general workflow of the present study. The figure demonstrates the datasets used
in the present study and the analyses of the datasets including the construction of the serum
protein network, identification of its individual protein modules and their association to
genetic variants and disease related outcomes.
16
A
B
C
Fig. S2. Cross-platform validation of protein measurements. (A) A comparison between
the SOMAmer-based technology and immunoassays measuring serum levels of C-reactive
protein (CRP), r=0.984, P<1×10-300
, insulin (INS), r=0.680, P=1×10-264
, and natriuretic
peptide B (NPPB), r=0.915, P<1×10-300
. (B) Cross-platform validation of the correlation of
five known plasma protein biomarkers to the phenotypic measures previously observed using
immunoassays including prevalent heart failure (prev HF), metabolic syndrome (MetS), type
2 diabetes (T2D), lean (BMI<25), overweight (25BMI<30) or obese (BMI≥30) (see table
S5). (C) The custom-designed SOMAscan was used to confirm the association of elevated
17
serum levels of NPPB (red curve) and growth differentiation factor 15 (GDF15) (red curve) to
lower probability of survival post incident coronary heart disease (CHD) (highest vs. lowest
quartiles of the respective protein levels). General: controls are subjects free of the disease in
question. Data were analyzed using forward linear or logistic regression or Cox proportional
hazards regression, depending on the outcome being continuous, binary or a time to an event.
Kaplan-Meier plots were used to display survival probabilities.
18
Fig. S3. A correlation matrix for selected candidate proteins. The correlation matrix
demonstrates the relationship between the candidate proteins from table S5, and includes as
well the highly connected hub proteins highlighted in Figs. 3, S9 and S10.
19
A
B
C
Fig. S4. Clustering and robustness of the serum protein network. (A) Hierarchical
clustering by applying dynamic tree cut and a power transformation of 5 (=5) resulting in 27
protein modules each containing a minimum of 20 proteins (table S7). (B) A dynamic tree cut
using power transformation of 1 (= 1) and a minimum of 20 proteins per module, resulting
in 11 relatively large modules compared to using power transformation = 5, thus
maintaining the scale-free property of the network. (C) Comparison between connectivity
(kTotal) of proteins from the real network (blue curve) and corresponding proteins from a
network based on random protein data (cyan curve). Proteins (x-axis) were ordered by
20
annotation and increasing kTotal (y-axis). The mean kTotal for proteins from the real network
was 9.950, while the mean kTotal was 0.000018 for the randomized protein data.
21
Fig. S5. Heat plots of the overlap between modules of the serum protein network and
gene mRNA co-expression modules generated from solid tissues. Limited overlap was
found between protein modules within the serum protein networks and 2,672 gene co-
expression modules constructed from multiple solid tissues (21). Top panel is the overlap
heatmap, representing the overlap ratio between each pair of protein module (rows) vs gene
co-expression module (columns). Proteins in each protein module were assessed for overlaps
with genes in each gene co-expression module by Jaccard Index, defined as the number of
shared genes between the two modules divided by the sum of unique genes in both modules.
Jaccard index values are plotted on a color scale at the intersections of each protein module-
gene module pair. The best Jaccard index is only 8%, a very low overlap ratio. The bottom
panel is the overlap heatmap based on statistical significance of the module overlap analysis
as evaluated by Fisher's Exact Test with Bonferroni correction. -log10 (adjusted P-values)
were used in this heatmap. Similarly, protein modules are in rows and gene modules are in
columns, and the intersection between a row and a column is colored based on the
significance of the -log10 (adjusted P-values). Only three protein modules demonstrated
significant overlap with gene co-expression modules at the cutoff of Jaccard Index > 5% and
Bonferroni-corrected P-value < 0.05 (shown in the heatmap to the right). Among these, PM23
overlaps with only liver gene co-expression modules, PM27 overlaps with liver and muscle
gene-coexpression modules, whereas PM24 overlaps with adipose, hypothalamus and liver
gene co-expression modules. Hierarchical clustering was applied to the rows and columns of
both heatmaps, and dendrograms were plotted accordingly.
22
Fig. S6. A dendrogram showing the inter-module clustering of the different protein
modules via correlation of their eigenproteins (E(q)
s). PM1 does not link to any other
modules, while the other modules form four major super-clusters reflecting the functionality
shared between modules (tables S8 and S11). The numbers at the branches of the dendrogram
refer to the number of proteins found in a given protein module. Functional categories and
tissue/cell specific signatures enriched in the different super-clusters were obtained using
annotation tools like WebGestalt, DAVID, GeneMANIA and CTen, also reported in table
S11. Modules are ordered and annotated according to their inter-module relationship here as
well as throughout the present study.
23
A B C
D E F
G H I
Fig. S7. The relationship between modules E
(q)s to disease related measures and
outcomes. (A) The modules PM7 and PM10 are members of super-cluster II. (B) Inverse
association of the modules E(PM7)
and E(PM10)
to prevalent heart failure (prev HF), ***P110-
16. (C) Reduced overall survival probability for low E
(PM7) levels (cyan curve) compared to
high E(PM7)
levels (red curve). (D) PM16, a 170 protein module, is a member of super-cluster
IV. (E) Positive association of E(PM16)
quintiles to variation (cm2) in visceral adipose tissue
(VAT), P = 310-16
, to the metabolic syndrome (MetS) and prevalent coronary heart disease
(prev CHD) and HF, ***P <110-11
. (F) Reduced overall survival probability for high E(PM16)
levels (red curve) compared to low E(PM16)
levels (cyan curve). (G) PM26, a 390 protein
module, is a member of super-cluster V. (H) Positive association of E(PM26)
to prevalent CHD
and HF as well as incident CHD (inc CHD) and HF (inc HF), ***P110-8
. (I) Reduced post
CHD and overall survival probability for high E(PM26)
levels (red curve) compared to low
E(PM26)
levels (cyan curve). Controls are subjects free of the disease in question. Data were
analyzed using forward linear or logistic regression or Cox proportional hazards regression,
depending on the outcome being continuous, binary or a time to an event. Kaplan-Meier plots
were used to display survival probabilities. For more details see fig. S6 and tables S7 and S12.
The number of proteins per module are denoted at the branches of the dendrogram.
24
Fig. S8. A volcano plot of the association of global serum proteins to prevalent CHD
diagnosed at different times before sampling. The plot demonstrates the significance –
log(Bonferroni adjusted P-value) as a function of effect sizes (log odds ratio), either when all
prevalent CHD cases (N=1,217) were included in the analysis (blue circles) or when only
CHD cases diagnosed with the disease within five years before entry in the AGES (N=700)
were included (orange circles). Two different aptamers were used to detect and measure
PCSK9. In terms of effect sizes variable levels of proteins associated with prevalent disease
like CHD were not affected by restricting the analysis to the time of diagnosis to the time of
sampling (see material and methods).
25
A B C
D E F
G H I
Fig. S9. The relationship between network connectivity of proteins and disease related
measures and outcomes. (A) Spring graph of PM10 highlighting the hub protein DYRK3
located in the hub region of the module. (B) Positive correlation between within module
connectivity (Ki) (x-axis) of PM10 proteins and the absolute value of the effect (-coefficient)
size of their association to prevalent heart failure (HF) (y-axis), Pearson´s r=0.782, P=110-72
.
(C) Positive association of DYRK3 to prevalent HF, P<110-30
, and reduced overall survival
(all-cause mortality post entry into the AGES study cohort) associated with low serum
DYRK3 levels (cyan curve). (D) Spring graph of the PM16 showing location of the hub
HNRNPA1 within the hub region. (E) Positive correlation between Ki (x-axis) and the
association to incident coronary heart disease (inc CHD), r=0.712, P=110-22
. (F) Positive
association of HNRNPA1 to incident CHD, P=110-10
, and high serum levels of HNRNPA1
(red curve) predict reduced overall survival. (G) A spring graph of PM26 highlighting the
module´s hub FSTL3. (H) Positive correlation between Ki (x-axis) and the association to
prevalent HF, r=0.431, P=110-16
. (I) Positive association of FSTL3 to prevalent HF,
P<110-30
, and reduced overall survival associated with high serum FSTL3 levels (red curve).
Network visualization was performed with the igraph package in R (30). Controls are subjects
26
free of the disease in question. Data were analyzed using forward linear or logistic regression
or Cox proportional hazards regression, depending on the outcome being continuous, binary
or a time to an event. Kaplan-Meier plots were used to display survival probabilities.
27
A
B
Fig. S10. Preservation analysis of the serum protein network structure. (A) The cohort
was randomly split into two parts, 2/3 for a training set and a 1/3 for the test set, and the
summary Z score statistics plotted for each of the 27 modules presented as colored data
points. Here the summary Z score <2 (blue dotted line) indicates no preservation, 2< summary
Z score <10 (between the blue and green dotted lines) indicates moderate evidence of
preservation, while a summary Z score >10 (green dotted line) indicates strong evidence of
preservation. All the modules showed strong preservation or Z score >10. (B) Preservation of
the connectivity status for the top 10 hubs within each module (kWithin). The modules and
protein hubs highlighted are also presented in Fig. 3 and figs. S3 and S9.
28
A
B C
Fig. S11. Highlighted examples of cis acting SNPs depending on genomic location and in
relation to network connectivity. (A) Cis-acting pSNPs may be located in intergenic
regions (rs7547965), or within genes including missense (rs1250259, NP_997647.1:
p.Gln15Leu), 5´-UTR (rs16923189), 3´-UTR (rs15881) or intronic (rs76426991). (B) Mean
total connectivity ±2CI (2× 95% Confidence Interval) for all significant cis effects (yes)
compared to proteins with no detectable cis effect (no), Student´s t-test P = 310-16
. (C)
Pearson, correlation between the absolute value for the -coefficient of all cis effects (x-axis)
vs. total connectivity of corresponding cis regulated proteins (y-axis), r = -0.231, P=110-15
.
29
A
B C
D E
Fig. S12. Examples of GWAS risk loci affecting serum protein levels. (A) A box plot of
five cis regulated proteins by known GWAS loci listed in table S16. (B) Trans acting effects
at the rs1050362 GWAS locus and a corresponding boxplot of two proteins affected. (C)
Trans acting effects at the rs964184 GWAS locus and a boxplot of two proteins affected. (D)
The strong cis and trans acting effects at the CHD-associated locus rs579459 affecting 43
proteins in trans. The rs579459 mediates a strong proximal cis acting effect on serum ABO
levels as highlighted in the boxplot. Also shown are boxplots for two proteins regulated in
trans by rs579459. (E) The Venn diagram demonstrates a significant enrichment of the
rs579459 trans affected proteins within the PM27 module (Fisher Exact Test P = 210-10
).
Here, 18 out of 25 proteins regulated by rs579459 map to PM27. Chromosomal ideograms
were reprinted from the NCBI chromosome Map Viewer. The genotypes and pSNPs are at the
x-axis of each box plot while the normalized levels of serum proteins are denoted at the y-
axis.
30
A B C
D E
F G
H I
Fig. S13. Selected examples of known GWAS risk loci for CHD, T2D and/or adiposity.
(A) A box plot of the trans regulated protein PROC at the CHD locus rs867186. (B) Trans
acting effects at the rs1892094 GWAS CHD locus and a boxplot of a protein affected by the
pSNP. (C) A trans acting effect at the rs1165669, another CHD GWAS locus, and a boxplot
31
of a protein affected by the locus. (D) The well documented T2D locus rs7756992 at
CDKAL1 affects the protein MLN in trans as highlighted in the boxplot. (E) The T2D GWAS
locus rs3132524 exerts trans effects on five proteins including proteins in the corresponding
box plots. (F) The distribution of the ABO protein serum levels in the AGES study population
as per genotypes for the CHD lead SNP rs579459. (G) A strong cis acting effect on ABO
serum levels using a 300kb window across the ABO locus, also representing many well
established GWAS risk lead SNPs for various disease related outcome data (right panel). (H)
The distribution of the VTN protein serum levels in the AGES study population as per
genotypes for the npSNP rs704. (I) The E(PM11)
representing module PM11 is strongly
associated with LDL cholesterol and triglycerides (TG) but not HDL cholesterol, using
forward linear regression analysis. Chromosomal ideograms were reprinted from the NCBI
chromosome Map Viewer. The genotypes and pSNPs are at the x-axis of each box plot while
the normalized levels of proteins are denoted at the y-axis.
32
A B
C D
E F
Fig. S14. Examples of replicated cis and trans effects reported by others. (A) Applying a
genomic window of 300kb across WFIKKN2, we detected a strong cis acting effect for
WFIKKN2. The lead pSNP rs9675120 is also associated with GDF11/8 levels acting in trans.
The T allele represents the major allele in the AGES. The rs9675120 is highly correlated
(r2=0.928) with the rs11079936 reported in Sun et al. (58). (B) There was a significant
positive correlation between the protein levels of WFIKKN2 and GDF11/8 in the AGES
cohort, Pearson´s r=0.498, P=110-241
. (C) The missense variant rs3197999 (NP_066278.3:
p.Arg703Cys) in MST1 mediated trans effects on 11 proteins in the AGES dataset. (D) Also,
the boxplot shows a strong cis effect on the proximal protein MST1 (P < 110-300
). Two trans
effects are highlighted as well. (E) The pSNP hotspot at rs28929474 (NP_001121179:
p.Glu366Lys) in SERPINA1 affects 17 proteins in trans at P < 110-5
. The regression values
in the table are based on copy T allele (also called the Z allele). Subjects homozygous for the
Z allele are not found in the AGES cohort. Many of these effects were also reported in Sun et
al. (58). (F). Boxplots of three proteins affected by the pQTL hotspot rs28929474. The
genotypes and pSNPs are at the x-axis of each box plot while the normalized levels of serum
proteins are denoted at the y-axis.
33
Table S1. Annotation of the human proteins targeted in the present study Annotation of the 4,137 human protein targets detected with the custom-designed SOMAscan
platform.
(Excel table hosted online)
34
Table S2. Descriptive statistics of the present study cohort for relevant measures
Baseline characteristics of the AGES Reykjavik study cohort: Numbers are mean(SD) for
continuous-, N(%) for categorical- and median[IQR] for skewed variables. Abbreviations:
CHD, coronary heart disease; HF, heart failure; N/A, not applicable.
*For sex differences, obtained from two sided T-test for continous-, χ2 test for categorical- and
quantile regression for skewed variables.
Characteristic Variable Males Females P-value* Total
Demographics
Numbers
Age (years)
2330 (42.7%)
76.7 (5.4)
3127 (57.3%)
76.5 (5.7)
N/A
0.280
5457
76.6 (5.6)
Anthropometry
BMI (kg/m2)
Obese (BMI>30)
26.9 (3.8)
439 (18.9%)
27.2 (4.8)
777 (24.9%)
0.004
<0.001
27.1 (4.4)
1216 (22.3%)
Physiological
SBP (mmHg)
DBP (mmHg)
TOT-C (mmol/L)
LDL-C (mmol/L)
TG (mmol/L)
FG (mmol/L)
VAT (cm2)
SAT (cm2)
143.2 (20.4)
76.2 (9.6)
5.2 (1.1)
3.2 (1.0)
1.0 [0.8,1.4]
5.9 (1.2)
203.0 (86.2)
203.4 (86.8)
142.2 (20.9)
72.2 (9.5)
6.0 (1.1)
3.7 (1.0)
1.1 [0.8,1.5]
5.7 (1.1)
150.3 (67.2)
294.9 (112.3)
0,075
<0.001
<0.001
<0.001
<0.001
<0.001
<0.001
<0.001
142.6 (20.7)
73.9 (9.7)
5.6 (1.2)
3.5 (1.0)
1.0 [0.8,1.4]
5.8 (1.2)
172.8 (80.2)
255.7 (111.7)
Medication
Antihypertension
Lipid lowering
1460 (62.7%)
656 (28.2%)
2016 (64.5%)
575 (18.4%)
0,169
<0.001
3476 (63.7%)
1231 (22.6%)
Lifestyle
Smoker
265 (11.7%)
390 (12.8%)
0.199
655 (12.3%)
Metabolic
T2D
MetS
363 (15.6%)
486 (20.9%)
291 (9.3%)
641 (20.5%)
<0.001
0.746
654 (12.0%)
1127 (20.7%)
Heart disease
CHD prevalent
CHD incl recurrent
CHD incident
HF prevalent
HF incl recurrent
HF incident
Followup yrs CHD
Followup yrs death
777 (33.6%)
938 (40.6%)
421 (27.4%)
101 (4.4%)
287 (12.4%)
233 (10.5%)
7.4 [3.2,10.1]
10.5 [6.2,12.3]
440 (14.2%)
681 (22.0%)
451 (17.0%)
71 (2.3%)
242 (7.8%)
207 (6.9%)
9.7 [5.8,10.8]
11.6 [8.1,12.8]
<0.001
<0.001
<0.001
<0.001
<0.001
<0.001
<0.001
<0.001
1217 (22.5%)
1619 (30.0%)
872 (20.8%)
172 (3.2%)
529 (9.8%)
440 (8.4%)
9.2 [4.4,10.6]
11.3 [7.2,12.6]
35
Table S3. Direct assessment of aptamer specificity via DDA mass spectrometry List of proteins with confirmation by data dependent analysis (DDA) mass spectrometry after
SOMAmer enrichment in biological matrices. Column Biological Matrix; Cell line name if
detected in lysate or conditioned media (cm), otherwise noted as blood serum, blood plasma,
or urine biofluid. Column File Name: Refers to raw data file name uploaded to PRIDE
Proteome Exchange with five dataset identifiers PXD008819-PXD008823.
(Excel table hosted online)
Table S4. Direct assessment of aptamer specificity via MRM mass spectrometry
List of proteins with confirmation by multiple reaction monitoring (MRM) mass spectrometry
after SOMAmer enrichment in biological matrices. Cell line name if detected in lysate or
conditioned media (cm), otherwise noted as blood serum, blood plasma, or urine biofluid. The
MRM dataset has been deposited to Peptide Atlas PASSEL repository with the dataset
identifier PASS01145.
(Excel table hosted online)
36
Table S5. Cross-platform validation of known links of proteins to disease related traits
Confirmation, via application of the custom designed SOMAscan platform, in the AGES
cohort, of known associations of protein biomarkers to relevant disease related outcomes
detected with conventional immunoassays. The beta coefficients (-coeff) were estimated
through either linear or logistic regression analysis. N/A, not applicable.
Protein Reference Trait Reported
levels
Prevalent disease, AGES
-coeff P-value
Incident disease, AGES
-coeff P-value
IL-18
CRP
SAA
IL6
NPPB
MPO
PAPPA
GDF15
LGALS3
ADIPOQ
LEP
IGFBP2
ADIPOQ
LEP
sLEPR
ADIPOQ
RBP4
FABP4
EDN1
NPPB
UCN3
LECT2
PAI-1
PTX3
21481392
20182820
20182820
10769275
20182820
20182820
20182820
27811204
22230397
19029992
29236298
22554827
11479627
27906690
12075576
11479627
18239568
17553506
8149524
24807464
19961889
28278265
8673927
21900125
CHD
CHD
CHD
CHD
CHD
CHD
CHD
CHD
CHD
T2D
T2D
T2D
SAT
SAT
SAT
MetS
MetS
MetS
HF
HF
HF
VAT
VAT
VAT
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Reduced
Elevated
Reduced
Reduced
Elevated
Reduced
Reduced
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
Elevated
0.117
0.077
0.075
0.066
0.656
0.277
0.163
0.327
0.285
0.543
0.491
-0.632
-19.140
88.848
-15.472
-0.903
0.398
1.043
0.527
1.303
0.250
10.670
8.190
3.511
0.0007
0.02
0.02
0.045
1e-64
9e-16
6e-07
3e-19
9e-16
1e-55
1e-41
<1e-258
3e-37
<1e-300
4e-29
<1e-300
4e-24
<1e-300
1e-13
<1e-300
0.001
2e-23
7e-15
0.0008
0.094
0.176
0.087
0.111
0.401
0.159
0.128
0.300
0.179
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
0.239
0.807
0.165
N/A
N/A
N/A
0.002
1e-08
0.004
0.0002
1e-26
3e-07
0.0002
2e-18
2e-08
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
2e-07
<1e-300
0.0005
N/A
N/A
N/A
37
Table S6. The degree validation of aptamer specifcity for all human proteins measured
in the present study A summary of direct and/or inferred validation of aptamer specificity for the 4,137 human
proteins detected in the present study.
(Excel table hosted online)
Table S7. The modules of the serum protein network and corresponding proteins Annotation of the modules and the proteins that constitute each module of the serum protein
network together with information related to degree connectivity (kWithin, kOut, and kTotal).
(Excel table hosted online)
38
Table S8. Enrichment of functional categories in the different modules
Functional categories and tissue/cell specific signatures enriched in the different protein
modules using annotation tools like WebGestalt, DAVID, GeneMANIA and CTen (64-67).
Modules are ordered and annotated according to their inter-module relationship. N/A, not
applicable.
Module Size Over-represented
pathways & tissue signatures
FDR P-value
(Bonferroni
adjusted)
Database
PM1 31 Signal peptide
Autoimmunity
Notch signaling
BDCA4+ dentritic cells
N/A
0.03
0.00001
0.01
0.0007
N/A
N/A
N/A
DAVID
WebGestalt
GeneMANIA
CTen
PM2 86 Signal peptide
Circadian rhythm
Adenocarcinoma
Lymphocyte mediated immunity
Whole blood
N/A
N/A
0.00002
0.0002
0.006
2e-07
0.006
N/A
N/A
N/A
DAVID
DAVID
WebGestalt
GeneMANIA
CTen
PM3 921 Signal peptide
Growth factor activity
MAPK cascade
Zymogen
Cytokine-cytokine receptor
JAK - STAT signaling
PI3K – AKT signaling
Immune system diseases
Hypotension
Smooth muscle
Pancreas
N/A
N/A
N/A
N/A
1e-30
2e-11
1e-10
1e-06
0.0008
0.002
0.028
1e-78
1e-25
1e-12
1e-07
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
DAVID
DAVID
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
CTen
PM4 86 Signal peptide
Pattern recognition receptor activity
Hepatitis B
SIDS
Rheumatoid arthritis
Leukemia lymphoblastic
N/A
1e-06
0.00005
0.005
0.02
0.002
0.002
N/A
N/A
N/A
N/A
N/A
DAVID
GeneMANIA
WebGestalt
WebGestalt
WebGestalt
CTen
PM5 65 Extracellular exosome
IkB / NF-kB signaling pathway
CD33+ Myeloid
Skin
N/A
0.0001
0.002
0.007
0.002
N/A
N/A
N/A
DAVID
GeneMANIA
CTen
CTen
PM6 157 Signal peptide
Calcium ion transport
Heart valve disease
Cardiac myocytes
N/A
0.004
0.04
0.002
5e-15
N/A
N/A
N/A
DAVID
GeneMania
WebGestalt
CTen
PM7 88 Protein binding N/A 0.02 DAVID
PM8 84 Signal peptide
Four helical cytokine core
Natural killer cell activation
Intravascular coagulation
N/A
N/A
2e-06
0.03
3e-07
0.01
N/A
N/A
DAVID
DAVID
GeneMania
WebGestalt
PM9 286 Signal peptide
Growth factor binding
Complement and coagulation
Liver
Pancreatic islets
N/A
0.0002
0.002
0.002
0.006
1e-31
N/A
N/A
N/A
N/A
DAVID
GeneMania
WebGestalt
CTen
CTen
PM10 312 Signal peptide N/A 8e-28 DAVID
39
Leukocyte differentiation
Fc-epsilon receptor
Inate immune system
Lung diseases
Globus pallidus
Cingulate cortex subthalamic
0.0004
0.0005
0.001
0.004
0.002
0.002
N/A
N/A
N/A
N/A
N/A
N/A
GeneMania
GeneMania
WebGestalt
WebGestalt
CTen
CTen
PM11 26 Secreted proteins
Lipoprotein particles
Sterol homeostasis
Familial hypercholesterolemia
Adrenal gland
Fetal liver
N/A
1e-12
1e-10
0.002
0.02
0.03
0.01
N/A
N/A
N/A
N/A
N/A
DAVID
GeneMania
GeneMania
WebGestalt
CTen
CTen
PM12 69 Signal peptide
Telomere maintenance
Ovary
Atrioventricular node
N/A
0.00004
0.01
0.01
0.00003
N/A
N/A
N/A
DAVID
GeneMania
CTen
CTen
PM13 318 Signal peptide
Biological rhythms
Epstein-Barr virus infection
Skeletal muscle
Uterus
N/A
N/A
0.009
0.0002
0.006
1e-09
0.006
N/A
N/A
N/A
DAVID
DAVID
WebGestalt
CTen
CTen
PM14 81 Signal peptide
Bone marrow
N/A
0.01
0.00003
N/A
DAVID
CTen
PM15 118 Signal peptide
TNF mediated signaling
Kaposi sarcoma
T- cell activation
N/A
N/A
0.01
0.004
1e-08
0.0005
N/A
N/A
DAVID
DAVID
WebGestalt
GeneMania
PM16 170 Poly(A) RNA binding
Acetylation
Ubiquitin conjugation
Secreted proteins
Antibiotic activity
Neutrophil degranulation
Inflammation
Liver carcinoma
RNA spliceosome
Bone marrow
CD33+ myeloid
N/A
N/A
N/A
N/A
N/A
1e-12
1e-06
0.01
1e-07
5e-14
4e-13
1e-08
3e-07
8e-07
0.00001
0.00002
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
DAVID
DAVID
DAVID
DAVID
WebGestalt
WebGestalt
WebGestalt
GeneMania
CTen
CTen
PM17 53 Acetylation
Phosphoprotein
Stress
Vesicle mediated transport
SNARE complex
N/A
N/A
0.0009
0.003
0.002
1e-06
3e-06
N/A
N/A
N/A
DAVID
DAVID
WebGestalt
WebGestalt
GeneMania
PM18 83 Cytoplasm
ERBB signaling pathway
Platelet activation
EGF / EGFR signaling pathway
Drug-drug interaction
CD28 costimulation
Focal adhesion
FCg mediated phagocytosis
CCKR signaling
Angiogenesis
CD56+ NK Cells
N/A
2e-13
3e-13
1e-12
1e-10
1e-08
5e-07
3e-06
0.0004
0.01
0.0003
1e-15
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
GeneMania
GeneMania
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
40
CD19+ B cells 0.0028 N/A CTen
PM19 81 Acetylation
Extracellular exosomes
Hereditary hemolytic anemia
Cofactor metabolic process
Protein folding
CD71+ early erythroid
CD105+ endothelial
N/A
N/A
N/A
0.0005
0.002
1e-07
0.0001
1e-28
2e-14
0.00001
N/A
N/A
N/A
N/A
DAVID
DAVID
DAVID
GeneMania
GeneMania
CTen
CTen
PM20 32 Signal peptide
Immunoglobulin C1
Lymph node
Small intestine
N/A
N/A
0.01
0.02
4e-10
2e-06
N/A
N/A
DAVID
DAVID
CTen
CTen
PM21 18 Cellular ion homeostasis 0.004 N/A GeneMania
PM22 39 Signal peptide
Calcium ion binding
Bronchial epithelial cells
Adipocyte
N/A
N/A
0.01
0.02
3e-08
0.00002
N/A
N/A
DAVID
DAVID
CTen
CTen
PM23 35 Extracellular exosome
Biosynthesis of antibiotics
NAD(P)-binding domains
Disease mutation
Metabolic pathways
Amino acid metabolism
Carbon metabolism
Metabolism, inborn errors
Ethanol oxidation
Oxidoreductase
Liver
Kidney
Small intestine
Adrenal gland
N/A
N/A
N/A
N/A
1e-10
5e-10
0.00001
0.0001
1e-09
3e-07
2e-14
7e-06
0.00008
0.0003
3e-08
1e-07
3e-06
0.0001
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
DAVID
DAVID
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
GeneMania
GeneMania
CTen
CTen
CTen
CTen
PM24 37 Secreted proteins
Protein activation cascade
Vesicle lumen
Complement activation
Platelet degranulation
Thrombosis
Fetal liver
Fetal lung
Lymph node
N/A
1e-18
1e-15
2e-10
2e-09
3e-09
1e-21
5e-10
6e-06
1e-27
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
GeneMania
GeneMania
GeneMania
WebGestalt
WebGestalt
CTen
CTen
CTen
PM25 30 Signal peptide N/A 0.003 DAVID
PM26 390 Signal peptide
Extracellular exosome
Ephrin receptor signaling
Inflammation
Glomerular filtration rate
Spontenous abortion
Axon guidance
Osteoporosis
Prostatic neoplasms
Smooth muscle
Adipocyte
Lung
N/A
N/A
2e-08
5e-08
0.00001
0.00002
0.00002
0.04
0.04
1e-06
4e-06
6e-06
6e-101
1e-19
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
DAVID
GeneMania
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
CTen
CTen
PM27 378 Signal peptide N/A 1e-113 DAVID
41
Extracellular exosome
Cell adhesion (CAMs)
Extracellular matrix organization
Collagen diseases
Vascular diseases
Axon guidance
Neoplasm metastasis
Osteoblast signaling
Adipocyte
Uterus
Smooth muscle
N/A
1e-30
1e-18
7e-07
1e-06
0.00001
0.0005
0.005
2e-15
6e-12
1e-09
1e-20
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
CTen
CTen
42
Table S9. Tissue specific expression of individual serum proteins GTEx gene expression data (https://www.gtexportal.org) related to potential tissue of origin
of individual proteins. The Z>9.24 represents the top 0.5% of all tissue-specific Z-scores for
the proteins measured.
(Excel table hosted online)
Table S10. Tissue specific expression of serum protein modules
GTEx gene expression data (https://www.gtexportal.org) related to potential tissue of origin
of individual protein modules using a Z>2.75 cut-off, i.e. the top 2.5% of tissue specificity.
The numbers refer to percentage of all proteins in each module passing this cut-off.
(Excel table hosted online)
43
Table S11. Enrichment of functional categories in the different superclusters
Functional categories enriched in the five super-clusters using annotation tools like
WebGestalt, DAVID, GeneMANIA and CTen (64-67). N/A, not applicable. GEFs,
guanine nucleotide exchange factors.
Modules Super-
cluster
Over-representation of
pathways & tissues
FDR P-value
(Bonferroni
adjusted)
Database
PM1 I Signal peptide
Autoimmunity
Notch signaling
BDCA4+ dendritic cells
N/A
0.03
0.00001
0.01
0.0007
N/A
N/A
N/A
DAVID
WebGestalt
GeneMANIA
CTen
PM2-10 II Signal peptide
Immune diseases
Necrosis
Inflammation
Cytokine
Growth factor
Jak STAT signaling
PI3K-AKT signaling
N/A
1e-100
1e-90
1e-90
1e-34
1e-33
3e-18
1e-15
1e-169
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
WebGestalt
PM11-15 III Signal peptide
MAPK cascade
Ras GEFs
Extracellular matrix
N/A
N/A
N/A
3e-30
1e-31
3e-18
2e-18
N/A
DAVID
DAVID
DAVID
WebGestalt
PM16-19 IV Extracellular exosomes
Kit receptor signaling
Drug-drug interaction
Nucleotide binding
Fc epsilon RI pathway
Bone marrow
CD33+ myeoloid
N/A
1e-30
1e-20
7e-12
1e-06
1e-14
1e-12
4e-28
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
CTen
PM20-27 V Signal peptide
Extracellular exosome
Biological adhesion
Neoplasm invasivness
Angiogenesis
Axon guidance
Adipocyte
Smooth muscle
Lung
N/A
N/A
1e-43
1e-20
1e-09
1e-10
1e-20
1e-15
1e-13
1e-251
3e-56
N/A
N/A
N/A
N/A
N/A
N/A
N/A
DAVID
DAVID
WebGestalt
WebGestalt
WebGestalt
WebGestalt
CTen
CTen
CTen
44
Table S12. Association of the modules E(q)
s to disease related phenotypic measures
Correlation of different modules E(q)
s to various disease related outcomes in the AGES study
cohort. The significance threshold of module trait correlations to outcome data was set at a
conservative P-value <110-7
. N/A, not applicable; NS, not significant.
E(module)
Size Super-
cluster
Outcome* Data N cases, events,
measurements
Direction
of effect
P-value
PM1 31 I VAT
MetS
SAT
T2D
Survival
Survival
CHD
HF
Prevalent
Prevalent
Prevalent
Prevalent
Post CHD
Overall
Incident
Incident
5239
1127
5239
654
692
2982
872
440
Direct
Direct
Direct
Direct
Direct
Direct
Direct
Direct
1e-65
1e-55
5e-27
2e-19
1e-17
<1e-14
6e-13
1e-11
PM2 86 II N/A N/A N/A N/A NS
PM3 921 II N/A N/A N/A N/A NS
PM4 86 II CHD
VAT
Prevalent
Prevalent
1217
5239
Inverse
Direct
2e-13
6e-12
PM5 65 II HF
MetS
CHD
HF
Prevalent
Prevalent
Prevalent
Incident
172
1127
1217
440
Inverse
Inverse
Inverse
Inverse
4e-18
2e-14
8e-14
8e-09
PM6 157 II HF
CHD
HF
Survival
Prevalent
Prevalent
Incident
Overall
172
1217
440
2982
Inverse
Inverse
Inverse
Inverse
1e-26
3e-14
2e-12
8e-08
PM7 88 II HF
Survival
Prevalent
Overall
172
2982
Inverse
Inverse
3e-20
2e-10
PM8 84 II VAT
HF
Survival
Prevalent
Prevalent
Overall
5239
172
2982
Direct
Inverse
Inverse
2e-22
4e-11
1e-09
PM9 286 II HF
VAT
CHD
HF
Survival
Prevalent
Prevalent
Prevalent
Incident
Overall
172
5239
1217
440
2982
Inverse
Direct
Inverse
Inverse
Inverse
2e-23
1e-13
1e-09
2e-09
3e-09
PM10 312 II HF
Survival
Prevalent
Overall
172
2982
Inverse
Inverse
7e-17
1e-12
PM11 26 III MetS
CHD
Prevalent
Prevalent
1127
1217
Direct
Direct
1e-15
1e-14
PM12 69 III N/A N/A N/A N/A NS
PM13 318 III N/A N/A N/A N/A NS
PM14 81 III N/A N/A N/A N/A NS
PM15 118 III N/A N/A N/A N/A NS
PM16 170 IV CHD
CHD
VAT
MetS
Survival
Prevalent
Incident
Prevalent
Prevalent
Overall
1217
872
5239
1127
2982
Direct
Direct
Direct
Direct
Direct
1e-18
5e-17
3e-16
1e-12
2e-12
45
*Survival probability was estimated either as post incident CHD or overall survival post entry into the
AGES study (see material and methods). Data were analyzed using forward linear or logistic
regression or Cox proportional hazards regression, depending on the outcome being continuous, binary
or a time to an event. Abbreviations: MetS, metabolic syndrome; VAT, visceral adipose tissue via CT;
SAT, subcutaneous adipose tissue via CT; T2D, type 2 diabetes; CHD, coronary heart disease; HF,
heart failure. See table S2 for descriptive statistics of the study cohort.
HF
HF
Prevalent
Incident
172
440
Direct
Direct
3e-12
2e-09
PM17 53 IV HF
CHD
HF
CHD
Survival
Survival
SAT
MetS
Prevalent
Prevalent
Incident
Incident
Overall
Post CHD
Prevalent
Prevalent
172
1217
440
872
2982
692
5339
1127
Direct
Direct
Direct
Direct
Direct
Direct
Direct
Direct
2e-22
6e-22
1e-18
3e-18
<1e-16
3e-11
1e-10
1e-09
PM18 83 IV N/A N/A N/A N/A NS
PM19 81 IV N/A N/A N/A N/A NS
PM20 32 V N/A N/A N/A N/A NS
PM21 18 V N/A N/A N/A N/A NS
PM22 39 V N/A N/A N/A N/A NS
PM23 35 V VAT
MetS
SAT
T2D
CHD
Prevalent
Prevalent
Prevalent
Prevalent
Prevalent
5239
1127
5239
654
1217
Direct
Direct
Direct
Direct
Direct
6e-90
8e-65
4e-42
5e-32
2e-14
PM24 37 V VAT
MetS
SAT
Prevalent
Prevalent
Prevalent
5239
1127
5239
Inverse
Inverse
Inverse
2e-18
3e-12
4e-10
PM25 30 V N/A N/A N/A N/A NS
PM26 390 V HF
HF
Survival
CHD
CHD
Survival
Prevalent
Incident
Overall
Prevalent
Incident
Post CHD
172
440
2982
1217
872
692
Direct
Direct
Direct
Direct
Direct
Direct
5e-20
2e-18
<1e-16
5e-13
2e-10
1e-08
PM27 378 V VAT
MetS
HF
Survival
Prevalent
Prevalent
Prevalent
Overall
5239
1127
172
2982
Inverse
Inverse
Direct
Direct
6e-34
4e-11
6e-09
9e-08
46
Table S13. Cis-acting serum pSNP-protein pairs All cis-acting pSNP-protein pairs detected within a 300kb window across and including a
given serum protein encoding gene. For each specific cis effect we report the single strongest
one (lead pSNP), and do not consider multiple independent cis effects per region.
(Excel table hosted online)
Table S14. Cross-referencing cis acting serum pSNP-proteins with eSNP-transcript
pairs Matching cis pSNP-proteins to expression eSNPs-transcripts pairs identified in >30 solid
tissues or cell types, using the stringent cutoffs of P < 5e-08 for significance and r2≥0.8 for
SNP proxy.
(Excel table hosted online)
Table S15. Cross-referencing cis acting serum pSNPs with GWAS lead SNPs
Cross-referencing cis pSNPs to genome-wide significant GWAS lead SNPs, using the
stringent cutoffs of P < 5×10-8 for significance and r2≥0.8 for pSNP proxy.
***Known GWAS findings are reported in the PhenoScanner (20), and/or the GWAS catalogue (68).
52
Table S20. Effects of network associated SNP (npSNP) on individual serum proteins
The SNPs associated with module E(q)
s listed in table S19 mediated cis and trans acting
effects on multiple proteins which cluster within specific protein modules. The genome-wide
significant association threshold for individual cis and trans effects mediated by the npSNPs
was set at Bonferroni adjusted P<510-7
(corrected for number of aptamers and npSNPs
tested). FET, Fisher exact test. N/A, not applicable.
E(q)
Lead
npSNP
Adjacent
cis effect(s)
#Trans
effects
Module
affected
#Cis and trans
effects in module
FET
P-value
PM1 rs204896
C4B, TNXB 78 PM13
PM15
33
18
1e-19
6e-13
PM2 rs704 VTN
698 PM2
PM4
PM6
PM7
PM10
27
34
68
87
160
1e-06
3e-10
4e-21
1e-75
4e-54
PM3 rs6813952 None 81 PM3 67 4e-39
PM4 rs10761731 None 27 PM3 18 4e-09
PM6 rs13026392 None 61 PM6
PM7
22
15
1e-17
4e-13
PM7
rs704
rs887829
VTN
UGT1A6
698
8
PM2
PM4
PM6
PM7
PM10
PM1
27
34
68
87
160
7
1e-06
3e-10
4e-21
1e-75
4e-54
1e-14
PM9 rs1250229 FN1 6 None N/A N/A
PM10 rs704
VTN
698
PM2
PM4
PM6
PM7
PM10
27
34
68
87
160
1e-06
3e-10
4e-21
1e-75
4e-54
PM11 rs445925
rs157582
rs6857
rs1803274
APOE
APOE
None
BCHE
37
37
35
20
PM11
PM11
PM11
PM11
16
19
19
9
4e-25
1e-31
5e-32
3e-15
PM12 rs17836931 None 27 PM12
PM14
6
8
2e-06
1e-08
PM13 rs1329424
rs541862
CFHR1, 4, 5
C4A/B, CFB
129
106
PM13
PM15
PM13
PM15
48
32
55
37
1e-24
3e-22
6e-37
3e-31
PM14 rs541862 C4A/B, CFB 106 PM13
PM15
55
37
6e-37
3e-31
PM15 rs1329424
rs389512
CFHR1, 4, 5
C4A/B, CFB
129
158
PM13
PM15
PM13
PM15
48
32
68
46
1e-25
3e-22
1e-38
4e-34
PM16 rs1970793 None 19 PM16 15 4e-19
PM17 rs17080938 None 7 PM17 5 3e-09
PM18 rs2562545 None 9 None N/A N/A
PM19 rs17091323 None 11 PM19
PM26
3
5
0.0006
0.0007
PM20 rs719482 IGHG1-4 68 PM20 25 7e-34
53
rs2885162
None
10
PM25
PM20
23
6
6e-31
2e-11
PM21 rs719482
IGHG1-4
68
PM20
PM25
25
23
7e-34
6e-31
PM23 rs357707 None 20 PM23 11 2e-18
PM26 rs881029 None 43 PM26 31 8e-26
PM27 rs6683597 None 28 PM27
PM26
15
9
1e-10
0.0001
54
Table S21. Tissue specificity of cis-to-trans protein pairs Tissue specific expression of transcripts encoding the cis-to-trans regulated proteins based on
53 different human tissues (median RPKM by tissue) downloaded from GTEx
(https://www.gtexportal.org) on 07/25/2017.
(Excel table hosted online)
Table S22. Tissue specificity of npSNPs
Tissue-specificity of network-associated protein SNPs (npSNPs).
(Excel table hosted online)
55
References and Notes 1. J. M. Schwenk, G. S. Omenn, Z. Sun, D. S. Campbell, M. S. Baker, C. M. Overall, R.
Aebersold, R. L. Moritz, E. W. Deutsch, The Human Plasma Proteome Draft of 2017: Building on the Human Plasma PeptideAtlas from Mass Spectrometry and Complementary Assays. J. Proteome Res. 16, 4299–4310 (2017). doi:10.1021/acs.jproteome.7b00467 Medline
2. M. Uhlén, L. Fagerberg, B. M. Hallström, C. Lindskog, P. Oksvold, A. Mardinoglu, Å. Sivertsson, C. Kampf, E. Sjöstedt, A. Asplund, I. Olsson, K. Edlund, E. Lundberg, S. Navani, C. A.-K. Szigyarto, J. Odeberg, D. Djureinovic, J. O. Takanen, S. Hober, T. Alm, P.-H. Edqvist, H. Berling, H. Tegel, J. Mulder, J. Rockberg, P. Nilsson, J. M. Schwenk, M. Hamsten, K. von Feilitzen, M. Forsberg, L. Persson, F. Johansson, M. Zwahlen, G. von Heijne, J. Nielsen, F. Pontén, Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015). doi:10.1126/science.1260419 Medline
3. M. Stastna, J. E. Van Eyk, Secreted proteins as a fundamental source for biomarker discovery. Proteomics 12, 722–735 (2012). doi:10.1002/pmic.201100346 Medline
4. I. M. Conboy, M. J. Conboy, A. J. Wagers, E. R. Girma, I. L. Weissman, T. A. Rando, Rejuvenation of aged progenitor cells by exposure to a young systemic environment. Nature 433, 760–764 (2005). doi:10.1038/nature03260 Medline
5. S. A. Villeda, J. Luo, K. I. Mosher, B. Zou, M. Britschgi, G. Bieri, T. M. Stan, N. Fainberg, Z. Ding, A. Eggel, K. M. Lucin, E. Czirr, J.-S. Park, S. Couillard-Després, L. Aigner, G. Li, E. R. Peskind, J. A. Kaye, J. F. Quinn, D. R. Galasko, X. S. Xie, T. A. Rando, T. Wyss-Coray, The ageing systemic milieu negatively regulates neurogenesis and cognitive function. Nature 477, 90–94 (2011). doi:10.1038/nature10357 Medline
6. E. E. Schadt, Molecular networks as sensors and drivers of common human diseases. Nature 461, 218–223 (2009). doi:10.1038/nature08454 Medline
7. B. Zhang, C. Gaiteri, L.-G. Bodea, Z. Wang, J. McElwee, A. A. Podtelezhnikov, C. Zhang, T. Xie, L. Tran, R. Dobrin, E. Fluder, B. Clurman, S. Melquist, M. Narayanan, C. Suver, H. Shah, M. Mahajan, T. Gillis, J. Mysore, M. E. MacDonald, J. R. Lamb, D. A. Bennett, C. Molony, D. J. Stone, V. Gudnason, A. J. Myers, E. E. Schadt, H. Neumann, J. Zhu, V. Emilsson, Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 153, 707–720 (2013). doi:10.1016/j.cell.2013.03.030 Medline
8. V. Emilsson, G. Thorleifsson, B. Zhang, A. S. Leonardson, F. Zink, J. Zhu, S. Carlson, A. Helgason, G. B. Walters, S. Gunnarsdottir, M. Mouy, V. Steinthorsdottir, G. H. Eiriksdottir, G. Bjornsdottir, I. Reynisdottir, D. Gudbjartsson, A. Helgadottir, A. Jonasdottir, A. Jonasdottir, U. Styrkarsdottir, S. Gretarsdottir, K. P. Magnusson, H. Stefansson, R. Fossdal, K. Kristjansson, H. G. Gislason, T. Stefansson, B. G. Leifsson, U. Thorsteinsdottir, J. R. Lamb, J. R. Gulcher, M. L. Reitman, A. Kong, E. E. Schadt, K. Stefansson, Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008). doi:10.1038/nature06758 Medline
9. Y. Chen, J. Zhu, P. Y. Lum, X. Yang, S. Pinto, D. J. MacNeil, C. Zhang, J. Lamb, S. Edwards, S. K. Sieberts, A. Leonardson, L. W. Castellini, S. Wang, M.-F. Champy, B. Zhang, V. Emilsson, S. Doss, A. Ghazalpour, S. Horvath, T. A. Drake, A. J. Lusis, E. E. Schadt, Variations in DNA elucidate molecular networks that cause disease. Nature 452, 429–435 (2008). doi:10.1038/nature06757 Medline
10. D. R. Davies, A. D. Gelinas, C. Zhang, J. C. Rohloff, J. D. Carter, D. O’Connell, S. M. Waugh, S. K. Wolk, W. S. Mayfield, A. B. Burgin, T. E. Edwards, L. J. Stewart, L. Gold, N. Janjic, T. C. Jarvis, Unique motifs and hydrophobic interactions shape the binding of modified DNA ligands to protein targets. Proc. Natl. Acad. Sci. U.S.A. 109, 19971–19976 (2012). doi:10.1073/pnas.1213933109 Medline
11. L. Gold, D. Ayers, J. Bertino, C. Bock, A. Bock, E. N. Brody, J. Carter, A. B. Dalby, B. E. Eaton, T. Fitzwater, D. Flather, A. Forbes, T. Foreman, C. Fowler, B. Gawande, M. Goss, M. Gunn, S. Gupta, D. Halladay, J. Heil, J. Heilig, B. Hicke, G. Husar, N. Janjic, T. Jarvis, S. Jennings, E. Katilius, T. R. Keeney, N. Kim, T. H. Koch, S. Kraemer, L. Kroiss, N. Le, D. Levine, W. Lindsey, B. Lollo, W. Mayfield, M. Mehan, R. Mehler, S. K. Nelson, M. Nelson, D. Nieuwlandt, M. Nikrad, U. Ochsner, R. M. Ostroff, M. Otis, T. Parker, S. Pietrasiewicz, D. I. Resnicow, J. Rohloff, G. Sanders, S. Sattin, D. Schneider, B. Singer, M. Stanton, A. Sterkel, A. Stewart, S. Stratford, J. D. Vaught, M. Vrkljan, J. J. Walker, M. Watrobka, S. Waugh, A. Weiss, S. K. Wilcox, A. Wolfson, S. K. Wolk, C. Zhang, D. Zichi, Aptamer-based multiplexed proteomic technology for biomarker discovery. PLOS ONE 5, e15004 (2010). doi:10.1371/journal.pone.0015004 Medline
12. T. B. Harris, L. J. Launer, G. Eiriksdottir, O. Kjartansson, P. V. Jonsson, G. Sigurdsson, G. Thorgeirsson, T. Aspelund, M. E. Garcia, M. F. Cotch, H. J. Hoffman, V. Gudnason, Age, Gene/Environment Susceptibility-Reykjavik Study: Multidisciplinary applied phenomics. Am. J. Epidemiol. 165, 1076–1087 (2007). doi:10.1093/aje/kwk115 Medline
13. A. L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286, 509–512 (1999). doi:10.1126/science.286.5439.509 Medline
14. B. Zhang, S. Horvath, A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, e17 (2005). doi:10.2202/1544-6115.1128 Medline
15. P. Langfelder, R. Luo, M. C. Oldham, S. Horvath, Is my network module preserved and reproducible? PLOS Comput. Biol. 7, e1001057 (2011). doi:10.1371/journal.pcbi.1001057 Medline
16. L. Shu, K. H. K. Chan, G. Zhang, T. Huan, Z. Kurt, Y. Zhao, V. Codoni, D.-A. Trégouët, J. Yang, J. G. Wilson, X. Luo, D. Levy, A. J. Lusis, S. Liu, X. Yang; Cardiogenics Consortium, Shared genetic regulatory networks for cardiovascular disease and type 2 diabetes in multiple populations of diverse ethnicities in the United States. PLOS Genet. 13, e1007040 (2017). doi:10.1371/journal.pgen.1007040 Medline
17. H. Jeong, S. P. Mason, A. L. Barabási, Z. N. Oltvai, Lethality and centrality in protein networks. Nature 411, 41–42 (2001). doi:10.1038/35075138 Medline
18. A. L. Barabási, N. Gulbahce, J. Loscalzo, Network medicine: A network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011). doi:10.1038/nrg2918 Medline
19. M. Muñoz, R. Pong-Wong, O. Canela-Xandri, K. Rawlik, C. S. Haley, A. Tenesa, Evaluating the contribution of genetics and familial shared environment to common disease using the UK Biobank. Nat. Genet. 48, 980–983 (2016). Medline
20. J. R. Staley, J. Blackshaw, M. A. Kamat, S. Ellis, P. Surendran, B. B. Sun, D. S. Paul, D. Freitag, S. Burgess, J. Danesh, R. Young, A. S. Butterworth, PhenoScanner: A database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209 (2016). doi:10.1093/bioinformatics/btw373 Medline
21. N. Mähler, J. Wang, B. K. Terebieniec, P. K. Ingvarsson, N. R. Street, T. R. Hvidsten, Gene co-expression network connectivity is an important determinant of selective constraint. PLOS Genet. 13, e1006402 (2017). doi:10.1371/journal.pgen.1006402 Medline
22. J. K. Pickrell, T. Berisa, J. Z. Liu, L. Ségurel, J. Y. Tung, D. A. Hinds, Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016). doi:10.1038/ng.3570 Medline
23. M. Franchini, G. Lippi, The intriguing relationship between the ABO blood group, cardiovascular disease, and cancer. BMC Med. 13, 7 (2015). doi:10.1186/s12916-014-0250-y Medline
24. M. Franchini, F. Capra, G. Targher, M. Montagnana, G. Lippi, Relationship between ABO blood group and von Willebrand factor levels: From biology to clinical implications. Thromb. J. 5, 14 (2007). doi:10.1186/1477-9560-5-14 Medline
25. E. A. Boyle, Y. I. Li, J. K. Pritchard, An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell 169, 1177–1186 (2017). doi:10.1016/j.cell.2017.05.038 Medline
26. D. Alfego, U. Rodeck, A. Kriete, Global mapping of transcription factor motifs in human aging. PLOS ONE 13, e0190457 (2018). doi:10.1371/journal.pone.0190457 Medline
27. J. Yang, T. Huang, F. Petralia, Q. Long, B. Zhang, C. Argmann, Y. Zhao, C. V. Mobbs, E. E. Schadt, J. Zhu, Z. Tu; GTEx Consortium, Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases. Sci. Rep. 5, 15145 (2015). doi:10.1038/srep15145 Medline
28. J. M. Zahn, S. Poosala, A. B. Owen, D. K. Ingram, A. Lustig, A. Carter, A. T. Weeraratna, D. D. Taub, M. Gorospe, K. Mazan-Mamczarz, E. G. Lakatta, K. R. Boheler, X. Xu, M. P. Mattson, G. Falco, M. S. H. Ko, D. Schlessinger, J. Firman, S. K. Kummerfeld, W. H. Wood 3rd, A. B. Zonderman, S. K. Kim, K. G. Becker, AGEMAP: A gene expression database for aging in mice. PLOS Genet. 3, e201 (2007). doi:10.1371/journal.pgen.0030201 Medline
29. P. Langfelder, B. Zhang, S. Horvath, Defining clusters from a hierarchical cluster tree: The Dynamic Tree Cut package for R. Bioinformatics 24, 719–720 (2008). doi:10.1093/bioinformatics/btm563 Medline
30. G. Csardi, T. Nepusz, The igraph software package for complex network research. InterJournal. Complex Syst. 1695, 1 (2006).
31. American Diabetes Association, Diagnosis and classification of diabetes mellitus. Diabetes Care 36 (suppl. 1), S67–S74 (2013). doi:10.2337/dc13-S067 Medline
32. A. Agarwala, S. Virani, D. Couper, L. Chambless, E. Boerwinkle, B. C. Astor, R. C. Hoogeveen, J. Coresh, A. R. Sharrett, A. R. Folsom, T. Mosley, C. M. Ballantyne, V. Nambi, Biomarkers and degree of atherosclerosis are independently associated with incident atherosclerotic cardiovascular disease in a primary prevention cohort: The ARIC study. Atherosclerosis 253, 156–163 (2016). doi:10.1016/j.atherosclerosis.2016.08.028 Medline
33. Y. Hathout, E. Brody, P. R. Clemens, L. Cripe, R. K. DeLisle, P. Furlong, H. Gordish-Dressman, L. Hache, E. Henricson, E. P. Hoffman, Y. M. Kobayashi, A. Lorts, J. K. Mah, C. McDonald, B. Mehler, S. Nelson, M. Nikrad, B. Singer, F. Steele, D. Sterling, H. L. Sweeney, S. Williams, L. Gold, Large-scale serum protein biomarker
discovery in Duchenne muscular dystrophy. Proc. Natl. Acad. Sci. U.S.A. 112, 7153–7158 (2015). doi:10.1073/pnas.1507719112 Medline
34. J. Candia, F. Cheung, Y. Kotliarov, G. Fantoni, B. Sellers, T. Griesman, J. Huang, S. Stuccio, A. Zingone, B. M. Ryan, J. S. Tsang, A. Biancotto, Assessment of Variability in the SOMAscan Assay. Sci. Rep. 7, 14248 (2017). doi:10.1038/s41598-017-14755-5 Medline
35. K. J. Max Kuhn, Applied Predictive Modeling (Springer, 2013).
36. J. Barretina, G. Caponigro, N. Stransky, K. Venkatesan, A. A. Margolin, S. Kim, C. J. Wilson, J. Lehár, G. V. Kryukov, D. Sonkin, A. Reddy, M. Liu, L. Murray, M. F. Berger, J. E. Monahan, P. Morais, J. Meltzer, A. Korejwa, J. Jané-Valbuena, F. A. Mapa, J. Thibault, E. Bric-Furlong, P. Raman, A. Shipway, I. H. Engels, J. Cheng, G. K. Yu, J. Yu, P. Aspesi Jr., M. de Silva, K. Jagtap, M. D. Jones, L. Wang, C. Hatton, E. Palescandolo, S. Gupta, S. Mahan, C. Sougnez, R. C. Onofrio, T. Liefeld, L. MacConaill, W. Winckler, M. Reich, N. Li, J. P. Mesirov, S. B. Gabriel, G. Getz, K. Ardlie, V. Chan, V. E. Myer, B. L. Weber, J. Porter, M. Warmuth, P. Finan, J. L. Harris, M. Meyerson, T. R. Golub, M. P. Morrissey, W. R. Sellers, R. Schlegel, L. A. Garraway, The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012). doi:10.1038/nature11003 Medline
37. B. MacLean, D. M. Tomazela, S. E. Abbatiello, S. Zhang, J. R. Whiteaker, A. G. Paulovich, S. A. Carr, M. J. Maccoss, Effect of collision energy optimization on the measurement of peptides by selected reaction monitoring (SRM) mass spectrometry. Anal. Chem. 82, 10116–10124 (2010). doi:10.1021/ac102179j Medline
38. Y. Mohammed, D. Domański, A. M. Jackson, D. S. Smith, A. M. Deelder, M. Palmblad, C. H. Borchers, PeptidePicker: A scientific workflow with web interface for selecting appropriate peptides for targeted proteomics experiments. J. Proteomics 106, 151–161 (2014). doi:10.1016/j.jprot.2014.04.018 Medline
39. B. MacLean, D. M. Tomazela, N. Shulman, M. Chambers, G. L. Finney, B. Frewen, R. Kern, D. L. Tabb, D. C. Liebler, M. J. MacCoss, Skyline: An open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26, 966–968 (2010). doi:10.1093/bioinformatics/btq054 Medline
40. J. A. Vizcaíno, R. G. Côté, A. Csordas, J. A. Dianes, A. Fabregat, J. M. Foster, J. Griss, E. Alpi, M. Birim, J. Contell, G. O’Kelly, A. Schoenegger, D. Ovelleiro, Y. Pérez-Riverol, F. Reisinger, D. Ríos, R. Wang, H. Hermjakob, The PRoteomics IDEntifications (PRIDE) database and associated tools: Status in 2013. Nucleic Acids Res. 41, D1063–D1069 (2013). doi:10.1093/nar/gks1262 Medline
41. M. M. Chan, R. Santhanakrishnan, J. P. C. Chong, Z. Chen, B. C. Tai, O. W. Liew, T. P. Ng, L. H. Ling, D. Sim, K. T. G. Leong, P. S. D. Yeo, H.-Y. Ong, F. Jaufeerally, R. C.-C. Wong, P. Chai, A. F. Low, A. M. Richards, C. S. P. Lam, Growth differentiation factor 15 in heart failure with preserved vs. reduced ejection fraction. Eur. J. Heart Fail. 18, 81–88 (2016). doi:10.1002/ejhf.431 Medline
42. P. G. van Peet, A. J. de Craen, J. Gussekloo, W. de Ruijter, Plasma NT-proBNP as predictor of change in functional status, cardiovascular morbidity and mortality in the oldest old: The Leiden 85-plus study. Age (Dordr.) 36, 9660 (2014). doi:10.1007/s11357-014-9660-1 Medline
43. P. Langfelder, S. Horvath, WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008). doi:10.1186/1471-2105-9-559 Medline
44. E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, A. L. Barabási, Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002). doi:10.1126/science.1073374 Medline
45. S. L. Carter, C. M. Brechbühler, M. Griffin, A. T. Bond, Gene co-expression network topology provides a framework for molecular characterization of cellular state. Bioinformatics 20, 2242–2250 (2004). doi:10.1093/bioinformatics/bth234 Medline
46. M. C. Oldham, S. Horvath, D. H. Geschwind, Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc. Natl. Acad. Sci. U.S.A. 103, 17973–17978 (2006). doi:10.1073/pnas.0605938103 Medline
47. R. Albert, H. Jeong, A.-L. Barabási, Error and attack tolerance of complex networks. Nature 406, 378–382 (2000). doi:10.1038/35019019 Medline
48. R. Albert, A.-L. Barabási, Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002). doi:10.1103/RevModPhys.74.47
49. J.-D. J. Han, N. Bertin, T. Hao, D. S. Goldberg, G. F. Berriz, L. V. Zhang, D. Dupuy, A. J. M. Walhout, M. E. Cusick, F. P. Roth, M. Vidal, Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430, 88–93 (2004). doi:10.1038/nature02555 Medline
50. G. Chauhan, C. R. Arnold, A. Y. Chu, M. Fornage, A. Reyahi, J. C. Bis, A. S. Havulinna, M. Sargurupremraj, A. V. Smith, H. H. H. Adams, S. H. Choi, S. L. Pulit, S. Trompet, M. E. Garcia, A. Manichaikul, A. Teumer, S. Gustafsson, T. M. Bartz, C. Bellenguez, J. S. Vidal, X. Jian, O. Kjartansson, K. L. Wiggins, C. L. Satizabal, F. Xue, S. Ripatti, Y. Liu, J. Deelen, M. den Hoed, S. Bevan, J. C. Hopewell, R. Malik, S. R. Heckbert, K. Rice, N. L. Smith, C. Levi, P. Sharma, C. L. M. Sudlow, A. M. Nik, J. W. Cole, R. Schmidt, J. Meschia, V. Thijs, A. Lindgren, O. Melander, R. P. Grewal, R. L. Sacco, T. Rundek, P. M. Rothwell, D. K. Arnett, C. Jern, J. A. Johnson, O. R. Benavente, S. Wassertheil-Smoller, J.-M. Lee, Q. Wong, H. J. Aparicio, S. T. Engelter, M. Kloss, D. Leys, A. Pezzini, J. E. Buring, P. M. Ridker, C. Berr, J.-F. Dartigues, A. Hamsten, P. K. Magnusson, M. Traylor, N. L. Pedersen, L. Lannfelt, L. Lindgren, C. M. Lindgren, A. P. Morris, J. Jimenez-Conde, J. Montaner, F. Radmanesh, A. Slowik, D. Woo, A. Hofman, P. J. Koudstaal, M. L. P. Portegies, A. G. Uitterlinden, A. J. M. de Craen, I. Ford, J. W. Jukema, D. J. Stott, N. B. Allen, M. M. Sale, A. D. Johnson, D. A. Bennett, P. L. De Jager, C. C. White, H. J. Grabe, M. R. P. Markus, U. Schminke, G. B. Boncoraglio, R. Clarke, Y. Kamatani, J. Dallongeville, O. L. Lopez, J. I. Rotter, M. A. Nalls, R. F. Gottesman, M. E. Griswold, D. S. Knopman, B. G. Windham, A. Beiser, H. S. Markus, E. Vartiainen, C. R. French, M. Dichgans, T. Pastinen, M. Lathrop, V. Gudnason, T. Kurth, B. M. Psaty, T. B. Harris, S. S. Rich, A. L. deStefano, C. O. Schmidt, B. B. Worrall, J. Rosand, V. Salomaa, T. H. Mosley, E. Ingelsson, C. M. van Duijn, C. Tzourio, K. M. Rexrode, O. J. Lehmann, L. J. Launer, M. A. Ikram, P. Carlsson, D. I. Chasman, S. J. Childs, W. T. Longstreth, S. Seshadri, S. Debette; Neurology Working Group of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, the Stroke Genetics Network (SiGN), and the International Stroke Genetics Consortium, Identification of additional risk loci for stroke and small vessel disease: A meta-analysis of genome-wide association studies. Lancet Neurol. 15, 695–707 (2016). doi:10.1016/S1474-4422(16)00102-2
51. E. J. Foss, D. Radulovic, S. A. Shaffer, D. R. Goodlett, L. Kruglyak, A. Bedalov, Genetic variation shapes protein networks mainly through non-transcriptional mechanisms. PLOS Biol. 9, e1001144 (2011). doi:10.1371/journal.pbio.1001144 Medline
52. C. Gaiteri, Y. Ding, B. French, G. C. Tseng, E. Sibille, Beyond modules and hubs: The potential of gene coexpression networks for investigating molecular mechanisms of complex brain disorders. Genes Brain Behav. 13, 13–24 (2014). doi:10.1111/gbb.12106 Medline
53. D. L. Nicolae, E. Gamazon, W. Zhang, S. Duan, M. E. Dolan, N. J. Cox, Trait-associated SNPs are more likely to be eQTLs: Annotation to enhance discovery from GWAS. PLOS Genet. 6, e1000888 (2010). doi:10.1371/journal.pgen.1000888 Medline
54. Å. Johansson, S. Enroth, M. Palmblad, A. M. Deelder, J. Bergquist, U. Gyllensten, Identification of genetic variants influencing the human plasma proteome. Proc. Natl. Acad. Sci. U.S.A. 110, 4673–4678 (2013). doi:10.1073/pnas.1217238110 Medline
55. S. Kim, S. Swaminathan, M. Inlow, S. L. Risacher, K. Nho, L. Shen, T. M. Foroud, R. C. Petersen, P. S. Aisen, H. Soares, J. B. Toledo, L. M. Shaw, J. Q. Trojanowski, M. W. Weiner, B. C. McDonald, M. R. Farlow, B. Ghetti, A. J. Saykin; Alzheimer’s Disease Neuroimaging Initiative (ADNI), Influence of genetic variation on plasma protein levels in older adults using a multi-analyte panel. PLOS ONE 8, e70269 (2013). doi:10.1371/journal.pone.0070269 Medline
56. S. Enroth, A. Johansson, S. B. Enroth, U. Gyllensten, Strong effects of genetic and lifestyle factors on biomarker variation and use of personalized cutoffs. Nat. Commun. 5, 4684 (2014). doi:10.1038/ncomms5684 Medline
57. Y. Liu, A. Buil, B. C. Collins, L. C. Gillet, L. C. Blum, L.-Y. Cheng, O. Vitek, J. Mouritsen, G. Lachance, T. D. Spector, E. T. Dermitzakis, R. Aebersold, Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 11, 786 (2015). doi:10.15252/msb.20145728 Medline
58. B. B. Sun, J. C. Maranville, J. E. Peters, D. Stacey, J. R. Staley, J. Blackshaw, S. Burgess, T. Jiang, E. Paige, P. Surendran, C. Oliver-Williams, M. A. Kamat, B. P. Prins, S. K. Wilcox, E. S. Zimmerman, A. Chi, N. Bansal, S. L. Spain, A. M. Wood, N. W. Morrell, J. R. Bradley, N. Janjic, D. J. Roberts, W. H. Ouwehand, J. A. Todd, N. Soranzo, K. Suhre, D. S. Paul, C. S. Fox, R. M. Plenge, J. Danesh, H. Runz, A. S. Butterworth, Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018). doi:10.1038/s41586-018-0175-2 Medline
59. K. Suhre, M. Arnold, A. M. Bhagwat, R. J. Cotton, R. Engelke, J. Raffler, H. Sarwath, G. Thareja, A. Wahl, R. K. DeLisle, L. Gold, M. Pezer, G. Lauc, M. A. El-Din Selim, D. O. Mook-Kanamori, E. K. Al-Dous, Y. A. Mohamoud, J. Malek, K. Strauch, H. Grallert, A. Peters, G. Kastenmüller, C. Gieger, J. Graumann, Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 8, 14357 (2017). doi:10.1038/ncomms14357 Medline
60. I. Bhattacharya, Z. Manukyan, P. Chan, A. Heatherington, L. Harnisch, Application of Quantitative Pharmacology Approaches in Bridging Pharmacokinetics and Pharmacodynamics of Domagrozumab From Adult Healthy Subjects to Pediatric Patients With Duchenne Muscular Disease. J. Clin. Pharmacol. 58, 314–326 (2018). Medline
61. K. Kondás, G. Szláma, M. Trexler, L. Patthy, Both WFIKKN1 and WFIKKN2 have high affinity for growth and differentiation factors 8 and 11. J. Biol. Chem. 283, 23677–23684 (2008). doi:10.1074/jbc.M803025200 Medline
62. H. Sun, Y. Zhu, H. Pan, X. Chen, J. L. Balestrini, T. T. Lam, J. E. Kanyo, A. Eichmann, M. Gulati, W. H. Fares, H. Bai, C. A. Feghali-Bostwick, Y. Gan, X. Peng, M. W. Moore, E. S. White, P. Sava, A. L. Gonzalez, Y. Cheng, L. E. Niklason, E. L. Herzog, Netrin-1 Regulates Fibrocyte Accumulation in the Decellularized Fibrotic Sclerodermatous Lung Microenvironment and in Bleomycin-Induced Pulmonary Fibrosis. Arthritis Rheumatol. 68, 1251–1261 (2016). Medline
63. G. T. Consortium; GTEx Consortium, The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science 348, 648–660 (2015). doi:10.1126/science.1262110 Medline
64. J. Wang, D. Duncan, Z. Shi, B. Zhang, WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): Update 2013. Nucleic Acids Res. 41, W77–W83 (2013). doi:10.1093/nar/gkt439 Medline
65. J. E. Shoemaker, T. J. S. Lopes, S. Ghosh, Y. Matsuoka, Y. Kawaoka, H. Kitano, CTen: A web-based platform for identifying enriched cell types from heterogeneous microarray data. BMC Genomics 13, 460 (2012). doi:10.1186/1471-2164-13-460 Medline
66. D. Warde-Farley, S. L. Donaldson, O. Comes, K. Zuberi, R. Badrawi, P. Chao, M. Franz, C. Grouios, F. Kazi, C. T. Lopes, A. Maitland, S. Mostafavi, J. Montojo, Q. Shao, G. Wright, G. D. Bader, Q. Morris, The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38 (suppl. 2), W214–W220 (2010). doi:10.1093/nar/gkq537 Medline
67. W. Huang, B. T. Sherman, R. A. Lempicki; W. Huang da, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009). doi:10.1038/nprot.2008.211 Medline
68. D. Welter, J. MacArthur, J. Morales, T. Burdett, P. Hall, H. Junkins, A. Klemm, P. Flicek, T. Manolio, L. Hindorff, H. Parkinson, The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42 (D1), D1001–D1006 (2014). doi:10.1093/nar/gkt1229 Medline