Top Banner
Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution Suzanne Weaver, Simant Dube, Alain Mir, Jian Qin, Gang Sun, Ramesh Ramakrishnan, Robert C. Jones, Kenneth J. Livak * Fluidigm Corporation, 7000 Shoreline Court, Suite 100, South San Francisco, CA 94080, USA article info Article history: Accepted 11 January 2010 Available online 15 January 2010 Keywords: qPCR Real-time PCR Digital PCR Copy number variation CNV Microfluidic array High throughput qPCR abstract This paper assesses the quantitative resolution of qPCR using copy number variation (CNV) as a paradigm. An error model is developed for real-time qPCR data showing how the precision of CNV determination varies with the number of replicates. Using samples with varying numbers of X chromosomes, experi- mental data demonstrates that real-time qPCR can readily distinguish four copes from five copies, which corresponds to a 1.25-fold difference in relative quantity. Digital PCR is considered as an alternative form of qPCR. For digital PCR, an error model is shown that relates the precision of CNV determination to the number of reaction chambers. The quantitative capability of digital PCR is illustrated with an experiment distinguishing four and five copies of the human gene MRGPRX1. For either real-time qPCR or digital PCR, practical application of these models to achieve enhanced quantitative resolution requires use of a high throughput PCR platform that can simultaneously perform thousands of reactions. Comparing the two methods, real-time qPCR has the advantage of throughput and digital PCR has the advantage of simplicity in terms of the assumptions made for data analysis. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction The exponential nature of PCR has had a revolutionary impact on the study of biology. PCR amplification simultaneously addresses both the low quantity of a single copy gene in a sample and the difficulty of specifically detecting that single copy sequence in a highly complex background. The advent of real-time qPCR has added true quantitative ability to the power of PCR. Again, the exponential nature of PCR enables accurate quantification over as many as nine orders of magnitude. Nevertheless, the blessings of exponential amplification become a curse when trying to detect small differences in copy number. Optimized PCRs achieve a doubling of template with each cycle. Correspondingly, qPCR instrument manufacturers specify that their instruments will routinely distinguish 2-fold differences in starting copy number. Because of the doubling per cycle inherent in PCR, distinguishing finer differences than 2-fold requires reliable assessment of fractional cycle differences. This helps explain why literature reports on the limit for qPCR sensitivity range from 1.5- to 2-fold [1,2]. This paper will explore how much better than 2-fold discrimination qPCR can achieve. Copy number variation (CNV) is an attractive application for examining the resolution of relative quantification. Attempts to identify quality metrics for evaluating gene expression measure- ments are often frustrated by the lack of standards with known relative quantities of specific transcripts. The MicroArray Quality Control (MAQC) project [3,4] established a framework to assess whether different platforms and laboratories obtain the same answer, but it did not provide a standard for the correct answer. In addition to the lack of standards, assessment of quantification is complicated by the variability introduced due to differing RNA quality and by the added complexity associated with the reverse transcriptase step used in most RNA detection methodologies. In contrast, the nature of germline DNA copy number variation re- sults in relative quantification that is in integer ratios. Thus, the comparison of a sample with four copies of a sequence to a sample with five copies generates exactly a 1.25-fold difference in relative quantity. Furthermore, variability due to sample processing is reduced because CNV analysis is performed directly on genomic DNA. Here, we use human copy number variants as models [5,6] to assess the quantitative resolution of qPCR. High throughput qPCR analysis has benefited from the develop- ment of microfluidic platforms that enable thousands of reactions in a single experiment [7–9]. The ability to simultaneously perform many reactions also has direct bearing on improving the quantita- tive resolution of qPCR. For real-time qPCR, this paper will present data on the relationship between the number of replicates and the number of copies per genome that can be distinguished. At about the same time that Higuchi was using real-time PCR [10], quantifi- cation of DNA using limiting dilutions, or digital PCR, was also 1046-2023/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.ymeth.2010.01.003 * Corresponding author. Fax: +1 650 871 7152. E-mail address: ken.livak@fluidigm.com (K.J. Livak). Methods 50 (2010) 271–276 Contents lists available at ScienceDirect Methods journal homepage: www.elsevier.com/locate/ymeth
6

Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

May 16, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

Methods 50 (2010) 271–276

Contents lists available at ScienceDirect

Methods

journal homepage: www.elsevier .com/locate /ymeth

Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughputqPCR to enhance quantitative resolution

Suzanne Weaver, Simant Dube, Alain Mir, Jian Qin, Gang Sun, Ramesh Ramakrishnan, Robert C. Jones,Kenneth J. Livak *

Fluidigm Corporation, 7000 Shoreline Court, Suite 100, South San Francisco, CA 94080, USA

a r t i c l e i n f o a b s t r a c t

Article history:Accepted 11 January 2010Available online 15 January 2010

Keywords:qPCRReal-time PCRDigital PCRCopy number variationCNVMicrofluidic arrayHigh throughput qPCR

1046-2023/$ - see front matter � 2010 Elsevier Inc. Adoi:10.1016/j.ymeth.2010.01.003

* Corresponding author. Fax: +1 650 871 7152.E-mail address: [email protected] (K.J. Livak

This paper assesses the quantitative resolution of qPCR using copy number variation (CNV) as a paradigm.An error model is developed for real-time qPCR data showing how the precision of CNV determinationvaries with the number of replicates. Using samples with varying numbers of X chromosomes, experi-mental data demonstrates that real-time qPCR can readily distinguish four copes from five copies, whichcorresponds to a 1.25-fold difference in relative quantity. Digital PCR is considered as an alternative formof qPCR. For digital PCR, an error model is shown that relates the precision of CNV determination to thenumber of reaction chambers. The quantitative capability of digital PCR is illustrated with an experimentdistinguishing four and five copies of the human gene MRGPRX1. For either real-time qPCR or digital PCR,practical application of these models to achieve enhanced quantitative resolution requires use of a highthroughput PCR platform that can simultaneously perform thousands of reactions. Comparing the twomethods, real-time qPCR has the advantage of throughput and digital PCR has the advantage of simplicityin terms of the assumptions made for data analysis.

� 2010 Elsevier Inc. All rights reserved.

1. Introduction

The exponential nature of PCR has had a revolutionary impact onthe study of biology. PCR amplification simultaneously addressesboth the low quantity of a single copy gene in a sample and thedifficulty of specifically detecting that single copy sequence in ahighly complex background. The advent of real-time qPCR has addedtrue quantitative ability to the power of PCR. Again, the exponentialnature of PCR enables accurate quantification over as many as nineorders of magnitude. Nevertheless, the blessings of exponentialamplification become a curse when trying to detect small differencesin copy number. Optimized PCRs achieve a doubling of templatewith each cycle. Correspondingly, qPCR instrument manufacturersspecify that their instruments will routinely distinguish 2-folddifferences in starting copy number. Because of the doubling percycle inherent in PCR, distinguishing finer differences than 2-foldrequires reliable assessment of fractional cycle differences. Thishelps explain why literature reports on the limit for qPCR sensitivityrange from 1.5- to 2-fold [1,2]. This paper will explore how muchbetter than 2-fold discrimination qPCR can achieve.

Copy number variation (CNV) is an attractive application forexamining the resolution of relative quantification. Attempts toidentify quality metrics for evaluating gene expression measure-

ll rights reserved.

).

ments are often frustrated by the lack of standards with knownrelative quantities of specific transcripts. The MicroArray QualityControl (MAQC) project [3,4] established a framework to assesswhether different platforms and laboratories obtain the sameanswer, but it did not provide a standard for the correct answer.In addition to the lack of standards, assessment of quantificationis complicated by the variability introduced due to differing RNAquality and by the added complexity associated with the reversetranscriptase step used in most RNA detection methodologies. Incontrast, the nature of germline DNA copy number variation re-sults in relative quantification that is in integer ratios. Thus, thecomparison of a sample with four copies of a sequence to a samplewith five copies generates exactly a 1.25-fold difference in relativequantity. Furthermore, variability due to sample processing isreduced because CNV analysis is performed directly on genomicDNA. Here, we use human copy number variants as models [5,6]to assess the quantitative resolution of qPCR.

High throughput qPCR analysis has benefited from the develop-ment of microfluidic platforms that enable thousands of reactionsin a single experiment [7–9]. The ability to simultaneously performmany reactions also has direct bearing on improving the quantita-tive resolution of qPCR. For real-time qPCR, this paper will presentdata on the relationship between the number of replicates and thenumber of copies per genome that can be distinguished. At aboutthe same time that Higuchi was using real-time PCR [10], quantifi-cation of DNA using limiting dilutions, or digital PCR, was also

Page 2: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

272 S. Weaver et al. / Methods 50 (2010) 271–276

reported [11,12]. In fact, digital PCR can be thought of as endpointqPCR. High throughput platforms now make it practical to usedigital PCR to detect small differences in copy number [13–17].For digital PCR, the relationship that will be presented is betweenthe number of reaction chambers and the number of copies pergenome that can be distinguished. In addition, this paper will com-pare the quantitative capabilities of real-time qPCR and digital PCR.

2. Description of method

2.1. Measurement of CNV using real-time qPCR

2.1.1. Statistical analysisA straightforward way for determining CNV using real-time PCR

is to use the 2�DDCq method [18–21]. This method uses a target as-say (T) for the DNA segment being interrogated for copy numbervariation and a reference assay (R) for an internal control segment,which is typically a known single copy gene. For the case wheretarget and reference assays are run in separate reaction chambers,the first step in determining relative copy number is to calculatethe average Cq values for each sample for the target and referenceassays. The standard deviation (r) of the average value and thenumber of measurements (n) that determine the mean is used tocalculate the Standard Error of the Mean (SEM).

The best estimate for the Cq of the target assay is:

average Cq;T �rTffiffiffiffiffinTp ¼ Cq;T � SEMT ð1Þ

The best estimate for the Cq of the reference assay is:

average Cq;R �rRffiffiffiffiffiffinRp ¼ Cq;R � SEMR ð2Þ

The DCq value (Cq,T � Cq,R) is a measure of the copy number of the tar-get segment relative to the reference segment. The use ofDCq normal-izes for differences in input concentration when comparing differentsamples. The uncertainty in DCq is the square root of the quadraticsum of the uncertainties in the individual Cq values, assuming theuncertainties are independent and random [22]. The uncertaintiesare independent because each set of Cq values is derived from individ-ual, independent reactions. In addition, we observe no significant sys-tematic bias in the qPCR replicates or platform that would indicatethe assumption about being random is invalid.

SEMDCq ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiSEM2

T þ SEM2R

qð3Þ

The next step in determining relative copy number is to calibrateeach DCq value to a sample with a known copy number for the tar-get segment. Typically, the copy number of the target segment inthe calibrator sample (C) is single copy per haploid genome (twocopies per diploid genome). Calibration is done by calculating DDCq

DDCq ¼ ðDCq � DCq;cÞ �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiSEM2

DCqþ SEM2

DCq;c

q¼ DDCq � SEMDDCq ð4Þ

Assuming the efficiencies of the target and reference assays are sim-ilar and close to 1 [18], relative copy number (RCN) is calculatedfrom the DDCq value using the formula:

RCN ¼ 2�DDCq ð5Þ

For the case where the calibrator has two copies of the target seg-ment per diploid genome:

# of copies per diploid genome ¼ 2 � 2�DDCq ð6Þ

The SEMDDCq value can be used to calculate the 95% confidenceinterval for each RCN value. The 95% confidence interval for DDCq

is given by �t � SEMDDCq , where t is the appropriate critical value

for a t-distribution (two-tailed test with p = 0.05). Thus, the 95%upper and lower bounds for RCN are:

RCNmax ¼ 2�DDCqþt�SEMDDCq ð7ÞRCNmin ¼ 2�DDCq�t�SEMDDCq ð8Þ

These equations assume that the DDCq values are normally distrib-uted. When a relatively large number of replicates are being used todetermine DDCq, this seems a reasonable assumption based on thecentral limit theorem. The critical t value is used to compensate forexperiments with fewer replicates.

As shown in Fig. 1A, these equations can be used to generate amodel relating the 95% confidence limit range to the number ofreplicates per assay per sample. A 95% confidence interval (CI)means that there is a 5% chance that the measured value usingthe designated number of replicates will be outside the 95% CI. Thismeans there is a 2.5% chance that the value will be greater than themaximum value and a 2.5% chance the value will be lower than theminimum value. For example, at the point where the 4-copy CI and5-copy CI cross (copy number of approx. 4.47), there is a 97.5%chance that the measured value for the 4-copy sample will be lessthan the crossover value and a 97.5% chance the measured valuefor the 5-copy sample will be greater than the crossover value.Thus, the probability that both the 4-copy value is less than thecrossover value and the 5-copy value is greater than the crossovervalue is 0.975 � 0.975 = 0.950625. Assuming a system r of 0.16,the crossover points in Fig. 1A indicate that it should be possibleto distinguish, with at least 95.1% probability, one copy from twocopies with 5 replicates; two copies from three copies with 8 rep-licates; three copies from four copies with 12 replicates; and fourcopies from five copies with 18 replicates. At a system r of 0.25,similar analysis indicates that one copy can be distinguished fromtwo copies with 7 replicates; two copies from three copies with 14replicates; three copies from four copies with 26 replicates; andfour copies from five copies with 40 replicates.

Kubista et al. [23] describe how to use the Power test to esti-mate the number of replicates needed in order to detect a certaindifference in copy number. The analysis that is equivalent to theanalysis depicted in Fig. 1A is a one-tailed t-test with 2.5% falsepositives (significance) and 2.5% false negatives (power). Fig. 1Bshows the results of this Power test. This analysis indicates that,with a system r of 0.16, it should be possible to distinguish onecopy from two copies with 4 replicates; two copies from three cop-ies with 6 replicates; three copies from four copies with 11 repli-cates; and four copies from five copies with 17 replicates. At asystem r of 0.25, the analysis indicates that one copy can be distin-guished from two copies with 6 replicates; two copies from threecopies with 13 replicates; three copies from four copies with 24replicates; and four copies from five copies with 39 replicates.These numbers are slightly smaller than the number of replicatesindicated by the analysis of Fig. 1A. This is because the Power testis set to satisfy the condition that the measured value of the lowercopy number sample will be lower than the measured value of thehigher copy number sample, which is slightly less stringent thanrequiring that the lower copy value is less than the crossover valueand the higher copy value is greater than the crossover value. Also,the significance of the analysis of Fig. 1A is slightly higher than95%.

2.1.2. ExampleHuman genomic DNA samples were obtained from Coriell that

come from cell lines containing one, two, three, four, or five X chro-mosomes. These samples were analyzed using a target assay forthe YY2 gene on the X chromosome and a reference assay for theRPPH1 gene on chromosome 14. As shown in Fig. 2, these assayswere checked in order to confirm that their efficiencies are similar

Page 3: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

Fig. 1. Error model for real-time qPCR relating 95% confidence interval for copy number determination versus number of replicates. From Eqs. (1)–(4):

SEMDDCq ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffir2

T

nTþ r2

R

nRþ

r2T;C

nT;Cþ

r2R;C

nR;C

sð9Þ

If all standard deviations are assumed to be equal to the system standard deviation r and the same number of replicates (n) are used for the target and reference assays acrossall samples including the calibrator, then:

SEMDDCq ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffir2 þ r2 þ r2 þ r2p

ffiffiffinp ¼

ffiffiffiffiffiffiffiffiffi4r2pffiffiffinp

¼ 2rffiffiffinp ð10Þ

This value for SEMDDCq is plugged into Eqs. (7) and (8) and evaluated for a specific value of r and differing values of n. (A) Plot of 2 � RCNmin and 2 � RCNmax versus number ofreplicates for r = 0.16 and DDCq = �1, 0, 0.585, 1, and 1.322 (copy number per genome of 1, 2, 3, 4, and 5 assuming reference sequence is present at 2 copies per genome). (B)Plot of values generated using the ‘Exp. Design’ tab of GenEx Enterprise software (MultiD Analyses AB, Göteborg, Sweden). The settings used were: ‘Number of samples ineach group’ = 2–48; ‘Power (%)’ = 97.5; ‘Significance’ = 95%, 2 tail; ‘Type of test’ = Unpaired; ‘SD Estimated’ = Yes (t-test). This is equivalent to a significance of 97.5%, 1 tail, anentry the software does not allow. For system r = 0.16, the values entered in ‘SD (Group A)’ and ‘SD (Group B)’ were 0.32. This is because, if r = 0.16 for Cq values, then the r

for DDCq values isffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið0:16Þ2 þ ð0:16Þ2 þ ð0:16Þ2 þ ð0:16Þ2

q¼ 0:32. For system r = 0.25, the values entered in ‘SD (Group A)’ and ‘SD (Group B)’ were 0.5. Distinguishing one

copy from two copies corresponds to a difference in DDCq values of 1.0; distinguishing two from three corresponds to a difference of 0.585; distinguishing three from fourcorresponds to a difference of 0.415; and distinguishing four from five corresponds to a difference of 0.322. The red line indicates a difference value of 0.322, which is thedifference required to distinguish four and five copies.

S. Weaver et al. / Methods 50 (2010) 271–276 273

and close to 1. Copy number qPCR analysis was performed in aFluidigm� 96.96 Dynamic Array integrated fluidic circuit (IFC). Inthis device, up to 96 samples and 96 assays can be loaded intothe matrix array to create 9216 individual reactions. For this anal-ysis of copy number of an X-lined gene, each sample was pipettedinto 19 Sample Inlets and each assay was pipetted into 24 Assay In-lets for a total of 19 � 24 = 456 replicates. The range of standarddeviations observed for YY2 Cq measurements across the five sam-ples was 0.035–0.066, and for RPPH1 Cq measurements was 0.070–0.096. Table 1 shows the measured copy number for the YY2 generelative to the RPPH1 gene for the five samples. Because of the verylarge number of replicates, the 95% confidence intervals for thesevalues are very small, demonstrating that qPCR can deliver veryprecise results.

In order to explore the applicability of the error model in Fig. 1,the experimental data set was used to see how precision of theestimate of copy number varies with the number of replicates. Thiswas done by randomly selecting Cq values from each set of 456 val-ues to create ‘‘samples” with a lower number of replicates. The re-sults of one such trial are shown in Fig. 3. In this case, 4 replicateswere able to distinguish one copy from two copies and two copiesfrom three copies; 7 replicates were able to distinguish three cop-ies from four copies; and 12 replicates were able to distinguish fourcopies from five copies. Fewer replicates were required than pre-dicted by the model in Fig. 1A because the system r is considerablyless than 0.16. In order to examine the robustness of the method,the analysis was repeated but this time each set of replicates al-ways included the minimum and maximum value out of all 456values. The other replicates in each set were again selected ran-domly. The results of one trial where the minimum and maximumvalues were deliberately included were that one copy was distin-guished from two copies with 6 replicates; two copies from three

copies with 7 replicates; three copies from four copies with 16 rep-licates; and four copies from five copies with 24 replicates. Theseresults indicate that the error model illustrated in Fig. 1 providesa reasonable expectation of the precision that can be obtained withreal-time qPCR.

The derivation of the error model focuses on the precision ofqPCR, but does not directly address accuracy. The results in Table1 show that the accuracy of copy number determination for theone-X, three-X, and four-X samples is quite good. For the five-Xsample, though, the measured copy number is 4.56 versus an ex-pected value of 5. This indicates the existence of an unidentifiedsystematic error that is impairing accuracy for this particular sam-ple. For example, if the five copies of YY2 in the five-X sample arenot identical and if the sequence variation in one or more of thesecopies adversely affects the performance of the specific assay usedin this study, then the measured Cq value for YY2 will be higherthan the value expected if all five copies were identical. In this case,these results do clearly distinguish four copies from five copies, butdo not accurately report the true degree of separation. Anotherpossibility is that the karyotype of 5 X chromosomes is not com-pletely stable in cell culture so that the DNA was extracted froma mixture of cells with 4 and 5 X chromosomes. If this were thecase, then the results in Table 1 could very well accurately reportthe copy number in the experimental samples.

2.2. Measurement of CNV using digital PCR (‘‘endpoint qPCR”)

2.2.1. Statistical analysisIn digital PCR, a number of target reactions are performed at

low template input such that some reaction chambers are positiveand some chambers are negative. The absolute concentration ofany target sequence (in molecules/lL) can be calculated by count-

Page 4: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

Fig. 2. Standard curves for the YY2 and RPPH1 assays. The 20� YY2 assay consists ofprimers (CAGTACGAGGATGTGGATGGC and CCTCTTGTGTCTGCAACATAAGCobtained from IDT) at 18 lM each and a hydrolysis probe (FAM-TTCCTGGTCGTGGTCGCCATAGCC-BHQ obtained from Biosearch) at 4 lM. RPPH1 isthe gene encoding the RNA component H1 of RNase P. The 20� RPPH1 assay wasobtained from Applied Biosystems (4316831) and consists of primers at 18 lM eachand a FAM-labeled hydrolysis probe at 5 lM. Preamplification was performed in50 lL reactions containing 4.4 lg genomic DNA, 1� TaqMan� PreAmp Master Mix(Applied Biosystems 4391128) and pooled assays at 0.05� each. Reactions wereincubated for 10 min at 95 �C followed by 10 cycles of 15 s at 95 �C/4 min at 60 �C,then diluted 1:5 with H2O. Eight 2-fold dilutions of this preamplified material werethen analyzed separately with each assay in a 96.96 Dynamic Array IFC (Fluidigm).Final reaction conditions consisted of 1� TaqMan Universal PCR Master Mix(Applied Biosystems 4304437), 1� YY2 or RPPH1 assay, 1� GE Sample LoadingReagent (Fluidigm 85000735), and 0.1� Assay Loading Reagent (Fluidigm85000736). The array was analyzed in the BioMark™ real-time PCR system(Fluidigm) using a thermal protocol of 2 min at 50 �C, 30 min at 70 �C, 10 min at25 �C, 2 min at 50 �C, 10 min at 95 �C, followed by 40 cycles of 15 s at 95 �C/1 min at60 �C. The Cq values plotted are the average of at least 136 replicates. Efficiency wascalculated using the formula

efficiency ¼ 10�1

slope � 1 ð11Þ

The slope and 1-sigma interval were estimated using weighted regression.

Fig. 3. Copy number analysis using real-time qPCR with variable numbers ofreplicates. The data used to generate the results in Table 1 (456 replicates for eachassay for each sample) were used to create simulated samples that had variablenumbers of replicates. For any particular assay and sample, the 456 Cq values werere-ordered using a random number generator. The top n values were assigned to thesimulated sample with n replicates. The list of Cq values was re-ordered with a freshset of random numbers prior to each selection. Copy number was calculated usingEq. (6). Results are shown for samples with one, two, three, four, or five copies of theX chromosome. Error bars show 95% confidence intervals calculated using Eqs. (7)and (8).

274 S. Weaver et al. / Methods 50 (2010) 271–276

ing the number of positive chambers, applying a correction forPoisson distribution, and dividing by the total volume of all thechambers. The relevant parameters for characterizing digital PCRare k, the average number of target molecules per chamber, andp, the probability that a chamber has at least one target moleculeand thus gives a positive PCR. When the number of chambers islarge, Sindelka et al. [24] report that k and p are related by theequation:

1� p ¼ e�k ð12Þ

or,

k ¼ � lnð1� pÞ ð13Þ

The best estimate for p is calculated by dividing the number of posi-tive chambers (H, for hits) by the total number of chambers (C). If Cis large enough, the 95% confidence limits for p andk are given by [17]:

Table 1Copy Number of an X-Linked Gene Determined by real-time qPCR.a

# of X chromosomes One Two (calibrator)

Measured copy # of YY2 1.06 295% Confidence interval 1.054–1.076 1.981–2.020

a X copy variant samples were purchased from Coriell (NA18515, NA18968, NA0623Preamplification and real-time qPCR analysis were performed as described in the legendsample. Copy number was calculated using Eq. (6). 95% confidence intervals were calcu

pmin;max ¼ p� 1:96 �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipð1� pÞ

c

rð14Þ

kmin ¼ � lnð1� pminÞ and kmax ¼ � lnð1� pmaxÞ ð15Þ

Fig. 4 shows how the 95% confidence interval kmax � kmin varieswith different values for k and p. It can be seen that the leastamount of relative error in determining k occurs when there areapproximately 1.5 target molecules per chamber, which corre-sponds to approximately 80% positive chambers.

Because quantification using digital PCR is based on an absolutemeasurement of the number of molecules, there is no need to cal-ibrate the results obtained to a sample with known target copynumber. Thus, relative copy number (RCN) is determined by com-paring the results of the target assay directly to the results of thereference assay for each sample. For the case where the referencesequence is present at two copies per diploid genome:

# of target copies per diploid genome ¼ 2 � kT

kRð16Þ

The 95% confidence intervals for kT and kR can be determined usingEqs. (14) and (15). As shown in Fig. 5, these intervals can be used togenerate a model relating the 95% confidence limit range for kr

kRto

the number of reaction chambers. Fig. 5 indicates that one copy canbe distinguished from two copies using approximately 200 cham-bers; two copies from three copies using approximately 400 cham-bers; three copies from four copies using approximately 800chambers; four copies from five copies using approximately 1200chambers, and so on. The final comparison in Fig. 5 indicates that

Three Four Five

3.16 3.92 4.563.132–3.188 3.884–3.960 4.516–4.600

, NA01416, and NA06061 for one, two, three, four, and five copies, respectively).to Fig. 2 except only one concentration of preamplified DNA was analyzed for each

lated using Eqs. (7) and (8) with data from 456 replicates.

Page 5: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

Fig. 4. Relative error in the digital PCR determination of concentration as a function of number of molecules per chamber or fraction of positive chambers. The fraction ofpositive chambers, p, was evaluated for k (molecules per chamber) values ranging from 0.1 to 4 using Eq. (12). pmin and pmax were calculated using Eq. (14) and a value ofC = 770 chambers. kmin and kmax were calculated using Eq. (15). Relative error was determined by dividing the 95% confidence interval kmax � kmin by k. (A) Relative errorplotted versus number of molecules per chamber. (B) Relative error plotted versus fraction of positive chambers.

S. Weaver et al. / Methods 50 (2010) 271–276 275

10 copies can be distinguished from 11 copies using approximately8000 chambers.

2.3. Example

Hosono et al. [6] characterized copy number variation affectingthe MRGPRX1 gene on human chromosome 11. Genomic DNA sam-ples were obtained from Coriell that come from cell lines contain-ing four or five copies of MRGPRX1. These samples were analyzedby digital PCR using an assay for MRGPRX1 with a FAM-labeledprobe and an assay for RPPH1 with a VIC-labeled probe. Differen-tially labeled probes were used so that the target and reference as-says could be run mixed together. Analysis was performed in theFluidigm 48.770 Digital Array IFC. In this device, the reactionloaded into each of 48 Inlets is subdivided into a panel of 770

Fig. 5. Error model for digital PCR relating 95% confidence interval for copy numberdetermination versus number of reaction chambers. Let relative copy number r ¼ kT

kR.

As derived in Dube et al. [17], the 95% confidence limits for r are given by:

rmin ¼kT � kR �

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffik2

T � k2R � ððkT � kT;minÞ2 � k2

TÞ � ððkR;max � kRÞ2 � k2RÞ

qk2

R � ðkR;max � kRÞ2ð17Þ

rmax ¼kT � kR þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffik2

T � k2R � ððkT;max � kTÞ2 � k2

TÞ � ððkR � kR;minÞ2 � k2RÞ

qk2

R � ðkR � kR;minÞ2ð18Þ

In order to evaluate these expressions, it is assumed that the concentration ofgenomic DNA is held constant at a level that corresponds to kR = 0.6, i.e., 0.6 copiesof the reference sequence per chamber. Plot of 2 � rmin and 2 � rmax versus numberof reaction chambers for r = 0.5, 1, 1.5, 2, 2.5, 3. 3.5. 4, 4.5, 5 and 5.5 (copy numberper genome of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11 assuming reference sequence ispresent at 2 copies per genome).

chambers. Fig. 6 shows the copy number results when the reac-tions for each sample were loaded into 24 Inlets. Using the datafrom all the panels, the measured copy numbers for the four-copyand five-copy samples are 4.12 (95% confidence interval 3.94–4.30)and 5.00 (95% confidence interval 4.84–5.17), respectively. Theapplicability of the error model in Fig. 5A was examined by ran-

Fig. 6. Copy number of MRGPRX1 determined by digital PCR. Genomic DNA sampleswere purchased from Coriell containing either four copies (NA19221) or five copies(NA19205) of MRGPRX1. The 20� MRGPRX1 assay consists of primers (TTAAGCTT-CATCAGTATCCCCCA and CAAAGTAGGAAAACATCATCACAGGA obtained from IDT) at18 lM each and a hydrolysis probe (FAM-ACCATCTCTAAAATCCT-MGB obtainedfrom Applied Biosystems) at 4 lM. The 20� RPPH1 assay was obtained from AppliedBiosystems (4316844) and consists of primers at 18 lM each and a VIC�-labeledhydrolysis probe at 5 lM. Preamplification was performed in 50 lL reactionscontaining 750 ng genomic DNA, 1� TaqMan PreAmp Master Mix and pooled assaysat 0.05� each. Reactions were incubated for 10 min at 95 �C followed by 5 cycles of15 s at 95 �C/2 min at 60 �C, then diluted 1:50 with H2O. Digital PCR analysis wasperformed in a 48.770 Digital Array IFC (Fluidigm) with the four-copy sampleloaded in 24 reaction inlets and the five-copy sample loaded in 24 reaction inlets.Final reaction conditions consisted of 1� TaqMan Gene Expression PCR Master Mix(Applied Biosystems 4369016), 1� MRGPRX1 assay (FAM probe), 1� RPPH1 assay(VIC probe), and 1� GE Sample Loading Reagent. The array was analyzed in theBioMark real-time PCR system using a thermal protocol of 2 min at 50 �C, 10 min at95 �C, followed by 40 cycles of 15 s at 95 �C/1 min at 60 �C. Reactions wereevaluated using endpoint data. Chambers were called positive if the Rn value wasgreater than an empirically selected threshold. FAM and VIC results were evaluatedindependently. kMRGPRX1 was calculated using Eq. (13) with p = the total number ofFAM hits divided by the total number of chambers. kRPPH1 was calculated using Eq.(13) with p = the total number of VIC hits divided by the total number of chambers.Copy number is 2 � kMRGPRX1

kRPPH1. Error bars show 95% confidence intervals calculated

using Eqs. (17) and (18). Panels were selected randomly to generate the resultsdisplayed for 1, 2, 4, 8, and 16 panels.

Page 6: Taking qPCR to a higher level: Analysis of CNV reveals the power of high throughput qPCR to enhance quantitative resolution

276 S. Weaver et al. / Methods 50 (2010) 271–276

domly selecting smaller numbers of panels to determine copynumbers. The results of one trial shown in Fig. 6 indicate thatthe data from 4 panels (4 � 770 = 3080 chambers) are required todistinguish four and five copies. This number is larger than theapproximately 1200 chambers predicted by the model in Fig. 5Abecause the template concentrations used for the experimentaldata were lower than the concentration used to generate the mod-el. The Fig. 5 model assumes a kR of 0.6 reference molecules perchamber. For the experiment presented in Fig. 6, kR for the four-copy sample was approximately 0.18 and kR for the five-copy sam-ple was approximately 0.37.

3. Concluding remarks

Although CNV was used as the example in this paper, the mod-els for quantitative resolution apply equally well to gene expres-sion studies. Thus, real-time qPCR can be used to routinelydistinguish 1.25-fold differences in gene expression as long asone is willing to run 18 (system r = 0.16) to 40 (system r = 0.25)replicates. In fact, the error model in Fig. 1 predicts that a 1.1-folddifference (10 copies from 11 copies) can be distinguished by run-ning 86 replicates if the system r is 0.16. System r includes samplepreparation, assay performance, and instrument platform, whichwere not addressed in this paper. The upstream processes involvedin preparing samples and assays for qPCR can result in large vari-ation, so it is important to consider including more than just qPCRtechnical replicates in any study [25]. If the system r can be veri-fied to be less than 0.16, then lower numbers of replicates will berequired. For digital PCR with the reference template loaded at 0.6molecules per chamber, the number of chambers required toachieve 1.25-fold discrimination is approximately 1200 chambers,and the number of chambers required to achieve 1.1-fold discrim-ination is approximately 8000 chambers.

The basic finding of this paper is that the quantitative resolutionof qPCR can be enhanced to 1.25-fold, or even 1.1-fold, by runninga large number of reactions, either as replicates for real-time qPCRor independent chambers for digital PCR. Although it is possible toachieve this improved quantitative performance using conven-tional tubes or plates, it is not practical because of the reagent ex-pense involved, the cumbersome workflow engendered by thepipetting required, and the increased chance of introducing vari-ability due to the cumbersome workflow. Thus, high throughput,microfluidic platforms for qPCR are enabling for achieving the ut-most in quantitative resolution.

The main advantage of using real-time qPCR over digital PCR isthroughput. Consider the case where at least 18 replicates arebeing run in order to achieve 1.25-fold discrimination. In the Flui-digm 96.96 Dynamic Array IFC, 95 of the 96 Assay Inlets can beused to run 5 different assays with 19 replicates each. Four of theseassays could be for separate targets and one assay would be a com-mon reference assay. Thus, a single 96.96 array can be used to ana-lyze 96 samples for 4 targets for a total of 96 � 4 = 384determinations with 95% confidence. For digital PCR, 1.25-fold dis-crimination can be achieved with approximately 1200 chambers,

which corresponds to two panels in a Fluidigm 48.770 Digital ArrayIFC. Mixing the target and reference assays together, the 48 Reac-tion Inlets of the 48.770 array can be used to run 24 different sam-ples with two panels per sample. Thus, the throughput of a single48.770 array is 24 samples for 1 target for a total of 24determinations.

The advantages of digital PCR over real-time qPCR are the op-tion to run as an endpoint assay and the simplicity of analysis. Thissimplicity stems from the fact that quantification is based on thecounting of all-or-none events. Thus, obtaining quantitative resultsis much less dependent on assumptions about assay efficiencies orthe particulars of threshold setting. Also, there is no need to com-pare the results of each sample to a calibrator sample. The simplic-ity of analysis translates into fewer opportunities to introduceerror or noise, which improves the likelihood of obtaining an accu-rate answer.

References

[1] B. Zimmermann, W. Holzgreve, F. Wenzel, S. Hahn, Clin. Chem. 48 (2002) 362–363.

[2] B. Bubner, K. Gase, I.T. Baldwin, BMC Biotechnol. 4 (2004) 14–23.[3] L. Shi et al., Nat. Biotechnol. 24 (2006) 1151–1161.[4] R.D. Canales et al., Nat. Biotechnol. 24 (2006) 1115–1122.[5] D. Pinkel, R. Segraves, D. Sudar, S. Clark, J. Poole, D. Kowbel, C. Collins, W.-L.

Kuo, C. Chen, Y. Zhai, S.H. Dairkee, B. Ljung, J.W. Gray, D.G. Albertson, Nat.Genet. 20 (1998) 207–211.

[6] N. Hosono, M. Kubo, Y. Tsuchiya, H. Sato, T. Kitamoto, S. Saito, Y. Ohnishi, Y.Nakamura, Human Mutat. 29 (2008) 182–189.

[7] T. Morrison, J. Hurley, J. Garcia, K. Yoder, A. Katz, D. Roberts, J. Cho, T. Kanigan,S.E. Ilyin, D. Horowitz, J.M. Dixon, C.J.H. Brenan, Nucleic Acids Res. 34 (2006)e123–e131.

[8] A. Dahl, M. Sultan, A. Jung, R. Schwartz, M. Lange, M. Steinwand, K.J. Livak, H.Lehrach, L. Nyarsik, Biomed. Microdevices 9 (2007) 307–314.

[9] S.L. Spurgeon, R.C. Jones, R. Ramakrishnan, PLoS ONE 3 (2008) e1662–e1668.

[10] R. Higuchi, C. Fockler, G. Dollinger, R. Watson, Nat. Biotechnol. 11 (1993)1026–1030.

[11] P. Simmonds, P. Balfe, J.F. Peuherer, C.A. Ludlam, J.O. Bishop, A.J. Leigh Brown, J.Virol. 64 (1990) 865–872.

[12] P.J. Sykes, S.H. Neoh, M.J. Brisco, E. Hughes, J. Condon, A.A. Morley,BioTechniques 13 (1992) 444–449.

[13] B. Vogelstein, K.W. Kinzler, Proc. Natl. Acad. Sci. USA 96 (1999) 9236–9241.

[14] J. Jarvius, J. Melin, J. Göransson, J. Stenberg, S. Fredriksson, C. Gonzalez-Rey, S.Bertilsson, M. Nilsson, Nat. Methods 3 (2006) 725–727.

[15] Y.M.D. Lo, F.M.F. Lun, K.C.A. Chan, N.B.Y. Tsui, K.C. Chong, T.K. Lau, T.Y. Leung,B.C.Y. Zee, C.R. Cantor, R.W.K. Chiu, Proc. Natl. Acad. Sci. USA 104 (2007)13116–13121.

[16] J. Qin, R.C. Jones, R. Ramakrishnan, Nucleic Acids Res. 36 (2008) e116–e123.[17] S. Dube, J. Qin, R. Ramakrishnan, PLoS ONE 3 (2008) e2876–e2883.[18] K.J. Livak, T.D. Schmittgen, Methods 25 (2001) 402–408.[19] E. Schäffeler, M. Schwab, M. Eichelbaum, U.M. Zanger, Human Mutat. 22

(2003) 476–485.[20] Y.L. Wu, S.L. Savelli, Y. Yang, B. Zhou, B.H. Rovin, D.J. Birmingham, H.N.

Nagaraja, L.A. Hebert, C.Y. Yu, J. Immunol. 179 (2007) 3012–3025.[21] J.H. Lee, J.T. Jeon, Cytogenet. Genome Res. 123 (2008) 333–342.[22] J.R. Taylor, An Introduction to Error Analysis, University Science Books, New

York, 1982.[23] M. Kubista, J. Eliasson, M. Lennerås, S. Andersson, R. Sjöback, Eppendorf

BioNews 29 (2008) 7–8.[24] R. Sindelka, J. Jonák, R. Hands, S.A. Bustin, M. Kubista, Nucleic Acids Res. 36

(2008) 387–392.[25] A. Tichopad, R. Kitchen, I. Riedmaier, C. Becker, A. Ståhlberg, M. Kubista, Clin.

Chem. 55 (2009) 1816–1823.