A robust internal control for high-precision DNA methylation analyses by droplet digital PCR

Background Droplet digital PCR (ddPCR) allows absolute quantification of nucleic acids and has potential for improved non-invasive detection of DNA methylation. For increased precision of the methylation analysis, we aimed to develop a robust internal control for use in methylation-specific ddPCR. Methods Two control design approaches were tested: (a) targeting a genomic region shared across members of a gene family and (b) combining multiple assays targeting different pericentromeric loci on different chromosomes. Through analyses of 34 colorectal cancer cell lines, the performance of the control assay candidates was optimized and evaluated, both individually and in various combinations, using the QX200™ droplet digital PCR platform (Bio-Rad). The best-performing control was tested in combination with assays targeting methylated CDO1, SEPT9, and VIM. Results A 4Plex panel consisting of EPHA3, KBTBD4, PLEKHF1, and SYT10 was identified as the best-performing control. The use of the 4Plex for normalization reduced the variability in methylation values, corrected for differences in template amount, and diminished the effect of chromosomal aberrations. Positive Droplet Calling (PoDCall), an R-based algorithm for standardized threshold determination, was developed, ensuring consistency of the ddPCR results. Conclusion Implementation of a robust internal control, i.e., the 4Plex, and an algorithm for automated threshold determination, PoDCall, in methylation-specific ddPCR increase the precision of DNA methylation analysis. Electronic supplementary material The online version of this article (10.1186/s13148-018-0456-5) contains supplementary material, which is available to authorized users.


Background
Digital PCR (dPCR) enables absolute quantification of nucleic acids. The principle behind the method was described already in 1992 [1], but its use was for many years hampered by lack of suitable protocols and instrumentation. Technology development during the last decade has led to several commercial systems for dPCR, resulting in a rapid increase in the publication rate of dPCR studies [2,3]. With the concomitant increase in liquid biopsy analyses for cancer screening, for detection of minimal residual disease after surgery, and for monitoring cancer patients, the need for high-precision analyses of circulating tumor-derived nucleic acid molecules is obvious, but not necessarily implemented.
With the dPCR technology, a PCR mixture can be randomly divided into a large number of partitions. Individual PCRs are performed inside each partition, and based on the fraction of fluorescence-positive partitions, the absolute quantity of the target can be calculated [2]. One of the most commonly used platforms is the droplet digital PCR (ddPCR), where the partitions are represented by thousands of nanoliter-scale droplets, formed by water-in-oil emulsion [4][5][6].
The sample partitioning inherent for ddPCR considerably reduces the competition from any background DNA, allowing detection of minimal amounts of a target of interest. The sensitivity is in principle only limited by the number of droplets analyzed [4], and the method has been demonstrated to trace one mutated gene copy in the background of 200,000 wild-type molecules [7]. This makes ddPCR particularly valuable for analyses of various types of non-invasive biomarkers, such as detection of KRAS mutations in the blood of colorectal cancer patients, predicting lack of response to targeted treatment [8], screening for metastatic breast cancer by small increases in HER2 copy number in plasma samples [9], gene expression analyses to detect hepatocellular carcinoma from circulating tumor cells [10], and detection of bladder cancer among hematuria patients [11].
Although the ddPCR technology has great potential also for DNA methylation analyses, only few studies have been published so far [12][13][14][15][16]. The lack of consensus regarding how to perform standardized experiments might be a contributing factor. Generation of consistent methylation data is dependent on the use of a suitable control for normalization, as previously demonstrated for other PCR-based DNA methylation analyses [17,18]. The aims of the present study were to develop a robust internal control for ddPCR DNA methylation analyses and demonstrate its value in terms of increased precision of the normalized methylation data.

Bisulfite conversion
The EpiTect Bisulfite Kit (Qiagen) was used for bisulfite conversion of 1.3 μg DNA according to the manufacturers' standard protocol. After conversion in the MJ Mini Personal Thermal Cycler (Bio-Rad Laboratories), the samples were automatically purified and eluted in 40 μl elution buffer by the QIAcube System (Qiagen).

Design and development of candidate internal controls
With the aim of developing a control for methylationspecific ddPCR that targeted multiple non-CpG containing loci located on different chromosomes, two approaches were tested. In the first approach, "A," a common sequence shared by several members of a gene family (the Aldolase A family; ALDOA, and the Cytochrom C family; CYCS) was targeted. This approach implied introduction of only one control assay into the target gene reaction, with the rationale of reducing the chances of interference with target amplification. In the second approach, "B," multiple assays, targeting different loci in the exonic part of various genes located close to the centromeres (n = 13; ALDH1B1, ANKRD30A, EPHA3, HAO2, IGFBPL1, ITGAD, KBTBD4, MRPS5, NIPA2, PLEKHF1, SAMSN1, SYT10, and TTC5), were designed and tested in different combinations. These control gene candidates were chosen from a list of pericentromeric reference genes, previously suggested by Bio-Rad to be stable in regard to copy number variations [21]. This approach implied introduction of several control assays into the target gene reaction (see Additional file 1: Table S1 for assay sequences and their chromosomal locations). The best-performing control (VIC-labeled) was tested in combination with assays targeting methylated CDO1, SEPT9, and VIM (FAM-labeled), through ddPCR analyses of 34 colorectal cancer cell lines. Finally, the performance of the control was compared to two previously published single locus controls, ACTB [22] and C-LESS [23].

Digital droplet PCR
The QX200™ Droplet Digital™ PCR System (Bio-Rad) was used for analyses. The ddPCR reaction mixture consisted of 1x ddPCR Supermix for Probes (Bio-Rad), 900 nM of each primer, 250 nM of the probe, and approximately 30 ng bisulfite-converted DNA template, in a final volume of 22 μl. Droplets were generated by the QX200 Droplet Generator (Bio-Rad), using 20 μl of the ddPCR mixture and 70 μl droplet generation oil (Bio-Rad). Samples were transferred to a 96-well PCR plate (Bio-Rad) and sealed in the PX1 PCR Plate Sealer (Bio-Rad). The PCR was performed in a T100 Thermal Cycler (Bio-Rad; see Additional file 1: Table S2 for PCR cycling conditions). The fluorescence signals were measured by the QX200 Droplet Reader (Bio-Rad). For each experiment, the following control samples were included: two methylationpositive controls (commercially available in vitro methylated DNA (IVD); Zymo Research), one methylationnegative control (bisulfite-treated DNA from normal blood of healthy donors), one non-bisulfite-converted IVD sample, and a non-template control (NTC; water). All analyses were performed according to the digital MIQE guidelines (Additional file 2) [24].

Data analyses
Data from the QX200 Droplet Reader was analyzed in QuantaSoft version 1.7.4.0917 (Bio-Rad). The Positive Droplet Calling (PoDCall) algorithm was developed in-house to determine well-specific thresholds that discriminated positive droplets, i.e., containing the target, from negative droplets, i.e., did not contain the target. The PoDCall workflow is illustrated in Fig. 1 and summarized in the following. Amplitude values were extracted from QuantaSoft and used as input data for PoDCall. First, the multimodality of the distribution of the amplitude values was assessed using Hartigan's dip test [25]. Two strategies were applied depending on the significance of the test. If non-significant, i.e., p value > 0.05, the distribution was assumed to be unimodal and the threshold was set as the maximum amplitude value, after testing for potential outliers. Alternatively, if the test was significant, a Gaussian mixture model was fitted to the distribution of droplets, using the R package "mclust," version 5.2.3 [26]. Next, the number of mixture components in the distribution was assessed by a likelihood ratio test, whose significance was approximated by using a bootstrap approach with 700 replications. Finally, the threshold was defined as the average value between the modes of the first and second components. The resulting thresholds, representing the output data from PoD-Call, were manually entered in QuantaSoft. Based on the fraction of positive droplets, concentrations of methylated copies/μl were calculated by the software.
Normalized concentrations were generated by dividing the concentration of the target gene on the concentration of the control. These normalized values were then multiplied by a constant, i.e., the mean concentration of the control among all analyzed cell lines, in order to have them in the same range as the non-normalized concentration.

Statistics
The statistical analyses were performed using R version 3.2.2. In order to investigate how normalized concentrations were affected by chromosomal aberrations, cell lines were stratified according to the presence of deletions, gains, deletions and gains (both), or no aberration. Differences in mean among the groups were investigated using ANOVA.

Results
The 4Plex panel is the best-performing control The gene family approach for designing an internal control (approach A) provided poor results (Additional file 1: Figure S1) and was discarded from further analyses. The ALDOA assay resulted in an IVD concentration of~400 copies/microliter (Additional file 1: Figure S1A), which was lower than expected given the number of loci  Table S1). For CYCS, no positive droplet band was detected (Additional file 1: Figure S1B).
For the approach that combined single assays targeting different loci in the exonic part of various pericentromeric genes (approach B), nine (ALDH1B1, EPHA3, IGFBPL1, KBTBD4, MRPS5, PLEKHF1, SAMSN1, SYT10, and TTC5) of the 13 designed assays showed a clear separation between positive and negative droplets ( Fig. 2a and Additional file 1: Figure S2). EPHA3, KBTBD4, PLEKHF1, and SYT10 had similar amplitude value of the negative droplet cluster (around 2000; Fig. 2a), and merging these assays into a control panel resulted in clear separation between positive and negative droplets (Fig. 2b). Combinations with other assays, e.g., ALDH1B1 and SAMSN1, which had a higher amplitude value of the negative droplet cluster (~2500-2800; Additional file 1: Figure S2), resulted in reduced separation (Fig. 2c). Thus, the four-assay panel consisting of EPHA3, KBTBD4, PLEKHF1, and SYT10, termed the 4Plex, was identified as the best-performing control. Across all samples analyzed, the 4Plex provided a consistent amplification pattern, with V9P as an exception. This cell line displayed a shift in the droplet pattern, comprising a significant reduction of the negative droplet peak, and simultaneous increase of the positive droplet peak (Additional file 1: Figure S3).
The 4Plex has minor impact on amplification of the target gene The assays comprised in the 4Plex are labeled with VIC and run in the same reaction as the FAM-labeled assays that measure the methylation of CDO1, SEPT9, and VIM. To evaluate whether the presence of the 4Plex had an impact on the amplification of the target gene, non-normalized target gene concentrations (methylated copies/μl) from experiments with and without the 4Plex control were compared. The resulting non-normalized concentrations were highly consistent for both CDO1 and SEPT9 (Fig. 3). For VIM, discrepancies between the concentrations resulting from the experiments with and without the 4Plex were observed (median absolute difference of 21%). However, this was comparable with resulting discrepancies from using the alternative controls ACTB [22] (median absolute difference of 23%) and C-LESS [23] (median absolute difference of 38%; Additional file 1: Figure S4).
The 4Plex corrects for differences in template amounts and can act as a template-loading control Since the target assays in the present study (CDO1, SEPT9, and VIM) are designed to amplify methylated sequences only, the 4Plex represents an internal control for normalization for these analyses. Despite the use of the same theoretical input amount for all samples in this work (based on the input in the bisulfite conversion), the 4Plex revealed concentration differences across the cell line panel (Fig. 4a). Moreover, comparing non-normalized and 4Plex-normalized concentrations of the target genes across the cell line panel, large differences were observed for the samples with the highest and lowest 4Plex concentrations (Fig. 4b). Finally, inclusion of the 4Plex discriminates true methylation-negative samples (e.g., KM12; Fig. 5) from potential false methylation-negative samples lacking template (NTC; Fig. 5).

4Plex-normalized concentrations show less variance than non-normalized target gene concentrations
Non-normalized and 4Plex-normalized concentrations of the target genes were compared among replicates of two different samples (SW48 and SW480). For both samples, normalized concentrations of CDO1 showed lower variance than the non-normalized concentrations ( Fig. 6; 28.5 vs. 183 for SW48 and 20.3 vs. 356 for SW480). The same tendency of reduced variability after normalization was seen for SEPT9 and VIM (Additional file 1: Figure S5).

Normalization by the 4Plex diminishes the effect of chromosomal aberrations
To evaluate the potential impact of chromosomal aberrations on the 4Plex compared to the previously suggested single locus controls ACTB [22] and C-LESS [23], the normalized concentrations of the target genes were compared in groups of colorectal cancer cell lines harboring no aberrations, gain, loss, or both gain and loss in the control loci (Additional file 1: Table S3). As shown in Fig. 7, chromosomal aberrations significantly affected the ACTB-normalized concentrations (blue boxes) of CDO1, SEPT9, and VIM (P < 0.001, P < 0.001, and P = 0.016, respectively) as well as the C-LESS-normalized concentrations of the same target genes (pink boxes; P < 0.001, P < 0.001, and P = 0.012, respectively). In contrast, the 4Plex (green boxes) was found to diminish the effect of chromosomal aberrations when analyzing these three target genes (P = 0.131, P = 0.109, and P = 0.011).

Discussion
The ddPCR technology allows highly sensitive quantification of nucleic acids and has great potential for analyses of DNA methylation. In the present work, we have developed a robust internal control for methylationspecific ddPCR, the 4Plex. This control consists of four individual pericentromeric assays and is analyzed in the same reaction as the target of interest. We demonstrate that normalization using the 4Plex standardizes the results by increasing the precision of the target quantification. Such precision is especially important for the rapidly evolving field of liquid biopsies, which holds great promise for disease detection, monitoring, and emergence of drug resistance [27].
Two different strategies are typically used for robust quantification of methylated targets in ddPCR analyses. In line with standard mutation/SNP assays, primers binding to bisulfite converted DNA, independent of the methylation status, can be paired with probes with different fluorescent marks, one binding to the methylated version and the other to the unmethylated version of the target sequence. With such a design, the ratio between methylated and unmethylated DNA can be determined, and the use of an internal control would be limited to normalizing for minor technical variations, such as pipetting inaccuracies. This represents a convenient design for absolute quantification, but can be challenging for DNA methylation analyses where the number of CpGs in the target region of interest, e.g., gene promoters, is often high, and the presence of such CpG sites in the primer binding sites may bias the amplification. A commonly used alternative, often seen in qMSP/MethyLight analyses, is designing an assay where both primers and probe bind exclusively to the methylated version of the target. This type of analysis requires an internal control for normalization, preferably reflecting the total amount of amplifiable template in the reaction.
In the present study, two approaches for developing such an internal control for normalization in ddPCR analyses were evaluated. The gene family approach (A) provided poor results for both alternatives tested. Amplification was either failing or resulting in a lower than expected concentration (Additional file 1: Figure S1). The latter could potentially be explained by a degenerated base in the sense primer, causing less efficient binding to one or more of the targeted loci. In contrast, Fig. 6 The 4Plex-normalized concentrations show lower variance than non-normalized concentrations. Results are shown for CDO1 in two different cell lines (SW48 and SW480). Each analysis includes 12 replicates approach (B), consisting of combining several individual assays into a combined control panel, was successful and resulted in the 4Plex.
Using the 4Plex as an internal control in methylationspecific ddPCR has several advantages. In addition to reducing the overall variability in methylation values and increasing the reproducibility, the 4Plex can adjust for unforeseen variations in the experimental pipeline. Although equal amounts of DNA, as measured by the NanoDrop, were loaded into the bisulfite treatment and subsequent ddPCR reaction in the present study, the 4Plex revealed significant DNA concentration differences across samples (Fig. 4a). Normalization by the 4Plex thereby prevented over-and underestimation of methylation levels (Fig. 4b). This is highly relevant for analyses of clinical material, where the DNA quality and integrity is typically varying [28].
As expected, and in line with single locus references, the 4Plex served as a template-loading control that allowed distinguishing between true methylation-negative samples and template-negative samples (Fig. 5). Moreover, as the 4Plex consists of four assays and has a considerably higher concentration than the target gene, it could also be used to establish a lower threshold for allowing scoring of samples. With a very low signal from the control, it is unlikely that the reaction contains enough template to detect potential methylation. Such a lower loading threshold, revealing samples that cannot be robustly determined, will reduce the number of false negatives.
Chromosomal aberrations are common in various diseases, and cancer in particular [29,30], and will affect the normalization if present in the control locus [17,18]. The importance of using an internal control that targets multiple loci is known from qMSP/MethyLight analyses . Cell lines are grouped after their chromosomal aberration status of the respective controls (x-axis). *Significant at a 1% level. **Significant at a 5% level [17,18] and was recently also emphasized by Uehiro and colleagues for ddPCR DNA methylation analyses [13]. In MethyLight, the transposable ALU element, containing more than one million copies spread out in the human genome, represents a robust internal control. However, for ddPCR methylation analyses, the ALU element is too abundant and saturates the reaction (data not shown). The 4Plex on the other hand amplifies four loci in the genome, located on different chromosomes, without reaching saturation. When used as an internal control in the present study, the 4Plex reduced the effect of chromosomal aberrations on normalized methylation values of the target gene. In contrast, the use of the single locus controls ACTB and C-LESS caused significant deviations in methylation values (Fig. 7). An additional advantage with the 4Plex is that it only quantifies template that can be amplified by the targeted assays, i.e., bisulfite-converted DNA, in contrast to the C-LESS control that amplifies its target independent of bisulfite conversion status [23].
To evaluate whether the presence of the 4Plex had an impact on the amplification of the target gene, the target was run alone and in combination with the 4Plex. The resulting methylation concentrations were highly consistent for two of three genes tested (CDO1 and SEPT9; Fig. 3). Interestingly, for VIM, the discrepancies from using the 4Plex were smaller than the observed discrepancies from using the single locus controls ACTB and C-LESS (Additional file 1: Figure S4).
In ddPCR methylation analyses, rain, i.e., droplets that fall between the positive and negative clusters, is a known phenomenon [12,13,31] (visible in Figs. 2 and 5), making methylation concentrations sensitive to inconsistent threshold determination. To standardize the analyses, PoDCall, an algorithm for automated threshold calculations, was developed (Fig. 1). PoDCall contributed to standardization through well-specific scoring of positive and negative droplets and thus increased the consistency of the methylation data. However, it is likely that PoDCall will be applicable also for other types of ddPCR analyses where rain is observed. Finally, PoDCall was also useful in order to correct for unexpected technical artifacts in the ddPCR analyses, such as shifts in baseline fluorescence between samples (Additional file 1: Figure S6). Improved accuracy of ddPCR analyses by using automated threshold determination has also been underscored by others [32,33] but has to our knowledge received limited recognition.
The 4Plex performed well across all samples analyzed, with V9P as an exception (Additional file 1: Figure S3). This is most likely explained by a significant chromosomal amplification observed for the PLEKHF1 locus in this cell line. In contrast, a "normal" droplet distribution pattern was seen across a series of more than 100 colorectal cancer tissues (data not shown), indicating that such pattern aberrations are rare. Furthermore, the 4Plex was successfully used in recent analyses of non-to minimally invasive material from bladder cancer-and cholangiocarcinoma patients, respectively (Pharo et al., unpublished, Vedeld et al., unpublished), underscoring that this control can be applied across cancer types.

Conclusions
In conclusion, the 4Plex internal control increases the precision of methylation-specific ddPCR analyses by reducing the variability in methylation concentrations, correcting for variable input amount, and by reducing the effect of chromosomal aberrations. PoDCall, the algorithm for automated threshold determination, contributes to additional consistency of the ddPCR results. We advocate for implementation of the 4Plex and PoDCall as a standard in methylation-specific ddPCR analyses.

Additional files
Additional file 1: Table S1. Sequence information for the ddPCR assays used in the present study. Table S2. The PCR thermal cycling conditions (T100 Thermal Cycler, Bio-Rad). Table S3. Gene copy number states of ACTB, C-LESS, and the 4Plex in the 34 colorectal cancer cell lines. Figure S1. Droplet dPCR amplification of a non-CpG containing sequence shared by members of a gene family (approach A) provides poor results. Figure S2. Individual control assay candidates from approach B. Figure S3. The 4Plex shows a consistent amplification pattern across the cell line panel with V9P as an exception. Figure S4. Non-normalized VIM concentrations are lower with a control assay included in the reaction. Figure S5. A tendency of lower variation in 4Plex-normalized target gene concentrations is seen in replicates of the same sample. Figure S6