Skip to main content

Quantitative survey of multiple CpGs from 5 genes identifies CpG methylation panel discriminating between high- and low-grade cervical intraepithelial neoplasia



Studies of methylation biomarkers for cervical cancer often involved only few randomly selected CpGs per candidate gene analyzed by methylation-specific PCR-based methods, with often inconsistent results from different laboratories. We evaluated the role of different CpGs from multiple genes as methylation biomarkers for high-grade cervical intraepithelial neoplasia (CIN).


We applied a mass spectrometry-based platform to survey the quantitative methylation levels of 34 CpG units from SOX1, PAX1, NKX6-1, LMX1A, and ONECUT1 genes in 100 cervical formalin-fixed paraffin-embedded (FFPE) tissues. We then used nonparametric statistics and Random Forest algorithm to rank significant CpG methylations and support vector machine with 10-fold cross validation and 200 times bootstrap resampling to build a predictive model separating CIN II/III from CIN I/normal subjects. We found only select CpG units showed significant differences in methylation between CIN II/III and CIN I/normal groups, while mean methylation levels per gene were similar between the two groups for each gene except PAX1. An optimal classification model involving five CpG units from SOX1, PAX1, NKX6-1, and LMX1A achieved 81.2% specificity, 80.4% sensitivity, and 80.8% accuracy.


Our study suggested that during CIN development, the methylation of CpGs within CpG islands is not uniform, with varying degrees of significance as biomarkers. Our study emphasizes the importance of not only methylated marker genes but also specific CpGs for identifying high-grade CINs. The 5-CpG classification model provides a promising biomarker panel for the early detection of cervical cancer.


Cervical cancer is one of the leading causes of cancer-related mortality in women worldwide. Cervical intraepithelial neoplasia (CIN) is a premalignant transformation and abnormal growth (dysplasia) of squamous cells of the cervix. Early screening together with timely treatment of precancerous lesions can substantially improve clinical outcome, thus offering a unique opportunity to cervical cancer management. The widely used screening strategy, cytology-based Pap smear, has been associated with a significant reduction of cancer incidence rate and mortality [1].

Besides finding cervical carcinomas, cervical cancer screening aims to identify high-grade intraepithelial lesions (corresponding to histological grades CIN II and CIN III) which require surgical procedures to prevent further progression. Low-grade intraepithelial lesions (corresponding to histological grade CIN I), on the other hand, should not be over-treated for such procedures as they have high potential to spontaneously regress to normal [2]. However, the sensitivity of Pap smears for the detection of CIN II or higher grades is generally low [3,4]. On the other hand, the highly sensitive diagnostic high-risk human papillomavirus (HPV) DNA testing tends to give false positives [5-9]. A third strategy is direct colposcopy [10], which requires interpretational expertise, is not amenable to high throughput processing, and has low positive predictive values for low-grade squamous intraepithelial lesions [4,11]. Finally, even for histopathology specimen of cervical biopsies, objective CIN diagnosis can be sometimes challenging. The reproducibility of cervical histopathologic interpretations was moderate and equivalent to the reproducibility of monolayer cytologic interpretations [12]. Thus, an objective, high-throughput approach with high sensitivity and specificity is urgently needed for early diagnosis of cervical cancer.

Numerous investigations have reported that gene-specific hypermethylation occurring in pre-invasive and invasive phase of cervical cancer may be promising biomarkers for early diagnosis [13-21]. A review of the results of 51 published cervical cancer methylation studies involving 68 different genes concluded, however, that no single methylation marker from these studies was suitable as a cervical cancer biomarker [13]. Most identified biomarkers, with a few exception [22,23], lacked sufficient independent validations. Currently, therefore, it is as important to validate existing candidates as to identify additional ones. Another concern regarded inconsistent results in methylation studies. Most of these studies used methylation-specific PCR (MSP) or quantitative methylation-specific PCR (QMSP) methods [24], analyzing in each gene one or two CpGs which were selected randomly as those feasible for primer/probe design, assuming hypermethylation is uniform across CpG promoter and the analyzed CpGs are representative. The measured methylation frequencies varied widely for the same gene even between studies that used common specimen or similar assays [13]. A recent study by Lai et al. [18], on a Chinese cohort of squamous cell carcinoma (SCC), identified six novel genes (SOX1, PAX1, NKX6-1, LMX1A, ONECUT1 and WT1) as more frequently methylated in SCC tissues than in normal controls. Some of the markers were verified by the same laboratory using QMSP (MethyLight) [25,26]. However, two of these methylation markers had different performance between the two studies [18,25]. Moreover, another study by an independent laboratory using QMSP on liquid-based cytology samples from a UK cohort found that only one of these genes, SOX1, was able to discriminate between high-grade squamous intraepithelial lesions and controls [15]. Surprisingly, although such disturbing discrepancies cast considerable doubts on the validity of identified biomarker candidates, little study was undertaken to examine their potential causes.

We suspected that different CpGs assayed by different groups for the same genes may be a major factor contributing to the result variances, and decided to systematically evaluate different CpGs from multiple genes as methylation biomarkers. For early detection of cervical cancer, it is clinically more useful to find epigenetic correlates discriminating between histologically distinct CINs than between SCC and normal cervices, yet few studies have focused exclusively on CIN development. We set to evaluate the utility of methylation biomarkers in distinguishing high-grade from low-grade CIN lesions. Our aims were therefore twofold: (1) to evaluate the relative importance of different CpGs as methylation biomarkers and thus decide whether randomly selecting CpGs to assay, as practiced in most methylation biomarker studies, is justified and (2) to find an optimal panel of candidate hypermethylated CpGs with high sensitivity and specificity for precancerous CIN II or CIN III.

To these ends, we evaluated 34 CpG units from five candidate genes, using definitively diagnosed FFPE tissue specimens from an independent cohort of 100 Chinese precancerous cervical patients and normal controls, who shared a common genetic background with the subjects of the original gene-discovery study [18]. We used a matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS)-based DNA methylation quantification technology (EpiTYPER, Sequenom) [27], which is fundamentally different from the commonly used MSP or QMSP methods. Our method yields direct quantification of the percentage of DNA methylated in a CpG unit, with results highly concordant with bisulfite sequencing [28]. This technology has already been applied to evaluate methylation patterns of leukemia [29] and non-small cell lung cancer [30].

To rank CpG units with discriminating power, we used traditional nonparametric statistics as well as Random Forest, a method particularly well suited for analyzing mass spectrometry data in studies of biomarker identification for cancer classification [31]. We then used support vector machine (SVM) [32-35] with cross-validation and bootstrap resampling, which randomly partitioned the tested samples into training and validation sets, to construct an inferred model and assessed the predicative power of the model. Our results showed that choosing the right CpG unit to assay is critical, and a panel of multiple specific CpG methylation constructed by computerized algorithm allowed us to separate high-grade CIN from low-grade or healthy subjects with high accuracy, providing a candidate biomarker panel for early detection of cervical cancer development.


Survey of CpG methylation of five genes by MALDI-TOF-based EpiTYPER assay

A total of 100 FFPE cervical samples with histopathological classifications of normal (N = 16), CIN I (N = 31), and CIN II or CIN III (N = 53 including 4 CIN II and 49 CIN III) were obtained retrospectively from a cohort of ethnic Han Chinese women. The CIN samples were all tested HPV positive (data not shown), and consensus histological diagnoses were provided independently by two pathologists, with confirmation by p16 and Ki-67 immunohistochemistry staining (Figure 1). There was no significant age difference between different groups (Table 1).

Figure 1
figure 1

CINs with p16 and Ki-67 immunostaining. Immunohistochemical examination of p16 (A–C) and Ki-67 (D–F) protein expressions in histologically CIN I (A, D), CIN II (B, E), and CIN III (C, F) tissues. Magnifications × 40.

Table 1 Sample characteristics and number of samples whose CpG islands for each gene were successfully amplified for EpiTYPER analysis

To analyze the methylation status of PAX1, NKX6-1, SOX1, LMX1A, and ONECUT1, a CpG island (CGI) for each gene was chosen for amplification (Figure 2 and Table S1 in Additional file 1). Each CGI contained five to eight CpG units that can be analyzed by EpiTYPER in this study (Table 2).

Figure 2
figure 2

The positions of CpGs analyzed by EpiTYPER. Drawings are schematic and not to scale. The orientation of each gene is indicated by the arrow at the end. Boxes indicate exons or UTRs; vertical lines indicate individual CpGs in the CGI regions, and the horizontal bars indicate the regions analyzed by EpiTYPER.

Table 2 The number of CpG units for each gene

Upon DNA extraction, bisulfite treatment, and PCR amplification, 72–94% of the samples, depending on the target gene of interest, generated sufficient amplicons that were amenable to subsequent EpiTYPER analysis (Table 1), suggesting that the assay design and sample processing protocol were suitable for the archival FFPE samples. The EpiTYPER is capable of simultaneously determining all applicable CpG units within a CGI amplicon in one well. Quantitative methylation assessment of a total of 34 CpG units (Table 2) in 100 individuals was completed in two 384-well plates in a single day.

When we examined CGIs of candidate genes, we observed unexpectedly that the mean methylation levels of the CGIs of four of the five genes were not statistically different between CIN II/III and CIN I/normal groups (Figure 3A), indicating that during CIN development, the overall methylation status of the examined CGIs of NKX6-1, SOX1, LMX1A, and ONECUT1 did not change. However, when we examined individual CpG units, statistically significant difference in methylation between the CIN II/III and CIN I/normal groups emerged for eight CpG units (Figure 3B). PAX1 had the highest methylation level among the five genes, with the methylation level of four CpG units being significantly different between the two groups (P < 0.05). Although the overall methylation level of LMX1A was low, one CpG unit (L_CpG 28.29.30) was differentially methylated between the two groups (Figure 3B). NKX6-1 and SOX1 exhibited a moderate methylation level and, respectively, contained one (N_CpG 9.10) and two (S_ CpG 17.18 and S_ CpG 34.35) significant CpG units. ONECUT1 contained no significantly methylated CpG units (Figure 3B). These data suggested that during CIN development, the methylation within CGIs was not uniform.

Figure 3
figure 3

The methylation patterns of the five genes. (A) Dot plot of the methylation levels for each gene. Each dot represents a sample, with methylation level averaged over all CpG units analyzed for the gene. (B) Bar graph of the methylation levels of individual CpG units. Open and filled columns denote CIN I/normal and CIN II/III group, respectively. Each bar represents mean methylation of all samples in the group. Error bars indicate SEM; *P < 0.05 for Mann-Whitney U test.

Validation by bisulfite sequencing

To validate the EpiTYPER methylation results using independent methods, we did bisulfite sequencing for three genes on eight samples (Figure 4). As expected, the sequencing results were in accordance with quantitative EpiTYPER results. Moreover, sequencing confirmed that the DNA methylation was not uniform, as some specific CpGs tended to exhibit more frequent methylation than other CpGs (Figure 4).

Figure 4
figure 4

Bisulfite sequencing (BS) of CpGs assayed by EpiTYPER. Three genes were bisulfite-sequenced in eight cervical samples of various stages. In each panel, sample ID is shown at the top, and EpiTYPER results are shown below the gene name as the average level for all measured CpGs. BS results are summarized as filled circles representing methylated CpGs and open circles representing unmethylated CpGs. Each line is an independently sequenced clone. Each column is a CpG of the gene.

Significance ranking of CpG units by Random Forest

To evaluate the contribution of 34 CpG units to the separation of CIN II/III subjects from CIN I/normal ones, we employed the Random Forest algorithm (see “Methods”) in addition to the standard nonparametric statistical method; Figure 5 shows the mean decrease in accuracy (MDA) values of the 34 CpG units, with higher MDA indicating increasing importance of a CpG unit as predictor [36]. We tested the performance of classifiers constructed by the assembly of different features iteratively. When the selected features were PAX1_CpG12, SOX1_CpG34.35, LMX1A_ CpG28.29.30, NKX6-1_CpG9.10, and PAX1_CpG6.7.8, the classifiers achieved the optimal performance. Table 3 presents the Mann-Whitney U test result of these five CpG units.

Figure 5
figure 5

Feature importance for prediction of CIN II/III according to Random Forest algorithm. MDA: mean decrease in accuracy.

Table 3 Summary of nonparametric statistics for the selected CpG units in building classifier

Classification model built by SVM

To build optimal classification model, the methylation levels of the above five CpG units were entered into a SVM with radial basis function (RBF) kernel (see “Methods”). When C and γ were 410 and 14, the classifiers showed the optimal and robust performance. With 200 times bootstrap resampling and 10-fold cross validation, the classification model showed high predictive power with sensitivity, specificity, and accuracy of 0.804 ± 0.028, 0.812 ± 0.008, and 0.808 ± 0.014 (mean ± SD), respectively (Table 4).

Table 4 Evaluation parameters of SVM classifier model trained by the best parameter setting


CpG hypermethylation of key genes involved in cervical cancer development may be promising biomarkers for early diagnosis [13-18,23]. However, progress in the field of cervical methylation biomarker discovery has been hampered by inconsistent results that defy validations [13]. Most of the previous studies used nonquantitative MSPs [18,37] that are highly sensitive but with the drawback of being unable to distinguish tumors with substantial methylation from those with biologically insignificant methylation levels [28]. Recently developed quantitative MSP (QMSP) such as MethyLight provides a better alternative and is becoming increasingly used in methylation studies [24,38,39]. However, QMSP will only detect methylation of few CpGs (equivalent of one CpG unit in our assay, usually 2 ~ 3 CpGs each) [24]. It is difficult to design assays in CpG-rich areas without having the probe overlapping flanking CpGs, and with the probe having sufficient annealing temperature to achieve robust annealing and specificity in the highly AT-rich sequence after bisulfite conversion. This makes QMSP limited in applications and perhaps explains the sometimes variable results obtained for the same gene using different primer/probe designs [15,18,25]. In contrast, primers used in EpiTYPER assay did not involve CpGs (Table S1 in Additional file 1), resulting in more consistent results. The EpiTYPER assay showed much better correlation with the gold-standard sequencing results than the MSP-based assays [28]. The technique could analyze almost all CpGs covered within one amplicon for a gene, instead of 2 ~ 3 CpGs randomly chosen by QMSP. We used this novel quantitative platform to survey 5 ~ 8 CpG units (containing 13 ~ 35 CpGs) within a CGI of each gene for five genes. We found that randomly selecting CpGs to assay gene methylation can be problematic, as CpG methylation is not uniform during CIN development (Figure 3B). Of all genes except PAX1, the overall methylation status, defined as the averaged level of all CpGs within the CGI, was similar between CIN II/III and CIN I/normal(Figure 3A). This is in contrast to the conclusion based on a much limited number of CpGs using MSP [18,25]. Only select CpG units may be used as markers distinguishing CIN II/III from CIN I/normal samples (Figure 3B). Consistent with our observation, other reports has demonstrated that aberrant DNA methylation of only specific CpGs within the CGI are responsible for the downregulation of gene expression [40-44], and more recently, a substantial number of studies reported a specific, single CpG can function as strong prognostic or predictive indicators in various cancers [28,45-47].

Our findings highlight the importance of studying the detailed methylation pattern within a CGI, as they reveal the temporal complexity of DNA methylation during cervical cancer development, and emphasize the importance of not only methylated marker genes but also specific CpGs for identifying high-grade CINs. Therefore, choosing the right CpG unit to assay is critical, and previous inconsistencies among different labs regarding methylation status of the same genes may be due to CpG choices.

We also note that significant CpG units for CIN development can reside beyond the promoter, in exonic or intronic regions as well (Figure 2), just as CpG methylation outside of promoter region can be responsible for tumor suppressor inactivation in breast cancer [48]. Although we did not evaluate all CpGs of these marker genes, our original findings could provide diagnostically useful methylation biomarkers for high-grade CIN. Moreover, MALDI-TOF-based technology gave consistent results to assay these CpG markers in a multiplexed, high-throughput fashion suitable for clinical applications.

Diagnostic classifiers built on multigene methylation panels have shown better performance in predicting a wide variety of tumors [49]. However, such studies commonly associated with the overfitting problem [50,51]. To overcome this, we used SVM to construct the classifier model [52-54] and coupled with a procedure of 10-fold cross validation (in which our samples were partitioned into randomly assigned training and testing sets for the model to be validated 10 times) and 200 times bootstrap resampling (in which the partitioning and cross-validation was randomized and repeated 200 times). Such procedures help reduce overfitting and provide a reliable estimate of the performance of the model [55]. Compared with classification methods used in previous studies [18,25], SVM is a statistical learning method with greater accuracy in diagnostic ability [32,33,54,56] and with more consistent performance at our sample size [57].

Hypermethylated genes selected to predict invasive cervical cancer achieved a sensitivity about 90% according to previous study. However, the high-grade CIN lesions were predicted with much lower sensitivity (~70%) [20,58]. Our panel of CpG units obtained a high sensitivity and specificity of ~80%, achieving a valuable balance between sensitivity and specificity in identifying high-risk samples. The high specificity of our classifier would be particularly suitable for developing countries like China, where cervical cancer prevalence remained relatively high.

Identification of a set of reliable CIN biomarkers serves as a foundation for potential future applications such as quality assurance of histopathology classifications and noninvasive cervical cancer screening if these markers are validated in exfoliated cell samples from cervical scrapings or Pap smears. Our panel of CpG units and the EpiTYPER platform can potentially be a part of an objective, high-throughput strategy for early cervical cancer detection.


Our findings highlight the significance of studying the detailed methylation pattern within a CGI and emphasize the importance of not only methylated marker genes but also specific CpGs for identifying high-grade CINs. We demonstrated the value of the MALDI-TOF technology in methylation biomarker identification and obtained a five-CpG panel with a promising potential as a biomarker for the early detection of cervical cancer.



Formalin-fixed paraffin-embedded (FFPE) cervical biopsy samples were obtained from outpatients visiting the Beijing Aerospace Central Hospital from 2007 to 2012. All histological specimens were tested for HPV DNA (Hybrid Capture-2 kit; Qiagen, Gaithersburg, MD) and for p16 and Ki-67 immunostaining (Beijing Zhong Shan Golden Bridge Biological Technology Co., Ltd.) [59]. The specimens were reviewed independently by two expert pathologists from the Departments of Pathology at Aerospace Central Hospital and Beijing Tiantan Hospital, and only concordant, clearly unambiguous specimens were chosen for the study. Exclusion criteria included uncertain histopathological classification, pregnancy, chronic or acute systemic viral infections, presence of other cancers, skin or genital warts, and an immunocompromised state. Informed consents were obtained from all patients and controls. The study followed the ethical guidelines of the Institutional Review Board of the Aerospace Central Hospital.

DNA preparation and bisulfite treatment

Genomic DNA was extracted from archival FFPE blocks using an established protocol [60]. DNA was quantified using the NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific Inc, CA.). Only DNA samples exhibiting an A260/A280 ratio between 1.8 and 2.0 were considered for further testing.

EZ DNA Methylation Kit (Zymo Research Corporation, CA) was used to modify extracted genomic DNA according to the manufacturer’s protocol with Sequenom recommendations.

MALDI-TOF-MS-based DNA methylation analysis

MALDI-TOF-MS-based DNA methylation assay (EpiTYPER) was performed according to the manufacturer’s specification [27] (Sequenom Inc. CA). Bisulfite-modified genomic DNA was used as PCR template. Primers for PCR (Table S1 in Additional file 1), which do not contain CpGs and amplify both methylated and unmethylated sequences equally, were designed using EpiDesigner ( with the following constraints: (1) the amplicon was located in a CGI of the target gene, (2) the amplicon size is below 300 bp to increase the amplification success rate of FFPE samples, and (3) the amplicon covers as many CpGs as possible. The reverse primers included at the 5′ end a T7 promoter tag [5′-cagtaatacgactcactataggg-3′]. Only samples successfully amplified with a clear and specific PCR band at the expected size were included for further analysis. After PCR amplification, T7 RNA polymerase (Sequenom Inc.) was used to in vitro transcribe single-stranded RNA, which was then cleaved base-specifically by RNase A [27] (MassCLEAVE, Sequenom Inc.). The cleavage products, which contained either individual CpG or short stretches of adjacent CpGs, were analyzed using a MALDI-TOF mass spectrometer (Sequenom Inc). The peak areas of the mass signals derived from methylated and non-methylated template DNA were used to estimate the relative methylation level (valued from 0 to 1 or 0 to 100%). Methylation level for each CpG unit represents average of CpGs within the unit.

We used 100 and 0% methylated human DNA (EpiTect Control DNA Set, QIAGEN Inc. CA) as positive and negative controls, respectively, for the amplification and methylation determination. No-template controls were included for each amplicon to monitor PCR specificity.

Bisulfite sequencing

We cloned the EpiTYPER PCR products into pGEM-T Easy vectors (Promega, WI). For each sample, Sanger sequencing was performed on 10 random individual clones using the 3730 automatic sequencer (Applied Biosystems, CA). Sequencing results were analyzed using the QUMA online software suite (

Statistical analysis

For all statistical analysis in this study, the normal and CIN I samples were grouped into one category, so that all samples were classified as either CIN II/III or CIN I/normal. The relative methylation of each CpG unit in the dataset was analyzed as continuous variables. Nonparametric statistical analysis was performed with the two-tailed Mann-Whitney U test for unpaired comparisons (GraphPad Prism 5.01), with statistical significance set at P value <0.05.

Additionally, the significance of CpG unit was assessed using the MDA calculated by the feature selection algorithm of Random Forest [36] ( Two main parameters of Random Forest, ntree (the number of trees in the forest) and mtry (the number of variables randomly chosen at each split in a tree), were set to 5000 and 6, respectively.

We used SVM with a RBF kernel for the classifiers. The SVM parameters (penalty parameter C, kernel parameter γ) were optimized using grid-search method [50]. Besides SVM, we used 10-fold cross-validation combined with 200 times bootstrapping sampling in constructing and evaluating the classification model. Thus, the original samples were randomly partitioned into 10 equal-sized subsets, 9 of which were used as training data and the remaining set for validation testing. The process was repeated 10 times to ensure each subset was used exactly once as the testing set. The random partition and cross-validation repeat 200 times altogether. The classification performances were assessed using the sensitivity, the specificity, and the accuracy of the classification [61]. All computational experiments were carried out in the MATLAB (Version 8.1) programming environment.



Cervical intraepithelial neoplasia


Methylation-specific PCR


Human papillomavirus


Quantitative methylation-specific PCR


Squamous cell carcinoma


Matrix-assisted laser desorption ionization time-of-flight mass spectrometry


CpG island


Support vector machine


Formalin-fixed paraffin-embedded


Mean decrease in accuracy


Radial basis function


  1. Gustafsson L, Ponten J, Zack M, Adami HO. International incidence rates of invasive cervical cancer after introduction of cytological screening. Cancer Causes Control. 1997;8:755–63.

    Article  CAS  PubMed  Google Scholar 

  2. Melnikow J, Nuovo J, Willan AR, Chan BK, Howell LP. Natural history of cervical squamous intraepithelial lesions: a meta-analysis. Obstet Gynecol. 1998;92:727–35.

    Article  CAS  PubMed  Google Scholar 

  3. Renshaw AA. Measuring sensitivity in gynecologic cytology: a review. Cancer. 2002;96:210–7.

    Article  PubMed  Google Scholar 

  4. Wentzensen N, von Knebel DM. Biomarkers in cervical cancer screening. Dis Markers. 2007;23:315–30.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. Sherman ME, Solomon D, Schiffman M, Group ALTS. Qualification of ASCUS. A comparison of equivocal LSIL and equivocal HSIL cervical cytology in the ASCUS LSIL Triage Study. Am J Clin Pathol. 2001;116:386–94.

    Article  CAS  PubMed  Google Scholar 

  6. Layfield LJ, Qureshi MN. HPV DNA testing in the triage of atypical squamous cells of undetermined significance (ASCUS): cost comparison of two methods. Diagn Cytopathol. 2005;33:138–43.

    Article  PubMed  Google Scholar 

  7. Ferris DG, Wright Jr TC, Litaker MS, Richart RM, Lorincz AT, Sun XW, et al. Triage of women with ASCUS and LSIL on Pap smear reports: management by repeat Pap smear, HPV DNA testing, or colposcopy? J Fam Pract. 1998;46:125–34.

    CAS  PubMed  Google Scholar 

  8. Arbyn M, Buntinx F, Van Ranst M, Paraskevaidis E, Martin-Hirsch P, Dillner J. Virologic versus cytologic triage of women with equivocal Pap smears: a meta-analysis of the accuracy to detect high-grade intraepithelial neoplasia. J Natl Cancer Inst. 2004;96:280–93.

    Article  PubMed  Google Scholar 

  9. Arbyn M, Paraskevaidis E, Martin-Hirsch P, Prendiville W, Dillner J. Clinical utility of HPV-DNA detection: triage of minor cervical lesions, follow-up of women treated for high-grade CIN: an update of pooled evidence. Gynecol Oncol. 2005;99:S7–11.

    Article  CAS  PubMed  Google Scholar 

  10. Petry KU, Luyten A, Scherbring S. Accuracy of colposcopy management to detect CIN3 and invasive cancer in women with abnormal screening tests: results from a primary HPV screening project from 2006 to 2011 in Wolfsburg. Germany Gynecol Oncol. 2013;128:282–7.

    Article  Google Scholar 

  11. Trivers KF, Benard VB, Eheman CR, Royalty JE, Ekwueme DU, Lawson HW. Repeat pap testing and colposcopic biopsies in the underserved. Obstet Gynecol. 2009;114:1049–56.

    Article  PubMed  Google Scholar 

  12. Stoler MH, Schiffman M, Atypical Squamous Cells of Undetermined Significance-Low-grade Squamous Intraepithelial Lesion Triage Study G. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL Triage Study. JAMA. 2001;285:1500–5.

    Article  CAS  PubMed  Google Scholar 

  13. Wentzensen N, Sherman ME, Schiffman M, Wang SS. Utility of methylation markers in cervical cancer early detection: appraisal of the state-of-the-science. Gynecol Oncol. 2009;112:293–9.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Shivapurkar N, Sherman ME, Stastny V, Echebiri C, Rader JS, Nayar R, et al. Evaluation of candidate methylation markers to detect cervical neoplasia. Gynecol Oncol. 2007;107:549–53.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Apostolidou S, Hadwin R, Burnell M, Jones A, Baff D, Pyndiah N, et al. DNA methylation analysis in liquid-based cytology for cervical cancer screening. Int J Cancer. 2009;125:2995–3002.

    Article  CAS  PubMed  Google Scholar 

  16. Szalmas A, Konya J. Epigenetic alterations in cervical carcinogenesis. Semin Cancer Biol. 2009;19:144–52.

    Article  CAS  PubMed  Google Scholar 

  17. Steenbergen RD, Snijders PJ, Heideman DA, Meijer CJ. Clinical implications of (epi)genetic changes in HPV-induced cervical precancerous lesions. Nat Rev Cancer. 2014;14:395–405.

    Article  CAS  PubMed  Google Scholar 

  18. Lai HC, Lin YW, Huang TH, Yan P, Huang RL, Wang HC, et al. Identification of novel DNA methylation markers in cervical cancer. Int J Cancer. 2008;123:161–7.

    Article  CAS  PubMed  Google Scholar 

  19. Hansel A, Steinbach D, Greinke C, Schmitz M, Eiselt J, Scheungraber C, et al. A promising DNA methylation signature for the triage of high-risk human papillomavirus DNA-positive women. PLoS One. 2014;9:e91905.

    Article  PubMed Central  PubMed  Google Scholar 

  20. Lendvai A, Johannes F, Grimm C, Eijsink JJ, Wardenaar R, Volders HH, et al. Genome-wide methylation profiling identifies hypermethylated biomarkers in high-grade cervical intraepithelial neoplasia. Epigenetics. 2012;7:1268–78.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Brebi P, Maldonado L, Noordhuis MG, Ili C, Leal P, Garcia P, et al. Genome-wide methylation profiling reveals Zinc finger protein 516 (ZNF516) and FK-506-binding protein 6 (FKBP6) promoters frequently methylated in cervical neoplasia, associated with HPV status and ethnicity in a Chilean population. Epigenetics. 2014;9:308–17.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. Hesselink AT, Heideman DA, Steenbergen RD, Coupe VM, Overmeer RM, Rijkaart D, et al. Combined promoter methylation analysis of CADM1 and MAL: an objective triage tool for high-risk human papillomavirus DNA-positive women. Clin Cancer Res. 2011;17:2459–65.

    Article  CAS  PubMed  Google Scholar 

  23. Eijsink JJ, Lendvai A, Deregowski V, Klip HG, Verpooten G, Dehaspe L, et al. A four-gene methylation marker panel as triage test in high-risk human papillomavirus positive patients. Int J Cancer. 2012;130:1861–9.

    Article  CAS  PubMed  Google Scholar 

  24. Eads CA, Danenberg KD, Kawakami K, Saltz LB, Blake C, Shibata D, et al. MethyLight: a high-throughput assay to measure DNA methylation. Nucleic Acids Res. 2000;28:E32.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  25. Lai HC, Lin YW, Huang RL, Chung MT, Wang HC, Liao YP, et al. Quantitative DNA methylation analysis detects cervical intraepithelial neoplasms type 3 and worse. Cancer. 2010;116:4266–74.

    Article  CAS  PubMed  Google Scholar 

  26. Chao TK, Ke FY, Liao YP, Wang HC, Yu CP, Lai HC. Triage of cervical cytological diagnoses of atypical squamous cells by DNA methylation of paired boxed gene 1 (PAX1). Diagn Cytopathol. 2013;41:41–6.

    Article  PubMed  Google Scholar 

  27. Ehrich M, Nelson MR, Stanssens P, Zabeau M, Liloglou T, Xinarianos G, et al. Quantitative high-throughput analysis of DNA methylation patterns by base-specific cleavage and mass spectrometry. Proc Natl Acad Sci U S A. 2005;102:15785–90.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  28. Claus R, Wilop S, Hielscher T, Sonnet M, Dahl E, Galm O, et al. A systematic comparison of quantitative high-resolution DNA methylation analysis and methylation-specific PCR. Epigenetics. 2012;7:772–80.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Bullinger L, Ehrich M, Dohner K, Schlenk RF, Dohner H, Nelson MR, et al. Quantitative DNA methylation predicts survival in adult acute myeloid leukemia. Blood. 2009;115:636–42.

    Article  PubMed  Google Scholar 

  30. Ehrich M, Field JK, Liloglou T, Xinarianos G, Oeth P, Nelson MR, et al. Cytosine methylation profiles as a molecular marker in non-small cell lung cancer. Cancer Res. 2006;66:10911–8.

    Article  CAS  PubMed  Google Scholar 

  31. Wu B, Abbott T, Fishman D, McMurray W, Mor G, Stone K, et al. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics. 2003;19:1636–43.

    Article  CAS  PubMed  Google Scholar 

  32. Akay MF. Support vector machines combined with feature selection for breast cancer diagnosis. Exp Syst Appl. 2009;36:3240–7.

    Article  Google Scholar 

  33. Noble WS. What is a support vector machine? Nat Biotech. 2006;24:1565–7.

    Article  CAS  Google Scholar 

  34. Vapnik V. The nature of statistical learning theory. Berlin: Springer; 2000.

    Book  Google Scholar 

  35. Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Mach Learn. 2002;46:389–422.

    Article  Google Scholar 

  36. Archer KJ, Kirnes RV. Empirical characterization of random forest variable importance measures. Comput Stat Data An. 2008;52:2249–60.

    Article  Google Scholar 

  37. Yang HJ, Liu VW, Wang Y, Chan KY, Tsang PC, Khoo US, et al. Detection of hypermethylated genes in tumor and plasma of cervical cancer patients. Gynecol Oncol. 2004;93:435–40.

    Article  CAS  PubMed  Google Scholar 

  38. Widschwendter A, Muller HM, Fiegl H, Ivarsson L, Wiedemair A, Muller-Holzner E, et al. DNA methylation in serum and tumors of cervical cancer patients. Clin Cancer Res. 2004;10:565–71.

    Article  CAS  PubMed  Google Scholar 

  39. Lim EH, Ng SL, Li JL, Chang AR, Ng J, Ilancheran A, et al. Cervical dysplasia: assessing methylation status (Methylight) of CCNA1, DAPK1, HS3ST2, PAX1 and TFPI2 to improve diagnostic accuracy. Gynecol Oncol. 2010;119:225–31.

    Article  CAS  PubMed  Google Scholar 

  40. Lim SP, Wong NC, Suetani RJ, Ho K, Ng JL, Neilsen PM, et al. Specific-site methylation of tumour suppressor ANKRD11 in breast cancer. Eur J Cancer. 2012;48:3300–9.

    Article  CAS  PubMed  Google Scholar 

  41. Hammons GJ, Yan-Sanders Y, Jin B, Blann E, Kadlubar FF, Lyn-Cook BD. Specific site methylation in the 5'-flanking region of CYP1A2 interindividual differences in human livers. Life Sci. 2001;69:839–45.

    Article  CAS  PubMed  Google Scholar 

  42. Song SH, Jong HS, Choi HH, Kang SH, Ryu MH, Kim NK, et al. Methylation of specific CpG sites in the promoter region could significantly down-regulate p16(INK4a) expression in gastric adenocarcinoma. Int J Cancer. 2000;87:236–40.

    Article  CAS  PubMed  Google Scholar 

  43. Nile CJ, Read RC, Akil M, Duff GW, Wilson AG. Methylation status of a single CpG site in the IL6 promoter is related to IL6 messenger RNA levels and rheumatoid arthritis. Arthritis Rheum. 2008;58:2686–93.

    Article  PubMed  Google Scholar 

  44. Roberson ED, Liu Y, Ryan C, Joyce CE, Duan S, Cao L, et al. A subset of methylated CpG sites differentiate psoriatic from normal skin. J Invest Dermatol. 2012;132:583–92.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  45. Claus R, Lucas DM, Stilgenbauer S, Ruppert AS, Yu L, Zucknick M, et al. Quantitative DNA methylation analysis identifies a single CpG dinucleotide important for ZAP-70 expression and predictive of prognosis in chronic lymphocytic leukemia. J Clin Oncol. 2012;30:2483–91.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  46. Sohn BH, Park IY, Lee JJ, Yang SJ, Jang YJ, Park KC, et al. Functional switching of TGF-beta1 signaling in liver cancer via epigenetic modulation of a single CpG site in TTP promoter. Gastroenterology. 2010;138:1898–908.

    Article  CAS  PubMed  Google Scholar 

  47. Peille AL, Brouste V, Kauffmann A, Lagarde P, Le Morvan V, Coindre JM, et al. Prognostic value of PLAGL1-specific CpG site methylation in soft-tissue sarcomas. PLoS One. 2013;8:e80741.

    Article  PubMed Central  PubMed  Google Scholar 

  48. Yuan J, Luo RZ, Fujii S, Wang L, Hu W, Andreeff M, et al. Aberrant methylation and silencing of ARHI, an imprinted tumor suppressor gene in which the function is lost in breast cancers. Cancer Res. 2003;63:4174–80.

    CAS  PubMed  Google Scholar 

  49. Enokida H, Shiina H, Urakami S, Igawa M, Ogishima T, Li LC, et al. Multigene methylation analysis for detection and staging of prostate cancer. Clin Cancer Res. 2005;11:6582–8.

    Article  CAS  PubMed  Google Scholar 

  50. Chih-Wei Hsu C-CC, Lin C-J. A practical guide to support vector classification. 2010. Accessed 10 Jan 2015.

  51. Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci. 2004;44:1–12.

    Article  CAS  PubMed  Google Scholar 

  52. Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci. 2000;97:262–7.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  53. Vapnik V. The nature of statistical learning theory. 1998.

    Google Scholar 

  54. Yang ZR. Biological applications of support vector machines. Brief Bioinform. 2004;5:328–38.

    Article  CAS  PubMed  Google Scholar 

  55. Efron B, Gong G. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am Stat. 1983;37:36–48.

    Google Scholar 

  56. Ben-Hur A, Weston J. A user’s guide to support vector machines. Methods Mol Biol. 2010;609:223–39.

    Article  CAS  PubMed  Google Scholar 

  57. Fu WJ, Carroll RJ, Wang S. Estimating misclassification error with small samples via bootstrap cross-validation. Bioinformatics. 2005;21:1979–86.

    Article  CAS  PubMed  Google Scholar 

  58. Overmeer RM, Louwers JA, Meijer CJLM, van Kemenade FJ, Hesselink AT, Daalmeijer NF, et al. Combined CADM1 and MAL promoter methylation analysis to detect (pre-)malignant cervical lesions in high-risk HPV-positive women. Int J Cancer. 2011;129:2218–25.

    Article  CAS  PubMed  Google Scholar 

  59. Galgano MT, Castle PE, Atkins KA, Brix WK, Nassau SR, Stoler MH. Using biomarkers as objective standards in the diagnosis of cervical biopsies. Am J Surg Pathol. 2010;34:1077–87.

    Article  PubMed Central  PubMed  Google Scholar 

  60. Pikor LA, Enfield KS, Cameron H, Lam WL. DNA extraction from paraffin embedded material for genetic and epigenetic analyses. J Vis Exp. 2011; 49:e2763. doi:10.3791/2763.

  61. Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000;16:412–24.

    Article  CAS  PubMed  Google Scholar 

Download references


We are very grateful to Ms. Danli Yang for technical assistance and Dr Xuan Mu for critical reading of the manuscript.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Xiaolin Yang, Xiuru Zhang or Zhi Zheng.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

ZZ and XZ conceived the study. ZZ and XT designed the experiments; XZ and CD coordinated the subject recruitment and pathology interpretations. XT, CD, JZ, and RZ carried out the experiments. XY performed the statistical analysis, ZZ, XY, XZ, and XP analyzed and interpreted the results; ZZ, XT, and XY drafted and revised the manuscript. All authors were involved in manuscript preparation and had final approval of the submitted version.

Additional file

Additional file 1: Table S1.

Primers, respective target sizes, and locations of the CGIs analyzed by EpiTYPER. L indicates left primer; R indicates right primer.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit

The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, X., Chen, D., Zhang, R. et al. Quantitative survey of multiple CpGs from 5 genes identifies CpG methylation panel discriminating between high- and low-grade cervical intraepithelial neoplasia. Clin Epigenet 7, 4 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: