Skip to main content


Methylated genomic loci encoding microRNA as a biomarker panel in tissue and saliva for head and neck squamous cell carcinoma



To identify aberrant promoter methylation of genomic loci encoding microRNA (mgmiR) in head and neck squamous cell carcinoma (HNSCC) and to evaluate a biomarker panel of mgmiRs to improve the diagnostic accuracy of HNSCC in tissues and saliva.


Methylation of promoter regions of mgmiR candidates was initially screened using HNSCC and control cell lines and further selected using HNSCC and control tissues by quantitative methylation-specific PCR (qMS-PCR). We then examined a panel of seven mgmiRs for validation in an expanded cohort including 189 HNSCC and 92 non-HNSCC controls. Saliva from 86 pre-treatment HNSCC patients and 108 non-HNSCC controls was also examined using this panel of seven mgmiRs to assess the potentials of clinical utilization.


Among the 315 screened mgmiRs, 12 mgmiRs were significantly increased in HNSCC cell lines compared to control cell lines. Seven out of the 12 mgmiRs, i.e., mgmiR9-1, mgmiR124-1, mgmiR124-2, mgmiR124-3, mgmiR129-2, mgmiR137, and mgmiR148a, were further found to significantly increase in HNSCC tumor tissues compared to control tissues. Using multivariable logistic regression with dichotomized variables, a combination of the seven mgmiRs had sensitivity and specificity of 92.6 and 92.4% in tissues and 76.7 and 86.1% in saliva, respectively. Area under the receiver operating curve for this panel was 0.97 in tissue and 0.93 in saliva. This model was validated by independent bootstrap validation and random forest analysis.


mgmiR biomarkers represent a novel and promising screening tool, and the seven-mgmiR panel is able to robustly detect HNSCC in both patient tissue and saliva.


Head and neck squamous cell carcinoma (HNSCC) compromises approximately 90% of all head and neck cancers and 5% of all malignancies [1, 2]. HNSCC has also seen an increasing rate of prevalence over the past 30 years due to HPV infection [3]. Despite advancements in cancer therapy, the prognosis for HNSCC patients remains poor [3]. The low survival rate is in stark contrast to the increase in survival rates of many other cancers. One of the main reasons for the poor prognosis of HNSCC is that by the time of diagnosis, more than half of HNSCC patients have locoregionally advanced disease. Therefore, early detection may be key to improving survival rates in the future [4].

Current early screening methods for HNSCC in the clinic are limited to physical examination or optical devices by either dentists or primary care physicians. These then lead to a referral to a specialist. Around that time, medical imaging, such as MRI, CT scan, or laryngoscopy, are still the main methods for initial clinical diagnosis, leading eventually to a biopsy or surgical procedure for confirmation. [5]. These initial imaging methods are subjective, inaccurate, invasive, costly, and inconvenient. Moreover, utilizing the current system leads to significant diagnostic delays so that by the time of diagnosis, over 2/3 of HNSCC patients already have in advanced stage disease [6]. Thus, development of an objective, accurate, non-invasive, low cost, and convenient method for early detection would be highly beneficial.

Development of cancer-specific biomarkers for detection of initial HNSCC and recurrence has been widely explored using DNA-based (loss of heterozygosity, mutation, and DNA methylation), RNA-based, and protein-based assays in both patient tissues and saliva [7,8,9,10,11,12,13,14,15,16,17]. DNA methylation usually occurs early in the process of tumorigenesis and has been widely developed as a basis for biomarkers for human cancers [18,19,20,21]. Several methylation markers have been approved by the FDA for clinical application, including MGMT in glioblastoma [22] and the ColoGuard stool-based screen for colorectal cancer patients [23]. Development of DNA methylation biomarkers for HNSCC detection has also been reported [7, 19]. However, biomarker studies are still limited to the research phase, and no biomarker-based assays for early detection have been used clinically for HNSCC patients.

MicroRNAs (miRNAs) are a class of non-coding small RNAs, which negatively regulate gene expression at the post-transcriptional level [24]. A number of miRNAs have been found to be deregulated in human cancers through various mechanisms [25]. One of the mechanisms responsible for reduced or loss of miRNA expression is epigenetic silencing of miRNA genes by DNA methylation at the genomic loci encoding the miRNAs [26]. Currently, more than 20 miRNAs have been reported to be silenced by DNA methylation in multiple human cancers [27, 28]. We have reported that DNA methylation of miR9 specifically occurred in a subset of human HNSCC tissue samples [29]. However, their potential as a novel class of biomarker for human cancer detection has not been fully assessed.

We hypothesized that DNA methylation at the genomic loci encoding miRNA (mgmiRs) represents a novel class of modification and could be efficiently utilized for human cancer including HNSCC. We tested this hypothesis by screening and selecting mgmiRs in both cell lines and tissues from HNSCC patients and normal controls. We then investigated panel of seven mgmiRs, i.e., 9-1, 124-1, 124-2, 124-3, 129-2, 137, and 148a, in an expanded patient’s cohort, including 189 HNSCC tissues and 92 control tissues. The translational application of this panel of mgmiRs was further evaluated in saliva from 86 HNSCC patients and 108 control patients. Lastly, we assessed the association of individual mgmiRs with demographic and clinical pathological information in both tissue and saliva.


Patient information

This study was conducted on human HNSCC surgical samples from both the University of Colorado Anschutz Medical Campus and the Oregon Health & Science University (OHSU) under the Institutional Review Board approval protocols from each institution. A written informed consent was obtained from each subject. A total of 281 different tissue specimens were used. Among them, 189 samples were HNSCC specimens from the time of surgical resection and constituted our “tumor” group. Ninety-two samples were from non-HNSCC patients undergoing surgeries for sleep apnea or tonsillectomy with no history of malignancy and used as our “non-HNSCC control” group (Table 1). Saliva samples were collected in 86 previously untreated HNSCC patients and 108 control patients including subjects enrolled in a community screening study and non-HNSCC patients undergoing surgeries for sleep apnea or tonsillectomy (Table 1). Enrollment included collection of demographic information, risk factor history, and clinical pathological information. All information was registered in a Research Electronic Data Capture (REDCap) database.

Table 1 Demographic and clinicopathological information of participated patients

Preparation of samples

After harvesting, tissue was immediately taken to the laboratory where it was frozen and stored in liquid nitrogen until DNA extraction. For saliva collection, patients/volunteers were required to refrain from eating, drinking, chewing gums, and smoking 30 min before head. At the time of saliva collection, patients gargled with normal saline solution two times for around 20 s each time. Patients/volunteers were then instructed to spit their saliva into the collection tube for 5 min without swallowing. Once collected, samples were immediately frozen and stored at − 80 °C until ready for use.

DNA extraction and bisulfite conversion of genomic DNA

Tissue DNA was extracted from each tissue sample using the DNeasy Blood & Tissue kit (QIAGEN, Hilden, Germany), and saliva DNA was extracted using the QiaAmp DNA mini kit (QIAGEN, Hilden, Germany) following the manufacturer’s instructions. Quantity and quality of the extracted genomic DNAs were measured using the Nanovue spectrophotometer (GE Healthcare). Bisulfite conversion of 1 μg genomic DNA was performed as described in the EZ DNA Methylation-Gold kit (Zymo, Irvine, CA, USA) to create a template for qMS-PCR. The bisulfate-modified genomic DNA was resuspended in 100 μl of water and stored at − 80 °C.

Quantitative methylation-specific PCR

Bisulfite-treated DNA was then used as a template for qMS-PCR, which was performed using the methylation-specific primers. For the primer designs, genomic sequence for each miRNAs including 1000 base upstream were obtained from the UCSC genomic browser website. The primers for methylation analysis were designed on the basis of this sequence using MethPrimer software. All primer sequences are available upon request. The analysis was performed using quantitative methylation-specific PCR (qMS-PCR).

For each individual marker, the qMS-PCR protocol was optimized prior to running the samples, in order to identify the proper annealing temperature and maximize the results to obtain a typical sigmoid result curve. Melting curves and gel were integrated to determine the specificity of each marker. Variables were adjusted for the temperature, number of cycles, and length of each cycle. Each reaction was performed in a 20-μl PCR mixture consisting of 2 μl of bisulfite-converted DNA, 5 nmol/L of forward primer, 5 nmol/L of reverse primer, and 4 μl SYBR-green supermix (Biorad, Hercules, CA, USA). QMS-PCR was run in triplicate on the CFX connect™ real-time detection system (Biorad, Hercules, USA). First, samples were denatured at 95 °C for 5 min, this was followed by 40 cycles of 95 °C for 30 s, and lastly for a given primer, samples were exposed to the optimized annealing/extension temperature for 1 min. Standardization was done by using UMSCC10A cells and subjecting the cells to methylation in vitro with excess Sss1 methyltransferase (New England Biolabs, Ipswich, MA, USA) to generate a completely methylated DNA, and serial dilutions of this DNA were used for constructing the calibration curve for each plate. Water and reaction mix were also included in each plate to serve as negative controls and to ensure that there was no contamination. For each sample within each marker, a relative methylation level was calculated using the difference in Ct values by the standard 2−ΔCt method in which β-actin was used as an internal reference gene. A Ct of 15 to 30 was considered a high methylation level, and a Ct of 35, a low methylation level. A Ct value of more than 40 was considered as undetectable.

HPV detection

An HPV signal was detected with amplification of L1 consensus sequence using primers GP5+/GP6+ as previously described [30]. Briefly, 20 ng of genomic DNA extracted from tissue samples and cell lines was used for PCR amplification by the forward primer GP5+ (5′-TTTGTTACTGTGGTAGATACTAC) and the reverse primer GP6+ (5′-GAAAAATAAACTGTAAATCATATTC). The final PCR products were analyzed by 2% agarose gel electrophoresis and ethidium bromide staining.

Statistical analysis

Descriptive statistics such as mean, standard deviation, proportion, and percentage were used to summarize demographic and clinicopathological characteristics of study subjects. Chi-square or two-group T tests, as appropriate, were used to compare HNSCC patients and control patients.

A univariate and combined analysis of the seven selected mgmiRs was carried out using logistic regression. The receiver operating curve (ROC) was computed for each of the analyses. For each of the univariate analyses, the Youden’s index (J) was used to determine the optimal cutoff point to dichotomize a continuous mgmiR based on the ROC plots. J can be formally defined as J = Maximum (sensitivity + specificity − 1) on a ROC plot. The cut-point that archives this maximum is referred to as the optimal cut-point because it optimizes the biomarker’s differentiating ability when equal weight is given to sensitivity and specificity [31]. Additionally, to indicate diagnostic accuracy of each mgmiR, sensitivity, specificity, positive predictive values (PPVs), and negative predictive values (NPVs) were provided. The same quantities were also calculated for the combination of the seven mgmiRs.

The bootstrap method was used to validate the performance of the combination of the seven mgmiRs internally, where 2000 random samples with replacement were generated with each sample containing the same number of observations as the original dataset. The same combined analysis of the seven selected mgmiRs was carried out on each sample and the average AUC and its 95% confidence interval (2.5 and 97.5 percentiles) across the 2000 samples were reported. In addition, random forest, a nonparametric classification approach was used to estimate the prediction performance using the seven mgmiRs using the CARET package in R (Kuhn M. 2016. R package version 6.0-73). Random forest analyses were conducted separately for tissue and saliva samples. The dataset was split into training and testing datasets, using a 2/3 to 1/3 split and random sampling. The random forest models were trained using 5000 trees and 1000 bootstrap resamples. In addition, for each type of sample (i.e., tissue or saliva), it was assessed whether the presence of demographic information (i.e., age, gender, and smoking history) improved prediction results. The random forest models were then applied to the testing dataset, and the ROC and area under the ROC (AUC) were determined. Youden’s index was used to determine a cutoff point on the ROC, and sensitivity and specificity were determined. The median and its 95% confidence interval for the sensitivity and specificity were estimated using 2000 bootstrap resamples using the pROC package in R.


DNA methylation at genomic loci encoding microRNAs were identified in HNSCC cell lines

We utilized the following approaches to screen candidate methylated genomic loci encoding miRNAs (mgmiRs) in HNSCC: (1) The UCSC genome browser was used to obtain 1 Kb genomic sequences [GRch38] of 5′-UTRs of 315 primary (Pri-) miRNAs in the Homo sapiens miRBase [from has-let-7a-1 to has-mir-499b in miRbase,]. (2) The CpG island prediction software (MethPrimer) was then utilized to identify 26 genomic loci with CpG islands (defined as island size > 200 bp, GC content > 50%, observed/expectation > 0.6). (3) By designing methylation-specific primers and running quantitative methylation-specific PCR (qMS-PCR), we amplified methylation signals in 22 mgmiRs and found 12 mgmiRs that had increased methylation in human HNSCC cell lines compared to normal head and neck cell lines (Additional file 1: Figures S1 and S2). As shown in Additional file 2: Table S1, we examined relative methylation levels in 12 HNSCC cell lines including cell lines derived from age (ranging from 22 to 70 years), both male and female, human papillomavirus (HPV)-positive and HPV-negative HNSCCs, an HNSCC from Fanconi anemia patient, and different anatomic sites of the head and neck region. Four head and neck normal cell lines were included as controls.

HNSCC patients and control patients were recruited for this study

We included 189 tumor tissues from HNSCC patients and 92 control tissues from non-HNSCC patients undergoing surgeries for sleep apnea or tonsillectomy (Table 1). Clinical and demographic variables were similar in cases and controls with exception of age. Subjects with HNSCC patients were older than controls (63.23 vs. 54.98 years). Primary tumor sites included oral squamous cell carcinoma (OSCC), 95 cases; oropharynx SCC (OPSCC), 44 cases; larynx SCC (LSCC), 48 cases; and two cases without location information. Among the 66 HNSCC tissue samples tested for HPV status, there were 25 (37.9%) HPV-positive cases and 41 (62.1%) HPV-negative cases. Pathologic stage at diagnosis was T1 in 46 cases, T2 in 49 cases, T3 in 41 cases, and T4 in 50 cases; and N0 in 78 cases, N1 in 37 cases, N2 in 66 cases, and N3 in five cases. Clinical staging was stage I in 34 cases, II in 20 cases, III in 59 cases, and IV in 73 cases. There were three HNSCC cases without TNM information (Table 1). We also included 86 saliva from pre-treatment HNSCC patients and 108 from control patients of either undergoing surgeries for sleep apnea or tonsillectomy, or coming for community screening. Similar to tissue samples, clinical and demographic variables were similar in cases and controls with exception of age. Subjects with HNSCC patients were older than controls (61.07 vs. 53.77 years). Primary tumor sites included OSCC, 42 cases; OPSCC, 28 cases; LSCC, 13 cases, and three cases without location information. Pathologic stage at diagnosis was T1 in 20 cases, T2 in 26 cases, T3 in 11 cases, T4 in 21 cases, and eight cases without information of tumor size; and N0 in 28 cases, N1 in 6 cases, N2 in 40 cases, and N3 in two cases. Clinical staging was stage I in 10 cases, II in 13 cases, III in 17 cases, and IV in 36 cases. There are ten HNSCC cases without node and stage information (Table 1).

A panel of seven mgmiR biomarker was identified and validated in HNSCC patient tissues

To further identify mgmiRs which can distinguish HNSCC from control patient samples, we first examined the 12 mgmiRs in a small sample cohort including 30 HNSCC patients’ tissue samples, and 25 age and gender-matched normal head and neck tissues from either tonsillitis or sleep apnea patients. Based on the comparison of relative methylation levels, seven mgmiRs, i.e., 9-1, 124-1, 124-2, 124-3, 129, 137, and 148a, showed significant elevation in HNSCC tissue compared to non-HNSCC control tissues and were selected for further study (Additional file 1: Figure S3).

We then expanded examination of the seven mgmiRs to additional 159 HNSCC and 67 non-HNSCC control cases. Fig. 1 and Additional file 2: Table S2 show the relative methylation levels of the seven mgmiRs in the total of 189 (30 + 159) HNSCC and 86 (25 + 67) non-HNSCC control tissues. The total methylation level in the HNSCC group was significantly higher (p < 0.0001) than that in the control group for all the seven mgmiRs. All data was analyzed using both dichotomized and continuous variables. ROC curves for HNSCC detection were generated using either dichotomized variables or continuous variables (Additional file 1: Figure S4A for each mgmiR). As shown in Table 2, the univariate assessment of HNSCC diagnosis from each of the seven dichotomized mgmiR variables in tissue had sensitivities and specificities ranging from 43.9 to 77.3% and 85.9 to 100%, respectively. However, the combined assessment of the seven mgmiRs in tissue resulted in a higher sensitivity and a specificity that was within the range seen in the univariate assessment, 92.6 and 92.4%, respectively. The combined assessment of the seven mgmiRs yielded 92.5% accuracy, 96.2% PPV, and 85.9% NPV with an area under curve (AUC) of 0.97 (Table 2). Since age difference is a confounder factor, we also included age together with the seven mgmiRs for the combined assessment. As shown in Table 2, inclusion of age slightly enhanced specificity but no significant changes with other parameters. Internal bootstrap validation had AUC of 0.97 with a 95% CI of 0.95–0.99. The results using continuous forms of the mgmiR variables as opposed to dichotomous forms were similar with 93.6% sensitivity, 92.4% specificity, and AUC of 0.98 (Additional file 2: Table S3).

Fig. 1

Relative methylation levels for seven mgmiRs (9-1, 124-1, 124-2, 124-3, 129-2, 137, 148a) in tissues from 189 HNSCC patients and 92 controls. The quantity of methylated mgmiRs was expressed as fold changes from the methylated mgmiR to that from the reference gene β-actin

Table 2 Univariate and multivariable logistic regression with dichotomized variables in tissues

MgmiRs were associated with HPV infection and could be detected in early cancer stage HNSCC tissues

We looked for association between either individual mgmiR or in combination and clinical pathological characteristics. For the mgmiR combination, we first estimated the probability of a tissue being estimated as positive using logistic regression with the seven mgmiRs then assessed the association between a characteristic and the estimated status (case or control) of the tissue. In general, the associations between the mgmiRs and location of HNSCC, smoking status, tumor size, node status, and stage were stronger for the combination of the seven mgmiRs compared to their individual assessments (Table 3). Notably, mgmiR124-2 and mgmiR129-2 were found to detect significantly more HPV-positive than HPV-negative HNSCC cases (88.0 vs. 65.9%, p = 0.04; 84.0 vs. 56.1%, p = 0.02, respectively). There were no significant associations between tumor size, stage, and percentage of positive cases detected by either individual mgmiR or the seven mgmiRs as a panel. However, most importantly, 93.5% of T1 and 91.2% of stage I HNSCC can be detected by the seven mgmiRs as a panel (Table 3). Similarly, there were no significant associations between node status (N0 vs. N+) with the exception of mgmiR124-1, which detected more N0 cases than N+ cases (79.5 vs. 65.7%, p = 0.04, Table 3).

Table 3 Association between mgmiRs and clinicopathological characteristics in HNSCC tissues

The seven mgmiR biomarkers were validated in HNSCC patient saliva

We then examined the seven mgmiR biomarkers in saliva from 86 HNSCC patients and 108 normal controls. The relative methylation levels of the seven mgmiRs in HNSCC and control saliva are shown in Fig. 2 and Additional file 2: Table S4. The total methylation level in the HNSCC group was significantly higher than the control group across the seven mgmiRs. All data were analyzed using both dichotomized and continuous variables. ROC curves for HNSCC detection were generated using either dichotomized variables or continuous variables for each mgmiR (Additional file 1: Figure S4B). The sensitivity and specificity using dichotomized variables for HNSCC diagnosis from single mgmiR in saliva ranged from 19.8 to 72.1% and 74.1 to 97.2%, respectively, and from the seven combined mgmiRs in saliva 76.7 and 86.1%, respectively. The combined seven mgmiRs yielded 83.0% accuracy, 83.5% PPV, and 82.6% NPV with AUC 0.93 (Table 4). Inclusion of age together with the seven mgmiRs slightly increased specificity (Table 4). Internal bootstrap validation had AUC of 0.93 with a 95% CI of 0.89–0.96. The results using continuous variables were better to those using dichotomized variables with 83.7% sensitivity, 95.4% specificity, and AUC of 0.95 (Additional file 2: Table S5).

Fig. 2

Relative methylation levels for the seven mgmiRs (9-1, 124-1, 124-2, 124-3, 129-2, 137, 148a) in saliva from 86 HNSCC patients and 108 controls. The quantity of methylated mgmiRs was expressed as fold changes from the methylated mgmiR to that from the reference gene β-actin

Table 4 Univariate and multivariable logistic regression with dichotomized variables in saliva

MgmiR was detected in the saliva of early HNSCC

Similar to the tissue sample results, the associations between the mgmiRs and location of HNSCC, smoking status, tumor size, node status, and stage were generally stronger for the combination of the seven mgmiRs compared to their individual assessments (Table 5). The combination of seven mgmiRs can detect 71.4, 82.1, and 76.9% of OSCC, OPSCC, and LSCC, respectively. However, there was no significant association between tumor location and positive cases detected by mgmiR either individually or in combination (Table 5). Importantly, 85.0% of T1 HNSCC and 80.0% of stage I HNSCCs could be detected by the panel of mgmiRs, although there were no significant associations between tumor size, node status, stage, and mgmiR-positive cases detected by either individual mgmiR or the combination (Table 5).

Table 5 Association between mgmiRs and clinicopathologcal characteristics in HNSCC saliva

Complementary prediction accuracy and performance assessment further validated the panel of mgmiR biomarkers in HNSCC

The mgmiR biomarkers were further evaluated for tissue and saliva samples using random forest models. Similar to the combined logistic regression model results reported above, the random forest models included all seven mgmiR biomarkers. In addition, demographic information’s influence (age, gender, and smoking status) on prediction performance was also assessed. Excluding demographic information, the randomly selected tissue training dataset had 186 subjects, 62 (33%) controls and 124 (67%) cancers, and the independent testing dataset had 92 subjects, 30 (33%) controls and 62 (67%) cancers. With the inclusion of demographic information, the tissue training dataset had 170 subjects, 60 (35%) controls and 110 (65%) cancers, and the testing dataset had 84 subjects, 30 (36%) controls and 54 (64%) cancers, due to missing in demographic variables. Alternatively, no demographic information was missing for the saliva samples and therefore the training and testing datasets were the same between models with and without demographic information. The saliva training dataset had 130 subjects, 72 (55%) controls and 58 (45%) cancers, and the testing dataset had 64 subjects, 36 (56%) controls and 28 (44%) cancers. Model performance (i.e., AUC, sensitivity, and specificity) using the random forest approach was similar to the combined seven-mgmiR logistic regression results for both the tissue and saliva samples when demographic information was not included (Table 6). The mean decrease in Gini index was used to evaluate variable importance. The demographic variables were found to be the least important variables in both the tissue and saliva samples (Additional file 1: Figures S5-S6). However, including demographic information in the random forest models resulted in slightly but not significantly higher AUCs, sensitivities, and specificities (Table 6).

Table 6 Random forest model results for tissue and saliva samples using either mgmiRs and demographic variables or the mgmiRs alone


Early detection of cancers in high-risk populations has increased patient survival rates, which is exemplified in colonoscopy for colon cancer, pap smear for cervical cancer, and mammogram for breast cancer [32]. Recently, the FDA approved the use of a stool-DNA test for colon cancer as a screening tool. This stool-DNA test, which can be used ahead of colonoscopy, significantly improved sensitivity and specificity of colon cancer detection, patient compliance, and reduced costs [23]. Nearly all current diagnostic methods for HNSCC in clinic rely on physical examination, medical imaging, and endoscopy techniques. Molecular tests for HNSCC have been widely explored [7, 19]. However, their application in clinic has been limited by their poor performance (i.e., low sensitivity and specificity). In this study, we reported a high sensitivity and specificity molecular test can be achieved by examining DNA methylation on the genomic loci of microRNA (mgmiR). This is the first study to evaluate the effectiveness of mgmiR, a novel class of molecule, as a biomarker panel for HNSCC diagnosis using tissue and saliva samples from a large HNSCC population.

Our “mgmiR” technique combines several advantages of different molecular test technologies: (1) It is a genomic DNA-based technique. Analysis of RNA or protein in saliva samples relies on the stability of the mRNA or protein, which is always a challenge as saliva harbors high levels of RNAse and proteases [33, 34]. In contrast, genomic DNA has a higher stability and is less susceptible to be affected by storage and shipping than RNA- or protein-based technologies. (2) It is a methylation-based technique. In contrast with examining expression level of mRNA, miRNA itself [35], or protein, which can be either positive (increased) or negative (decreased), DNA methylation converts a negative signal (reduced or loss of expression) into a positive signal, which can be used for the detection of cancer-specific signal in a high background of normal non-cancer cells. In contrast with detecting mutations, which usually occur at multiple sites in individual tumors. DNA methylation usually occurs on the same region of a gene, such as the promoter, which greatly simplifies design and interpretation of screening tests. In addition, DNA methylation occurs at the early stage of cancer development, providing the opportunity for early detection of cancer [19]. It also has tissue specificity, providing a foundation for detection of tumor of unknown primary, which is exemplified in a EPICUP clinical trial study [36,37,38]. (3) Instead of detecting methylation of coding genes, mgmiR detects genomic loci encoding for microRNA. Given the lower numbers of miRNAs, compared to coding mRNAs, mgmiR-based technique will compensate or at least provide novel biomarkers for current DNA methylation-based HNSCC detection. (4) It is a qPCR-based technique which has been more widely accepted in clinical laboratories. It is highly sensitive, simple, easier to handle, and has a lower cost than mutation search using next-generation sequencing (NGS).

While the development of “liquid biopsy” is almost focused on the analysis of cell-free DNA and/or circulating tumor cells in the blood, saliva, together with sputum, urine, etc., constitutes major resources for molecular testing using “body fluids.” Blood-based liquid biopsy has been largely applied to patients with late-stage cancers or serves as companion biomarkers for treatment. The sensitivity and specificity may limit their applicability for detection of early-stage cancers. For example, saliva has been shown to be a more sensitive predictor than blood for detection of HNSCC [17]. Using the recently developed most sensitive NGS, only about 50% of colon cancer patients can be detected at stage I, in comparison with ~ 90% of patients that can be detected at stages II–IV [39]. Although we do not have data from blood to directly compare with the data from saliva, the sensitivity from saliva for stage I HNSCC patients in our study is 80%, which is higher than the results from the blood reported in the literature [39]. Thus, it is likely that the saliva-based method is more sensitive and can detect cancer earlier than blood-based method. In addition, for screening purposes, the nature of non-invasive, self-sampling, convenient, and low-cost of saliva test make it a better screening tool particularly for rural or community primary care settings [40].

While this report focuses on identification and validation of a panel of mgmiRs as biomarkers for HNSCC by rigorous statistical analysis in a large sample size, the functions and mechanism of individual mgmiRs and corresponding miRNA in HNSCC pathogenesis are not included in this report. One interesting question is how the miRNAs are regulated by their mgmiRs. We have reported that miR9 expression is restored upon DNA demethylation agent 5-aza-cytidine treatment [29], and similar results have been obtained for miR124, miR137, miR129-2, and miR148a (manuscripts in preparation), indicating DNA methylation on mgmiRs is a common mechanism for silencing miRNA expression. Another interesting mechanism is how each individual miRNA functions in HNSCC pathogenesis. While detailed functional analysis including identification of their downstream target genes is beyond the scope of this report, we published the tumor suppression role of miR9 on cell proliferation [29] and are preparing separate manuscripts on characterizing functional role of individual miRNA and their target genes in HNSCC pathogenesis.

Our controls are comprised of a significant numbers of tonsillitis patients; however, we have shown inflammation does not affect the ability of mgmiRs in distinguishing cancer and controls, suggesting this panel of mgmiRs is cancer-specific. An interesting focus of study is why mgmiRs are methylated in cancer but not in normal tissue. We do not quite understand the underlying mechanism yet, but reports on aberrant regulation of DNA methytransferases (DNMTs) by either smoking or HPV could be potential mechanisms [41, 42]. We have not tested these mgmiRs in other cancers. However, there are reports of mgmi124-2 detection in cervical cancers [43]. Interestingly, both cervical and HNSCC are squamous cell carcinomas, and HPV is the common etiological factor in both cervical and HPV+ HNSCC patients. Whether this marker represents a common marker for HPV-related squamous cell carcinoma is a potential area of future investigation.

The association between mgmiR biomarker(s) and clinical pathological data yield several interesting findings: (1) Statistical analysis showed no significant difference of positive cases detected by mgmiRs in tumor size, node involvement, and tumor stage. More importantly, T1 size tumors were detected in 93.5% of tissues and 85.0% of saliva samples, N0 tumors were detected in 94.9% of tissues and 75% of saliva samples, and stage I HNSCC patients were detected in 91.2% of tissues and 80.0% of saliva samples. These data clearly demonstrate the clinical usefulness of using the mgmiR panel for early detection of HNSCC. (2) Although the positive percentages of LSCC detected by either mgmiR or the seven mgmiRs as a panel were lower than OSCC and OPSCC tumor tissues, it is not statistically significant. However, mgmiR148a seems to be limited in its ability to detect LSCC, and it may be possible to use the remaining six mgmiRs for LSCC detection in saliva. One question here is the age difference between the control group and HNSCC group would affect methylation of mgmiRs in general population, since aging has been shown to affect both methylation and cancer development [44, 45]. We thus include age as a confounder with the seven mgmiRs and compare prediction performance with and without age (Tables 2 and 4). We have also performed a random forest model of the seven mgmiRs together with age, gender, and smoking status (Table 6). Age itself is shown to slightly contribute to the prediction and is ranked lowest in comparison with the other seven mgmiRs. Anyway, it would be ideal in the future to include more aged-matched controls whenever it is possible in clinical study.

While the incidence of HPV-negative HNSCC gradually drops down due to cessation of smoking, the incidence of HPV-positive HNSCC has increased more than twofold in the past 20 years [46]. In our data, smoking status does not affect the results of the assay. However, the mgmiR124-2 and mgmiR129-2 detected more HPV-positive HNSCC cases than HPV-negative HNSCC cases. Unfortunately, we do not have HPV data on non-HNSCC control patients at this moment, which would help us to better distinguish if this association is HPV-associated or HPV-cancer-associated, and this warrants a future study. One of the problems of current HPV virus detection for either HNSCC or cervical cancer is that the high number of patients who are HPV positive without having a disease. Detection of HPV virus alone cannot discriminate truly oncogenic infection from transient “bystander” infection. Immunostaining of the surrogate marker p16 on tissue sections frequently generates false-positive results [47]. Thus, similar to the triage strategy used in cervical cancer [48], the detection of mgmiRs in saliva samples will be useful to further separate high-risk oncogenic infectious HPV-positive patients from low-risk “passenger” HPV infection and have great promise as triage tool for HPV-positive HNSCC diagnostics.


We have successfully developed a diagnostic panel using mgmiRs markers that demonstrated superior sensitivity and specificity in the detection of HNSCC in both tissues and saliva. The high sensitivity in detection of T1, N0, and stage I HNSCC of this panel of markers suggests their values in the early detection of HNSCC. Further assessment using pre-malignant lesions and prospectively monitoring high-risk patients will measure their usefulness in clinic. The quantitative nature makes the panel of mgmiRs an ideal candidate to use in surveillance of HNSCC patients, including treatment efficacy, prognosis, and prediction of recurrence and metastasis. Evaluation of these clinical applications using this panel of mgmiR markers are undergoing.



Area under the ROC


Head and neck squamous cell carcinoma


Human papillomavirus


Larynx SCC


Methylation of genomic loci encoding microRNA




Negative predictive value


Oropharynx SCC


Oral squamous cell carcinoma


Positive predictive value


Quantitative methylation-specific PCR


Receiver operating curve


  1. 1.

    Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics. CA Cancer J Clin. 2011;61(2):69–90.

  2. 2.

    Siegel RL, Miller KD, Jemal A: Cancer statistics, 2017. In: CA: a cancer journal for clinicians. Vol. 67, 2017/01/06 edn; 2017: 7-30.

  3. 3.

    Argiris A, Karamouzis MV, Raben D, Ferris RL. Head and neck cancer. Lancet. 2008;371(9625):1695–709.

  4. 4.

    Algazi AP, Grandis JR. Head and neck cancer in 2016: a watershed year for improvements in treatment? Nat Rev Clin Oncol. 2017;14(2):76–8.

  5. 5.

    Adelstein D, Gillison ML, Pfister DG, Spencer S, Adkins D, Brizel DM, Burtness B, Busse PM, Caudell JJ, Cmelak AJ, et al. NCCN guidelines insights: head and neck cancers, version 2.2017. J Natl Compr Canc Netw. 2017;15(6):761–70.

  6. 6.

    Vermorken JB, Specenier P. Optimal treatment for recurrent/metastatic head and neck cancer. Ann Oncol. 2010;21(Suppl 7):vii252–61.

  7. 7.

    Mydlarz WK, Hennessey PT, Califano JA. Advances and perspectives in the molecular diagnosis of head and neck Cancer. Expert Opin Med Diagn. 2010;4(1):53–65.

  8. 8.

    Brinkman BM, Wong DT. Disease mechanism and biomarkers of oral squamous cell carcinoma. Curr Opin Oncol. 2006;18(3):228–33.

  9. 9.

    Mao L. Can molecular assessment improve classification of head and neck premalignancy? Clin Cancer Res. 2000;6(2):321–2.

  10. 10.

    Rosin MP, Cheng X, Poh C, Lam WL, Huang Y, Lovas J, Berean K, Epstein JB, Priddy R, Le ND, et al. Use of allelic loss to predict malignant risk for low-grade oral epithelial dysplasia. Clin Cancer Res. 2000;6(2):357–62.

  11. 11.

    Carvalho AL, Jeronimo C, Kim MM, Henrique R, Zhang Z, Hoque MO, Chang S, Brait M, Nayak CS, Jiang WW, et al. Evaluation of promoter hypermethylation detection in body fluids as a screening/diagnosis tool for head and neck squamous cell carcinoma. Clin Cancer Res. 2008;14(1):97–107.

  12. 12.

    Pattani KM, Zhang Z, Demokan S, Glazer C, Loyo M, Goodman S, Sidransky D, Bermudez F, Jean-Charles G, McCaffrey T, et al. Endothelin receptor type B gene promoter hypermethylation in salivary rinses is independently associated with risk of oral cavity cancer and premalignancy. Cancer Prev Res (Phila). 2010;3(9):1093–103.

  13. 13.

    Zhang L, Poh CF, Williams M, Laronde DM, Berean K, Gardner PJ, Jiang H, Wu L, Lee JJ, Rosin MP. Loss of heterozygosity (LOH) profiles—validated risk predictors for progression to oral cancer. Cancer Prev Res (Phila). 2012;5(9):1081–9.

  14. 14.

    Rosin MP, Lam WL, Poh C, Le ND, Li RJ, Zeng T, Priddy R, Zhang L. 3p14 and 9p21 loss is a simple tool for predicting second oral malignancy at previously treated oral cancer sites. Cancer Res. 2002;62(22):6447–50.

  15. 15.

    Carvalho AL, Henrique R, Jeronimo C, Nayak CS, Reddy AN, Hoque MO, Chang S, Brait M, Jiang WW, Kim MM, et al. Detection of promoter hypermethylation in salivary rinses as a biomarker for head and neck squamous cell carcinoma surveillance. Clin Cancer Res. 2011;17(14):4782–9.

  16. 16.

    Sun W, Zaboli D, Wang H, Liu Y, Arnaoutakis D, Khan T, Khan Z, Koch WM, Califano JA. Detection of TIMP3 promoter hypermethylation in salivary rinse as an independent predictor of local recurrence-free survival in head and neck cancer. Clin Cancer Res. 2012;18(4):1082–91.

  17. 17.

    Wang Y, Springer S, Mulvey CL, Silliman N, Schaefer J, Sausen M, James N, Rettig EM, Guo T, Pickering CR, et al. Detection of somatic mutations and HPV in the saliva and plasma of patients with head and neck squamous cell carcinomas. Sci Transl Med. 2015;7(293):293ra104.

  18. 18.

    Jones PA, Baylin SB. The epigenomics of cancer. Cell. 2007;128(4):683–92.

  19. 19.

    Ha PK, Califano JA. Promoter methylation and inactivation of tumour-suppressor genes in oral squamous-cell carcinoma. Lancet Oncol. 2006;7(1):77–82.

  20. 20.

    Towle R, Truong D, Hogg K, Robinson WP, Poh CF, Garnis C. Global analysis of DNA methylation changes during progression of oral cancer. Oral Oncol. 2013;49(11):1033–42.

  21. 21.

    Heyn H, Esteller M. DNA methylation profiling in the clinic: applications and challenges. Nat Rev Genet. 2012;13(10):679–92.

  22. 22.

    Siegal T. Clinical impact of molecular biomarkers in gliomas. J Clin Neurosci. 2015;22(3):437–44.

  23. 23.

    Imperiale TF, Ransohoff DF, Itzkowitz SH, Levin TR, Lavin P, Lidgard GP, Ahlquist DA, Berger BM. Multitarget stool DNA testing for colorectal-cancer screening. N Engl J Med. 2014;370(14):1287–97.

  24. 24.

    Garzon R, Marcucci G, Croce CM. Targeting microRNAs in cancer: rationale, strategies and challenges. Nat Rev Drug Discov. 2010;9(10):775–89.

  25. 25.

    Lujambio A, Lowe SW. The microcosmos of cancer. Nature. 2012;482(7385):347–55.

  26. 26.

    Davalos V, Esteller M. MicroRNAs and cancer epigenetics: a macrorevolution. Curr Opin Oncol. 2010;22(1):35–45.

  27. 27.

    Iorio MV, Piovan C, Croce CM. Interplay between microRNAs and the epigenetic machinery: an intricate network. Biochim Biophys Acta. 2010;1799(10-12):694–701.

  28. 28.

    Lopez-Serra P, Esteller M. DNA methylation-associated silencing of tumor-suppressor microRNAs in cancer. Oncogene. 2012;31(13):1609–22.

  29. 29.

    Minor J, Wang X, Zhang F, Song J, Jimeno A, Wang XJ, Lu X, Gross N, Kulesz-Martin M, Wang D, et al. Methylation of microRNA-9 is a specific and sensitive biomarker for oral and oropharyngeal squamous cell carcinomas. Oral Oncol. 2012;48(1):73–8.

  30. 30.

    de Roda Husman AM, Walboomers JM, van den Brule AJ, Meijer CJ, Snijders PJ. The use of general primers GP5 and GP6 elongated at their 3′ ends with adjacent highly conserved sequences improves human papillomavirus detection by PCR. J Gen Virol. 1995;76(Pt 4):1057–62.

  31. 31.

    Reiser B. Measuring the effectiveness of diagnostic markers in the presence of measurement error through the use of ROC curves. Stat Med. 2000;19(16):2115–29.

  32. 32.

    Schiffman JD, Fisher PG, Gibbs P. Early detection of cancer: past, present, and future. Am Soc Clin Oncol Educ Book. 2015:57–65.

  33. 33.

    Palanisamy V, Wong DT. Transcriptomic analyses of saliva. Methods Mol Biol. 2010;666:43–51.

  34. 34.

    Xiao H, Wong DT. Proteomics and its applications for biomarker discovery in human saliva. Bioinformation. 2010;5(7):294–6.

  35. 35.

    Majem B, Rigau M, Reventos J, Wong DT. Non-coding RNAs in saliva: emerging biomarkers for molecular diagnostics. Int J Mol Sci. 2015;16(4):8676–98.

  36. 36.

    Moran S, Martinez-Cardus A, Boussios S, Esteller M. Precision medicine based on epigenomics: the paradigm of carcinoma of unknown primary. Nat Rev Clin Oncol 2017;14(11):682–94.

  37. 37.

    Gangaraju VK, Lin H. MicroRNAs: key regulators of stem cells. Nat Rev Mol Cell Biol. 2009;10(2):116–25.

  38. 38.

    Rosenfeld N, Aharonov R, Meiri E, Rosenwald S, Spector Y, Zepeniuk M, Benjamin H, Shabes N, Tabak S, Levy A, et al. MicroRNAs accurately identify cancer tissue origin. Nat Biotechnol. 2008;26(4):462–9.

  39. 39.

    Phallen J, Sausen M, Adleff V, Leal A, Hruban C, White J, Anagnostou V, Fiksel J, Cristiano S, Papp E, et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci Transl Med. 2017;9(403):eaan2415.

  40. 40.

    Wang X, Kaczor-Urbanowicz KE, Wong DT. Salivary biomarkers in cancer detection. Med Oncol. 2017;34(1):7.

  41. 41.

    Lin RK, Hsieh YS, Lin P, Hsu HS, Chen CY, Tang YA, Lee CF, Wang YC. The tobacco-specific carcinogen NNK induces DNA methyltransferase 1 accumulation and tumor suppressor gene hypermethylation in mice and lung cancer patients. J Clin Invest. 2010;120(2):521–32.

  42. 42.

    Au Yeung CL, Tsang WP, Tsang TY, Co NN, Yau PL, Kwok TT. HPV-16 E6 upregulation of DNMT1 through repression of tumor suppressor p53. Oncol Rep. 2010;24(6):1599–604.

  43. 43.

    De Strooper LMA, Verhoef VMJ, Berkhof J, Hesselink AT, de Bruin HME, van Kemenade FJ, Bosgraaf RP, Bekkers RLM, Massuger L, Melchers WJG, et al. Validation of the FAM19A4/mir124-2 DNA methylation test for both lavage- and brush-based self-samples to detect cervical (pre)cancer in HPV-positive women. Gynecol Oncol. 2016;141(2):341–7.

  44. 44.

    Jung M, Pfeifer GP. Aging and DNA methylation. BMC Biol. 2015;13:7.

  45. 45.

    Finkel T, Serrano M, Blasco MA. The common biology of cancer and ageing. Nature. 2007;448(7155):767–74.

  46. 46.

    Chaturvedi AK, Engels EA, Pfeiffer RM, Hernandez BY, Xiao W, Kim E, Jiang B, Goodman MT, Sibug-Saber M, Cozen W, et al. Human papillomavirus and rising oropharyngeal cancer incidence in the United States. J Clin Oncol. 2011;29(32):4294–301.

  47. 47.

    Ross K, Pailler E, Faugeroux V, Taylor M, Oulhen M, Auger N, Planchard D, Soria JC, Lindsay CR, Besse B, et al. The potential diagnostic power of circulating tumor cell analysis for non-small-cell lung cancer. Expert Rev Mol Diagn. 2015;15(12):1605–29.

  48. 48.

    Boers A, Bosgraaf RP, van Leeuwen RW, Schuuring E, Heideman DA, Massuger LF, Verhoef VM, Bulten J, Melchers WJ, van der Zee AG, et al. DNA methylation analysis in self-sampled brush material as a triage test in hrHPV-positive women. Br J Cancer. 2014;111(6):1095–101.

Download references


We thank all participating subjects for their kind cooperation in this study.


This work was supported by the Whedon Cancer Detection Foundation, University of Colorado Cancer Center, Cancer League of Colorado, and the University of Colorado Academic Enrichment Fund (to S. Lu), Dick Brown Head and Neck Research Fund (to J Song), China Postdoctoral Science Foundation 2017M621178 (to Y. Cao) and the National Institutes of Health (R01 DE026125 to D. Pyeon). S.L.L is an investigator of THANC foundation.

Author information

All authors made substantial contributions to the conception and design and to the acquisition, analysis, or interpretation of the data. LSL and JS designed the study. YC, KG, SQ, BM, LL, HH, NL, LG, FH, MW, YM, FJ, JX, and MW performed mgmiR experiments. TX and DP did the HPV test in clinical samples. DG and DS did the biostatiscal analysis. EH, SB, and NG consented patients and provided clinical samples. All authors read and approved the final manuscript.

Correspondence to John Song or Shi-Long Lu.

Ethics declarations

Ethics approval and consent to participate

This study was conducted on human HNSCC samples from both the University of Colorado Anschutz Medical Campus and the Oregon Health & Science University (OHSU) under the Institutional Review Board approval protocols from each institution. Consent was obtained from each participant.

Consent for publication

Written informed consent was obtained from all study participants according to institutional guidelines.

Competing interests

SL Lu and J Song are listed as inventors on a PCT application “Biomarkers for head and neck cancer and methods of their use.” The other authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Figure S1. Flow chart of mgmiRs search for HNSCC. Figure S2. Screening of mgmiR in HNSCC using HNSCC and control cell lines. Relative methylation level of the mgmiRs examined by qMS-PCR in 12 HNSCC cell lines (HNSCC) and 4 head and neck control cell lines (Normal). Red frames highlight mgmiRs with significance difference between HNSCCs and normal (p < 0.05). Figure S3. Selection of mgmiR in HNSCC using HNSCC and control tissues. Relative methylation level of the mgmiRs examined by qMS-PCR in 30 HNSCC tissues (HNSCC) and 25 control tissues (Normal). Red frames highlight mgmiRs with significance difference between HNSCCs and normal (p < 0.05). Figure S4. ROC curves using continuous variables for HNSCC detection. (A). ROC curves comparing the seven mgmiRs with the largest areas under the curve for tissues. (B). ROC curves comparing the seven mgmiRs with the largest areas under the curve for saliva. Figure S5. Variable importance plot from the Random Forest analysis for tissue data including (A) or excluding (B) demographic information. Figure S6. Variable importance plot from the Random Forest analysis for saliva data including (A) or excluding (B) demographic information. (ZIP 509 kb)

Additional file 2:

Table S1. Relative methylation levels of mgmiRs in human HNSCC cell lines and normal head and neck cell lines. Table S2. Relative methylation level in tissues from 189 HNSCC patients and 92 normal controls. Table S3. Univariate and multivariable logistic regression with continuous variables in tissues. Table S4. Relative methylation level in saliva from 86 HNSCC patients and 108 normal controls. Table S5. Univariate and multivariable logistic regression with continuous variables in saliva. (DOCX 42 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cao, Y., Green, K., Quattlebaum, S. et al. Methylated genomic loci encoding microRNA as a biomarker panel in tissue and saliva for head and neck squamous cell carcinoma. Clin Epigenet 10, 43 (2018).

Download citation


  • Head and neck squamous cell carcinoma
  • microRNA
  • DNA methylation
  • Biomarker
  • Tissue
  • Saliva