- Open Access
Epigenetic biomarkers of ageing are predictive of mortality risk in a longitudinal clinical cohort of individuals diagnosed with oropharyngeal cancer
Clinical Epigenetics volume 14, Article number: 1 (2022)
Epigenetic clocks are biomarkers of ageing derived from DNA methylation levels at a subset of CpG sites. The difference between age predicted by these clocks and chronological age, termed “epigenetic age acceleration”, has been shown to predict age-related disease and mortality. We aimed to assess the prognostic value of epigenetic age acceleration and a DNA methylation-based mortality risk score with all-cause mortality in a prospective clinical cohort of individuals with head and neck cancer: Head and Neck 5000. We investigated two markers of intrinsic epigenetic age acceleration (IEAAHorvath and IEAAHannum), one marker of extrinsic epigenetic age acceleration (EEAA), one optimised to predict physiological dysregulation (AgeAccelPheno), one optimised to predict lifespan (AgeAccelGrim) and a DNA methylation-based predictor of mortality (ZhangScore). Cox regression models were first used to estimate adjusted hazard ratios (HR) and 95% confidence intervals (CI) for associations of epigenetic age acceleration with all-cause mortality in people with oropharyngeal cancer (n = 408; 105 deaths). The added prognostic value of epigenetic markers compared to a clinical model including age, sex, TNM stage and HPV status was then evaluated.
IEAAHannum and AgeAccelGrim were associated with mortality risk after adjustment for clinical and lifestyle factors (HRs per standard deviation [SD] increase in age acceleration = 1.30 [95% CI 1.07, 1.57; p = 0.007] and 1.40 [95% CI 1.06, 1.83; p = 0.016], respectively). There was weak evidence that the addition of AgeAccelGrim to the clinical model improved 3-year mortality prediction (area under the receiver operating characteristic curve: 0.80 vs. 0.77; p value for difference = 0.069).
In the setting of a large, clinical cohort of individuals with head and neck cancer, our study demonstrates the potential of epigenetic markers of ageing to enhance survival prediction in people with oropharyngeal cancer, beyond established prognostic factors. Our findings have potential uses in both clinical and non-clinical contexts: to aid treatment planning and improve patient stratification.
Oropharyngeal cancer (OPC), which includes cancers of the soft palate, base of tongue, uvula, palatine tonsils and tonsillar pillars , is the second most commonly diagnosed head and neck cancer (HNC) in the UK, with an age-standardised incidence rate of 2.9 per 100,000 persons . Risk factors include smoking, alcohol consumption and human papillomavirus (HPV) infection. Estimated 5-year survival rates for people with OPC vary from 35 to 83% [3, 4]. As such, the ability to estimate survival probabilities at the time of diagnosis is important for clinical decision making and enrolment of low-risk individuals into treatment de-escalation trials .
HPV positivity, primarily HPV16, is a major determinant of OPC prognosis [6,7,8]. Compared to people with non-HPV-driven tumours, people with HPV-driven tumours have a 60% reduced risk of death 3-year post-diagnosis . Consequently, HPV status is now included in prognostic models alongside TNM stage and comorbidity [8,9,10,11]. One such model has yielded a Harrell’s concordance statistic (C-statistic) of 0.68 (95% confidence interval [CI] 0.65, 0.71) in external validation, indicating good (but not excellent) prediction . The potential for model improvement is currently being explored and the prognostic value of lifestyle factors has been evaluated [13,14,15,16,17,18,19]. The prognostic role of epigenetic biomarkers is less well studied.
Epigenetic biomarkers of ageing (“epigenetic clocks”), which are multivariate predictors of biological age based on DNA methylation (DNAm) levels at a subset of CpGs across the genome, are demonstrating promise in predicting age-related disease and mortality [20,21,22]. Most studies evaluating the prognostic utility of these epigenetic clocks have been conducted in general (healthy) populations, however [22,23,24]. There is a paucity of studies focusing on clinical populations. One study used a Cox model to estimate hazard ratios (HRs) for the association between epigenetic age acceleration (EAA), that is the difference between age predicted by the epigenetic clocks and chronological age, and risk of death following cancer diagnosis (n = 1726 deaths) . After adjusting for socio-demographic and lifestyle variables, the authors found limited evidence (OR 1.04, 95% CI 1.00–1.09) of an association with EAA based on an epigenetic clock derived from methylation at 353 CpG sites (EEAHorvath) . However, mortality risk was 28% higher (OR 1.28, 95% CI 1.11–1.47) for the highest versus lowest quartile of age acceleration based on an epigenetic clock derived from methylation at 71 CpG sites (EEAHannum) .
In this study, we investigated six epigenetic biomarkers in relation to survival in a cohort of individuals with OPC (n = 408). We examined associations between both “first generation” epigenetic clocks derived from DNAm levels at CpG sites found to be strongly associated with chronological age, and two more recently derived clocks: one optimised to predict physiological dysregulation and one optimised to predict lifespan. We also examined the association of a DNAm-based mortality risk score with survival.
In stage one of our analyses, we examined the associations of the six epigenetic biomarkers with survival using cox regression models, with and without adjustment for factors known to influence epigenetic ageing. In the second stage, we implement flexible parametric survival models to investigate the added prognostic value of epigenetic markers compared to a standard clinical model that included age, sex, TNM stage and HPV status.
The study population included a subset of individuals with OPC enrolled in the Head and Neck 5000 (H&N5000) study, a prospective, UK-based, clinical cohort study of people with HNC (n = 5518) [28, 29]. H&N5000 was approved by the National Research Ethics Committee (South West Frenchay Ethics Committee, 10/H0107/57) on 5th November 2010 and approved by the Research and Development departments of participating NHS Trusts.
Individuals were selected based on pre-treatment clinical coding of OPC and the availability of baseline questionnaire and clinical data-capture information. Where possible, pathology reports of individual cases were subsequently checked to verify tumour site and subtype. Overall, 5474/5518 (99%) data-capture forms were completed, and 3361/5385 (62%) individuals returned all three baseline questionnaires.
Baseline data collection
Consent was wide-ranging, including permission to: collect, store and use biological samples; carry out genetic analyses; collect information from hospital records and through self-reported questionnaires; and obtain mortality data through electronic record linkage . Baseline collection was completed pre-treatment, unless the individual’s diagnosis and treatment were the same procedure (e.g. tonsillectomy), in which case recruitment and baseline procedures were completed within a month of the diagnostic procedure. Blood samples (n = 4676, 85%) were sent to the study laboratory (https://www.bristol.ac.uk/population-health-sciences/research/groups/bblabs/) at ambient temperature for processing. They were centrifuged at 3500 rpm for 10 min and the buffy coat layer used for DNA extraction. Additional samples were frozen and stored at − 80 °C.
Assessment of HPV status
HPV serologic testing for HPV16 (E6, E7, E1, E2, E4 and L1) antibodies was conducted at the German Cancer Research Center (DKFZ) using glutathione S‐transferase multiplex assays. HPV16 E6 seropositivity (a marker of HPV‐transformed tumour cells ) was indicated if HPV16 E6 median fluorescence intensity (MFI) was > 1000 units [31, 32].
DNA methylation profiling
DNA was bisulphite-converted using the Zymo EZ DNA Methylation™ kit (Zymo, Irvine, CA, USA) and genome-wide methylation data were generated using the Infinium MethylationEPIC BeadChip (EPIC array; Illumina, USA). Raw data files were pre-processed using the R package meffil (https://github.com/perishky/meffil/) . Overall, 440/448 samples passed quality control and were normalised (Fig. 1). Further details are provided in the Supplementary Material (Additional file 3).
Estimation of epigenetic age
DNAm data for a subset of CpGs on the EPIC array (n = 27,523) and an annotation file containing data on chronological age, sex and tissue type were uploaded onto the DNAm Age Calculator https://dnamage.genetics.ucla.edu/ (Additional file 3: Supplementary Methods). The following epigenetic ageing measures were obtained: intrinsic epigenetic age acceleration based on Horvath’s multi-tissue predictor (IEAA) ; intrinsic epigenetic age acceleration based on Hannum’s predictor (IEAAHannum) ; extrinsic epigenetic age acceleration (EEAA), an enhanced version based on Hannum’s method, which up-weights the contribution of blood cell composition ; PhenoAge (AgeAccelPheno)  and GrimAge (AgeAccelGrim)  An overview of the age predictors is provided in Table 1. In each case, age acceleration was defined as the residual obtained from regressing predicted age, as estimated by the epigenetic clock, on chronological age.
Generation of the DNAm-based mortality predictor in H&N5000
The epigenetic predictor for mortality (ZhangScore) was generated using the equation in . Two of the ten CpGs included in the DNAm score were not present in the H&N5000 epigenetic data because methylation was measured using the EPIC array rather than Illumina450K array, on which the score was developed. The score was therefore generated using the remaining 8 CpGs (See Additional file 3: Supplementary Methods).
Study follow-up and survival
Regular vital status updates were received from the NHS Central Register and NHS Digital, notifying on subsequent cancer registrations/deaths among cohort members. Recruitment finished December 2014 and follow-up information on survival status was obtained on 1 September 2018. The median duration of follow-up was 4.3 years (inter-quartile range [IQR] 3.3–5.2).
Information on age at diagnosis, sex, weight, height, marital status, highest educational attainment (school education, college or degree-level), annual household income, smoking status (defined as “current”, “former” or “never” user of tobacco) and alcohol intake (units per week) were obtained from baseline questionnaires, which are available on the study website (http://www.headandneck5000.org.uk/). We were unable to include lifetime exposure to tobacco (i.e. pack-years) in the current analysis because information relating to time since starting and time since quitting smoking was insufficient, i.e. not enough people completed these questions at baseline. Furthermore, the questionnaire did not capture information regarding periods of abstinence from tobacco use.
Clinically meaningful alcohol drinking categories (both sexes) were defined as “none”, “moderate” (≤ 14 units/week) and “hazardous-to-harmful” (> 14 units/week), based on UK guidelines  (Additional file 3: Supplementary Methods). We used categories of alcohol intake in our main analyses (rather than units consumed) because categories of drinking form the basis of clinical advice, i.e. they are more clinically relevant, and many governments and public health bodies have sought to promote public guidelines for “low risk” or “sensible” drinking based on cut-offs of intake. In addition, these alcohol exposure variables are consistent with previous publications [17, 38].
Body mass index (BMI) was calculated as: weight (kg)/(height (m))2. Comorbidity was defined as “none”, “mild”, “moderate” or “severe” based on the extent of functional deterioration, as measured by the ACE-27. Ethnicity was not included because only two individuals reported being non-white.
Sex, diagnosis, stage and comorbidity were recorded on the data-capture form. Diagnosis was coding using the International Classification of Diseases (ICD) version 10 . Clinical staging of the tumour from T (characteristics of the tumour site), N (degree of lymph node involvement) and M (the absence or presence of metastases) were based on the American Head and Neck Society TNM staging of head and neck cancer . Comorbidity was determined using the Adult Comorbidity Evaluation-27 [ACE-27] .
Stata 15.0 (StataCorp. 2017) was used for all analyses. Firstly, we examined whether EAA measures were associated with survival, after controlling for established HNC prognostic factors; secondly, we investigated whether these measures provide any additional prognostic information, over and above factors that are considered in routine clinical practice.
Step 1: examining associations of EAA measures with survival
Descriptive analyses were performed to explore the distribution of, and correlations between EAA measures. Baseline descriptive data were stratified by survival at 3 years. The univariate association of covariates on all-cause mortality risk was assessed using Kaplan–Meier curves and log-rank tests.
Multivariable Cox proportional hazards models were used to examine associations of EAA measures and the mortality predictor with overall survival, defined as the time in years from study enrolment to date of death from any cause or date of censorship (i.e. the last date of follow-up). Measures were standardised using z-scores to allow comparison of effect estimates. Hazard ratios (HRs) and 95% CIs for all-cause mortality were calculated for each standard deviation (SD) increase in EAA.
For each epigenetic ageing marker, four separate Cox models were run: (1) a minimally adjusted model that controlled for sex; (2) a model that additionally controlled for clinical factors (TNM stage, HPV status, comorbidity and BMI); (3) a model that additionally controlled for socio-demographic and economic factors (education, annual household income, marital status) and (4) a fully adjusted model that additionally controlled for lifestyle behaviours (self-reported smoking and alcohol consumption). Models were selected a priori based on the existing literature linking these covariates with survival [15, 42,43,44,45,46]. As a sensitivity analysis, we used the continuous measure of alcohol intake (units /week) rather than categories of intake in model 4.
For the DNAm-based mortality predictor (ZhangScore), the same models were run, with the exception that the minimally adjusted model also included age at time of diagnosis, since chronological age was not factored in when generating this score.
The proportional hazards assumption was checked using statistical tests and graphical diagnostics based on the Schoenfeld residuals. Missing covariate values were imputed using the ICE package for multiple chained equations in Stata  (Additional file 3: Supplementary Methods). As a further sensitivity analysis, we created a complete case dataset and analysed as above .
We chose not to include chronological age as a covariate in the (EAA) primary survival models because, by definition, age acceleration residuals from a DNAm age predictor should be zero (i.e. not correlated with chronological age). However, since chronological age is positively correlated with mortality, we re-ran the cox models adjusting for chronological age (imputed and complete case).
Step 2: assessing the prognostic value of EAA measures
Evidence of an association with survival is not enough to include novel biomarkers in prediction models; to aid clinicians they must provide added prognostic value to existing models . We explored whether the addition of EAA measures to existing models based on established mortality risk factors (i.e. those currently considered in clinical decision making), improved model performance.
Flexible parametric survival models were fitted using the methods of Royston and Parmar [50, 51] (Additional file 3: Supplementary Methods). Models were fitted using maximum likelihood estimation via the “stpm2” command. Nonlinear relationships with continuous predictors were considered using the multivariable fractional polynomial (MFP) algorithm  and implemented in Stata using the “mfp” command.
The following models were fit: (1) a “clinical model”, which comprised age, sex, TNM stage, HPV status and comorbidity; (2) clinical + IEAA; (3) clinical + EEAA; (4) clinical + IEAAHannum; (5) clinical + AgeAccelGrim; (6) clinical + AgeAccelPheno; (7) clinical + ZhangScore. Models were fit in a sub-sample of participants with data available for the clinical covariates included in the model (age, sex, tumour stage, comorbidity and HPV status).
The performance measures examined were the Akaike Information Criterion (AIC) and the C-statistic, an extension of the area under the receiver operating curve (AUC) to survival analysis [53, 54]. ROC curves and AUC functions were also calculated to characterise how well the models distinguished between people who were and were not alive at 3 years. Internal validation was performed using 500 bootstrap samples to adjust performance for optimism and calculate a shrinkage factor to be applied to model regression coefficients. Where there was evidence of model improvement with addition of the epigenetic markers, assessed based on the C-statistic, we also examined the complementary role of these markers in the prediction of mortality through inclusion in the same model.
In total, 408 out of 1896 participants with pathologically confirmed OPC had epigenetic data available (Fig. 1). There were 105 deaths during follow-up (median = 5.3 years, IQR 4.9–6.0). The proportion of missing data is presented in Additional file 1: Table S1.
Participants who were alive at 3 years had a mean age of 57.4 years at diagnosis (SD = 8.9) compared to 62.9 years (SD = 11.3) (Table 2). Overall, mean EAA measures were lower in people who were alive. The mortality risk score was also lower in those individuals who were alive at 3 years. See Additional file 1: Table S2 for complete case descriptives.
Pairwise correlations between epigenetic markers
The strongest correlation was between EEAA and IEAAHannum (0.74) while the weakest was between IEAA and both AgeAccelGrim and ZhangScore (0.05) (Fig. 2).
Association of DNAm-based biological age with survival
The results of the minimally adjusted and fully adjusted Cox regression analyses on imputed data (n = 408) are illustrated in Fig. 3. An overview of all the model outputs is provided in the Supplementary Material (Additional file 1: Table S3).
In the basic model, all the EAA measures except IEAA were associated with survival (Fig. 3). The reported associations were in the expected directions, i.e. higher values of EAA were associated with higher mortality risk. HRs ranged from 1.22 (95% CI 1.00, 1.49; p = 0.048) for ZhangScore to 1.90 (95% CI 1.57, 2.29; p = 2.27 × 10–11) for AgeAccelGrim, where HRs represent the difference in mortality risk per SD unit increase in the epigenetic marker. Associations of EEAA and ZhangScore with survival attenuated following adjustment for clinical and socioeconomic factors. In the fully adjusted model, which also adjusted for smoking and alcohol consumption, SD increases in IEAAHannum and AgeAccelGrim were associated with 30% and 40% increased mortality risks, respectively (HRs 1.30 [95% CI 1.07, 1.57; p = 0.007] and 1.40 [95% CI 1.06, 1.83; p = 0.016]) (Fig. 3).
In the complete case analysis (n = 225; 49 deaths), the results of the minimally adjusted model were broadly comparable (Additional file 1: Table S4) but IEAAHannum was not robust to adjustment for socioeconomic factors and the association of AgeAccelGrim with survival attenuated following adjustment for smoking and alcohol intake.
Using the continuous measure of alcohol intake (rather than categories) resulted in very similar effect estimates for IEAA, AgeAccelGrim, AgeAccelPheno and ZhangScore in the imputed analysis (Additional file 1: Table S5). The strength of the evidence linking IEAAHannum with mortality risk was lower when alcohol units were used (HR 1.22 [0.99, 1.50]; p = 0.066). There was some evidence that EEAA was associated with mortality risk (HR 1.34 [1.10, 1.62]; p = 0.003). The results of the complete case analysis were comparable to those obtained when categories of alcohol exposure were used in model 4 (Additional file 1: Table S5).
The results of the sensitivity analysis, where we included chronological age as a covariate in the epigenetic age models, are presented in Additional file 1: Table S3 (imputed) and Table S4 (complete case). On adjusting for age, the associations of AgeAccelGrim and IEAAHannum with survival remained in the imputed analysis (fully adjusted HRs 1.50 [1.14, 1.97; p = 0.004] and 1.22 [1.00, 1.49; p = 0.052), respectively). The strength of the evidence associating these measures with survival was reduced in the complete case analysis, although effect estimates were similar (fully adjusted HRs 1.42 [0.94, 2.14; p = 0.095] and 1.23 [0.88, 1.72; p = 0.234), respectively)”.
Examination of the predictive utility of epigenetic markers at 3 years
Table 3 shows the performance measures for the fitted models. The AIC values for the clinical + IEAA, clinical + IEAAHannum and clinical + AgeAccelGrim models were lower than that of the standard clinical model. Two models are generally considered equivalent if the difference in AICs is less than two . On this basis, all three of these models had a better overall fit compared to the standard clinical model. C-statistics ranged from 0.75 (clinical model) to 0.78 (clinical + AgeAccelGrim model), but confidence intervals overlapped.
When we looked at the effect of adding two of the EAA measures to the clinical model (Additional file 1: Table S6), the clinical + IEAAHannum + AgeAccelGrim had a lower AIC than the clinical + AgeAccelGrim model, indicating a better fit to the data, however the C-statistic was not improved compared to the simpler model, indicating that the discriminative ability of the model was no better.
Given that the clinical + AgeAccelGrim model showed the strongest association in the Cox analysis and yielded the highest discrimination, we examined whether this model provided improved prediction at 3 years (n = 72 deaths) compared to a standard clinical model including age, sex, TNM stage, HPV and comorbidity, by comparing AUC values. There was weak evidence to suggest the clinical + AgeAccelGrim model had superior predictive performance compared to the clinical model (clinical AUC: 0.77, clinical + AgeAccelGrim AUC: 0.80; p value for difference = 0.069) (Fig. 4). The bootstrap optimism corrected AUC values showed a small reduction in performance compared with the original model (optimism-adjusted AUCs of 0.74 and 0.77 for clinical and clinical + AgeAccelGrim models, respectively).
The optimism-adjusted c-slope (uniform shrinkage factor) for the clinical + AgeAccelGrim model was 0.83, indicating some overfitting. The original predictor effects were adjusted by this value  (Table 4). In the adjusted model, each SD unit increase in AgeAccelGrim was associated with a 1.5-fold increased risk of death at 3 years (optimism-adjusted HR: 1.54, 95% CI 1.2, 1.92; p ≤ 0.001).
Smoking has been shown to be independently predictive of mortality in H&N5000 . The reduced effect estimate observed between AgeAccelGrim and mortality with adjustment for smoking status suggests that the enhanced prognostic ability gained from adding AgeAccelGrim to the clinical model could be due to the inclusion of a smoking predictor . We conducted an additional sensitivity analysis (Additional file 2: Fig. S1) whereby we compared the prognostic ability of the following models: (1) clinical + AgeAccelGrim; (2) clinical + self-reported smoking; and (3) clinical + DNAmpackyears, the DNAm-based surrogate biomarker for pack-years of smoking used to derive GrimAge (n = 384 participants with smoking data available; no. deaths = 72). At 3 years, there was a suggestion that the clinical + AgeAccelGrim model had better discrimination (AUC value of 0.80 [95% 95% CI 0.74, 0.85]) than the clinical models including both self-reported smoking (AUC = 0.77 [95% CI 0.71, 0.83]) and a DNAm surrogate for pack-years (AUC = 0.78 [0.72, 0.83]), although there was limited evidence of a difference in AUCs based on chi-squared tests (p = 0.148).
In this study of 408 OPC cases with a median of 5 years of follow-up, we demonstrate that epigenetic markers derived from blood are associated with increased risk of all-cause mortality and these associations are independent of established mortality risk factors. In particular, AgeAccelGrim, an “extrinsic” age acceleration measure which captures exogenous lifestyle factors and extracellular changes related to ageing, had the strongest effect estimate, with each SD increase in EAA resulting in a 40% increase in risk of death in the fully adjusted model (HR 1.40; 95% CI 1.06, 1.83; p = 0.016). IEEAHannum, an “intrinsic” measure of EAA, was also associated with mortality risk, but to a lesser extent. The addition of AgeAccelGrim to the clinical model showed marginal improvement in mortality risk prediction at 3 years (Clinical AUC: 0.77, Clinical + AgeAccelGrim AUC: 0.80; p = 0.069). Our findings support the literature which suggests that age acceleration as measured by GrimAge is a better predictor of mortality risk in healthy populations compared to first-generation DNAm-based predictors (i.e. Horvath and Hannum’s clocks) .
It is unclear why some epigenetic ageing measures can predict mortality risk better than others in this population. The DNAm clocks used to derive these measures reflect different aspects of cellular processes and exogenous factors (i.e. lifestyle factors). Smoking has been shown to be independently predictive of mortality among HNC cases , therefore it is possible that the relatively strong association of AgeAccelGrim with mortality risk may be explained by the inclusion of the surrogate measure for smoking in the GrimAge biomarker. When we compared the prognostic performance of the clinical + AgeAccelGrim model with clinical models including both self-reported smoking and the DNAm surrogate biomarker for pack-years of smoking, clinical + AgeAccelGrim had better discrimination. While the difference in model performance was modest, it nonetheless suggests that the methylation-based measure of smoking provides a better indicator with less misclassification than self-report. Moreover, the prognostic utility of AgeAccelGrim does not appear to be solely driven by the inclusion of the DNAm-based biomarker for smoking. GrimAge is also trained on a set of proteins known to be associated with mortality . One of these, Plasminogen activator inhibitor 1 (PAI-1), is overexpressed in a variety of tumours and is a strong predictor of poor clinical outcomes [57,58,59]. Another, growth differentiation factor 15 (GDF15) is involved in the pathogenesis of oral squamous cell carcinoma (OSCC) [60,61,62]. Further studies are needed to examine whether these factors may be contributing to the prognostic utility of GrimAge.
Hannum and Horvath’s clocks were built using similar regression techniques and show moderate correlation, yet, in our analysis, only IEAAHannum was associated with survival. This finding is consistent with previous work . It is possible that, because the Hannum predictor was developed and validated in blood samples—the tissue type used in our analysis—it may be better able to capture cell-intrinsic processes in blood compared with a predictor that was developed across multiple tissue, i.e. Horvath’s predictor.
Our investigation has several strengths including the relatively long follow-up period, the fact that individuals were sampled at the time of diagnosis and that DNAm was assayed in the same laboratory. We were also able to account for a range of factors which are known to influence both DNAm and HNC risk [63, 64] and missing covariate data were imputed to minimise possible biases [65, 66].
Our study has several limitations. First, the sample size for our analysis was relatively small and we were unable to identify independent prospective datasets to validate our findings. This limits the translation impact of our work. To mitigate this, we obtained estimates of a uniform shrinkage factor and multiplied this by the original β-coefficients from the fitted model to obtain optimism-adjusted coefficients. Second, various unmeasured confounders may influence the outcome of these age predictors, including genetic and environmental factors. While we found that the associations of GrimAge and IEAAHannum persisted after controlling for smoking and alcohol intake in our primary analyses, residual confounding is likely to be present. This is especially likely since we used categories of exposure which were derived via participants’ self-report, which is prone to recall bias and/or misreporting. We conducted sensitivity analyses to evaluate residual confounding by alcohol based on a continuous variable of units/week and found that the effect estimates for AgeAccelGrim were comparable to those of our primary analysis (HR 1.40 [1.05, 1.85]; p = 0.020 in the model that included units of alcohol consumed vs 1.40 [1.06, 1.83]; p = 0.016 in the model that included categories of alcohol consumption). The association of IEAAHannum with mortality remained when alcohol units were used but the HR was lower (HR 1.22 [0.99, 1.50]; p = 0.066 vs 1.30 [1.07, 1.57]; p = 0.007 for the model that included alcohol categories). While we were unable to derive a continuous measure for lifetime smoking, we utilised a DNAm-derived measure of pack-years of smoking in our sensitivity analysis for GrimAge. We found that the addition of AgeAccelGrim to a clinical model that included age, tumour stage and HPV status had slightly better discrimination (AUC = 0.80) compared to a clinical model that additionally included the DNAm surrogate marker for smoking (AUC = 0.78). Genome-wide DNA methylation (DNAm) profiling has allowed for the development of molecular predictors for a multitude of traits and diseases, including smoking and alcohol intake . Future studies could implement the use of other methylation scores to index these variables [63, 64]. Third, there is a disparity in coverage between Illumina 450 K and EPIC platforms meaning that 17 of the 353 CpGs (4.8%), and 6 of the 71 CpGs (8.5%) necessary to calculate epigenetic age via the Horvath and Hannum methods, respectively, were missing . Similarly, two of the CpGs included in the DNAm risk score for mortality were missing from the DNA methylation dataset for the same reason. Previous work suggests that the lack of the clock-CpGs on the EPIC array does not undermine the utility of the epigenetic age predictors . Fourth, we did not account for multiple testing, although evidence of correlation between some of the epigenetic measures suggests that correction may not have been appropriate. Finally, it was not possible to examine cancer-specific mortality.
DNAm-based estimators of ageing could provide prognostic utility in people with OPC, above established prognostic factors, though the mechanisms of association are currently unclear. That an accurate, easy-to-measure biomarker could serve as a better predictor of mortality risk is important as it could aid treatment planning and improve patient stratification in study design. These findings should be investigated in further, independent samples.
Availability of data and materials
The datasets used in this analysis are available from the Head and Neck 5000 study upon submission of a research proposal. If you would like to access this resource, please contact the Head and Neck 5000 Executive on firstname.lastname@example.org. The study website http://www.headandneck5000.org.uk/ describes the resource and the types of data available.
Akaike information criterion
Area under the receiver operating characteristic curve
Bayesian information criterion
Epigenetic age acceleration
Head and neck cancer
Missing at random
Anatomy of the Head & Neck: National Cancer Network SEER training modules. https://training.seer.cancer.gov/head-neck/anatomy/.
Ferlay J EM, Lam F, Colombet M, Mery L, Piñeros M, Znaor A, Soerjomataram I, Bray F Cancer Today: International Agency for Research into Cancer (IARC), WHO; 2018 12 Sept 2018. http://gco.iarc.fr/today/home
Elrefaey S, Massaro MA, Chiocca S, Chiesa F, Ansarin M. HPV in oropharyngeal cancer: the basics to know in clinical practice. Acta Otorhinolaryngol Ital. 2014;34(5):299–309.
Nichols AC, Palma DA, Dhaliwal SS, Tan S, Theuer J, Chow W, et al. The epidemic of human papillomavirus and oropharyngeal cancer in a Canadian population. Curr Oncol. 2013;20(4):212–9.
Mirghani H, Amen F, Blanchard P, Moreau F, Guigay J, Hartl DM, et al. Treatment de-escalation in HPV-positive oropharyngeal carcinoma: ongoing trials, critical issues and perspectives. Int J Cancer. 2015;136(7):1494–503.
Fakhry C, Westra WH, Li S, Cmelak A, Ridge JA, Pinto H, et al. Improved survival of patients with human papillomavirus-positive head and neck squamous cell carcinoma in a prospective clinical trial. J Natl Cancer Inst. 2008;100(4):261–9.
Dayyani F, Etzel CJ, Liu M, Ho CH, Lippman SM, Tsao AS. Meta-analysis of the impact of human papillomavirus (HPV) on cancer risk and overall survival in head and neck squamous cell carcinomas (HNSCC). Head Neck Oncol. 2010;2:15.
Ang KK, Harris J, Wheeler R, Weber R, Rosenthal DI, Nguyen-Tan PF, et al. Human papillomavirus and survival of patients with oropharyngeal cancer. N Engl J Med. 2010;363(1):24–35.
Rietbergen MM, Witte BI, Velazquez ER, Snijders PJ, Bloemena E, Speel EJ, et al. Different prognostic models for different patient populations: validation of a new prognostic model for patients with oropharyngeal cancer in Western Europe. Br J Cancer. 2015;112(11):1733–6.
Emerick KS, Leavitt ER, Michaelson JS, Diephuis B, Clark JR, Deschler DG. Initial clinical findings of a mathematical model to predict survival of head and neck cancer. Otolaryngol Head Neck Surg. 2013;149(4):572–8.
Datema FR, Ferrier MB, van der Schroeff MP, de Jong RJB. Impact of comorbidity on short-term mortality and overall survival of head and neck cancer patients. Head Neck-J Sci Spec. 2010;32(6):728–36.
Rietbergen MM, Brakenhoff RH, Bloemena E, Witte BI, Snijders PJ, Heideman DA, et al. Human papillomavirus detection and comorbidity: critical issues in selection of patients with oropharyngeal cancer for treatment de-escalation trials. Ann Oncol. 2013;24(11):2740–5.
Mayne ST, Cartmel B, Kirsh V, Goodwin WJ. Alcohol and tobacco use prediagnosis and postdiagnosis, and survival in a cohort of patients with early stage cancers of the oral cavity, pharynx, and larynx. Cancer Epidemiol Biomark Prev. 2009;18(12):3368–74.
Gillison ML, Zhang Q, Jordan R, Xiao W, Westra WH, Trotti A, et al. Tobacco smoking and increased risk of death and progression for patients with p16-positive and p16-negative oropharyngeal cancer. J Clin Oncol. 2012;30(17):2102–11.
Duffy SA, Ronis DL, McLean S, Fowler KE, Gruber SB, Wolf GT, et al. Pretreatment health behaviors predict survival among patients with head and neck squamous cell carcinoma. J Clin Oncol. 2009;27(12):1969–75.
Browman GP, Mohide EA, Willan A, Hodson I, Wong G, Grimard L, et al. Association between smoking during radiotherapy and prognosis in head and neck cancer: a follow-up study. Head Neck. 2002;24(12):1031–7.
Beynon RA, Lang S, Schimansky S, Penfold CM, Waylen A, Thomas SJ, et al. Tobacco smoking and alcohol drinking at diagnosis of head and neck cancer and all-cause mortality: results from head and neck 5000, a prospective observational cohort of people with head and neck cancer. Int J Cancer. 2018;143(5):1114–27.
Do KA, Johnson MM, Doherty DA, Lee JJ, Wu XF, Dong Q, et al. Second primary tumors in patients with upper aerodigestive tract cancers: joint effects of smoking and alcohol (United States). Cancer Causes Control. 2003;14(2):131–8.
Fortin A, Wang CS, Vigneault E. Influence of smoking and alcohol drinking behaviors on treatment outcomes of patients with squamous cell carcinomas of the head and neck. Int J Radiat Oncol Biol Phys. 2009;74(4):1062–9.
Fransquet PD, Wrigglesworth J, Woods RL, Ernst ME, Ryan J. The epigenetic clock as a predictor of disease and mortality risk: a systematic review and meta-analysis. Clin Epigenet. 2019;11(1):62.
Chen BH, Marioni RE, Colicino E, Peters MJ, Ward-Caviness CK, Tsai PC, et al. DNA methylation-based measures of biological age: meta-analysis predicting time to death. Aging (Albany NY). 2016;8(9):1844–65.
Tiina F, Katja W, Anne V, Riikka S, Miina O, Taina R, et al. Does the epigenetic clock GrimAge predict mortality independent of genetic influences: an 18 year follow-up study in older female twin pairs. Clin Epigenetics. 2021;13(1):128.
Dugue PA, Bassett JK, Joo JE, Baglietto L, Jung CH, Wong EM, et al. Association of DNA methylation-based biological age with health risk factors and overall and cause-specific mortality. Am J Epidemiol. 2018;187(3):529–38.
Marioni RE, Shah S, McRae AF, Chen BH, Colicino E, Harris SE, et al. DNA methylation age of blood predicts all-cause mortality in later life. Genome Biol. 2015;16:25.
Dugue PA, Bassett JK, Joo JE, Jung CH, Ming Wong E, Moreno-Betancur M, et al. DNA methylation-based biological aging and cancer risk and survival: pooled analysis of seven prospective studies. Int J Cancer. 2018;142(8):1611–9.
Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14(10):R115.
Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell. 2013;49(2):359–67.
Ness AR, Waylen A, Hurley K, Jeffreys M, Penfold C, Pring M, et al. Establishing a large prospective clinical cohort in people with head and neck cancer as a biomedical resource: head and neck 5000. BMC Cancer. 2014. https://doi.org/10.1186/1471-2407-14-973.
Ness AR, Waylen A, Hurley K, Jeffreys M, Penfold C, Pring M, et al. Recruitment, response rates and characteristics of 5511 people enrolled in a prospective clinical cohort study: head and neck 5000. Clin Otolaryngol. 2016;41(6):804–9.
Lang Kuhs KA, Kreimer AR, Trivedi S, Holzinger D, Pawlita M, Pfeiffer RM, et al. Human papillomavirus 16 E6 antibodies are sensitive for human papillomavirus-driven oropharyngeal cancer and are associated with recurrence. Cancer. 2017;123(22):4382–90.
Waterboer T, Sehr P, Michael KM, Franceschi S, Nieland JD, Joos TO, et al. Multiplex human papillomavirus serology based on in situ-purified glutathione s-transferase fusion proteins. Clin Chem. 2005;51(10):1845–53.
Kreimer AR, Johansson M, Waterboer T, Kaaks R, Chang-Claude J, Drogen D, et al. Evaluation of human papillomavirus antibodies and risk of subsequent head and neck cancer. J Clin Oncol. 2013;31(21):2708–15.
Min J, Hemani G, Davey Smith G, Relton CL, Suderman M. Meffil: efficient normalisation and analysis of very large DNA methylation samples. bioRxiv. 2017.
Levine ME, Lu AT, Quach A, Chen BH, Assimes TL, Bandinelli S, et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY). 2018;10(4):573–91.
Lu AT, Quach A, Wilson JG, Reiner AP, Aviv A, Raj K, et al. DNA methylation GrimAge strongly predicts lifespan and healthspan. Aging (Albany NY). 2019;11(2):303–27.
Zhang Y, Wilson R, Heiss J, Breitling LP, Saum KU, Schottker B, et al. DNA methylation signatures in peripheral blood strongly predict all-cause mortality. Nat Commun. 2017;8:14617.
NHS. Alcohol units 2018. https://www.nhs.uk/live-well/alcohol-support/calculating-alcohol-units/.
Penfold CM, Thomas SJ, Waylen A, Ness AR. Change in alcohol and tobacco consumption after a diagnosis of head and neck cancer: findings from head and neck 5000. Head Neck. 2018;40:1389–99.
Brumpton B, Sanderson E, Hartwig FP, Harrison S, Vie GÅ, Cho Y, et al. Within-family studies for Mendelian randomization: avoiding dynastic, assortative mating, and population stratification biases. bioRxiv. 2019:602516.
Deschler DG, Day T. Pocket guide to: TNM staging of head and neck cancer and neck dissection classification. Alexandria, VA: American Academy of Otolaryngology– Head and Neck Surgery Foundation, Inc; 2008. http://www.sld.cu/galerias/pdf/sitios/cirugiamaxilo/neckdissectionpart1.pdf.
Piccirillo JF, Tierney RM, Costas I, Grove L, Spitznagel EL Jr. Prognostic importance of comorbidity in a hospital-based cancer registry. JAMA. 2004;291(20):2441–7.
Gillison ML, D’Souza G, Westra W, Sugar E, Xiao WH, Begum S, et al. Distinct risk factor profiles for human papillomavirus type 16-positive and human papillomavirus type 16-negative head and neck cancers. J Natl Cancer I. 2008;100(6):407–20.
Piccirillo JF. Impact of comorbidity and symptoms on the prognosis of patients with oral carcinoma. Arch Otolaryngol Head Neck Surg. 2000;126(9):1086–8.
Schimansky S, Lang S, Beynon R, Penfold C, Davies A, Waylen A, et al. Association between comorbidity and survival in head and neck cancer: results from head and neck 5000. Head Neck. 2019;41(4):1053–62.
de Graeff A, de Leeuw JR, Ros WJ, Hordijk GJ, Blijham GH, Winnubst JA. Sociodemographic factors and quality of life as prognostic indicators in head and neck cancer. Eur J Cancer. 2001;37(3):332–9.
Hollander D, Kampman E, van Herpen CM. Pretreatment body mass index and head and neck cancer outcome: a review of the literature. Crit Rev Oncol Hematol. 2015;96(2):328–38.
Royston P. Multiple imputation of missing values: further update of ice, with an emphasis on categorical variables. Stata J. 2009;9:466–77.
White IR, Royston P. Imputing missing covariate values for the Cox model. Stat Med. 2009;28(15):1982–98.
Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157–72 (discussion 207-12).
Royston P. Flexible parametric alternatives to the Cox model, and more. Stata J. 2001;1:1–28.
Royston P, Parmar MK. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat Med. 2002;21(15):2175–97.
Royston PSW. Multivariable model-building: a pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables. Chichester: Wiley; 2008.
Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–87.
Pencina MJ, D’Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med. 2004;23(13):2109–23.
Whittle R, Royle KL, Jordan KP, Riley RD, Mallen CD, Peat G. Prognosis research ideally should measure time-varying predictors at their intended moment of use. Diagn Progn Res. 2017;1:1.
Riley RD, Snell KI, Ensor J, Burke DL, Harrell FE Jr, Moons KG, et al. Minimum sample size for developing a multivariable prediction model: part II—binary and time-to-event outcomes. Stat Med. 2019;38(7):1276–96.
Kubala MH, Punj V, Placencio-Hickok VR, Fang H, Fernandez GE, Sposto R, et al. Plasminogen activator inhibitor-1 promotes the recruitment and polarization of macrophages in cancer. Cell Rep. 2018;25(8):2177–91.
Duffy MJ, O’Donovan N, McDermott E, Crown J. Validated biomarkers: The key to precision treatment in patients with breast cancer. Breast. 2016;29:192–201.
Mengele K, Napieralski R, Magdolen V, Reuning U, Gkazepis A, Sweep F, et al. Characteristics of the level-of-evidence-1 disease forecast cancer biomarkers uPA and its inhibitor PAI-1. Expert Rev Mol Diagn. 2010;10(7):947–62.
Schiegnitz E, Kammerer PW, Rode K, Schorn T, Brieger J, Al-Nawas B. Growth differentiation factor 15 as a radiation-induced marker in oral carcinoma increasing radiation resistance. J Oral Pathol Med. 2016;45(1):63–9.
Yang CZ, Ma J, Luo QQ, Neskey DM, Zhu DW, Liu Y, et al. Elevated level of serum growth differentiation factor 15 is associated with oral leukoplakia and oral squamous cell carcinoma. J Oral Pathol Med. 2014;43(1):28–34.
Zhang L, Yang X, Pan HY, Zhou XJ, Li J, Chen WT, et al. Expression of growth differentiation factor 15 is positively correlated with histopathological malignant grade and in vitro cell proliferation in oral squamous cell carcinoma. Oral Oncol. 2009;45(7):627–32.
Langdon R, Richmond R, Elliott HR, Dudding T, Kazmi N, Penfold C, et al. Identifying epigenetic biomarkers of established prognostic factors and survival in a clinical cohort of individuals with oropharyngeal cancer. bioRxiv. 2019.
McCartney DL, Hillary RF, Stevenson AJ, Ritchie SJ, Walker RM, Zhang Q, et al. Epigenetic prediction of complex traits and death. Genome Biol. 2018;19(1):136.
Royston P. Multiple imputation of missing values. Stata J. 2004;4:227–41.
Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393.
Dhingra R, Kwee LC, Diaz-Sanchez D, Devlin RB, Cascio W, Hauser ER, et al. Evaluating DNA methylation age on the Illumina MethylationEPIC Bead Chip. PLoS ONE. 2019;14(4):e0207834.
McEwen LM, Jones MJ, Lin DTS, Edgar RD, Husquin LT, MacIsaac JL, et al. Systematic evaluation of DNA methylation age estimation with common preprocessing methods and the Infinium MethylationEPIC BeadChip array. Clin Epigenet. 2018;10(1):123.
The authors are extremely grateful to all Head and neck 5000 participants, the Head and Neck 5000 study coordination team and the laboratory technicians at the Bristol Bioresource Laboratories.
This work was supported by a Wellcome Trust PhD studentship (110021/Z/15/Z to RAB). The Head and Neck 5000 study was a component of independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research scheme (RP-PG-0707-10034). RCR and RMM are supported by a Cancer Research UK (C18281/A19169) programme grant (the Integrative Cancer Epidemiology Programme) and is part of the Medical Research Council Integrative Epidemiology Unit at the University of Bristol supported by the Medical Research Council (MC_UU_12013/1, MC_UU_12013/2 and MC_UU_12013/3) and the University of Bristol. RMM and AN are also supported by NIHR Bristol Biomedical Research Centre, which is funded by the National Institute for Health Research (NIHR) and is a partnership between University Hospitals Bristol NHS Foundation Trust and the University of Bristol. RL was supported by a Cancer Research UK Research PhD studentship (C18281/A20988 to RJL). The views expressed are those of the author(s) and not necessarily those of any funding body. George Davey Smith works in a Unit supported by the Medical Research Council for the Integrative Epidemiology Unit (MC_UU_00011/1 at the University of Bristol.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1.
Table S1. Proportion of missing data, N = 408; Table S2. Baseline descriptives of participants included in the complete case analysis (n = 225); Table S3. Association of DNA Methylation-Based predictors of Ageing with overall-survival based on imputed data (n = 408); Table S4. Results of the complete case cox regression analysis (n = 225); Table S5. Results of the sensitivity analyses, adjusting for units of alcohol consumed per week (Model 4); Table S6. The impact of including two epigenetic age acceleration measures on model fit and discrimination.
Additional file 2.
Supplementary Figure 1. A comparison of the area under the ROC curves (AUC) obtained for the models included in the sensitivity analyses (n = 384).
Additional file 3.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Beynon, R.A., Ingle, S.M., Langdon, R. et al. Epigenetic biomarkers of ageing are predictive of mortality risk in a longitudinal clinical cohort of individuals diagnosed with oropharyngeal cancer. Clin Epigenet 14, 1 (2022). https://doi.org/10.1186/s13148-021-01220-4
- Epigenetic clock
- Epigenetic ageing
- Oropharyngeal cancer
- DNA methylation