Skip to main content

Methylation-derived inflammatory measures and lung cancer risk and survival



Examining immunity-related DNA methylation alterations in blood could help elucidate the role of the immune response in lung cancer etiology and aid in discovering factors that are key to lung cancer development and progression. In a nested, matched case–control study, we estimated methylation-derived NLR (mdNLR) and quantified DNA methylation levels at loci previously linked with circulating concentrations of C-reactive protein (CRP). We examined associations between these measures and lung cancer risk and survival.


Using conditional logistic regression and further adjusting for BMI, batch effects, and a smoking-based methylation score, we observed a 47% increased risk of non-small cell lung cancer (NSCLC) for one standard deviation (SD) increase in mdNLR (n = 150 pairs; OR: 1.47, 95% CI 1.08, 2.02). Using a similar model, the estimated CRP Scores were inversely associated with risk of NSCLC (e.g., Score 1 OR: 0.57, 95% CI: 0.40, 0.81). Using Cox proportional hazards models adjusting for age, sex, smoking status, methylation-predicted pack-years, BMI, batch effect, and stage, we observed a 28% increased risk of dying from lung cancer (n = 145 deaths in 205 cases; HR: 1.28, 95% CI: 1.09, 1.50) for one SD increase in mdNLR.


Our study demonstrates that immunity status measured with DNA methylation markers is associated with lung cancer a decade or more prior to cancer diagnosis. A better understanding of immunity-associated methylation-based biomarkers in lung cancer development could provide insight into critical pathways.


Lung cancer is the leading cause of cancer death in the USA, projected to account for 21.7% of all cancer deaths in 2021 [1]. A large percentage of lung cancer patients are diagnosed at an advanced stage [2] and five-year relative survival rates for those patients are between 3 and 6% [3]. Thus, early detection remains a key strategy to improve survival. However, the currently recommended strategy for lung cancer screening—low-dose computed tomography (LDCT) for persons 50 to 80 years old with at least a 20 pack-year smoking history and currently smoke or have quit within the past 15 years—is expensive and has a high false positive rate [4, 5]. Modifying the current lung cancer screening strategy by performing risk stratification could help prioritize LDCT screening and optimize secondary prevention. We propose that immune system markers could be incorporated into such risk stratification tools to help identify persons at higher risk of lung cancer to target for screening.

While smoking is the most important risk factor for lung cancer in the population, there is growing evidence that the immune system, in response to or independent of smoking, plays an important role in lung cancer development, acting potentially through the genesis of chronic inflammation [6]. For instance, an aggregated genome-wide association studies (GWASs) analysis of lung cancer risk found a direct causal effect of BMI on small cell lung cancer and an inverse effect on lung adenocarcinoma, suggesting the complexity of the role BMI and chronic inflammation plays in lung cancer subtypes [7]. Furthermore, it is plausible that inflammatory profiles prior to lung cancer diagnosis are associated with lung cancer-specific survival. Markers of systemic inflammation, including elevated levels of C-reactive protein (CRP) and the peripheral blood neutrophil-to-lymphocyte ratio (NLR), also have been identified as robust markers of cancer-associated inflammation [8, 9]. Elevated CRP levels [8], elevated serum levels of pro-inflammatory cytokines [10,11,12], increased neutrophil counts and decreased lymphocyte counts [13, 14], and polymorphisms in inflammation-related genes [15,16,17,18] have been associated with increased lung cancer risk. These inflammatory measures have also been associated with poor survival of lung cancer patients in several retrospective and a few prospective studies [19,20,21]. In addition, both experimental and epidemiologic studies support a role for chronic inflammation as a hallmark of cancer development and progression [8, 22,23,24,25]. We posit that a better understanding of the role of inflammation in lung cancer etiology could be gained by examining DNA methylation alterations in blood that are associated with the systemic immune response.

In the current study, we first predicted peripheral blood leukocyte composition and a neutrophil to lymphocyte index using validated DNA methylation markers (mdNLR), then quantified DNA methylation levels at loci previously linked with circulating concentrations of CRP, and calculated methylation-derived immune cell ratios by using an expanded deconvolution library. We evaluated the associations of these potential markers with lung cancer risk and lung cancer-specific survival. To address this question, we used pre-diagnostic blood samples of cases and controls obtained from the CLUE I/II cohorts. Our analyses controlled for self-reported smoking and methylation-predicted cumulative smoking in order to better focus our examinations on the DNA methylation marks that are informative of the immune response profile [26].


Population characteristics

Characteristics of the 208 lung cancer cases and their 208 matched controls included in this analysis are presented in Table 1. Over 99% of the majority of participants were White. The median time between blood draw and lung cancer diagnosis was 14 years. The median age at blood draw in 1989 was 59 and 57 years in cases and controls, respectively. Overall, 55% of cases and controls were women and 11% were never smokers (Table 1).

Table 1 Baseline characteristics of lung cancer cases and matched controls nested in CLUE I/II

Methylation-derived mdNLR index, leukocyte proportions, and lung cancer risk

We observed a 47% increased risk of non-small cell lung cancer (NSCLC) for one standard deviation increase in mdNLR (n = 150 pairs; OR: 1.47 [1.08, 2.02]). However, higher mdNLR values were not statistically associated with overall risk of lung cancer in our study. This association was comparable for NSCLC cases diagnosed within 10 years and beyond 10 years after blood draw. No stable associations could be estimated for small cell lung cancer (SCLC). After multiple comparison adjustments, monocyte/lymphocyte ratio showed a borderline significant 65% increased risk of NSCLC for each standard deviation increase (n = 150 pairs; OR: 1.65, adjusted CI: [0.99, 2.76]). In addition, immune cell ratios for CD4/CD8, NLR, B cell/lymphocyte, T cell/lymphocyte, Neu + Mono/lymphocyte, Eos/lymphocyte, CD4nv/lymphocyte, B cell/CD8, CD8/Treg, Bnv/Bmem, CD4nv/CD4mem, CD8nv/CD8mem, and Treg > 0 vs. Treg = 0 were not statistically significantly associated with lung cancer risk overall or by histologic types (Table 2).

Table 2 Association between methylation-predicted immune cell profiles and risk of total lung cancer and NSCLC risk, overall and stratified by time to diagnosis, case–control study nested in the CLUE I/II cohort

Methylation-derived CRP scores and lung cancer risk

CRP Score 1 was built using 54 CpG sites that were previously associated with inflammatory markers, while CRP Score 2 and 3 were each built with a subset of these 54 CpGs that were putative cell-specific or cell type invariant, respectively. Using data from a previously published pancreatic cancer dataset [27], all three scores were moderately correlated with log CRP and log IL-6 levels (Table 3). In this nested case–control study, we found all three CRP Scores inversely associated with risk of NSCLC after additionally adjusting for methylation-predicted pack-years (n = 150 pairs; Score 1 OR: 0.57 [0.40, 0.81]; Score 2 OR: 0.62 [0.45, 0.84]; Score 3 OR: 0.65 [0.44, 0.95]). We also found statistically significant inverse association between CRP Score 1 and risk of NSCLC among cases diagnosed within 10 years and beyond 10 years, and between CRP Score 2 for NSCLC cases diagnosed within 10 years of blood draw (Table 4). CRP Scores 1, 2, and 3 were not associated with lung cancer risk when taking into account the matching factors and only adjusting for BMI and four surrogate variables for batch effects (n = 208 pairs; Score 1 OR: 0.96 [0.77, 1.21]; Score 2 OR: 0.89 [0.71, 1.11]; Score 3 OR: 1.11 [0.89, 1.40]). However, when additionally adjusting for methylation-predicted pack-years, inverse associations with total lung cancer risk were observed for Score 1 (OR: 0.76 [0.59, 0.99]) and Score 2 (OR: 0.77 [0.61, 0.98]). We also observed a 33% decreased risk of lung cancer for one standard deviation increase in CRP Score 1 (OR: 0.67 [0.47, 0.97]) among those with time to diagnosis over 10 years.

Table 3 Correlations between methylation-based CRP scores and circulating log-CRP level, log-IL6 level, peripheral blood leukocyte types, BMI, and smoking score residual among controls only
Table 4 Association between methylation-based CRP scores and risk of total lung cancer and NSCLC risk, overall and stratified by time to diagnosis, CLUE I/II cohort

Survival analysis

We examined whether the mdNLR, methylation-derived immune cell ratios, and CRP Scores were associated with risk of dying of lung cancer among lung cancer cases (Table 5, Fig. 1).

Table 5 Association between immune cell ratios and methylation-based CRP scores and lung cancer-specific mortality among lung cancer cases, CLUE I/II cohort
Fig. 1
figure 1

Survival curves for lung cancer-specific mortality among lung cancer cases in the mdNLR high and low groups (> or ≤ 75% quartiles). Plot adjusted for age, sex, smoking status, methylation-predicted pack-years smoked, BMI, stage, and batch effects

We observed a 47% increased risk of dying for one standard deviation of mdNLR for NSCLC cases (n = 149 cases; HR: 1.47 [1.20, 1.81]). Among the NSCLC cases whose mdNLR was from <  = 10 years before their diagnosis, we found a 73% increased risk of dying for a one standard deviation increase in mdNLR (HR: 1.73 [1.19, 2.51]). In comparison, the risk of dying for a one standard deviation increase in mdNLR was lower among the NSCLC cases whose mdNLR was from 10 to 25 years prior to diagnosis (HR: 1.39 [1.05, 1.85]). Lastly, we observed a 28% increased risk of dying from lung cancer for one standard deviation increase in mdNLR (n = 205 cases deleted 3 cases with person-year = 0 or > 25 years; HR: 1.28 [1.09, 1.50]).

Immune cell ratios for CD4/CD8, NLR, B cell/lymphocyte, T cell/lymphocyte, Mono/lymphocyte, Eos/lymphocyte, CD4nv/lymphocyte, B cell/CD8, CD8/Treg, Bnv/Bmem, CD4nv/CD4mem, CD8nv/CD8mem, and Treg (> 0 vs = 0) were not associated with lung cancer-specific death, except for a 48% increased risk for one standard deviation increase in Neu + Mono/lymphocyte ratio among the NSCLC cases (HR: 1.48 [1.04, 2.11]) and a borderline significant 29% increased risk of dying from lung cancer (HR: 1.29, adjusted CI: [1.00, 1.67]) for one standard deviation increase in Neu + Mono/lymphocyte ratio after multiple comparison adjustments. Furthermore, the three CRP Scores were not associated with lung cancer-specific death.


Our study prospectively assessed predicted immune cell profiles using DNA methylation markers and examined associations between previously identified DNA methylation markers of inflammation and lung cancer risk and survival. Using pre-diagnostic blood samples of lung cancer cases and controls who participated in the CLUE I/II cohorts [23], pre-diagnosis mdNLR was associated with increased risk of NSCLC, and among cases, with total lung cancer and NSCLC lung cancer-specific death. In addition, we built a series of methylation-derived CRP scores to capture individual systemic inflammatory profiles years before lung cancer diagnosis; these scores were inversely associated with risk of lung cancer, especially for NSCLC after adjusting for methylation-predicted pack-years smoked, but not with lung cancer-specific mortality.

Studies on NLR (calculated from measured WBC differentials) and lung cancer risk and survival typically measure pre-treatment NLR at diagnosis or up to 30 days prior to treatment [28,29,30]. Unlike prior studies, we were able to assess individual systemic inflammation profiles many years prior to diagnosis by using methylation markers of inflammation. Our study is not directly comparable to prior studies since we measured mdNLR using blood samples from subjects with a median of 14 years prior to lung cancer diagnosis. In addition, most cases in our study were diagnosed before the widespread use of immunotherapy. To our knowledge, only one other cohort, the multicenter β-Carotene and Retinol Efficacy Trial (CARET), examined pre-diagnosis mdNLR and lung cancer risk and survival using blood drawn years prior to diagnosis (median 4.7 years) [31, 32]. CARET, a study of heavy smokers, reported a 21% increased risk of lung cancer per one unit increase in mdNLR (OR: 1.21 [1.01, 1.45]), a 30% increased risk of NSCLC for one unit increase in mdNLR (OR: 1.30 [1.03, 1.63], and no association between higher pre-diagnosis mdNLR and risk of developing SCLC (OR: 1.06 [0.77, 1.47]) [31]. Like in CARET, in CLUE I/II we observed a 47% increased risk of NSCLC for a one standard deviation increase in mdNLR (n = 150 pairs; OR: 1.47 [1.08, 2.02]), but in contrast to CARET, we found no statistically significant association for overall lung cancer risk.

CARET researchers recently reported that pre-diagnosis mdNLR was positively associated with increased mortality for SCLC cases, but not for other case types [32]. In comparison, we observed a positive association between pre-diagnosis mdNLR and lung cancer-specific and NSCLC-specific mortality. In the case of SCLC, the number of cases was too limited for us to estimate stable associations (N = 29). Taken together, the CLUE and CARET results suggest that a systemic inflammatory profile marked by elevated NLR could indicate a lesser ability to mount a robust immune response to a developing lung cancer and/or a more favorable environment for cancer progression. Differences in findings between the two studies could stem from differences in study populations. The CARET cohort is exclusively heavy smokers, including a subgroup exposed to asbestos. In comparison, our analysis in the CLUE I/II cohorts included never, ever, and current smokers. Furthermore, our study population had a lower mdNLR in the lung cancer cases (mean 1.86 and SD 1.32) than in CARET (mdNLR mean 2.18 and SD 1.46).

Using a newly expanded deconvolution library, we were able to parse apart the granulocyte subtypes (neutrophils, eosinophils, and basophils) and investigate the balance between naïve and memory cell compartments for lung cancer. Previous research has identified the monocyte/lymphocyte (or lymphocyte/monocyte) ratio as an independent prognostic factor in NSCLC, demonstrating significant association with overall survival in patients with NSCLC [33,34,35]. In comparison, our exploratory analyses of immune cell ratios suggest that one standard deviation increase in the monocyte/lymphocyte ratio could potentially indicate increased risk of NSCLC after additionally adjusting for methylation-predicted pack-years. In addition, we found an increased risk of dying from lung cancer associated with an increase in Neu + Mono/lymphocyte ratio among the NSCLC cases after multiple comparison adjustments.

We also investigated three CRP Scores that we built from 54 CpG sites that had been strongly associated with CRP in previous studies. We found these methylation-predicted CRP Scores to be moderately correlated with log-CRP and log-IL6 in the controls of a previously published pancreatic cancer dataset [27]. CRP is a systemic marker of chronic inflammation and has been reported as a risk factor for cancer development [36]. Previous studies of pre-diagnostic circulating CRP concentration and lung cancer risk (7 cohorts [10, 11, 19, 37,38,39] and 3 nested case–control studies [8, 12, 40]) have consistently found a moderate positive association between pre-diagnostic CRP concentrations and lung cancer risk. In our study, CRP Scores were not associated with lung risk when taking into account the matching factors, BMI, and batch effects. However, we observed an inverse association when additionally adjusting for methylation-predicted pack-year. Our results suggest that when strict control of smoking is applied, our CRP Score is likely capturing the unique individual immune response that is not driven by smoking.

Furthermore, these results provide preliminary evidence supporting the hypothesis that systemic inflammation not driven by smoking could have a protective effect on individuals. While smoking is by far the most important risk factor for lung cancer, our DNA methylation-based CRP Scores provide the opportunity to examine inflammatory measures not related to smoking that could play a role in modulating cancer risk years prior to diagnosis. Lastly, our experience with the CRP Scores suggests that measuring methylation-derived inflammatory responses using pre-diagnostic samples provides the opportunity to capture informative individual systemic inflammatory profiles years prior to diagnosis, potentially shedding light on risk factors key to lung cancer development and progression, e.g., underlying genetics, exposure to environmental risk factors, and behavior risk factors.

Like other observational studies, our study included a limited number of NSCLC and SCLC cases. The relatively small sample size of SCLC cases (N = 29) impacted our ability to observe associations for this subtype (SCLC comprises about 15% of lung cancer cases in the USA). In our survival analysis, we adjusted for stage and restricted our analysis to samples whose time between blood draw and date of lung cancer diagnosis was less than 25 years; however, our survival analysis did not have access to post-diagnosis smoking status information. Our study is also limited by a lack of replication dataset and reduced generalizability. (Study population is mainly White and with very few cases in never smokers.) The CRP Scores we built should be investigated in other populations to ensure that what we observed did not arise due to chance.


Our study suggests that elevated pre-diagnosis mdNLR and a lower non-smoking-related systemic inflammatory profile before diagnosis are associated with higher cancer risk and poorer lung cancer-specific survival. These relationships were especially evident for NSCLC. As the most common subtype of lung cancer, most NSCLC cases are diagnosed with locally advanced or metastatic disease. Our prospective results support future evaluation of whether DNA methylation-based inflammatory measures could enhance lung cancer risk stratification to improve targeted lung cancer screening.


Study Population

This nested case–control study selected cases and controls from individuals who participated and provided blood in both CLUE I and CLUE II [26]. The CLUE I cohort was developed to identify serologic precursors of cancer and was conducted in Washington County, Maryland, in the fall of 1974. A blood sample was collected from 25,620 volunteers at the time of participation [41, 42]. The CLUE II cohort was conducted from May through October 1989. During this time, 32,894 participants donated a blood sample which was collected in tubes containing heparin and kept chilled until centrifuged, aliquoted into plasma, erythrocytes, and buffy coat, and frozen at 70 °C [43]. In CLUE II, the baseline for this study, health information was collected at the time of blood draw, including attained education, cigarette smoking status, cigarette smoking dose, cigar/pipe smoking status, and self-reported weight and height.

Incident lung cancer cases were ascertained from linkage to the Washington County cancer registry (before 1992 to the present) and the Maryland Cancer Registry (since 1992 when it began to the present). We ascertained 241 incident lung cancer cases who participated in CLUE I and were diagnosed after the day of blood draw in CLUE II through January 2018. Cases were characterized with respect to histology. We used incidence density sampling to select one control matched to each case on age, sex, smoking status and intensity (cig/day), and cigar/pipe smoking status. Death from lung cancer as the underlying cause was obtained from death certificates. The Institutional Review Board at the Johns Hopkins Bloomberg School of Public Health and the Tufts University Health Sciences Campus Institutional Review Board approved this study.

DNA methylation measurements

Extracted DNA was bisulfite-treated using the EZ DNA Methylation Kit (Zymo), and DNA methylation was measured with the 850 K Illumina Infinium MethylationEPIC BeadChip Arrays (Illumina, Inc., CA, USA). All samples and all array experiments were performed blinded to case–control status. Details on DNA methylation measurements, data preprocessing processing, and quality control assessment/screening are provided in the Additional file 1. The 850 K methylation microarray has been validated from a biological and technical standpoint. Reproducibility of results from 850 K Illumina array has been previously shown to be very high (r = 0.997) [44]. DNA volume and quality were sufficient for 208 of the cases and 222 controls totaling 208 matched pairs.

Estimation of peripheral blood leukocyte composition

Peripheral blood leukocyte subtypes proportions, including myeloid lineage sub-types [neutrophils (Neu), eosinophils (Eos), basophils (Bas), and monocytes (Mono)] and lymphoid lineage subtypes [B lymphocytes naïve (Bnv), B lymphocytes memory (Bmem), T helper lymphocytes naïve (CD4nv), T helper lymphocytes memory (CD4mem), T regulatory cells (Treg), T cytotoxic lymphocytes naïve (CD8nv), T cytotoxic lymphocytes memory (CD8mem), and natural killer lymphocytes (NK)], were estimated using a newly expanded reference-based deconvolution library EPIC IDOL-Ext [45]. This library used the IDOL methodology [46] to optimize the currently available six-cell reference library [47] in order to deconvolve the proportions of 12 leukocyte subtypes in peripheral blood. This EPIC IDOL-Ext library (Bioconductor package FlowSorted.BloodExtended.EPIC) was validated using flow cytometry gold standard data and substantiated by including publicly available data from > 100,000 samples [45].

Methylation-Derived Neutrophil Lymphocyte Ratio (mdNLR)

The peripheral blood neutrophil-to-lymphocyte ratio (NLR) is a cytological marker of both inflammation and poor outcomes in cancer patients [48,49,50,51,52]. We used a DNA methylation-derived NLR (mdNLR) index to predict the common clinical NLR parameter using a previously described approach [9]. This index is based on normal isolated leukocyte reference DNA methylation libraries and established reference-based cell mixture deconvolution algorithms [9, 53].

Inflammation-associated CpG score

We used 54 CpG sites that have been strongly associated with C-reactive protein (CRP) [54, 55] to build three CRP Scores. We selected these 54 CpGs (remaining 4 were not on the 850 K array that we used) from the 58 CpGs identified by Ligthart and colleagues [54] for their association with serum CRP level (listed in Table 3) using 450 K DNA methylation data. Forty-five of these 58 CpG sites were validated to have the same direction of protein–methylation associations by Myte et al. [55]. These CpGs, while identified based on their CRP association, have also been shown to be associated with other inflammatory mediators [54,55,56]. To compute CRP Score 1, we multiplied the beta value at each selected CpG site with the effect size estimates reported by Ligthart et al. These estimated beta coefficients represented the change in DNA methylation per one unit increase in log CRP. In the CRP Score 1 formula, we weighted the beta coefficients estimated by Ligthart et al. with their corresponding standard errors.

$${\text{CRP}}\;{\text{Score}}_{i} = \sum B_{ij} \times \frac{{\Delta_{j} }}{{{\text{SE}}_{j} }}$$

Bij is the beta value for the ith participant at the jth CpG site. ∆j is the beta coefficients reported by Ligthart et al. for the jth CpG site. SEj is the SE reported by Ligthart et al. for the jth CpG site.

Since most of the estimated beta coefficients are negative, CRP Score 1 ranged between − 0.059 and -0.026 in these participants. A score closer to zero indicated higher CRP levels. Based on CRP Score 1, we computed two additional CRP Scores, one cell (leukocyte)-type invariant (CRP Score 2) and one cell-specific (CRP Score 3). Among the 54 inflammation (CRP)-associated CpGs, we identified putative cell-type invariant and cell-specific CpGs by conducting ANOVA using the dataset described in Salas and Koestler et al. [47] and publicly available on the Gene Expression Omnibus (GSE110555). The dataset used for this ANOVA consisted of EPIC methylation data profiled in purified leukocyte cell population isolated from different healthy adults. Specifically, methylation signatures were available for CD4 + T cells, CD8 + T cells, NK cells, B cells, monocytes, and neutrophils. One-way ANOVA models were fit independently to each of the 54 CRP-associated CpGs treating methylation as the dependent variable and cell type as the independent variable. We tested the null hypothesis that the mean methylation beta-value is the same across the cell types. The F-statistic, corresponding p value, and maximum absolute pairwise difference in the mean methylation beta value across cell types were calculated for each of the 54 CpGs. We then selected subgroups of CpG sites that had the top 10 smallest or top 10 largest F-statistic value to build the two additional CRP Scores. CRP Score 2 consists of putative cell-specific CpGs with high F-statistics, e.g., those exhibiting a difference in mean methylation beta-values between at least two of the six cell types. CRP Score 3 is made of cell-type invariant CpGs with low F-statistics, e.g., CpGs for which there did not appear to be a substantial difference in mean methylation beta-values across the normal six leukocyte subtypes. Score 2 ranged between − 0.0002 and 0.0046, while Score 3 ranged between − 0.025 and − 0.016. In the regression analyses, we used a standardized version of CRP Scores 1, 2, and 3 (mean = 0, sd = 1) for easier interpretation of results and allowing us to compare the results for each of the scores.

Statistical analyses

All statistical analyses were performed in R (version 3.5.1). We estimated mdNLR as described above, used an independent pancreatic cancer dataset [27] to estimate the correlation between estimated values of CRP Scores 1–3 with the log CRP and log IL-6 levels, and tested a series of a priori hypotheses concerning the mdNLR and CRP Scores. In addition, we also conducted exploratory analyses to generate novel hypotheses regarding the role of methylation-derived leukocyte proportions in lung cancer. Immune cell ratios (e.g., CD4/CD8, Neu/lymphocyte, B cell/lymphocyte, T cell/lymphocyte, Mono/lymphocyte, Neu + Mono/lymphocyte, Eos/lymphocyte, CD4nv/lymphocyte, B cell/CD8, CD8/Treg, Bnv/Bmem, CD4nv/CD4mem, and CD8nv/CDmem) were calculated for each sample by taking the ratio of its predicted cell proportions described above and tested as continuous variables. The presence of Treg was tested as a dichotomous variable. Given the need for multiple comparison adjustment, Bonferroni adjustment (family-wise error rate = 0.0013) was conducted for all exploratory analyses.

We used conditional logistic regression to examine the association between DNA methylation-based inflammatory measures (CRP Scores 1–3 and continuous mdNLR) and lung cancer risk. Models were fit with age, sex, and smoking status (never, former, current) as matching factors and were adjusted for potential confounding factors, including body mass index (BMI), batch effect, and previously described methylation-predicted pack-years smoked [57]. These analyses did not additionally adjust for methylation-derived cell proportions given how these proportions correlated with methylation-based inflammatory measures (Table 3). We repeated these analyses by lung cancer histology (NSCLC, SCLC), length of time between blood draw and diagnosis (< = 10, > 10 years), and BMI (< 25, ≥ 25 kg/m2).

Among the lung cancer cases, we examined the association between these same pre-diagnostic DNA methylation-based inflammatory measures (CRP Scores 1–3 and continuous mdNLR) and risk of lung cancer-specific death using a series of multivariable Cox proportional hazard regression adjusting for age, gender, smoking status, BMI, stage at diagnosis (three strata: stage 1 & 2, stage 3 & 4, and missing), cell proportion, batch effects, and methylation-predicted pack-years smoked. The proportional hazards assumption was checked by conducting global tests of correlating the set of scaled Schoenfeld residuals with time for each covariate. We excluded three lung cancer cases whose date of diagnosis and date of death were the same, or whose time between blood draw and date of lung cancer diagnosis was longer than 25 years. Cases were followed until their date of death from lung cancer, death from another cause, or the end of follow up in 2018, whichever came first.

Availability of data and materials

The datasets generated during the current study are available from the corresponding author on reasonable request and will be deposited into dbGaP by publication.


  1. ACS. Cancer Facts & Figures 2021. Atlanta: American Cancer Society, Inc. 2021.

  2. Howlader N, Forjaz G, Mooradian MJ, Meza R, Kong CY, Cronin KA, et al. The effect of advances in lung-cancer treatment on population mortality. N Engl J Med. 2020;383(7):640–9.

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Howlader N, Krapcho M, Miller D, Brest A, Yu M, Ruhl J, et al. SEER cancer statistics review, 1975–2017, vol. 21. Bethesda, MD: National Cancer Institute; 2020. p. 12.

    Google Scholar 

  4. Fabrikant MS, Wisnivesky JP, Marron T, Taioli E, Veluswamy RR. Benefits and challenges of lung cancer screening in older adults. Clin Ther. 2018;40(4):526–34.

    PubMed  Google Scholar 

  5. Force UPST. Screening for lung cancer: US preventive services task force recommendation statement. JAMA. 2021;325(10):962–70.

    Google Scholar 

  6. Crusz SM, Balkwill FR. Inflammation and cancer: advances and new agents. Nat Rev Clin Oncol. 2015;12(10):584–96.

    CAS  PubMed  Google Scholar 

  7. Zhou W, Liu G, Hung RJ, Haycock PC, Aldrich MC, Andrew AS, et al. Causal relationships between body mass index, smoking and lung cancer: univariable and multivariable Mendelian randomization. Int J Cancer. 2021;148(5):1077–86.

    CAS  PubMed  Google Scholar 

  8. Chaturvedi AK, Caporaso NE, Katki HA, Wong H-L, Chatterjee N, Pine SR, et al. C-reactive protein and risk of lung cancer. J Clin Oncol. 2010;28(16):2719.

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Koestler DC, Usset J, Christensen BC, Marsit CJ, Karagas MR, Kelsey KT, et al. DNA methylation-derived neutrophil-to-lymphocyte ratio: an epigenetic tool to explore cancer inflammation and outcomes. Cancer Epidemiol Prev Biomark. 2017;26(3):328–38.

    CAS  Google Scholar 

  10. Il’yasova D, Colbert LH, Harris TB, Newman AB, Bauer DC, Satterfield S, et al. Circulating levels of inflammatory markers and cancer risk in the health aging and body composition cohort. Cancer Epidemiol Prev Biomark. 2005;14(10):2413–8.

    CAS  Google Scholar 

  11. Siemes C, Visser LE, Coebergh J-WW, Splinter TA, Witteman JC, Uitterlinden AG, et al. C-reactive protein levels, variation in the C-reactive protein gene, and cancer risk: the Rotterdam Study. J Clin Oncol. 2006;24(33):5216–22.

    CAS  PubMed  Google Scholar 

  12. Trichopoulos D, Psaltopoulou T, Orfanos P, Trichopoulou A, Boffetta P. Plasma C-reactive protein and risk of cancer: a prospective study from Greece. Cancer Epidemiol Prev Biomark. 2006;15(2):381–4.

    CAS  Google Scholar 

  13. Guthrie GJ, Charles KA, Roxburgh CS, Horgan PG, McMillan DC, Clarke SJ. The systemic inflammation-based neutrophil–lymphocyte ratio: experience in patients with cancer. Crit Rev Oncol Hematol. 2013;88(1):218–30.

    PubMed  Google Scholar 

  14. Templeton AJ, McNamara MG, Šeruga B, Vera-Badillo FE, Aneja P, Ocaña A, et al. Prognostic role of neutrophil-to-lymphocyte ratio in solid tumors: a systematic review and meta-analysis. J Natl Cancer Inst. 2014;106(6):dju124.

    CAS  Article  PubMed  Google Scholar 

  15. Campa D, Zienolddiny S, Maggini V, Skaug V, Haugen A, Canzian F. Association of a common polymorphism in the cyclooxygenase 2 gene with risk of non-small cell lung cancer. Carcinogenesis. 2004;25(2):229–35.

    CAS  PubMed  Google Scholar 

  16. Engels EA, Wu X, Gu J, Dong Q, Liu J, Spitz MR. Systematic evaluation of genetic variants in the inflammation pathway and risk of lung cancer. Can Res. 2007;67(13):6520–7.

    CAS  Google Scholar 

  17. Seifart C, Plagens A, Dempfle A, Clostermann U, Vogelmeier C, von Wichert P, et al. TNF-α, TNF-β, IL-6, and IL-10 polymorphisms in patients with lung cancer. Dis Mark. 2005;21(3):157–65.

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Shih C-M, Lee Y-L, Chiou H-L, Hsu W-F, Chen W-E, Chou M-C, et al. The involvement of genetic polymorphism of IL-10 promoter in non-small cell lung cancer. Lung Cancer. 2005;50(3):291–7.

    PubMed  Google Scholar 

  19. Allin KH, Bojesen SE, Nordestgaard BG. Baseline C-reactive protein is associated with incident cancer and survival in patients with cancer. J Clin Oncol. 2009;27(13):2217–24.

    CAS  PubMed  Google Scholar 

  20. Heikkilä K, Ebrahim S, Lawlor DA. A systematic review of the association between circulating concentrations of C reactive protein and cancer. J Epidemiol Community Health. 2007;61(9):824–33.

    PubMed  PubMed Central  Google Scholar 

  21. Qu Z, Sun F, Zhou J, Li L, Shapiro SD, Xiao G. Interleukin-6 prevents the initiation but enhances the progression of lung cancer. Cancer Res. 2015;75(16):3209–15.

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Ballaz S, Mulshine JL. The potential contributions of chronic inflammation to lung carcinogenesis. Clin Lung Cancer. 2003;5(1):46–62.

    CAS  PubMed  Google Scholar 

  23. Bauer AK, Dwyer-Nield LD, Keil K, Koski K, Malkinson AM. Butylated hydroxytoluene (BHT) induction of pulmonary inflammation: a role in tumor promotion. Exp Lung Res. 2001;27(3):197–216.

    CAS  PubMed  Google Scholar 

  24. Engels EA. Inflammation in the development of lung cancer: epidemiological evidence. Expert Rev Anticancer Ther. 2008;8(4):605–15.

    CAS  PubMed  Google Scholar 

  25. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–74.

    CAS  PubMed  Google Scholar 

  26. Genkinger JM, Platz EA, Hoffman SC, Comstock GW, Helzlsouer KJ. Fruit, vegetable, and antioxidant intake and all-cause, cancer, and cardiovascular disease mortality in a community-dwelling population in Washington County, Maryland. Am J Epidemiol. 2004;160(12):1223–33.

    PubMed  Google Scholar 

  27. Michaud DS, Ruan M, Koestler DC, Alonso L, Molina-Montes E, Pei D, et al. DNA Methylation–derived immune cell profiles, CpG markers of inflammation, and pancreatic cancer risk. Cancer Epidemiol Prev Biomark. 2020;29(8):1577–85.

    CAS  Google Scholar 

  28. Lohinai Z, Bonanno L, Aksarin A, Pavan A, Megyesfalvi Z, Santa B, et al. Neutrophil–lymphocyte ratio is prognostic in early stage resected small-cell lung cancer. PeerJ. 2019;7:e7232.

    PubMed  PubMed Central  Google Scholar 

  29. Galvano A, Peri M, Guarini AA, Castiglia M, Grassadonia A, De Tursi M, et al. Analysis of systemic inflammatory biomarkers in neuroendocrine carcinomas of the lung: prognostic and predictive significance of NLR, LDH, ALI, and LIPI score. Ther Adv Med Oncol. 2020;12:1758835920942378.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Ozyurek BA, Ozdemirel TS, Ozden SB, Erdogan Y, Kaplan B, Kaplan T. Prognostic value of the neutrophil to lymphocyte ratio (NLR) in lung cancer cases. Asian Pac J Cancer Prev. 2017;18(5):1417.

    PubMed Central  Google Scholar 

  31. Grieshober L, Graw S, Barnett MJ, Thornquist MD, Goodman GE, Chen C, et al. Methylation-derived neutrophil-to-lymphocyte ratio and lung cancer risk in heavy smokers. Cancer Prev Res. 2018;11(11):727–34.

    CAS  Google Scholar 

  32. Grieshober L, Graw S, Barnett MJ, Goodman GE, Chen C, Koestler DC, et al. Pre-diagnosis neutrophil-to-lymphocyte ratio and mortality in individuals who develop lung cancer. Res Sq 2021.

  33. Mandaliya H, Jones M, Oldmeadow C, Nordman II. Prognostic biomarkers in stage IV non-small cell lung cancer (NSCLC): neutrophil to lymphocyte ratio (NLR), lymphocyte to monocyte ratio (LMR), platelet to lymphocyte ratio (PLR) and advanced lung cancer inflammation index (ALI). Transl Lung Cancer Res. 2019;8(6):886.

    PubMed  PubMed Central  Google Scholar 

  34. Hu P, Shen H, Wang G, Zhang P, Liu Q, Du J. Prognostic significance of systemic inflammation-based lymphocyte-monocyte ratio in patients with lung cancer: based on a large cohort study. PLoS ONE. 2014;9(10):e108062.

    PubMed  PubMed Central  Google Scholar 

  35. Chen X, Wu J, Zhang F, Ying L, Chen Y. Prognostic significance of pre-operative monocyte-to-lymphocyte ratio in lung cancer patients undergoing radical surgery. Lab Med. 2018;49(2):e29–39.

    PubMed  Google Scholar 

  36. Tsilidis KK, Branchini C, Guallar E, Helzlsouer KJ, Erlinger TP, Platz EA. C-reactive protein and colorectal cancer risk: a systematic review of prospective studies. Int J Cancer. 2008;123(5):1133–40.

    CAS  PubMed  Google Scholar 

  37. Heikkilä K, Harris R, Lowe G, Rumley A, Yarnell J, Gallacher J, et al. Associations of circulating C-reactive protein and interleukin-6 with cancer risk: findings from two prospective cohorts and a meta-analysis. Cancer Causes Control. 2009;20(1):15–26.

    PubMed  Google Scholar 

  38. dos Santos SI, De Stavola BL, Pizzi C, Meade TW. Circulating levels of coagulation and inflammation markers and cancer risks: individual participant analysis of data from three long-term cohorts. Int J Epidemiol. 2010;39(3):699–709.

    Google Scholar 

  39. Van Hemelrijck M, Holmberg L, Garmo H, Hammar N, Walldius G, Binda E, et al. Association between levels of C-reactive protein and leukocytes and cancer: three repeated measurements in the Swedish AMORIS study. Cancer Epidemiol Prev Biomark. 2011;20(3):428–37.

    Google Scholar 

  40. Suzuki K, Ito Y, Wakai K, Kawado M, Hashimoto S, Seki N, et al. Serum heat shock protein 70 levels and lung cancer risk: a case-control study nested in a large cohort study. Cancer Epidemiol Prev Biomark. 2006;15(9):1733–7.

    CAS  Google Scholar 

  41. Braun MM, Helzlsouer KJ, Hollis BW, Comstock GW. Colon cancer and serum vitamin D metabolite levels 10–17 years prior to diagnosis. Am J Epidemiol. 1995;142(6):608.

    CAS  PubMed  Google Scholar 

  42. Comstock G, Helzlsouer KJ, Bush TL. Prediagnostic serum levels of carotenoids and vitamin E as related to subsequent cancer in Washington County, Maryland. Am J Clin Nutr. 1991;53(1):260S-S264.

    CAS  PubMed  Google Scholar 

  43. Kakourou A, Koutsioumpa C, Lopez DS, Hoffman-Bolton J, Bradwin G, Rifai N, et al. Interleukin-6 and risk of colorectal cancer: results from the CLUE II cohort and a meta-analysis of prospective studies. Cancer Causes Control. 2015;26(10):1449–60.

    PubMed  PubMed Central  Google Scholar 

  44. Moran S, Arribas C, Esteller M. Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences. Epigenomics. 2016;8(3):389–99.

    CAS  PubMed  Google Scholar 

  45. Salas LA, Zhang Z, Koestler DC, Butler RA, Hansen HM, Molinaro AM, et al. Enhanced cell deconvolution of peripheral blood using DNA methylation for high-resolution immune profiling. bioRxiv. 2021.

  46. Koestler DC, Jones MJ, Usset J, Christensen BC, Butler RA, Kobor MS, et al. Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL). BMC Bioinform. 2016;17(1):120.

    Google Scholar 

  47. Salas LA, Koestler DC, Butler RA, Hansen HM, Wiencke JK, Kelsey KT, et al. An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray. Genome Biol. 2018;19(1):1–14.

    Google Scholar 

  48. Wang Y, Liu P, Xu Y, Zhang W, Tong L, Guo Z, et al. Preoperative neutrophil-to-lymphocyte ratio predicts response to first-line platinum-based chemotherapy and prognosis in serous ovarian cancer. Cancer Chemother Pharmacol. 2015;75(2):255–62.

    CAS  PubMed  Google Scholar 

  49. Salim DK, Mutlu H, Eryılmaz MK, Salim O, Musri FY, Tural D, et al. Neutrophil to lymphocyte ratio is an independent prognostic factor in patients with recurrent or metastatic head and neck squamous cell cancer. Mol Clin Oncol. 2015;3(4):839–42.

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Yu Y, Wang H, Yan A, Wang H, Li X, Liu J, et al. Pretreatment neutrophil to lymphocyte ratio in determining the prognosis of head and neck cancer: a meta-analysis. BMC Cancer. 2018;18(1):1–9.

    CAS  Google Scholar 

  51. Ethier J-L, Desautels DN, Templeton AJ, Oza A, Amir E, Lheureux S. Is the neutrophil-to-lymphocyte ratio prognostic of survival outcomes in gynecologic cancers? A systematic review and meta-analysis. Gynecol Oncol. 2017;145(3):584–94.

    PubMed  Google Scholar 

  52. Diem S, Schmid S, Krapf M, Flatz L, Born D, Jochum W, et al. Neutrophil-to-Lymphocyte ratio (NLR) and Platelet-to-Lymphocyte ratio (PLR) as prognostic markers in patients with non-small cell lung cancer (NSCLC) treated with nivolumab. Lung Cancer. 2017;111:176–81.

    PubMed  Google Scholar 

  53. Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 2012;13(1):1–16.

    Google Scholar 

  54. Ligthart S, Marzi C, Aslibekyan S, Mendelson MM, Conneely KN, Tanaka T, et al. DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. Genome Biol. 2016;17(1):1–15.

    Google Scholar 

  55. Myte R, Sundkvist A, Van Guelpen B, Harlid S. Circulating levels of inflammatory markers and DNA methylation, an analysis of repeated samples from a population based cohort. Epigenetics. 2019;14(7):649–59.

    PubMed  PubMed Central  Google Scholar 

  56. Ahsan M, Ek WE, Rask-Andersen M, Karlsson T, Lind-Thomsen A, Enroth S, et al. The relative contribution of DNA methylation and genetic variants on protein biomarkers for human diseases. PLoS Genet. 2017;13(9):e1007005.

    PubMed  PubMed Central  Google Scholar 

  57. Sugden K, Hannon EJ, Arseneault L, Belsky DW, Broadbent JM, Corcoran DL, et al. Establishing a generalized polyepigenetic biomarker for tobacco smoking. Transl Psychiatry. 2019;9(1):1–12.

    Google Scholar 

Download references


Cancer data were provided by the Maryland Cancer Registry, Center for Cancer Prevention and Control, Maryland Department of Health, with funding from the State of Maryland and the Maryland Cigarette Restitution Fund. The collection and availability of cancer registry data are also supported by the Cooperative Agreement NU58DP006333, funded by the Centers for Disease Control and Prevention. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the Centers for Disease Control and Prevention or the Department of Health and Human Services.


This work was supported by 2018 American Association for Cancer Research (AACR)-Johnson and Johnson Lung Cancer Innovation Science (18-90-52-MICH). Note: The funders had no role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the manuscript; and the decision to submit the manuscript for publication.

Author information

Authors and Affiliations



DSM, KTK, and EAP designed the study, obtained funding, and acquired the data. JL assisted with identifying the study samples and preparation of dataset. LAS assisted with cell deconvolution. DSM supervised all research activities. MR and NZ conducted the statistical analyses. NZ drafted the manuscript. DSM, EAP, KTK, and DCK interpreted the data and provided critical revisions of the manuscript. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Dominique S. Michaud.

Ethics declarations

Ethics approval and consent to participate

The Institutional Review Board at the Johns Hopkins Bloomberg School of Public Health and the Tufts University Health Sciences Campus Institutional Review Board approved this study.

Consent for publication

Not applicable.

Competing interests

Dr. Kelsey is a founder and scientific advisor for Cellintec, which had no role in this research.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

. Supplementary Methods.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhao, N., Ruan, M., Koestler, D.C. et al. Methylation-derived inflammatory measures and lung cancer risk and survival. Clin Epigenet 13, 222 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Lung cancer
  • DNA methylation
  • Methylation-based inflammation measures
  • C-reactive protein
  • mdNLR