Distribution of smoking-associated CpG among white blood cells
One major aim of this study was to show that smoking-related methylation changes observed in whole blood or PBMC are rather caused by activation/expansion of specific cell types than by minor methylation changes in each cell type.
Recently, with the smoking-evoked GPR15+ T cells, possessing a cell type-specific hypomethylation at cg19859270 within the gene body of GPR15, we identified such a specific cell type accounting for the minor methylation change at cg19859270 in whole blood or PBMC samples of smokers [23]. It has been hypothesized that an up-regulation of GPR15 could explain to some extent the health hazard of smoking with regard to chronic inflammatory diseases [24].
Thus, we further assumed that all methylation changes at single CpGs by tobacco smoking might be unequally distributed among the leukocyte population of the blood. To prove this, we chose several highly significant hence top-ranked CpGs repeatedly highlighted in recent studies, based on isolated DNA from whole blood [6] and PBMC [4]. As expected, we could show for the first time, by comparing data from these two recent reports and by our own experimental data based on separated cell populations, that smoking-induced methylation changes in blood are strongly cell type-specific. As shown for cg19859270 (GPR15), even a methylation difference in blood of around 2 % by smoking may be of strong biological relevance, if this methylation is caused by expansion of a specific cell type, namely GPR15+ T cells, involved in inflammation and disease pathology.
Interestingly, in all cases where the main smoking-induced methylation change at single CpG was expected in PBMC (cg19859270, GPR15; cg02657160, CPOX; cg02319016, PAK2), the greatest methylation change was found even in the GPR15+ T cells of smokers. Thus, it can be supposed that similar to the reported cg19859270 in the GPR15 gene [23], the smoking-induced expansion of GPR15+ T cells may be responsible for further single methylation changes identified in whole blood or PBMC samples in smokers.
In contrast to other CpG sites showing cell type-specific methylation changes in smokers, the hypomethylation of cg05575921 within the AHRR gene was found in different cell types. The most pronounced methylation change emerged in granulocytes (−55.2 % compared to non-smoker) followed by PBMC (−15.7 %) and GPR15+ T cells (−12.7 %). Since GPR15+ T cells are only a small subset of PBMC (about 6–28 %), it can be assumed that monocytes additionally account for the methylation change of about 16 % at cg05575921 in PBMC. Smoking-induced methylation changes at cg05575921 in antigen-presenting cell types like EBV-immortalized lymphoblasts (corresponding to B cell) and alveolar macrophages may confirm this assumption [10]. The dissimilar effect size of methylation changes at cg05575921 by tobacco smoking furthermore discounts the previously discussed assumption of a stem cell origin of methylation changes.
Comparing smoking-induced changes in CpG methylation of whole blood with buccal samples, it was recently found that hypomethylated top-ranked CpGs (n = 19) were common to both tissues [25]. This might evoke the assumption that there are really existing CpG sites which are affected by smoking exposure irrespective of the tissue or cell type. However, because of (i) the absence of expected T cell-specific cg19859270 in GPR15 [26], (ii) the presence of mainly granulocyte-specific top-ranked CpG sites (in AHRR, F2RL3, GFI1 and others), and (iii) the huge number of different smoking-affected CpGs s between buccal and blood samples (see Additional file 6: Figure S1), this observation seems more likely to rely on expected contamination of buccal samples with leukocytes, especially granulocytes [27], than on a common smoking-induced hypomethylation at identical CpG sites both in leukocytes and buccal epithelial cells. Additionally, because of top-ranked CpGs in aryl-hydrocarbon receptor (AHR)-induced genes (CYP1A1, CYP1B1) in buccal samples, not conspicuous in whole blood, the concomitant presence of hypomethylated CpGs in AHR repressor gene (AHRR) in epithelial cells would not have been expected. A similar conclusion could also be drawn comparing isolated PBMC with whole blood. Due to contamination of PBMC preparations with granulocytes (3 ± 2 %, according to the manufacturer’s preparation instruction), the most significant CpGs of whole blood were expectedly found in PBMC (i.e., cg05575921 in AHRR, cg09935388 in GFI1, and cg03636183 F2RL3). Vice versa, the cell type-specific hypomethylation at cg19859270 (GPR15) of GPR15+ T cells (approximately 2–4 % of T cells in smokers) was found in whole blood of smokers.
Cellular biomarker for tobacco smoking effects in blood
Based on epidemiological studies, a variety of CpG sites have been identified as potential candidates for monitoring current and life-time tobacco smoke exposure, among them several sites in F2RL3, cg05575921 in AHRR as well as cg19859270 in GPR15. The cg05575921 in AHRR was the CpG with the highest effect size [11, 12, 20] and has additionally become of special interest, since its methylation level was found to be changed already after a quite short period of smoking. Philibert and colleagues postulated that the extent of AHRR methylation at cg0557921 in PBMC might be a potential biomarker for the initiation of smoking because it shows a significant hypomethylation in nascent smokers who had an one pack-year exposure [11]. However, in that report, the methylation level at one pack-year exposure showed a broad overlapping range toward that of non-smokers indicating the impossibility of discrimination of subjects at the individual level by the cg0557921 methylation level in PBMC.
In contrast to the reported results at a population level, we intended to identify biomarkers suitable to identify tobacco smoke effects at the subject level. In this study, we have shown that the methylation level at cg05575921 in granulocytes as well as the amount of GPR15+ T cells may serve as highly sensitive and specific biomarkers in blood indicating a disturbed homeostasis by tobacco smoking at the subject level.
Comparing both biomarkers, it was evident that the amount of GPR15+ T cells much better reflects the individual variation in response to smoking than cg05575921 methylation level in granulocytes. It should be noted that it was not our intension to identify a biomarker correlating with the duration or the level of tobacco smoke exposure. In a century of personalized medicine, we rather focused on specific qualitative signs for disturbed homeostasis by tobacco smoking at an individual level. If GPR15+ T cells are activated and expanded as a consequence of lung inflammation and if the occurrence of this cell population in peripheral blood might reflect the degree of already established lung inflammation, it remains elusive. However, further studies are necessary to validate whether the amount of GPR15+ T cells may in fact indicate lung inflammation/destruction by smoking.
Time course of methylation recovery
To exclude the possibility that the hypomethylation at cg0557921 in AHRR might be a long-lasting effect from mother’s smoking during pregnancy, we investigated the time course of methylation from birth until the age of 4 years in children from smoking compared to non-smoking mothers during pregnancy. In general, the cg0557921 was one of the most significant CpG in cord blood affected by tobacco smoking throughout pregnancy [16, 18, 20]. However, in accordance with studies showing that DNA methylation in blood of adult smokers after cessation approaches level of never smokers within few years, but never completely reaches normal levels [6, 8], we have shown that the different methylation level at cg0557921 in cord blood of newborns whose mothers had smoked during pregnancy disappeared until the age of 2 years. Because of the fast recovery during the first 2 years after birth, it seems unlikely that the hypomethylation at this CpG in adult smokers was caused by in utero exposure. In a further study including birth and ages of 7 and 17 years, a full recovery of methylation at cg0557921 in whole blood toward the level of those children not exposed to prenatal maternal smoke was not reached [21]. However, in this study, a maximum methylation was reported for the 7-year-old children followed by a slightly falling down to the age of 17 years [21].
The reason for the shift in methylation at cg0557921 over time in unexposed children remains elusive. But we think that changed methylation over the first year of life might be a sign of physiological maturation of immune system in general. Seemingly similar methylation changes were also found in GFI1 and CNTNAP2 showing an increase of methylation from birth to the age of 7 years [21]. Interestingly, all these three genes additionally exert their function as tumor suppressor genes (AHRR [28], GFI1 [29], CNTNAP2 [30]).
According to the time course of recovery of methylation after cessation in adults, the CpGs affected by tobacco smoking were classified by Guida and colleagues as persistent (cg19859270, GPR15; cg02657160, CPOX; cg0557921, AHRR; cg09935388, GFI1) or reversible (cg02319016, PAK; cg13086586, PAICS) [8]. This statement relied on genome-wide methylation data. However, taken into account that smoking-induced GPR15+ T cells are mainly central or effector memory T cells [31] with an even longer life span, we would expect that all conspicuous CpGs in this cell type could be interpreted as persistent, including the cg02319016 in PAK. The mechanistic interpretation of persistent CpGs mainly found in granulocytes, such as cg09935388 in GFI, remains unclear. In contrast to cells of the adaptive immune system, innate immune cells like granulocytes are expected to exhibit a comparable functional effector response each time the same pathogen is encountered [32]. Thus, activated immune cells might appear in peripheral blood as long as immune cell activating tobacco combustion products, which are deposited in the lung during active smoking, are not eliminated.
Smoking-associated methylation changes in newborns
Comparing the most significant smoking-induced methylation changes in cord blood [16, 17, 19] with that of adult blood [6, 8], it is striking evident that in cord blood exclusively those CpG-annotated genes were highlighted which were dominantly regulated in granulocytes. With decreasing significance of CpGs the AHRR, GFI1, MYO1G, and CYP1A1 were the top leading conspicuous genes in cord blood with altered methylation pattern following maternal smoking throughout pregnancy [16–18, 21]. According to our findings, the main methylation changes by smoking in CpGs of AHRR and GFI1 were attributed to granulocytes. MYO1G was identified in whole blood samples of adult smokers [6, 8] but not in PBMC [4] indicating granulocytes as the main source of this methylation change at this CpG. The enzyme of biotransformation CYP1A1 is rather expressed in metabolic active granulocytes or monocytes than in lymphocytes [33, 34]. Additionally, it is of special interest that the smoking-induced hypomethylation at cg19859270 in GPR15+ T cells, representing the adaptive immunity, was never found in cord blood. As a potential reason for the missing GPR15 activation in cord blood, the presence of a specific population of suppressor T cells protecting the fetus from an adaptive immune reaction like graft-versus-host reaction in utero has been discussed [35]. Such suppressor cells were not found in adult blood.
Missing in vitro effects of cigarette smoke extract
In a recent study, we have shown that CSE failed to induce the expansion of GPR15+ T cells in vitro [23]. Thus, we hypothesized that GPR15+ T cells, rather than being directly activated by tobacco smoke exposure, are stimulated by antigen-presenting cells in the lung, such as Langerhans cells, indicating a disturbed lung epithelial homeostasis by tobacco smoking. In the current study, we additionally have shown that CSE failed to induce hypomethylation at cg05575921 in AHRR both in isolated granulocytes and in PBMCs. This result again may give evidence for a more complex cellular mechanism leading to methylation changes in different cell types.
Limitations
Analyzing exclusively top-ranked CpGs in whole blood or PBMC samples, we did not adjust for cellular composition. This seems to be a limitation of this study but, first, it was proven that top-ranked CpGs are not affected by composition of main blood cell types [16]. Second, identical smoking-induced top-ranked CpGs were repeatedly highlighted in different genome-wide association studies (GWAS) considering [3, 6] or not considering [1, 2] cell composition of blood. Third, we have shown that smoking-induced changes in methylation at top-ranked CpGs were present both in cell type-composed blood and separated blood cell types. Altogether, this indicates the plausibility of our findings not considering the cell composition of blood. To better understand the mechanism or causality of impaired methylation pattern in blood by smoking exposure, for instance, we clearly have shown the need to discover the cellular origin of each of the significant CpG. By this approach, we automatically would clarify if a change in methylation at given CpG relied on cell composition of the blood or not. On the other hand, if adjustment would have been performed for all known cell types of the blood, including the recently described smoking-induced GPR15+ T cells, then methylation changes at cg19859270 in GPR15 would not have been detected in none of the GWAS. To overcome this limitation of incomplete adjustment for cell composition and to strengthen the focus on identification of biological cause of exposure-triggered methylation changes in blood, in general, we advise against adjustment of datasets obtained from whole blood for cell composition. An additional limitation was given by missing details about smoking behavior from blood donors; thus, we could not interpret the excess of GPR15+ cells among T cells to the daily frequency of smoking or cigarette pack-years. However, from the clinical site, for therapeutic purpose, more the individual qualitative and quantitative endpoints are of interest and less the tobacco smoke exposure leading to the phenotypic disturbance.