Investigating the DNA methylation profile of e-cigarette use

Background Little evidence exists on the health effects of e-cigarette use. DNA methylation may serve as a biomarker for exposure and could be predictive of future health risk. We aimed to investigate the DNA methylation profile of e-cigarette use. Results Among 117 smokers, 117 non-smokers and 116 non-smoking vapers, we evaluated associations between e-cigarette use and epigenome-wide methylation from saliva. DNA methylation at 7 cytosine-phosphate-guanine sites (CpGs) was associated with e-cigarette use at p < 1 × 10–5 and none at p < 5.91 × 10–8. 13 CpGs were associated with smoking at p < 1 × 10–5 and one at p < 5.91 × 10–8. CpGs associated with e-cigarette use were largely distinct from those associated with smoking. There was strong enrichment of known smoking-related CpGs in the smokers but not the vapers. We also tested associations between e-cigarette use and methylation scores known to predict smoking and biological ageing. Methylation scores for smoking and biological ageing were similar between vapers and non-smokers. Higher levels of all smoking scores and a biological ageing score (GrimAge) were observed in smokers. A methylation score for e-cigarette use showed poor prediction internally (AUC 0.55, 0.41–0.69) and externally (AUC 0.57, 0.36–0.74) compared with a smoking score (AUCs 0.80) and was less able to discriminate lung squamous cell carcinoma from adjacent normal tissue (AUC 0.64, 0.52–0.76 versus AUC 0.73, 0.61–0.85). Conclusions The DNA methylation profile for e-cigarette use is largely distinct from that of cigarette smoking, did not replicate in independent samples, and was unable to discriminate lung cancer from normal tissue. The extent to which methylation related to long-term e-cigarette use translates into chronic effects requires further investigation. Supplementary Information The online version contains supplementary material available at 10.1186/s13148-021-01174-7.


Introduction
Electronic cigarettes (e-cigarettes) have the potential to reduce smoking-related harm. Although little evidence currently exists on long-term effects, their lack of tar and very low levels of other dangerous substances [1] suggest they are considerably less harmful than cigarettes [2]. They have been shown to be an efficacious [3] and cost-effective [4] smoking cessation aid. While it will take years to fully estimate the impact of e-cigarette use on diseases including cancer, we can investigate whether it is associated with any biomarkers that may predict future health risk [5]. Recent studies have found a reduction in harmful biomarkers among e-cigarette users (vapers) compared with smokers, with some biomarkers showing levels similar to non-smokers [5][6][7]. However, only a few biomarkers have been investigated and all with relatively short half-lives [8,9], meaning their utility for predicting long-term effects of e-cigarettes may be limited.
DNA methylation is a type of epigenetic modification involving the addition of methyl groups to the DNA which influences how the underlying sequence is interpreted and expressed. Pronounced differences in methylation have been found between cigarette smokers and non-smokers [10]. These have been replicated in different populations [11,12] and tissues [13], shown to persist for several years post-cessation [10], are able distinguish tumour from normal samples [14], and are predictive of disease and mortality [15,16]. Assessing the methylation profile of vaping could therefore inform our understanding of the potential biological impact of their use and the relative health risks compared to cigarettes [17].
In this study, we explored whether e-cigarette use is associated with methylation in saliva and evaluated the degree of similarity between methylation profiles in vapers and cigarette smokers (compared with non-smokers). The investigation of methylation in saliva is supported by the overlap in methylation signals related to smoke exposure in blood and saliva [18], with one study demonstrating a stronger signal in buccal samples compared with matched blood samples [14]. We investigated associations between e-cigarette use and previouslydeveloped methylation scores used to predict smokingrelated disease and mortality [12,16], and biological ageing [19,20]. We generated a novel methylation score for predicting e-cigarette use and assessed replication in an independent study. We also investigated whether the e-cigarette score was able to distinguish lung tumour and adjacent normal tissue to the same extent as a smoking score, in order to make inferences about the potential importance of e-cigarette-related methylation in lung cancer development.

Methods
The analysis plan was pre-registered [21] and is summarized in Additional file 1: Figure 1.

Study Design
The SEE-Cigs study (Studying the Epigenetics of E-cigarette Use) recruited vapers, smokers and non-smokers from the United Kingdom general population. It was important that vapers did not have a long previous smoking history, given the persistence of methylation marks associated with smoke exposure many years after cessation [10,22]. Vapers were therefore defined as having used e-cigarettes at least weekly for the past 6 months and having smoked < 100 times in their lifetime; smokers as having smoked at least weekly for the past 6 months and having used an e-cigarette < 100 times in their lifetime; and never smokers as having smoked and/or used an e-cigarette < 100 times in their lifetime. We aimed to recruit 120 participants per group (vapers, smokers, nonsmokers) to provide > 80% power to detect a 4.5% mean difference in methylation at p < 0.05 and > 80% power to detect an 11% mean difference at p < 1 × 10 -6 .

Eligibility criteria
In order to maximise the chance that vapers had never been cigarette smokers, and to minimize confounding by age, we restricted eligibility to between 16 and 35 years old. Additional inclusion criteria were that the participants were in good physical and mental health (measured via self-report) and were able to give informed consent as judged by the investigator. Exclusion criteria obtained via self-report were: dependence on alcohol or drugs (other than nicotine); significant current or past illness (including cancer and type 1/type 2 diabetes); current pregnancy or breast feeding; having a related individual in the sample [21].

Recruitment
Participants were recruited via a number of mechanisms including from the student population at the University of Bristol, podcasts, blogs, posters/flyers in vape shops, and social media. Recruitment began in January 2017 and was completed in January 2019. The study protocol was originally published on the Open Science Framework on 19/01/2017. On 06/02/2018 we were granted ethics approval to relax the eligibility criteria related to age (16-35 years), and previous smoking history of the vapers and never-smokers (< 100 cigarettes in their lifetime) and vaping history of the smokers and neversmokers (vaped < 100 times in their lifetime). We stated in our original protocol that we would relax age criteria if recruitment stalled. Ethics approval for the study was granted by the Faculty of Science Human Research Ethics Committee at the University of Bristol.

Consent
All participants have provided written informed consent.

Questionnaire
Participants answered questions about their smoking and vaping behavior, as well as socio-demographic and behavioural factors including age, gender, height and weight (from which body mass index (BMI) was calculated), ethnicity, educational attainment, occupation and household smoking. They were asked whether they currently used recreational drugs and if so, which recreational drugs they used (stimulants, cocaine, opiates, hallucinogens, cannabis, MDMA/ecstasy). Based on initial responses to the questionnaire, and in accordance with the eligibility criteria, participants were allocated to three participant groups (smokers, vapers and never-smokers). Participants in the 'smokers' category were asked whether they smoke cigarettes or roll-ups, whether they were daily or weekly smokers, and how many cigarettes they smoked per day/week as appropriate. They were asked at what time of day they smoke their first cigarette, for how long they had been a smoker, and details about whether they had plans to give up, or whether they had previously attempted to stop smoking. Participants in the 'vapers' category were asked about the type of device they used, the nicotine concentration they used most frequently, and whether they had changed the nicotine concentration in the past. If they reported using a refillable device, they were asked to estimate the volume of liquid used in an average day. Similar to the 'smokers' group, this group were asked the time of day that they first vape, how long they had been a vaper, and details about whether they had plans to give up vaping. Participants in the 'never-smokers' group were asked whether they had ever smoked or vaped, and how frequently, to ensure they met our inclusion/exclusion criteria.

Sample collection
After completing an online questionnaire, participants were screened for eligibility and sent an information sheet and consent form. On enrolling, participants were posted a study pack containing a saliva collection kit (DNA Genotek Oragene ™ ) from which DNA was extracted and methylation was measured. We supplemented existing kit instructions with a simplified version to aid understanding and improve sample quality, which was posted to participants along with the kit, consent form and information sheet. We asked participants to provide 2 mL of saliva and return the kits through the post to the University of Bristol, where they were processed by the Bristol Bioresource Laboratories.

DNA methylation profiling
DNA was extracted from the saliva samples and underwent bisulphite conversion using the Zymo EZ DNA Methylation ™ kit (Zymo, Irvine, CA). Genome-wide methylation status of over 850,000 cytosine-phosphateguanine sites (CpGs) was measured using the Illumina HumanMethylationEPIC array according to standard protocol. DNA samples were loaded onto the Illumina HumanMethylationEPIC array in three batches with sampling criteria in place to ensure that all three groups were represented in each batch in order to minimise potential confounding by batch effects. In addition, during the data generation process a wide range of batch variables were recorded in a purpose-built laboratory information management system (LIMS), which also reported quality control (QC) metrics. Microarray data underwent quality control and normalization using meffil, an R package designed for pre-processing of large samples of Illumina Methylation BeadChip microarrays [23]. Sample outliers were identified and removed based on sex-chromosome methylation, methylation versus unmethylation intensity, control probes, detection p values (N = 10 exclusions in total: 4 vapers, 3 smokers and 3 non-smokers). Poor quality CpG sites, SNP/control probes and CpGs on the sex chromosomes were excluded, resulting in 846,244 CpG sites for analysis.

Estimated cell type proportions
A cell type reference for saliva was derived as part of meffil by combining a white blood cell type reference (GEO: GSE35069) and a buccal cell type reference (GEO: GSE48472). Estimated cell type proportions comprised: Buccal, CD4T, CD8T, Monocytes, B-cells, NK cells and Granulocytes.

Epigenome-wide association study (EWAS)
Multivariable linear regression was used to assess the differences in methylation at each measured CpG between (1) vapers versus non-smokers, (2) smokers versus nonsmokers, and (3) smokers versus vapers, with adjustment for age, biological sex, BMI, educational attainment, household smoking, recreational drug use and 20 surrogate variables, using meffil [23]. We investigated CpGs which reached a Bonferroni-significance threshold of p < 5.91 × 10 -8 (0.05/846,244 CpGs tested), as well as a less stringent threshold of p < 1 × 10 -5 . From these EWAS results, we identified differentially methylated regions (DMRs) using the dmrff R package [24]. DMRs were defined as regions containing at least two CpGs within 500 bp, each with EWAS meta-analysis p values < 0.05 and methylation changes in a consistent direction, and where the regional p value surpassed Bonferroni correction.
For the EWAS of vapers versus non-smokers, five additional models were run: (i) with adjustment for estimated cell type composition, (ii) with adjustment for self-reported smoking history (number of cigarettes), (iii) with adjustment for methylation at AHRR (cg05575921), an objective biomarker of smoke exposure [25], (iv) after excluding participants with < 60% salivary methylation at AHRR (cg05575921), indicative of a substantial smoking history [18], v) restricted to individuals of white ethnicity.
For the CpG sites identified in the EWAS of vapers versus non-smokers, and smokers versus non-smokers, we investigated whether there was evidence of a dose response in methylation levels based on the length of exposure history (6 months-1 year vs > 1 year for e-cigarette use, 6 months-5 years vs > 5 years for smoking).

Enrichment and annotation
From the EWAS results of (1) vapers versus non-smokers, and (2) smokers versus non-smokers, we investigated evidence for enrichment of associations among 2,623 and 1,501 smoking-related methylation sites identified in previous large-scale studies of blood [10] and buccal samples [14], using a Wilcoxon rank sum test.
Although we excluded potential participants who reported drug dependence, given the widespread use of e-cigarettes for inhaling cannabinoids [26] and known impact of cannabis on DNA methylation levels [27], we assessed whether any of the 15 CpGs identified in an EWAS of cannabis [27] were associated with e-cigarette use after Bonferroni correction. We also ran an additional model for the EWAS of vapers versus non-smokers with adjustment for reported cannabis use. Further, we assessed whether any CpG sites associated with alcohol use in a previous EWAS [28] were associated with e-cigarette use after Bonferroni correction.
We also investigated whether there was any evidence for replication of 14 CpGs related to e-cigarettes in a previous EWAS [29], and assessed the extent to which the CpGs identified in our EWAS had been previously reported in relation to other traits in two publicly available repositories [30,31]. We explored the potential functions of the top 50 CpGs identified in each EWAS via GO and KEGG enrichment analysis using the missMethyl R package [32].

Methylation scores for smoking and epigenetic ageing
Methylation scores can be derived by summing methylation values at relevant CpGs previously identified in relation to a relevant exposure, weighted by the effect sizes observed in independent EWAS studies. Five methylation scores of smoke exposure [10,14,16,25,33] and four methylation scores of epigenetic ageing [33][34][35][36] were quantified.
We assessed associations between scores comprising methylation values derived from a weighted average of CpG sites found to be related to smoking in previous studies. This included scores derived from the CpG sites identified in EWAS conducted by Joehanes et al. [10] and Teschendorff et al. [14], as well as 233 and 172 CpG sites identified in McCartney et al. [16] and Lu et al. [33] respectively. The latter two studies used penalised regression models of smoking pack-years to identify CpG sites most predictive of smoke exposure. Finally, since the CpG site, cg05575921 (AHRR) contributed most weight to all of the methylation scores and has been proposed as an independent biomarker of smoking [25], we investigated this site as an additional biomarker. With the exception of AHRR, the other scores developed were linear combinations of methylation levels at the relevant CpG sites weighted by the effect sizes of sites identified in relation to smoking from the various studies [10,14,16,33].
For epigenetic ageing, we assessed associations between two "first generation" epigenetic clocks derived from DNA methylation levels at CpG sites found to be strongly associated with chronological age [34,35], as well as two more recently derived clocks: one optimised to predict physiological dysregulation (PhenoAge) [36] and one optimised to predict lifespan (GrimAge) [33]. To generate the epigenetic ageing measures in SEE-Cigs, we uploaded DNA methylation data for a subset of CpG sites from the Illumina EPIC array to the online DNA Methylation Age Calculator (https:// dnama ge. genet ics. ucla. edu/) developed by the Horvath lab. We also uploaded an annotation file, containing data on chronological age, sex and tissue type (saliva) for the samples. We were able to generate the following epigenetic ageing measures: intrinsic epigenetic age acceleration based on Horvath's multi-tissue predictor (IEAA) [34]; extrinsic epigenetic age acceleration (EEAA) based on Hannum's method, which up-weights the contribution of blood cell composition [37]; PhenoAge [36] and GrimAge [33]. Intrinsic epigenetic age acceleration (IEAA) is independent of changes in blood cell composition while extrinsic epigenetic age acceleration (EEAA) incorporates age-related changes in blood cell composition. PhenoAge and Grim-Age can be considered as measures of extrinsic ageing.
Multivariable linear regression was used to assess differences in methylation scores between the three groups with adjustment for age, sex, BMI, educational attainment, household smoking and recreational drug use. Further analyses were restricted to individuals of white ethnicity only, with adjustment for methylation-derived smoking pack-years in the GrimAge model [33], and with adjustment for self-reported smoking history when evaluating methylation scores in relation to e-cigarettes.

Methylation score for e-cigarette use
We generated methylation scores for e-cigarette use and smoking within SEE-Cigs and then assessed their discriminative performance for predicting e-cigarette use and smoking within SEE-Cigs and in an independent dataset, the Avon Longitudinal Study of Parents and Children (ALSPAC) [38][39][40]. We also investigated whether the methylation scores for e-cigarette use was able to discriminate tumour from normal tissue in lung to the same extent as the methylation score for smoking, using data from publicly available methylation data in The Cancer Genome Atlas (TCGA) [41].

SEE-Cigs
For internal validation of the methylation score of e-cigarette use, we used a training (2/3 sample of vapers and non-smokers) and testing set (1/3 sample of vapers and non-smokers) within the SEE-Cigs study. In the training set, we used the glmnet package in R to fit a generalized logistic regression via penalized maximum likelihood using three-fold cross validation and run 10 times to determine a lambda with minimum average error. A methylation score was then generated based on the fitted object produced. The resulting methylation score comprised a sum of the beta-values of the included CpG sites. Its performance in predicting e-cigarette use was evaluated in the test set by generating a receiver operator characteristic (ROC) curve and evaluating the area under the curve (AUC) derived from the logistic regression model using the R package pROC (version 1.16.1). We compared the AUC obtained from this model with that from a similar model for predicting smoking. For this, we derived a methylation score for smoking using a training set (2/3 sample of smokers and non-smokers) and testing set (1/3 sample of smokers and non-smokers) within SEE-Cigs.

Avon Longitudinal Study of Parents And Children
We next assessed external validation of methylation scores for e-cigarette use and smoking in the Avon Longitudinal Study of Parents and Children (ALSPAC). ALSPAC is a large, prospective cohort study based in the south-west of England. Pregnant women resident in Avon, UK with expected dates of delivery 1st April 1991 to 31st December 1992 were recruited and detailed information has been collected on these women and their offspring at regular intervals [38,39]. Additional offspring that were eligible to enroll in the study have been subsequently recruited at the ages of 7 and 18 years [40]. The additional enrolment provides a baseline sample of 14,901 offspring who were alive at 1 year of age. Please note that the study website contains details of all the data that are available through a fully searchable data dictionary and variable search tool (http:// www. brist ol. ac. uk/ alspac/ resea rchers/ our-data). Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. Consent for biological samples has been collected in accordance with the Human Tissue Act (2004). Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time.
When the offspring were 24 years old, they were invited to attend the Focus @ 24 + clinic, which took place between June 2015 and October 2017. 4,026 were seen at this clinic, where fasting blood samples were taken. Blood samples from 570 of the offspring were selected for DNA methylation profiling to maximize overlap with existing DNA methylation profiles generated collected at younger ages as part of the Accessible Resource for Integrated Epigenomic Studies (ARIES) [42]. Following DNA extraction, samples were bisulphite-converted using the Zymo EZ DNA MethylationTM kit (Zymo, Irvine, CA). Genome-wide methylation was then measured using Illumina Infinium MethylationEPIC Beadchip arrays. The arrays were scanned using an Illumina iScan, with initial quality review using GenomeStudio. During the data generation process a wide range of batch variables were recorded in a LIMS, which also reported quality control metrics. Quality control and normalization was then carried out using the meffil R package [23]. Quality control included checks for sample swaps using genotype matching and sex prediction, methylated versus unmethylated signal outliers, dye bias, poor probe signal detection, and low bead numbers. Only one sample failed and was excluded due to evidence of being a sample swap and having multiple control probe outliers. The 569 samples that passed were normalized in meffil using functional normalization using the top 20 control probe principal components and sample plate as a random effect.
At the same time point when the offspring were 24 years old, information on smoking and e-cigarette use was obtained from a questionnaire that was completed by 458 of the offspring with DNA methylation data. Questionnaire data were collected and managed using RED-Cap electronic data capture tools hosted at the University of Bristol [43]. The participants were asked whether they had ever smoked or used an e-cigarette, as well as about the frequency and duration of use. From these details were determined three groups of participants: (1) vapers (currently users of electronic cigarettes or other vaping devices, n = 14) (2) smokers (current daily or weekly smokers, n = 47) (3) non-smokers (smoked < 100 cigarettes in their lifetime and never used an electronic cigarette or other vaping device, n = 262).
We assessed the discriminative performance of a methylation score generated in SEE-Cigs for predicting e-cigarette use (vs. non-smoking) in the ALSPAC cohort, which was compared with a methylation score for predicting smoking (vs. non-smoking). Both e-cigarette and smoking methylation scores were obtained using the same approach described above in the full sample of: i) vapers vs. non-smokers and ii) smokers versus non-smokers, respectively, in SEE-Cigs.

The Cancer Genome Atlas
We used data on 27 individuals with lung adenocarcinoma (LUAD) and 41 individuals with lung squamous cell carcinoma (LUSC) who had Illumina Infinium 450K DNA methylation measured in both tumour and adjacent normal samples as part of The Cancer Genome Atlas (TCGA). Again, a methylation score was generated on the full sample of vapers versus non-smokers in SEE-Cigs, this time restricted to CpG sites which were present only on the 450K array. This was compared with a methylation score for smoking generated on the full sample of smokers versus non-smokers, again restricted to 450K CpG sites. Figure 1 shows the participant flow for the SEE-Cigs study. The final sample consisted of 117 smokers, 117 non-smokers and 116 vapers with methylation data. Descriptive characteristics are displayed in Table 1. Compared with non-smokers, vapers were more likely to have higher BMI, be male, have lower educational attainment and be more exposed to household smoke. Smokers were more likely to be male, have lower educational attainment, be more exposed to household smoke and to use drugs recreationally. Smokers were slightly older on average than non-smokers and vapers. The majority of participants were of white ethnicity, with a slightly higher proportion of non-white individuals among the non-smokers. Smokers had smoked for a median of 1.20 (IQR = 0.38-3.15) pack-years, while both non-smokers and vapers reported a minimal smoking history. This was verified based on levels of AHRR methylation (cg05575921), for which 3 non-smokers and 5 vapers had < 60% salivary methylation, indicative of previous smoking (Fig. 2). There were no differences in cell type proportions of the saliva samples obtained from the participants. Most vapers used e-cigarettes containing nicotine and vaped daily.
Apart from associations at AHRR in the models involving smoking, there was limited overlap in the top CpGs identified in the three EWAS (Fig. 3, Additional file 1:   Figure 2). 9 DMRs were found in common between at least two of the EWAS models (Fig. 3). One DMR was hypermethylated (chr20:BLCAP;NNAT) and two hypomethylated (chr20:SLC2A10 and chr3:THRB) in nonsmokers compared with vapers and smokers. Two DMRs were hypermethylated (chr10 and chr3:CACNA1D) and two were hypomethylated (chr17:BRCA1;NBR2 and chr6:PRRT1;PPT2) in smokers compared with nonsmokers and vapers. Two DMRs were hypermethylated in vapers compared with smokers and non-smokers (chr10:ANXA11;LINC00857 and chr17:HSPB9;KAT2A). For the 7 CpG sites associated with vaping (vs. nonsmoking), the direction of effect was consistent irrespective of vaping duration. The magnitude of effect was typically larger among those participants reporting to have vaped for > 1 year compared with those vaping for 6 months-1 year, with the exception of cg10440286 (ELFN2). For the 13 CpG sites associated with smoking (vs. non-smoking), the direction of effect was consistent irrespective of smoking duration. The magnitude of effect was larger among those participants reporting to have smoked for > 5 years compared with those smoking for 6 months-5 years at some (e.g. cg05575921 (AHRR), cg21732535 (PTH2R)), but not all (e.g. cg23771956, cg13159505 (RPTOR)) sites (Additional file 1: Figure 3).
One CpG previously associated with cannabis was found to be associated with e-cigarettes after Bonferroni correction (cg04180046; p = 0.0029). This CpG has also been previously associated with smoke exposure [10], and there was no difference in methylation at this site between vapers and smokers in the present study (p = 0.865) (Additional file 2: Table 5). No CpGs previously associated with alcohol use were found to be associated with e-cigarettes after Bonferroni correction (p > 0.005) (Additional file 2: Table 6). We also found little evidence of associations between 14 CpGs previously found in relation to vaping [29] in any of the EWAS after Bonferroni correction (p > 0.01) (Additional file 2: Table 7).
Three of the seven CpGs associated with e-cigarettes have been identified in previous EWAS for smoking, Fig. 3 Comparison of epigenome-wide associations studies. A EWAS for e-cigarette use (vs. non-smoking) and smoking (vs. non-smoking). B EWAS for e-cigarette use (vs. non-smoking) and e-cigarette use (vs. smoking). C EWAS for smoking (vs. non-smoking) and e-cigarette use (vs. smoking) Down Syndrome, systemic corticosteroid, prostate cancer, gestational age and fetal versus adult liver (Additional file 2: Tables 8 and 9). We found limited enrichment for KEGG pathways or GO terms (false discovery rate, FDR p > 0.05) (Additional file 2: Tables 10-13). In relation to e-cigarette use, response to ethanol/alcohol, positive regulation of insulin secretion and GABA transport were the top GO terms. Butanoate metabolism, synaptic vesicle cycle and GABAergic synapse were the top KEGG pathways.

Discussion
Among 117 smokers, 117 non-smokers and 116 vapers with a limited smoking history, we found that salivary methylation signals of e-cigarette use were weak and largely distinct from those established in relation to cigarette smoking. The top 3 CpGs for vaping were located in protein-coding genes for a ribonuclease P/ MRP subunit (RPP14) (p = 6.43 × 10 -7 ), an insulin-like growth factor receptor (IGF1R) (p = 5.47 × 10 -6 ) and a gamma-aminobutyric acid (GABA) A receptor (GABRP) (p = 2.77 × 10 -6 ). The top DMR was located in MUC4 encoding Mucin 4 (p = 4.13 × 10 -18 ), an integral membrane glycoprotein present in mucus upregulated in vapers [44]. Of the DMRs found to be differentially methylated in vapers compared with both smokers and nonsmokers, ANXA11 suggestively plays an important role in lung function, and variation in this gene has been associated with Sarcoidosis (mainly affecting the lung) [45] and Chronic Obstructive Pulmonary Disease-related biomarkers [46]. Ethanol/alcohol, positive regulation of insulin secretion, GABA transport and butanoate metabolism were among the most enriched pathways, reflecting biological responses to e-cigarette constituents (ethyl alcohol, ethyl butyrate and nicotine). Methylation scores for smoking and biological ageing were similar between vapers and non-smokers. Higher levels of a biological ageing score (GrimAge) were observed in smokers. Finally, a methylation score generated to index e-cigarette use poorly discriminated vapers from non-smokers in SEE-Cigs and in an independent dataset (ALSPAC), which was in contrast to a methylation score generated to index smoking. The smoking methylation score also showed better discrimination of tumour and adjacent normal tissue in lung squamous cell cases compared with the e-cigarette methylation score. In contrast to our findings, two studies (comprising 32 and 45 participants, respectively) found associations between e-cigarettes and methylation levels which overlap with smoking-related signals [29,47]. We were also unable to replicate methylation differences for 14 CpGs previously related to e-cigarettes [29]. However, it is important to highlight that the vapers included in the previous studies were not selected for smoking history as stringently as in the current study and former smokers were likely to comprise a substantial proportion of the sample.
Two studies showing a weak methylation profile related to smokeless sources of nicotine are supportive of our results [48,49]. However, both studies only investigated peripheral blood and not tissue-specific methylation at the site of exposure (e.g. saliva).
Acceleration of a biological ageing methylation score in smokers but not vapers is of interest since such markers are predictive of age-related disease and mortality independent of chronological age [37,50,51]. Discrimination of lung tumour and adjacent samples by a salivary-based methylation score for smoking is supported by previous findings [14]. The lack of discrimination by the e-cigarette methylation score could indicate that smoking-related methylation changes may be more relevant to tumourigenesis than changes related to e-cigarettes. However, the smoking methylation score generated in the present study was lower in lung squamous cell carcinoma relative to normal tissue, the inverse of what was expected due to higher levels being observed in smokers [14]. We have previously found methylation in tumour tissue is in the opposite direction to that observed in relation to smoking for AHRR (cg05575921) [52] (Additional file 1: Figure 7), the CpG contributing most weight to the methylation score for smoking (Additional file 2: Table 19).
Major strengths relate to the design of the study, including the recruitment of individuals with a limited smoking history and the assessment of methylation levels in an easily accessible and exposure-relevant tissue for investigating epigenetic profiles of e-cigarettes. Limitations include the representativeness of our study sample, with demographic characteristics different to the general population due to the strict inclusion criteria. The young age of the study sample (mean age = 21 years) and limited smoking and vaping history could hamper the detection of methylation signals. This may explain why so few CpGs were identified in relation to smoking than anticipated based on previous studies of oral samples [14]. Nonetheless, enrichment of smoking-related CpGs among the smokers indicates that these signals were present but more weakly associated. Furthermore, small sample size could have hindered the detection and replication of e-cigarette methylation signals, in particular in the external validation analysis in ALSPAC where methylation data was only available on 14 vapers. Similar enrichment for smoking-related CpGs was not found among the vapers, indicating that the methylation signature of e-cigarettes is distinct from that of smoking, and that vapers in the present study were accurate in reporting their limited smoking history, despite this not being biochemically verified.
While it appears that the methylation profile of vapers is less pronounced than that of smokers, the methylation changes associated with e-cigarettes may be commensurate in scale with other lifestyle exposures and replication of the signals identified in relation to e-cigarettes in larger studies is warranted. In addition, future studies may benefit from comparing saliva methylation patterns in e-cigarette users with those from other sample types, such as blood, since some markers may perform better as predictors when measured in whole blood [18].
Additional research in cohort studies is required to investigate methylation changes among ex-smokers quitting with different methods, including e-cigarettes. While findings from this study suggest that e-cigarettes may have distinct health effects from cigarettes, we cannot provide robust conclusions regarding the safety of e-cigarettes. Furthermore, although the methylation changes identified in relation to both smoking and e-cigarettes may be predictive of future disease risk, the causal consequences of these methylation changes on health outcomes are currently uncertain [52].

Conclusions
Findings from this study suggest that e-cigarette use does not impact saliva methylation in the same way as cigarette smoking. Unlike for smoking, the methylation profile for e-cigarettes did not replicate in independent samples and was not able to discriminate cancer from normal tissue. However, the short duration of e-cigarette use by the study participants and sample size may have hampered the detection of signals. Further studies are required to detect a robust methylation signature for long-term e-cigarette use. The extent to which methylation related to e-cigarette use translates into chronic effects and relevant health outcomes should also be investigated.