Skip to main content

Fathers’ preconception smoking and offspring DNA methylation



Experimental studies suggest that exposures may impact respiratory health across generations via epigenetic changes transmitted specifically through male germ cells. Studies in humans are, however, limited. We aim to identify epigenetic marks in offspring associated with father’s preconception smoking.


We conducted epigenome-wide association studies (EWAS) in the RHINESSA cohort (7–50 years) on father’s any preconception smoking (n = 875 offspring) and father’s pubertal onset smoking < 15 years (n = 304), using Infinium MethylationEPIC Beadchip arrays, adjusting for offspring age, own smoking and maternal smoking. EWAS of maternal and offspring personal smoking were performed for comparison. Father’s smoking-associated dmCpGs were checked in subpopulations of offspring who reported no personal smoking and no maternal smoking exposure.


Father’s smoking commencing preconception was associated with methylation of blood DNA in offspring at two cytosine-phosphate-guanine sites (CpGs) (false discovery rate (FDR) < 0.05) in PRR5 and CENPP. Father’s pubertal onset smoking was associated with 19 CpGs (FDR < 0.05) mapped to 14 genes (TLR9, DNTT, FAM53B, NCAPG2, PSTPIP2, MBIP, C2orf39, NTRK2, DNAJC14, CDO1, PRAP1, TPCN1, IRS1 and CSF1R). These differentially methylated sites were hypermethylated and associated with promoter regions capable of gene silencing. Some of these sites were associated with offspring outcomes in this cohort including ever-asthma (NTRK2), ever-wheezing (DNAJC14, TPCN1), weight (FAM53B, NTRK2) and BMI (FAM53B, NTRK2) (p < 0.05). Pathway analysis showed enrichment for gene ontology pathways including regulation of gene expression, inflammation and innate immune responses. Father’s smoking-associated sites did not overlap with dmCpGs identified in EWAS of personal and maternal smoking (FDR < 0.05), and all sites remained significant (p < 0.05) in analyses of offspring with no personal smoking and no maternal smoking exposure.


Father’s preconception smoking, particularly in puberty, is associated with offspring DNA methylation, providing evidence that epigenetic mechanisms may underlie epidemiological observations that pubertal paternal smoking increases risk of offspring asthma, low lung function and obesity.


There is growing consensus that perturbations of the epigenome through parental exposures even before their offspring are conceived may explain some of the variation in the heritability of health and disease not captured by genome-wide association studies (GWAS). The period of puberty in future parents, in particular fathers, may represent a critical window of physiological change and epigenetic reprogramming events, which may increase the individual’s susceptibility for environmental exposures to be embodied in the developing gametes [1, 2]. Animal and human studies have shown that prenatal as well as personal exposure to smoking are associated with epigenetic modifications that impact on sperm count and quality [3]. There is now growing interest in how epigenetic modifications, such as DNA methylation (DNAm), related to the parental preconception period may influence the health of the next generation [4].

Although smoking rates are generally declining, smoking commencing before the age of 15 is increasing in European countries [5, 6]. Epidemiological studies have demonstrated that father’s smoking in adolescent years may be a causal factor for poorer respiratory health in offspring. Both fathers’ smoking initiation before age 15 and smoking duration before conception have been associated with more asthma and lower lung function in offspring [7,8,9]. Father’s preconception smoking onset has also been associated with higher body fat mass in sons [10,11,12,13].

Epigenome-wide association studies (EWAS) have identified extensive methylation biomarkers associated with personal smoking [14], all-cause mortality in current and former smokers, as well as mother’s smoking during pregnancy [15,16,17]. While previous studies have identified DNA methylation signals in offspring blood [16] and cord blood [17] related to father’s smoking, they have not specifically investigated the timing of exposure, partly because detailed smoking information from fathers is rarely available [18]. Methylation markers associated with paternal preconception smoking, could have an important role in elucidating long-term effects on the offspring epigenome, with the potential for developing efficient intervention programmes and improved public health.

This study aimed to investigate whether DNA methylation of DNA measured in offspring blood is associated with fathers’ smoking commencing before conception, and in particular, with fathers’ smoking starting in (pre)pubertal years (before age 15). We hypothesized that epigenetic changes involving DNA methylation may explain the molecular mechanisms underlying the association between fathers’ smoking preconception and offspring health observed in epidemiological studies. Additionally, we hypothesized that fathers’ smoking in the critical window of early puberty, as compared to smoking initiated at a later age, may have a more significant impact on the offspring epigenome. In a two-generation cohort, we sought to identify the DNA methylation changes in offspring blood (aged 7–50 years) associated with (1) father’s smoking onset preconception compared with never or later onset smoking and (2) father’s smoking onset before age 15 compared with never smoking. Finally, given the range of epidemiological studies reporting sex-specific outcomes in the offspring [10, 19], we wanted to explore whether patterns of associations between fathers’ preconception smoking and offspring DNA methylation were different for sons and daughters.


Study design and data

We used data and samples from offspring, aged 7–50 years that participated in the RHINESSA study ( Parent data, including detailed information on smoking habits, were retrieved from the population-based European Community Respiratory Health Survey (ECRHS, and/or the Respiratory Health in Northern Europe study (RHINE, studies. This analysis comprised 875 offspring-parent pairs with complete information, from six study centres with available peripheral blood for offspring (Aarhus, Denmark; Albacete/Huelva, Spain; Bergen, Norway; Melbourne, Australia; Tartu, Estonia). All participants were of Caucasian ancestry. Medical research committees in each study centre approved the studies, and each participant gave written consent. The ethical approval reference numbers are listed on

Father’s smoking and age of starting/quitting were reported in interviews/questionnaires and related to offspring’s birth year, to define the categories: never smoked (n = 547), any preconception smoking (n = 328), preconception smoking with onset < 15 years (pubertal smoking) (n = 64) (cut point based on mean age of voice break 14.5 years, first nocturnal seminal emission 14.8 years). Personal smoking was classified as current, ex- or never smoking. Maternal smoking was defined by offspring’s report of mothers’ smoking during their childhood/pregnancy.

Methylation profiling and processing

DNAm in offspring was measured in 1µg of DNA extracted from peripheral blood, using a simple salting out procedure [20]. Bisulphite conversion was undertaken using EZ 96-DNA methylation kits (Zymo Research, Irvine, CA, USA) at the Oxford Genomics Centre (Oxford, UK) and methylation assessed using Illumina Infinium MethylationEPIC Beadchip arrays (Illumina, Inc. CA, USA) with samples randomly distributed on microarrays to control against batch effects.

Data analysis was undertaken using Statistical Computing Program R, version 3.6.1 [21]. Methylation intensity files were processed and quality was assessed using minfi [22] and Mefil [23]. Methylation distribution for outliers was assessed using density and multidimensional scaling plots, methylated vs unmethylated ratio plot, sex mismatch and sex outliers, control probes and bisulphite conversion efficiency. Normalization was carried out using BMIQ [24], which adjusts the intra-sample beta-values of type 2 design and type 1 probes. To remove technical variation detected at p value 1 × 10–10 by the champ.SVD function within the CHAMP package [25], ComBat from SVA [25] was applied on sample batch and slide.

Probes were excluded from analysis using the following criteria: probes with a detection p value above 0.01 in one or more samples (n = 27,206 probes), probes with a beadcount < 3 in at least 5% of samples (n = 1451), non-cg probes (n = 2580), probes with SNPs as identified in Zhou et al. [26] (n = 92,403), probes with multiple hybridization locations as identified in Nordlund et al. [27] (n = 51) and probes on the X or Y (n = 15,776) chromosome and cross-reactive probes on epic array (n = 2368) as identified by Pidsley et al [28]. A total of 724,292 probes were used for downstream analysis. Cell-type proportions were estimated using EpiDISH (Epigenetics Dissection of Intra-Sample Heterogeneity) [29].

Statistical analysis

We ran two EWAS on preconception father’s smoking as exposure (any preconception smoking and prepuberty smoking) with DNA methylation as outcome. To identify differentially methylated cytosine-phosphate-guanine (CpG) sites (dmCPG), robust multiple linear regression models were applied on beta-values using limma [30] adjusting for offspring’s sex, age, personal and mother’s smoking and cell-type proportions (B cells, natural killer cells, CD4 T cells, CD8 T cells, monocyte, neutrophils) at significance level of false discovery rate (FDR) [31] corrected p value  < 0.05. Eosinophils were not included due to a very low estimate and to avoid potential multicollinearity. In additional analyses, associations between fathers’ any preconception smoking and offspring’s DNA methylation were also stratified by offspring sex.

Manhattan plots were generated using qqman [32] and a circos plot with CMplot R package [33]. Inflation from systematic biases was adjusted using BACON [34]. Differentially methylated regions were detected using dmrff [35] and DMRCate [36]. Transcription factor binding site prediction was performed using eFORGE TF [37]. Gene-disease association was identified using open target [38]. Identified dmCpGs were compared against EWAS atlas for association with known biological traits [39]. To gain biological insight regarding the dmCpGs mapped to genes, gene interactors were identified using String [40] and enrichment was performed using UniprotR [41] and gometh [42]. Biological interpretation of significant differentially methylated CpGs (dmCpGs) is detailed in the supplementary methods.

To further investigate whether the identified dmCpGs were associated with respiratory outcomes and weight in the offspring, we conducted regression analysis between offspring’s DNA methylation signals and offspring’s own reports of ever-asthma, ever-wheeze, weight and BMI, while accounting for offspring sex.

We constructed two additional EWAS on offspring personal as well as maternal smoking to assess the shared count and overlap of dmCpGs (FDR < 0.05) between each EWAS and to allow for comparison with dmCpGs identified as related to father’s preconception and pubertal smoking. To address potential confounding by offspring personal smoking and maternal smoking, association of the detected dmCpGs was also checked in subpopulations of offspring who reported no personal smoking exposure and offspring with no maternal smoke exposure.

Our EWAS results were also compared with findings from meta-analyses of EPIC array DNA methylation associated with personal smoking from four population-based cohorts [43], personal smoking-methylation effects from 16 cohorts using 450K arrays [14] and the Pregnancy and Childhood Epigenetics Consortium (PACE) meta-analysis of mother’s smoking in pregnancy on offspring cord blood methylation [15].

Replication analysis

Replication was carried out in the ALSPAC (Avon Longitudinal Study of Parents and Children) cohort adjusted for predicted cell count proportions, batch effects (plate), maternal smoking during pregnancy, self-reported own smoking, age and sex using DNA methylation data from whole blood measured at age 15–17. A description of the ALSPAC cohort is provided in the supplementary methods. T tests were used to compare the association of regression coefficient of RHINESSA’s dmCpG sites at FDR < 0.05 and the top 100 CpG sites with ALSPAC. Signed tests were used to test the direction of association.

Sensitivity analyses

To assess whether fathers’ smoking-related dmCpGs were potentially confounded by the effect of social class, father’s educational level, a surrogate measure of socioeconomic background, was used as an independent variable and regressed with the identified to dmCpGs. The impact of offspring’s age was also more extensively investigated in subsequent analyses, by correlating known age-related CpG markers from the RHINESSA EWAS study, with both top CpGs identified as related to fathers’ smoking, as well as to the age of the offspring.


The analysis included 875 RHINESSA participants (Table 1A), 457 males and 418 females, aged 7 to 50 years. Of these 328 had a father who had ever smoked before conception (father starting smoking before the birth year of offspring minus 2 years) of which 64 had started before age 15 years; 263 had a mother who had ever smoked, and 240 had smoked themselves. Characteristics are also given for the sub-sample of 304 offspring whose father either had started smoking before age 15 years or never smoked (before or after conception of the offspring) (Table 1B).

Table 1 A and B General characteristics of study participants from the RHINESSA study with complete data on offspring DNA methylation and father’s age of onset of tobacco smoking

Epigenome-wide association analysis of preconception father’s smoking

Epigenome-wide association between father’s any preconception smoking and offspring DNA methylation identified two dmCpGs (inflation λ = 1.187); cg00870527 mapped to PRR5 and cg08541349 mapped to CENPP (Table 2A and Additional file 1: table E1). The genome-wide distribution of associated dmCpGs is shown in Fig. 1A. Figure 2A shows a comparison of methylation beta-values between the never- and ever-smoked groups for two CpG sites. In both cases, the methylation values are significantly lower in the offspring of ‘ever-smoked’ fathers; cg00870527 in PRR5 (p value  = 0.0003) and cg08541349 in CENPP (p value  = 0.0000092).

Table 2 A and B. CpG sites associated with father’s smoking at genome-wide significance (FDR < 0.05)
Fig. 1
figure 1

Manhattan plot for genome-wide distribution of dmCpGs. A: for father’s any preconception smoking and B: father’s pubertal smoking starting before age 15. The red line shows genome-wide significance, the blue is the suggestive line. The y-axis represents − log10 of the p value for each dmCpG (indicated by dots) showing the strength of association. The x-axis shows the position across autosomal chromosomes. The top dmCpGs on each chromosome were annotated to the closest gene

Fig. 2
figure 2

Box plots showing distribution of methylation levels (beta-values) by significant dmCpGs sites in the EWAS. A: father’s any preconception smoking and B: for father’s pubertal smoking starting before age 15. The comparison p value between never-smoking exposed and smoking exposed offspring is shown above the box plot for each dmCpG

In sex-stratified analysis, in males (n = 457) we identified four dmCpGs mapped to KCNJ1, GRAMD4, TRIM2 and MYADML2. In females (n = 418) there was one dmCpG mapped to LEPROT1 (FDR <  = 0.05) (Additional file 1: Table E2). All sex-specific dmCpGs were hypomethylated.

To specifically determine the signature related to father’s early onset smoking, we compared methylation differences between offspring of fathers who started to smoke < 15 years (n = 64) with offspring of never smoking fathers (n = 240). We identified 55 dmCpGs at FDR < 0.05 (λ = 1.44) showing genome-wide significance. After adjusting for inflation using BACON, 19 dmCpGs showed significant association at FDR < 0.05 with λ = 1.29 (Table 2B, Fig. 1B and Additional file 1: Table E3). These dmCpGs were mapped to 14 known genes and 5 intergenic regions. The genes include TLR9, DNTT, FAM35B, NCAPG2, MBIP, C2orf39, NTRK2, DNAJC14, CDO1, PRAP1, TPCN1, IRS1, PSTPIP2 and CF1R. All hits were hypermethylated in the exposed group. The comparison of methylation distribution between the never and smoke exposed is shown in Fig. 2B.

The dmCpGs associated with father’s preconception smoking were mainly located in open-sea genomic features and enriched for promoter regions (Table 2A). The dmCpGs associated with father’s pubertal smoking were in open-sea genomic features and CpG island shores (flanking shore regions, < 2 kb up-and downstream of CpG islands) and enriched for CpG islands and gene bodies (Table 2B).

Associations between fathers’ smoking-related dmCpGs and offspring phenotypes

Some of the identified dmCpG sites showed association with ever-asthma (cg22402007: NTRK2), ever-wheezing (cg11380624: DNAJC14, cg10981514: TPCN1), weight (cg12053348, cg03380960: FAM53B, cg22402007: NTRK2 [44]) and BMI (cg03380960: FAM53B, cg12053348, cg22402007: NTRK2) at p < 0.05 as shown in (Additional file 1: Table E4).

Father’s preconception smoking signatures as compared with signatures of personal and mother’s smoking

We identified 33 dmCpGs related to personal smoking and 14 dmCpGs associated with mother’s smoking (FDR < 0.05) (Additional file 1: Tables E5 and E6, respectively).

To illustrate the distinct and shared genome-wide effects of personal, mother’s and father’s smoking on the offspring methylome, we generated a locus-by-locus genome comparison (Fig. 3). While there was similarity between the effects of personal smoking and mother’s smoking on chromosome 5, we observed distinct signatures for father’s preconception smoking on chromosome 22 and for mother’s smoking exposure on chromosomes 7 and 15.

Fig. 3
figure 3

Circos plots showing genome-wide distribution across autosomal chromosomes of dmCpGs associated with A personal smoking (in offspring), B mother’s smoking, C father’s any preconception smoking and D father’s pubertal smoking starting before age 15. Each dot represents a CpG site; the radial line shows the − log10 p value for each EWAS. Zoomed dots show CpG sites significant in at least one of the EWAS; each zoomed dot colour shows a unique CpG site specific locus

To confirm that the father’s smoking-associated dmCpGs were not confounded by offspring’s own or mother’s smoking, we carried out sensitivity analysis in subpopulation of offspring who reported no personal smoking exposure and no maternal smoking exposure. Accounting for all covariates (offspring age, sex, study centre and blood cell type proportions), all paternal smoking-associated dmCpGs remained significant at p value  < 0.05 in these analyses (Additional file 1: Table E7). Despite the small sample size (n = 175), when we accounted for mothers’ sustained smoking throughout pregnancy as covariate, 13 of 19 identified dmCpGs remained significantly associated with paternal smoking at p value < 0.05 (Additional file 1: Table E7).

Comparing our EWAS results with previously published meta-analysis results of maternal and personal smoking showed that 16 of our 19 dmCpGs associated with fathers pubertal smoking onset had not previously been associated with maternal or personal smoking [14, 15, 43] (Fig. 4A, B). Nine of the identified CpG sites were also present on the 450k array (Additional file 1: Table E8). Two CpG sites (cg11380624 (DNAJC14), cg20728490 (DNTT)) were shared with analyses of personal smoking by Joehanes et al. [43] and two sites (cg12053348 (intergenic), cg20728490 (DNTT)) with Christiansen et al. [14]. In contrast, 10 of our 14 mother smoking-associated dmCpGs, with 11 CpGs also present at the 450K array, and 25 of our personal smoking-related dmCpGs, with 23 CpGs present at the 450K array, were also reported in the meta-analyses results [14, 15, 43] (Additional file 1: Table E8).

Fig. 4
figure 4

Venn diagram showing EWAS CpG top hits for personal (offspring) smoking, mother’s smoking (FDR < 0.005), father’s any preconception smoking (top 100 dmCpGs) and father’s pubertal smoking starting before age 15 (FDR < 0.05) in the RHINESSA cohort, which are shared with top hits from meta-analysis of A PACE mother smoking (blue oval) as reported by Joubert et al. 2016 and B Personal cigarette smoking signature as reported by Christiansen et al. 2021 (blue) and by Joehanes et al. 2016 (yellow)

Enrichment of dmCpGs for related traits

We investigated whether the significant dmCpGs associated with father’s preconception smoking onset overlapped with other traits, using the repository of published EWAS literature in the EWAS atlas. The top 23 dmCpG sites for father’s any preconception smoking (those with p value  ≤ 9.86 × 10–06, distinctly lower than the following sites) were enriched for traits that include Immunoglobulin E (IgE) level, muscle hypertrophy, maternal smoking and birthweight (Fig. 5A). dmCpGs (FDR < 0.05) associated with father’s pubertal smoking were enriched for traits such as autoimmune diseases, atopy, smoking and puberty (Fig. 5B). Enriched traits related to the dmCpGs detected in the EWAS of maternal and personal smoking exposure are provided in Additional file 1: Fig. 1A and 1B for comparison.

Fig. 5
figure 5

Traits associated with the CpG sites that in EWAS were identified to be differentially methylated according to A father’s any preconception smoking, B father’s pubertal smoking starting before age 15

Role of dmCpGs for father’s pubertal smoking (smoking initiation < 15 years)

Given the stronger effects of father’s pubertal smoking than any father’s smoking on offspring DNA methylation than we further explored the biological relevance of these findings.

Transcription factor enrichment

We interrogated eFORGE TF for transcription factor enrichment in CD4+ cells to determine the regulatory role of our 19 significant dmCpGs (FDR < 0.05) related to father’s pubertal smoking. We found significant enrichment of 27 transcription factor binding sites that overlapped with 7 of the dmCpGs (q-value < 0.05) identified in our EWAS study (Additional file 1: Table E9).

EWAS atlas lookup

Of the 19 dmCpGs associated with father’s pubertal smoking identified in our analysis, 11 were present in the EWAS atlas and correlated with gene expression in a variety of tissues in the EWAS atlas (Fig. 6A) and overlapped with promoters (Fig. 6B) (FDR < 0.05). These were significantly associated with 9 other traits, including atopy and fractional exhaled nitric oxide (cg23021329), smoking (cg20728490; cg16730908), BMI (cg03516318), acute lymphoblastic leukaemia (cg2240207), cancer (cg11380624) and Crohn’s disease (cg10981514) (Additional file 1: Table E10).

Fig. 6
figure 6

Methylation effects on gene expression regulation across different tissue types for the CpG sites differently methylated according to father’s pubertal smoking starting before age 15 years (FDR < 0.05). [Accessed on 20 June 2021]. Size of point represents − log10 p value, colour scale shows CpG site correlation with expression; red to green represents increasing expression. In A shape shows the tissue type; in B shape shows genomic feature location

Differentially methylated region (DMR) analysis

No DMRs were significantly associated with father’s any preconception smoking using either DMRcate or dmrff. There were suggestive hits for father’s pubertal smoking, such as DNTT at FDR = 0.084. All DMRs are listed in Additional file 1: Table E11.

Pathway enrichment

To gain further insight into the functional roles of the dmCpGs, we used 14 genes that were mapped to dmCpGs associated with father’s pubertal smoking to generate a protein–protein interaction network from the String database. The top 20 protein interactors were included with high confidence score cutoff 0.7 from protein–protein interaction data sources including experimentally validated protein physical complexes, curated databases and co-expressions. The network indicated that immune response-related genes TLR9, CSF1R, NTRK2, PTPN11 and IL34 were well connected (Fig. 7A) (p value  < 1.0 × 10−16). The molecular function enrichment analysis showed enrichment for gene expression, inflammatory response, innate immunity and cytokine binding (Fig. 7B). We also assessed enrichment of GO terms using gometh. The most significantly enriched biological process terms (FDR < 0.05) include: Inactivation of MAPK activity involved in osmosensory signalling pathway (GO:0000173), negative regulation of interleukin-6 production (GO:0032715), regulation of mast cell chemotaxis (GO:0060753), regulation of neutrophil migration (GO:1902622) and insulin processing (GO:0030070) (Additional file 1: Table E12).

Fig. 7
figure 7

Interactome of dmCpGs associated with father’s pubertal smoking. A String network at confidence score 0.7 and 20 top interactors (pale red nodes show dmCpGs, light green nodes show top interactors). The interactions show experimental evidence from score 0.0 (weak) to 1.0 (strong) using the colour gradient yellow (0.06) to deep purple (1.0). B Functional enrichment from UniprotKb for the top 15 biological processes, 10 molecular functions and 5 cellular components

Replication of DNA methylation signatures associated with father’s preconception smoking

The replication cohort in ALSPAC included 542 participants (female = 280, male = 262), of whom 86 had a father who started to smoke before the age of 15 and 456 had never smoking fathers. There was no overlap of dmCpG sites significantly associated with father’s smoking before age 15 between the two cohorts (FDR < 0.05). However, of the 19 significant dmCpGs identified as related to father’s pubertal smoking in RHINESSA, 11 showed nominal replication in ALSPAC (p < 0.05) with similar direction. The correlation of effects between studies is R = 0.49. The binomial sign test showed the association to be significant at p < 0.05. Expanding the comparison to the top 100 dmCpGs in RHINESSA, the correlation of effects between studies, R = 0.54, p value = 3.04 × 10–05.

Sensitivity analyses

There was no association between fathers’ educational level and top dmCpGs identified in relation to fathers’ preconception smoking. There was only weak correlation between father’s smoking dmCpGs and offspring age (maximum R =|0.2|, with 9 CpGs R = 0). In contrast, as expected, the age-related CpG markers showed a strong correlation with age (R > =|0.6|) (Additional file 1: Fig. 2A, B). The study power is shown in Additional file 1: Table E13.


To our knowledge, this is the first human study to investigate the potential epigenetic mechanisms behind the impact of father’s pubertal smoking on offspring. In this epigenome-wide association study, using data from two generations of study participants, we found differentially methylated CpG sites in offspring associated with father’s preconception smoking. Signatures related to father pubertal smoking (smoking initiation before age 15) were much more pronounced than smoking starting at any time preconception. Sixteen of the 19 identified dmCpGs have not previously been reported to be associated with personal or maternal smoking. We suggest these new smoking-associated methylation biomarkers may be specific to smoking exposure of future fathers in early puberty. Several top dmCpGs were enriched for promoter regions and overlapped with significant transcription factor sites that correlated with gene expression in a variety of tissues. Besides unique sites identified for father’s preconception smoking onset, our study confirms previously reported DNA methylation sites associated with personal and mother smoking, demonstrating the validity of our cohort and analytical methods. The genes to which dmCpGs map are related to regulation of innate immunity and inflammatory responses.

For father’s any preconception smoking, we found two novel CpG sites that were not previously linked with any previously investigated smoking phenotype. PRR5 (mapped with cg008870527) is a component of the (mTOR) complex 2 which is upstream of major pathways known to have a crucial role in metabolic regulation and is suggested to play a role in obesity and the pathogenesis of insulin resistance [45]. CENPP (mapped with cg08541349), has been associated with lung function, leucocyte count, BMI and type II hypersensitivity reaction in GWAS studies [38]. Sex-stratified EWAS analyses detected four male-specific dmCpGs that mapped to genes associated with vital capacity (KCNJ1, MYADML2), IgE levels (relevant to asthma pathogenesis) [46], as well as to genes linked with low-density lipoprotein measurement/ total cholesterol (TRIM2) and BMI-related phenotypes (GRAMD4, MYADML2, KCNJ1). In female offspring, we found one dmCpG annotated to LEPROTL1, a gene with roles in lung function (FEV1/FVC ratio), growth hormone regulation and glucose homeostasis [47]. Yet, whether male and female offspring in fact display methylation differences at various sites and genes needs further investigation and is yet to be confirmed.

For father’s pubertal smoking, two of our 19 significant CpG sites, have previously been associated with personal smoking (cg20728490 in DNTT and cg16730908 in PSTPIP2), and they map to genes with important roles in innate immune responses to infections [48, 49]. Upregulation of PSTPIP2 has also been linked to neutrophilic airway inflammation and non-allergic asthma. When exploring the biological impact of other genes mapped to the dmCpGs uniquely associated with father’s pubertal smoking, several were related to genes associated with innate immunity, allergic diseases and asthma development, such as TLR9, CSF1R, DNAJ14, NTRK2 and TPCN1 [48,49,50,51,52,53]. We also identified CpGs and genes with links to obesity (NTRK2, PSTPIP2, MBIP) [38, 54, 55] and glucose and fat metabolism (IRS1). The differentially methylated CpGs were mainly located in open-sea genomic features and enriched for promoter regions, CpG island and gene bodies. These findings suggest that the identified DNA methylation differences, even though of relatively small magnitude, have functional implications in terms of a regulatory role in specific gene expression. Pathway analysis and molecular function enrichment further found interconnection of immune response-related genes and enrichment for inflammatory response, innate immunity and cytokine binding. When seeking replication of results in an independent sample in the ALSPAC, although no dmpCpGs overlapped in the two population cohorts, results showed that effect estimates associated with fathers’ preconception smoking were moderately correlated and with concordant directional effects.

Several mechanistic reports have demonstrated that the toxicogenic components in cigarette smoke impact on epigenetic germline inheritance and affect the offspring’s metabolic health [56]. However, given this is the first study that investigated DNA methylation signatures in young and adult offspring in relation to a timing-specific exposure on father’s smoking, there is limited published literature that is directly comparable to our findings. In a pilot study, we previously observed differentially methylated regions associated with father’s ever smoking, among which annotated genes were related to innate and adaptive immunity and fatty acid synthesis [16]. Preconception paternal smoking has been shown to alter sperm DNA methylation [57] and independently increase asthma risk and reduce lung function in the offspring [9], especially if the smoking started before age 15 [7, 9]. The observed association between the dmCpG sites related to father’s early onset smoking, and offspring asthma, wheezing and weight, suggests that epigenetic changes may lie on the casual pathway between paternal smoke exposures and offspring health outcomes.

Strikingly, the dmCpG sites we identified as related to fathers’ preconception smoking (any preconception smoking as well as pubertal smoking) were quite unique and not the same as those previously reported or found in our data to be associated with mothers’ or personal smoking. As several of the identified CpG sites also are present on the lower coverage 450K array (485512 CpG sites), as shown in Additional file 1: Table E8, the novelty of the identified paternal smoking-associated sites cannot be accounted for by the utilisation of the more comprehensive EPIC methylation array (866838 CpG sites). Reassuringly, our EWAS of mother’s smoking and personal smoking, identified several of the dmCpG sites previously associated with these exposures in other cohorts.

Available data for appropriate replication of our results is a major challenge. We found moderate correlation between RHINESSA and ALSPAC EWAS for paternal smoking before 15 years. Although the replication analysis found effect estimates to have concordant directions in several of the dmCpGs, we did not identify overlapping significant dmCpGs associated with fathers’ preconception smoking in the replication cohort. The low sample size in both cohorts for paternal smoking before 15 might contribute to the lack of shared genome-wide significance. Even within the same population, using different platforms can cause difficulties with replication [58]. The similarity in the direction of association suggests a potential biological effect of early prepuberty father’s smoking, but further research is warranted in order to verify our novel results.

Although we accounted for personal and mother’s smoking exposure in the analysis, we cannot disregard potential residual confounding related to maternal and personal smoking. Further, our analyses cannot fully disentangle effects of father’s early onset smoking from effects of subsequent accumulating second hand smoke exposure. However, epidemiological analyses of various measures of father’s smoking as related to offspring phenotype in over 20,000 father-offspring pairs found that effects of any other aspect of father’s smoking was negligible as compared to that of starting smoking early [7]. We did not control for genetic variation at single nucleotide polymorphisms and cannot rule out that the differentially methylated CpG sites are affected by, or interact with, genetic variants. Our study may be additionally constrained by factors attributable to that of shared familial environments. Although we found no evidence that our top differentially methylated signals were related to fathers’ educational level in a sensitivity analysis, there may be other unmeasured aspects related to social class, which may have influenced our findings. However, a recent analysis of our study cohorts using highly advanced statistical probabilistic simulations demonstrated that unmeasured confounding had a limited impact on the effects of father’s preconception smoking on offspring asthma [8]. This suggests that the identified dmCpGs associated with father’s preconception smoking, most likely are not driven by unmeasured confounding—by genetic factors or by lifestyle-related or environmental factors.

Self-reporting of smoking is another limitation of our study. However, based on validation studies there is an overall consensus that self-report provides a valid and reliable tool for assessing smoking behaviour in cohort studies. Furthermore, it is likely that error in father’s reporting of smoking habits is independent of DNA methylation measured in the offspring, and that misclassification thus will have attenuated the observed results and that the underlying true results might be stronger [59, 60].

We suggest that the observed association between father’s preconception smoking and offspring DNA methylation marks could be caused by transmission through germline imprint of male sperm. Supported by previous mechanistic and epidemiological findings we also speculate that our novel results reflect that early adolescence may constitute a period of particular vulnerability for smoking exposure to modify the offspring’s epigenome. A recent study demonstrated that preconception paternal cigarette exposure in mice from the onset of puberty until 2 days prior to mating modified the expression of miRNAs in spermatozoa and influenced the body weight of F1 progeny in early life [61]. As prepubertal years as well as the onset of puberty represents periods of epigenetic reprogramming events [62], we suggest early adolescence may be a critical time for tobacco-related exposures to interfere with germline epigenetic patterns. This is, however, challenging to study in humans and multiple scientific approaches are needed to elucidate the molecular mechanisms underlying the current findings as well as previous epidemiological results.


We have identified dmCpG sites in offspring associated with father’s onset of smoking before conception, with most pronounced effects when the father started to smoke already in early puberty (before the age of 15). The pattern differed from those of maternal smoking in pregnancy and of personal smoking, and we suggest these may be unique methylation signatures specific to father’s early adolescent smoking. The genes to which the identified dmCpGs map, are related to asthma, IgE and regulation of innate immunity and inflammatory responses. Our study provide evidence for an epigenetic mechanism underlying the epidemiological findings of high risk of asthma, obesity and low lung function following father’s early adolescent smoking. The functional links of hypermethylated genes suggest that particularly father’s pubertal smoking can have cross-generational effects impacting on the long-term health in offspring. Smoking interventions in early adolescence may have implications for better public health, and potential benefits, not only for the exposed, but also for future offspring.

Availability of data and materials

Summary statistics for the epigenome-wide association analyses reported in the manuscript are available from The underlying data are available on reasonable request. Requests for access to data can be made to the RHINESSA steering committee by PI Professor CS ( or vice PI VS ( Reuse of the data must be done in collaboration with the RHINESSA study team. Further information including issues on data security and sharing of data can be found at


  1. Skinner MK, Haque CGBM, Nilsson E, Bhandari R, McCarrey JR. Environmentally induced transgenerational epigenetic reprogramming of primordial germ cells and the subsequent germ line. PLoS ONE. 2013;8

  2. Senaldi L, Smith-raska M. Evidence for germline non-genetic inheritance of human phenotypes and diseases. 2020;1–12

  3. Hamad MF, et al. The status of global DNA methylation in the spermatozoa of smokers and non-smokers. Reprod Biomed Online. 2018;37:581–9.

    PubMed  CAS  Google Scholar 

  4. Fernandez-Twinn DS, Constância M, Ozanne SE. Intergenerational epigenetic inheritance in models of developmental programming of adult disease. Semin Cell Dev Biol. 2015;43:85–95.

    PubMed  PubMed Central  Google Scholar 

  5. Marcon A, et al. Trends in smoking initiation in Europe over 40 years: a retrospective cohort study. PLoS ONE. 2018;13:1–14.

    Google Scholar 

  6. Pesce G, et al. Time and age trends in smoking cessation in Europe. PLoS ONE. 2019;14:2000–10.

    Google Scholar 

  7. Svanes C, et al. Father’s environment before conception and asthma risk in his children: a multi-generation analysis of the Respiratory Health In Northern Europe study. Int J Epidemiol. 2017;46:235–45.

    PubMed  Google Scholar 

  8. Accordini S, et al. A three-generation study on the association of tobacco smoking with asthma. Int J Epidemiol. 2018;47:1106–17.

    PubMed  PubMed Central  Google Scholar 

  9. Accordini S, et al. Prenatal and prepubertal exposures to tobacco smoke in men may cause lower lung function in future offspring: a three-generation study using a causal modelling approach. Eur Respir J. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Northstone K, Golding J, Davey Smith G, Miller LL, Pembrey M. Prepubertal start of father’s smoking and increased body fat in his sons: further characterisation of paternal transgenerational responses. EJHG. 2014;22:1382–6.

    PubMed  PubMed Central  Google Scholar 

  11. Golding J, et al. Investigating possible trans/intergenerational associations with obesity in young adults using an exposome approach. Front Genet. 2019;10:1–11.

    Google Scholar 

  12. Knudsen GTM, et al. Parents’ smoking onset before conception as related to body mass index and fat mass in adult offspring: findings from the RHINESSA generation study. PLoS ONE. 2020;15:1–21.

    Google Scholar 

  13. Golding J, et al. Human transgenerational observations of regular smoking before puberty on fat mass in grandchildren and great-grandchildren. Sci Rep. 2022;12:1–8.

    Google Scholar 

  14. Joehanes R, et al. Epigenetic signatures of cigarette smoking. Circ Cardiovasc Genetics. 2016;9:436–47.

    CAS  Google Scholar 

  15. Joubert BR, et al. DNA methylation in newborns and maternal smoking in pregnancy: genome-wide consortium meta-analysis. Am J Hum Genet. 2016;98:680–96.

    PubMed  PubMed Central  CAS  Google Scholar 

  16. Mørkve Knudsen GT, et al. Epigenome-wide association of father’s smoking with offspring DNA methylation: a hypothesis-generating study. Environ Epigenet. 2019;5:1–10.

    Google Scholar 

  17. Wu CC et al. Paternal tobacco smoke correlated to offspring asthma and prenatal epigenetic programming. Front Genet. 2019;10.

  18. Rutkowska J, Lagisz M, Bonduriansky R, Nakagawa S. Mapping the past, present and future research landscape of paternal effects. BMC Biol. 2020;18:1–24.

    Google Scholar 

  19. Zhang B, et al. Maternal smoking during pregnancy and cord blood DNA methylation: new insight on sex differences and effect modification by maternal folate levels. Epigenetics. 2018;13:505–18.

    PubMed  PubMed Central  Google Scholar 

  20. Miller SA, Dykes DD, Polesky HF. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 1988;16:1215.

    PubMed  PubMed Central  CAS  Google Scholar 

  21. R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.; 2020.

  22. Aryee MJ, et al. Minfi: a flexible and comprehensive bioconductor package for the analysis of infinium DNA methylation microarrays. Bioinformatics. 2014;30:1363–9.

    PubMed  PubMed Central  CAS  Google Scholar 

  23. Min JL, Hemani G, Davey Smith G, Relton C, Suderman M. Meffil: efficient normalization and analysis of very large DNA methylation datasets. Bioinformatics. 2018;44:1–7.

    Google Scholar 

  24. Teschendorff AE, et al. A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics. 2013;29:189–96.

    PubMed  CAS  Google Scholar 

  25. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The SVA package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28:882–3.

    PubMed  PubMed Central  CAS  Google Scholar 

  26. Zhou W, Laird PW, Shen H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2016;45:gkw967.

    Google Scholar 

  27. Nordlund J, et al. Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia. Genome Biol. 2013;14: r105.

    PubMed  PubMed Central  Google Scholar 

  28. Pidsley R et al. Critical evaluation of the illumina methylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 2016;17.

  29. Zheng SC, et al. EpiDISH web server: epigenetic dissection of intra-sample-heterogeneity with online GUI. Bioinformatics. 2020;36:1950–1.

    CAS  Google Scholar 

  30. Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47.

    PubMed  PubMed Central  Google Scholar 

  31. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol). 1995;57:289–300.

    Google Scholar 

  32. Turner SD. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. TbioRxiv. 2007;81:559–75.

    Google Scholar 

  33. Yin L, et al. rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genomics Proteom Bioinform. 2021.

    Article  Google Scholar 

  34. van Iterson M, van Zwet EW, Heijmans BT. Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution. Genome Biol. 2017;18.

  35. Suderman M, et al. Dmrff: Identifying differentially methylated regions efficiently with power and control. BioRxiv. 2018.

    Article  Google Scholar 

  36. Peters TJ, et al. De novo identification of differentially methylated regions in the human genome. Epigenetics Chromatin. 2015;8:6.

    PubMed  PubMed Central  Google Scholar 

  37. Breeze CE, et al. EFORGE v.20: updated analysis of cell type-specific signal in epigenomic data. Bioinformatics. 2019;35:1–3.

    Google Scholar 

  38. Carvalho-Silva D, et al. Open targets platform: new developments and updates two years on. Nucleic Acids Res. 2019;47:D1056–65.

    PubMed  CAS  Google Scholar 

  39. Li M, et al. EWAS Atlas: a curated knowledgebase of epigenome-wide association studies. Nucleic Acids Res. 2019;47:D983–8.

    PubMed  CAS  Google Scholar 

  40. Szklarczyk D, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–13.

    PubMed  CAS  Google Scholar 

  41. Soudy M, et al. UniprotR: retrieving and visualizing protein sequence and functional information from Universal Protein Resource (UniProt knowledgebase). J Proteomics. 2020;213: 103613.

    PubMed  CAS  Google Scholar 

  42. Maksimovic J, Oshlack A, Phipson B. Gene set enrichment analysis for genome-wide DNA methylation data. BioRxiv. 2020; 2020.08.24.265702.

  43. Christiansen C, et al. Novel DNA methylation signatures of tobacco smoking with trans-ethnic effects. Clin Epigenetics. 2021;13:1–13.

    Google Scholar 

  44. Metrustry SJ, et al. Variants close to NTRK2 gene are associated with birth weight in female twins. Twin Res Hum Genet. 2014;17:254–61.

    PubMed  Google Scholar 

  45. Yuefeng T, Martina W et al. Adipose tissue mTORC2 regulates ChREBP-driven de novo lipogenesis and hepatic glucose metabolism. Nat Commun. 2016;16.

  46. Chen W, et al. An epigenome-wide association study of total serum IgE in Hispanic children. J Aller Clin Immunol. 2017;140:571–7.

    CAS  Google Scholar 

  47. Ochoa D, et al. Open targets platform: supporting systematic drug–target identification and prioritisation. Nucleic Acids Res. 2020.

    Article  PubMed Central  Google Scholar 

  48. Haeryfar SMM, et al. Terminal deoxynucleotidyl transferase establishes and broadens anti-viral CD8+ T cell immunodominance hierarchies. J Immunol. 2009;181:649–59.

    Google Scholar 

  49. Chitu V, et al. Primed innate immunity leads to autoinflammatory disease in PSTPIP2-deficient cmo mice. Blood. 2009;114:2497–505.

    PubMed  PubMed Central  CAS  Google Scholar 

  50. Murakami Y, et al. TLR9–IL-2 axis exacerbates allergic asthma by preventing IL-17A hyperproduction. Sci Rep. 2020;10:1–17.

    Google Scholar 

  51. Boulakirba S, et al. IL-34 and CSF-1 display an equivalent macrophage differentiation ability but a different polarization potential. Sci Rep. 2018;8:1–11.

    CAS  Google Scholar 

  52. Modena BD, et al. Gene expression correlated with severe asthma characteristics reveals heterogeneous mechanisms of severe disease. Am J Respir Crit Care Med. 2017;195:1449–63.

    PubMed  PubMed Central  CAS  Google Scholar 

  53. Shin EK, et al. Association between colony-stimulating factor 1 receptor gene polymorphisms and asthma risk. Hum Genet. 2010;128:293–302.

    PubMed  PubMed Central  CAS  Google Scholar 

  54. Caruso M, et al. Increased interaction with insulin receptor substrate 1, a novel abnormality in insulin resistance and type 2 diabetes. Diabetes. 2014;63:1933–47.

    PubMed  PubMed Central  CAS  Google Scholar 

  55. Kubota N, et al. Differential hepatic distribution of insulin receptor substrates causes selective insulin resistance in diabetes and obesity. Nat Commun. 2016;7:1–16.

    Google Scholar 

  56. Ng SF, et al. Paternal high-fat diet consumption induces common changes in the transcriptomes of retroperitoneal adipose and pancreatic islet tissues in female rat offspring. FASEB J. 2014;28:1830–41.

    PubMed  CAS  Google Scholar 

  57. Bunney PE, Zink AN, Holm AA, Billington CJ, Kotz CM. Regulation of motivation for food by neuromedin u in the paraventricular nucleus and the dorsal raphe nucleus. Physiol Behav. 2017;176:139–48.

    PubMed  PubMed Central  CAS  Google Scholar 

  58. Watkins SH, Iles-Caven Y, Pembrey M, Golding J, Suderman M. Grandmaternal smoking during pregnancy is associated with differential DNA methylation in peripheral blood of their grandchildren. Eur J Hum Genet. 2022.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Chiu YL, et al. Validation of self-reported smoking with urinary cotinine levels and influence of second-hand smoke among conscripts. Sci Rep. 2017;7:1–7.

    Google Scholar 

  60. Valladolid-López MDC, et al. Evaluating the validity of self-reported smoking in Mexican adolescents. BMJ Open. 2015;5:1–8.

    Google Scholar 

  61. Hammer B, et al. Preconceptional smoking alters spermatozoal miRNAs of murine fathers and affects offspring’s body weight. Int J Obes. 2021.

    Article  Google Scholar 

  62. Ly L, Chan D, Trasler JM. Developmental windows of susceptibility for epigenetic inheritance through the male germline. Semin Cell Dev Biol. 2015;43:96–105.

    PubMed  CAS  Google Scholar 

Download references


We are extremely grateful to all the families who took part in the RHINESSA and ALSPAC studies, the clinical teams and midwives for their help in recruiting them, and the whole of the RHINESSA and ALSPAC teams, which include interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses.


Coordination of the RHINESSA study has received funding from the Research Council of Norway (Grants Nos. 274767, 214123, 228174, 230827 and 273838), ERC StG project BRuSH #804199, the European Union's Horizon 2020 research and innovation programme under grant agreement No. 633212 (the ALEC Study), the Bergen Medical Research Foundation and the Western Norwegian Regional Health Authorities (Grants Nos. 912011, 911892 and 911631). Study centres have further received local funding from the following: Bergen: the above grants for study establishment and coordination, and, in addition, World University Network (REF and Sustainability grants), Norwegian Labour Inspection and the Norwegian Asthma and Allergy Association. Albacete and Huelva: Sociedad Española de Patología Respiratoria (SEPAR) Fondo de Investigación Sanitaria (FIS PS09). Gøteborg, Umeå and Uppsala: the Swedish Heart and Lung Foundation, the Swedish Asthma and Allergy Association. Reykjavik: Iceland University. Melbourne: National Health and Medical Research Council (NHMRC) of Australia (research grants 299901 and 1021275). Tartu: the Estonian Research Council (Grant No. PUT562). Århus: The Danish Wood Foundation (Grant No. 444508795), the Danish Working Environment Authority (Grant No. 20150067134), Aarhus University (PhD scholarship). ALSPAC funding: EPIC arrays age 15–17, John Templeton Foundation (60828) and EPIC arrays age 24, MRC (MC_UU_12013/2) & CLOSER (MRC and ESRC).

Author information

Authors and Affiliations



CS, GTMK, FIR, AJ, NTK and JWH contributed to conceptualization. GTMK, BS, AJ, NTK and JWH performed data curation. NTK, CS and JWH carried out formal analysis. NTK, FIR, CS and JWH provided methodology. SW and MS carried out replication analyses. CS performed project administration. NTK, GTMK, CS and JWH performed writing—original draft. All authors contributed to writing—review, editing and final approval.

Corresponding author

Correspondence to John W. Holloway.

Ethics declarations

Ethics approval and consent to participate

Ethical permissions were obtained for each study wave from the local ethics committee in each of the participating centres. The ethical approval reference numbers are listed on All study participants provided written informed consent prior to participation. Permission to extract information about themselves and family members from national registers were obtained from each participant in the Northern European study centres. For children and adolescents participating in the additional study groups presented in the Additional file 1, Supplementary methods and data written informed consents were given by the parents/guardian, as required by the local ethics committees.

Competing interests

The authors have no competing interests to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplementary methods and data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kitaba, N.T., Knudsen, G.T.M., Johannessen, A. et al. Fathers’ preconception smoking and offspring DNA methylation. Clin Epigenet 15, 131 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: