- Open Access
Distinct DNA methylation signatures associated with blood lipids as exposures or outcomes among survivors of childhood cancer: a report from the St. Jude lifetime cohort
Clinical Epigenetics volume 15, Article number: 32 (2023)
DNA methylation (DNAm) plays an important role in lipid metabolism, however, no epigenome-wide association study (EWAS) of lipid levels has been conducted among childhood cancer survivors. Here, we performed EWAS analysis with longitudinally collected blood lipid data from survivors in the St. Jude lifetime cohort study.
Among 2052 childhood cancer survivors of European ancestry (EA) and 370 survivors of African ancestry (AA), four types of blood lipids, including high-density lipoprotein (HDL), low-density lipoprotein (LDL), total cholesterol (TC), and triglycerides (TG), were measured during follow-up beyond 5-years from childhood cancer diagnosis. For the exposure EWAS (i.e., lipids measured before blood draw for DNAm), the DNAm level was an outcome variable and each of the blood lipid level was an exposure variable; vice versa for the outcome EWAS (i.e., lipids measured after blood draw for DNAm).
Among EA survivors, we identified 43 lipid-associated CpGs in the HDL (n = 7), TC (n = 3), and TG (n = 33) exposure EWAS, and 106 lipid-associated CpGs in the HDL (n = 5), LDL (n = 3), TC (n = 4), and TG (n = 94) outcome EWAS. Among AA survivors, we identified 15 lipid-associated CpGs in TG exposure (n = 6), HDL (n = 1), LDL (n = 1), TG (n = 5) and TC (n = 2) outcome EWAS with epigenome-wide significance (P < 9 × 10−8). There were no overlapping lipids-associated CpGs between exposure and outcome EWAS among EA and AA survivors, suggesting that the DNAm changes of different CpGs could be the cause or consequence of blood lipid levels. In the meta-EWAS, 12 additional CpGs reached epigenome-wide significance. Notably, 32 out of 74 lipid-associated CpGs showed substantial heterogeneity (Phet < 0.1 or I2 > 70%) between EA and AA survivors, highlighting differences in DNAm markers of blood lipids between populations with diverse genetic ancestry. Ten lipid-associated CpGs were cis-expression quantitative trait methylation with their DNAm levels associated with the expression of corresponding genes, out of which seven were negatively associated.
We identified distinct signatures of DNAm for blood lipids as exposures or outcomes and between EA and AA survivors, revealing additional genes involved in lipid metabolism and potential novel targets for controlling blood lipids in childhood cancer survivors.
Mounting evidence suggests that epigenetics, specifically DNA methylation (DNAm), plays an important role in lipid metabolism, and epigenome-wide association studies (EWAS) of blood lipid levels have identified robust 5′-cytosine-phosphate-guanine-3′ (CpG) sites and plausible underlying genes associated with lipid metabolism and related diseases . However, an EWAS analysis of lipid levels has not been conducted among survivors of childhood cancer who experience early onset and a substantially higher burden of chronic health conditions (CHCs), compared to community controls without a history of childhood cancer [2, 3]. These health disparities are mostly attributable to genotoxic cancer treatment exposures at a young age with the most notable link being between cardiovascular diseases and exposures to anthracyclines and/or chest-directed radiation therapy (RT) . Recognizing the high burden of CHCs among childhood cancer survivors [2, 3, 5], we have comprehensively analyzed DNAm variations among long-term survivors and conducted systematic investigations of potential casual pathways for treatment-associated CHCs . Our previous findings provide compelling evidence of mediation effect of DNAm between abdominal-RT and dyslipidemia (triglycerides > 150 mg/dL or total cholesterol > 200 mg/dL) . Dyslipidemia is highly prevalent within the broad spectrum of morbidities of childhood and adolescent cancer survivors , and a major risk factor for cardiac events, which are the leading cause of noncancer-related premature mortality and account for approximately 26% of deaths among survivors within 45 years of diagnosis .
In the general population, African American adults have higher prevalence of high low-density lipoprotein (LDL) and low high-density lipoprotein (HDL) levels but lower prevalence of high triglycerides (TG) than European American adults in both men and women [9, 10]. A study considering racial/ethnic differences among childhood cancer survivors in the St. Jude lifetime cohort study (SJLIFE) reported that childhood cancer survivors of African ancestry (AA) had higher risk of cardiovascular diseases overall including specific conditions such as stroke, heart attack, and heart failure than survivors of European ancestry (EA), potentially explained by the higher prevalence of obesity, diabetes, hypertension, and dyslipidemia among AA survivors . Studies have reported notable population-specific DNAm differences in multiple physical functions (e.g., immunity and kidney development) [12, 13], suggesting that EWAS across populations is critical to the interpretation of health disparities . However, there is a lack of diversity in currently available EWAS data, with most studies conducted in individuals of EA.
To further our understanding of the underlying biological mechanisms of different blood lipid levels among childhood cancer survivors and the differences between EA and AA populations as determined by their genetic ancestry, we employed a comprehensive and agnostic EWAS approach across these two populations. Taking advantage of longitudinal clinical assessments of SJLIFE survivors, we analyzed association of DNAm with blood lipids as exposures (i.e., blood lipids were measured before DNAm) and outcomes (i.e., blood lipids were measured after DNAm). Findings were compared between these two scenarios as well as between the two ancestral groups (i.e., EA and AA). The potential function of significant CpG sites were further demonstrated by their correlations with gene expression levels measured by RNA sequencing. We compared our findings among childhood cancer survivors with the known blood lipid-associated CpGs previously reported in non-cancer general populations. Clinically, the set of lipid-associated CpG sites (i.e., signatures) would facilitate the identification of survivors who have already experienced abnormal lipid levels or at higher risk of abnormal lipid levels in the future.
Characteristics of the study population and EWAS analysis design
Adult survivors of childhood cancer from the SJLIFE study [15, 16] were included in this analysis (Table 1). Among 2052 EA survivors (median age at blood draw for DNAm = 32.3 years, interquartile range [IQR] = 26.5–40.1 years; 47.2% female), body mass index (BMI) was 9.9–67.7 kg/m2 (Table 1). Among 370 AA survivors (median age at blood draw for DNAm = 29.6 years, IQR = 23.8–37.0 years; 53.2% female), BMI was 12.6–58.9 kg/m2 (Table 1). The summary statistics of the weighted average levels of HDL, LDL, TG, and TC (including the number of survivors with multiple lipid measurements) before or after DNA sampling are shown in Table 1. Compared with survivors of EA, those of AA had lower mean of weighted average of TG as an exposure (81.2 vs. 125.3 mg/dL, P < 0.0001) and outcome (83.4 vs. 130.6, P < 0.0001), TC as an exposure (171.3 vs 179.2 mg/dL, P=0.02) and an outcome (163.8 vs. 182.3 mg/dL, P < 0.0001), and LDL as an outcome (95.0 vs. 106.1 mg/dL, P <0.0001). AA survivors also had higher mean of weighted average of HDL as an exposure (57.2 vs. 51.1 mg/dL, P < 0.0001). The correlations of weighted average levels of lipids before and after DNA sampling were shown in Additional file 1: Table S1. The percentage of survivors taking any lipid control medications before DNAm sampling was 8.04% in EA and 4.86% in AA. The median time and range between the DNAm and pre-lipid profiles are 1.6, 0.0–5.3, years for EA and 1.6, 0.5–5.1, years for AA, and the median time and range between DNAm and post-lipid profiles are 2.2, 0.0–5.5, years for EA, and 2.3, 0.1–16.6, years (Table 1).
After quality control of DNAm data, a total of 689,414 CpGs were further advanced for EWAS. The associations between the DNAm level of each CpG and specific blood lipid level (HDL, LDL, TG, or TC) as an exposure or outcome were analyzed separately (Fig. 1). Quantile–quantile plots of each EWAS among survivors of EA and AA were shown in Additional file 1: Fig. S1 and Additional file 1: Fig. S2, respectively. EWAS of blood lipids among survivors of EA showed moderately low genomic inflation factors between 0.92 and 1.13 (Additional file 1: Fig. S1). EWAS of blood lipids among survivors of AA showed moderately low to high genomic inflation factors between 0.92 and 2.32 (Additional file 1: Fig. S2).
CpG sites associated with blood lipids among survivors of European ancestry
The landscapes of the overall association results among survivors of EA were shown in Fig. 2. Seven, three, and 33 epigenome-wide significant blood lipid-associated CpGs were identified for HDL, TC, and TG, respectively, in the exposure EWAS (P < 9 × 10−8, Fig. 2A–C). No significant CpG achieved epigenome-wide significance in the LDL exposure EWAS among survivors of EA (P < 9 × 10−8, Fig. 2D). Detailed estimates for the association between each CpG and specific blood lipid level as an exposure were provided in Additional file 1: Table S2. Notably, a cluster of three CpGs (cg00574958, cg05325763, and cg17058475), mapped to the 5′UTR of the CPT1A gene, were common in the TG and TC exposure EWAS (Table 2 and Additional file 1: Fig. S3). Five, three, four, and 94 CpGs were significantly associated with HDL, LDL, TC, and TG, respectively, in the outcome EWAS (P < 9 × 10−8, Fig. 2E–H and Additional file 1: Table S3). Three CpGs were common across HDL, LDL, and TC outcome EWAS, including ch.1.829344F mapped to the 5′UTR region of the SRPM1 gene, cg20935223 mapped to the 3′UTR region of the CYTH3 gene, and cg21750129 mapped to the 3′UTR region of the TRPM3 gene (Table 2 and Additional file 1: Fig. S3). No significant CpGs were common between blood lipid exposure and outcome EWAS (P < 9 × 10−8).
CpG sites associated with blood lipids among survivors of African ancestry
The overall landscape of CpG associations with blood lipid EWAS among survivors of AA were shown in the Additional file 1: Fig. S4. Six TG-associated CpGs were identified in the exposure EWAS (Additional file 1: Fig. S4A and Table S4), and five TG-associated CpGs, two TC-associated CpGs, one HDL-associated CpG, and one LDL-associated CpG were found in the outcome EWAS (Additional file 1: Fig. S4E–H and Table S4) (P < 9 × 10−8). No significant CpG was found in HDL, LDL, and TC exposure EWAS. Similarly, there was no significant CpGs common in exposure and outcome EWAS among survivors of AA (P < 9 × 10−8). In TG exposure EWAS, there were five significant CpGs mapping to nearby genes including cg26675329 and cg04747445 within 1500 bp upstream of the transcription start site of the IL18RAP gene and the BBX gene, respectively; cg05416955 and cg21376908 in the gene body of the CARD9 gene and the MSI2 gene, respectively; and cg16197879 in the 5’UTR region of the CLDN14 gene (P < 9 × 10−8, Additional file 1: Table S4). In TC outcome EWAS, cg16411101 in the 5’UTR of the SSCP3 gene was significant in both LDL and TC outcome EWAS, and the other significant CpG was cg23724016 in the 3’UTR of the CDHR5 gene (P < 9 × 10−8). In HDL outcome, one significant CpG cg14558275 is mapped to the PIK3CG gene. In TG outcome, cg05416345 and cg01111718 are in the gene body of the IFFO2 gene and the GRIA4 gene, respectively; cg04348872 and cg12686539 are in the first and second exon of the ADCY3 gene and the ZNF891 gene, respectively. We did not identify any overlapping significant CpG between survivors of EA and AA in SJLIFE cohort in any of the exposure or outcome EWAS of blood lipids (P < 9 × 10−8).
In the meta-analysis of blood lipid EWAS among EA and AA survivors, we identified 74 significant lipid-CpG associations (70 unique CpGs, P < 9 × 10−8). Specifically, four, one, and 33 significant CpGs were associated with HDL, TC, and TG exposures, respectively; and two, one, two, and 31 were associated with HDL, LDL, TC, and TG outcomes, respectively (P < 9 × 10−8, Additional file 1: Table S5). Among these significant lipid-CpG associations, twelve did not reach epigenome-wide significance level in either EWAS among survivors of EA or AA alone, including three for HDL exposure, three for TG exposure, and six for TG outcome (Table 3). All 12 had homogeneous effects with the same direction of association in survivors of EA and AA (Phet > 0.1, Table 3). Among the remaining 62 lipid-CpG associations that were significant in EWAS among survivors of EA or AA alone (P < 9 × 10−8), twenty-two had homogeneous effects with the same direction between survivors of EA and AA (Phet > 0.1), twenty-four had opposite directions of association with significant heterogeneity between survivors of AA and EA (Phet < 0.1), and the remaining 16 significant lipid-CpG associations either had the same direction of association but with significant heterogeneity between survivors of EA and AA (Phet < 0.1) or had the opposite direction of association but with homogenous effect (Phet > 0.1) (Additional file 1: Table S5).
Association between DNAm levels of lipid–associated CpGs and gene expression
For each of the blood-lipid associated CpGs, we estimated the linear association between DNAm levels and gene expression levels (adjusting for DNA/RNA sampling age and sex). Among the nearby genes of the significant lipid-associated CpGs among EA, there were no count data (number of reads from RNA sequencing) for 52 genes. Of the remaining 56 CpG-gene pairs, there were ten CpG-gene pairs with significant (FDR < 0.05) associations between the DNAm level of lipid-associated CpGs and gene expression of their nearby genes, including HDAC7 (cg01620154), AXIN2 (cg23475474), ECE1 (cg01758046), TRERF1 (cg07507418), TNRC6B (cg00543524), MICU1 (cg08641767), NUDCD3 (cg01507280), NLN (cg01710244), AKAP1 (cg18807499), and LRP5 (cg24040155) among EA (Table 4). Most of the estimated effects of these CpGs were negative (i.e., increased methylation associated with decreased gene expression), except for ECE1 (cg01758046), NUDCD3 (cg01507280), and NLN (cg01710244). Among the 12 annotated genes of significant lipid-associated CpGs among AA survivors (Additional file 1: Table S4), nine gene had no count data (i.e., number of reads from RNA sequencing). Of the remaining three CpG-gene pairs (PIK3CG-cg14558275, BBX-cg04747445, and IFFO2-cg05416345), there was no significant association between DNAm levels of lipid-associated CpGs and the gene expression among AA.
Cross-reference with the EWAS Catalog
By comparing our findings with previously reported blood lipid-associated CpGs in the general population from the EWAS Catalog, only four overlapping CpGs were identified among EA survivors, including cg00574958 (associated with TG and TC exposure), cg09737197 (associated with TG exposure), and cg17058475 (associated with TG and TC exposure) in CPT1A, and cg03725309 (associated with TG exposure) in SARS (Additional file 1: Tables S6 and S7), and none among AA survivors. Among the remaining 136 novel lipid-associated CpGs in childhood cancer survivors of EA, 26 were mapped to 23 genes that have been previously reported as lipid-associated (Additional file 1: Table S8). Among 12 additional blood lipid-associated CpGs identified in the meta-EWAS, five CpGs were reported to be associated with other traits (e.g., sex, age, Schizophrenia, and ADHD (attention-deficit and hyperactivity disorder)) but only cg01082498 (in the 5’UTR region of the CPT1A gene) was associated with blood lipid level in the EWAS Catalog (Additional file 1: Table S9).
Genetic and epigenetic (specifically, DNAm) studies have identified numerous genetic variants or CpG sites that are associated with blood lipids in the general population, hence at least 2572 genes have been implicated in lipid metabolism (Additional file 1: Fig. S5) [17, 18]. We conducted the first EWAS of blood lipids among childhood cancer survivors, including EA and AA survivors from the SJLIFE cohort. Among EA survivors, we identified 149 (140 unique CpGs) significant associations with blood lipid levels; 136 of these were novel findings. Among AA survivors, we found 14 novel significant blood lipid-associated CpGs. There was no overlapping CpGs between EA and AA survivors. A majority of these findings are unique to the survivor population, which may be attributable to childhood cancer diagnoses and/or treatments. For example, two TG exposure associated CpGs, cg24327132 and cg19120513, were associated with chest-RT and abdominal-RT .
In the meta-EWAS, twenty-four CpGs had opposite direction of association with significant heterogeneity between EA and AA survivors (Phet < 0.1), suggesting substantial disparity in lipid-associated CpGs between the two ancestral groups. Meta-EWAS yielded eight additional epigenome-wide significant CpGs with heterogeneity in effect size between EA and AA survivors (Phet < 0.1). However, future replication in EA or AA alone with independent data set is warranted to validate such findings.
TG outcome EWAS yielded the greatest number of significant CpGs among EA, with multiple novel lipid-associated CpGs mapped to the same nearby genes, including CDK5RAP3, FCGR2B, HSPA6, and HSPA7. CDK5RAP3, known to play important roles in liver development and hepatic function. Previous research showed that hepatocyte-specific Cdk5rap3 knockout mice suffered post-weaning lethality because of impaired lipid metabolism and serious hypoglycemia . FCGR2B gene encodes FcγRIIb, with a novel role in CD11c+ cells in modulating serum cholesterol and triglyceride levels and maintaining liver cholesterol homeostasis . HSPA6 and HSPA7 are family members of the HSP70 proteins, which are abundantly present in cancer and play crucial roles in cancer development, progression, and metastasis, clinically resulting in diverse outcomes for patient survival . Moreover, cg20935223 was significantly associated with multiple lipid traits in the outcome EWAS and mapped to CYTH3 gene. CYTH3 gene encodes Cytohesin-3, which is essential for insulin receptor signaling and body fat regulation via lipid excretion . Among novel genes (i.e., not genes with nearby lipid-associated CpGs in EWAS Catalog), four of them were reported as high-confidence genes that play a role in lipid levels, including LPIN2 (near cg07616376 associated with TG exposure among EA), SCARB1 (near cg08458758 associated with TG outcome among EA), MSI2 (near cg21376908 associated with TG exposure among AA), and SSBP3 (near cg16411101 associated with LDL and TC outcome among AA) .
We integrated gene expression levels from RNA sequencing to further characterize the associations between DNAm and blood lipid levels, which strengthened this study. For example, we demonstrated that ten blood lipid-associated CpGs were associated with levels of expression of the annotated genes, in which seven were inversely associated.
However, it is important to note that there are several limitations in this study. First, although we innovatively designed both exposure and outcome EWAS based on our longitudinal follow-up study, the cross-sectional nature of the data prevented us from disentangling the complex interplay between DNAm and blood lipid levels. Nevertheless, we demonstrated potential regulation of gene expression as plausible mechanisms for DNAm alterations by performing RNA-sequencing analysis. Second, the sample size of survivors of AA was limited, that led to the limited power and the exploratory nature of the AA EWAS (i.e., some findings might be identified by chance). However, differences between EA and AA populations as determined by their genetic ancestry were observed with no overlapping blood lipid levels associated CpG between survivors of EA and those of AA. To further validate the findings, a larger sample size of AA survivors is warranted in the future. Previous methodological work suggested that more than 1,000 subjects are required to achieve 80% power for detection of differential DNAm at nominal genome-wide significance with an odds ratio of 1.15 . Third, we obtained DNAm data at only one time point. In the outcome EWAS, all the blood lipid levels were measured after blood draw for DNAm, so the DNAm may be predictive of blood lipid levels. However, to better assess and interpret the changes of blood lipids level in exposure EWAS, longitudinal DNAm measurement (ideally, after the first blood lipid level measurement) is required to correlate changes of DNAm between two time points with changes in blood lipid levels. Fourth, the follow-up of our cohort is limited and still-ongoing, so there was large proportion of missing data in the analytic setting of bi-directional association between DNAm and lipid levels which requires multiple clinical assessments of lipid levels. Lastly, we did not consider cell type-specific DNAm in the current work. Recent research identified that DNAm variation in diseases, such as type 1 diabetes, can be cell type-specific . Therefore, in the future, we may deconvolute bulk DNAm measured in blood leukocytes into cell type–specific quantities and analyze the DNAm associations of each specific cell type.
Our findings demonstrated distinct DNAm signatures associated with blood lipid levels in EA and AA survivors, and that an additional set of genes may be implicated in lipid metabolism in the survivor population compared to the general population. Further longitudinal studies are warranted to replicate and validate DNAm biomarkers for blood lipid levels and other CHCs to facilitate the clinical translation for improved survivorship care.
SJLIFE is a retrospectively-constructed cohort with periodic evaluations of survivors beyond 5-years from childhood cancer diagnosis who were treated at St. Jude Children’s Research Hospital. The details of SJLIFE cohort study have been previously described [15, 16, 26]. Participants complete questionnaires assessing demographic and clinical factors, and receive comprehensive medical and laboratory assessments at each visit to determine health conditions. In this study, a total of 2,052 survivors of EA and 370 survivors of AA, with genome-wide DNAm profiling data, were included . The ancestry for each survivor was determined using genotypes derived from whole-genome sequencing and population admixture analysis as previously described . Primary childhood cancer diagnoses, exposure to chemotherapeutic agents and region-specific radiation dosimetry was obtained from medical records. All SJLIFE survivors completed at least one comprehensive clinical assessment that included a battery of laboratory tests including blood lipid measurement (HDL, LDL, TC, and TG) . The blood lipid levels measured before blood sampling for DNAm were used for exposure EWAS and the blood lipid levels measured after were used in outcome EWAS (Fig. 1). Weighted average was calculated if there were multiple measurements, and time intervals between two consecutive measurements were used as weights. We excluded lipid measurements without fasting. Samples with only one lipid measurements (coinciding with the time point for the blood draw for DNAm) were excluded to ensure that our exposure and outcome EWAS examined the temporal association between DNAm and blood lipid levels. All participants provided written informed consent, with institutional review board approval at St. Jude Children’s Research Hospital.
DNAm profiling and data processing
Illumina Infinium® MethylationEPIC BeadChip array including 850K CpG sites was used to generate genome-wide DNAm profiling on DNA derived from peripheral blood mononuclear cells (PBMC) collected at each follow-up visit for SJLIFE survivors. Details about laboratory experimental processes, array scanning, and DNAm bioinformatics data analysis were previously described by Song et al. .
Genotyping based on whole-genome sequencing (WGS)
Genotyping was based on whole-genome sequencing data of blood derived DNA from 4402 SJLIFE survivors as previously described [29, 30]. Details about data processing, genotyping calling as well as additional genotype quality control criteria and procedures were previously described in Dong et al. .
Epigenome-wide association analysis
Bidirectional EWAS was conducted using a multivariable linear regression to test the association of DNAm levels at each CpG (M-value, continuous variable) with blood lipid levels (continuous variable). We performed principal components analysis of methylation levels of all CpG sites to quantify potential batch effects in the DNAm data. The top four principal components were determined by the change rate of eigenvalues  and were included as covariates in the regression model. We also performed principal components analysis of genotypes derived from WGS to quantify the population substructure in EA and AA survivors. The top four principal components were determined by the change rate of eigenvalues and were included as covariates in the regression model. In the exposure EWAS, a multivariable linear regression model was used with lipid level (weighted average was calculated if there were multiple measurements, and time intervals between two consecutive measurements were used as weights) prior to DNA sampling as an independent variable and DNAm as a dependent variable, adjusting for sex, age at DNA sampling, leukocyte subtype proportions, top four significant genetic principal components, top four methylation principal components, cancer treatments, median age of lipid measurement, BMI, cigarette smoking, and lipid lowering medicine use. All these covariates were potential confounding factors for DNAm level of each CpG, and hence were considered in the exposure EWAS. Cancer treatments included chemotherapy and radiation therapy within 5 years from primary childhood cancer diagnosis. The chemotherapy agents included classical alkylating agent, anthracyclines, corticosteroids, vinca alkaloids, asparaginase enzymes, antimetabolites, and epipodophyllotoxins. The region-specific RT included brain-RT, chest-RT, abdomen-RT, and pelvis-RT. For smoking status as a categorical variable, we included three levels (“never”, “ever”, and “unknown”) in the model. BMI was measured at the same time as DNAm sampling. CpGassoc R package was used for the exposure EWAS analyses . In the outcome EWAS, a multivariable linear regression model was used for DNAm (age-, sex-, cell-type-, genotype principal components-, and methylation principal components- adjusted) as an independent variable and lipid level (weighted average was calculated if there were multiple measurements, and time intervals between two consecutive measurements were used as weights) after DNA sampling as a dependent variable. For EA survivors, a base model without DNAm level but including the complete set of covariate (i.e., sex, cancer treatments, median age of lipid measurement, BMI, smoking, lipid lowering medicine use, lipid level measured at DNA sampling, age at DNA sampling, and polygenic risk score for specific lipid level (in EA only) was fitted. Cancer treatment exposures that were not statistically significant (P > 0.05) in the base model were subsequently excluded. In the final model, DNAm level of each CpG was added for the EWAS analysis. For AA survivors, considering the smaller sample size and potential overfitting, a similar but slightly different variable selection approach was taken by additionally excluding BMI, smoking status, and lipid lowering medicine use if any of these was not statistically significant (P > 0.05) in the base model. Polygenic risk score for specific lipid level was constructed by following the same approach described previously  for EA survivors. Custom R code was used for the outcome EWAS analyses. A P value less than 9 × 10−8 was deemed as epigenome-wide significance level corresponding to 5% family-wise error .
RNA-sequence profiling and data processing
RNA was extracted from the same PBMC used for DNA methylation profiling. Details of library construction, sequencing, and data processing were described previously . Briefly, paired-end 100 cycle sequencing was performed on a NovaSeq 6000 (Illumina). After quality control procedures, raw reads from the fastq files were aligned to the GRCh38.p13 version (v31) of the reference human genome from GENCODE through the automated internal pipeline . The generated bam files were sorted and used to build an index using Samtools (version 1.9)  then used as inputs for counting reads using htseq-count  with GENCODE v31 gene annotation gtf file.
A total of 165 samples of RNA-seq data (135 EA survivors and 30 AA survivors) were available for further analysis. After removing transcripts with mean read counts across all 165 samples less than 10, a total of 12,882 genes were determined to be expressed in PBMC. Transcripts per million (TPM)  were calculated and transformed in the form of log2(TPM + 0.01). The function normalizeQuantiles in the limma package  in R (version 3.6.1) was used for quantile normalization  of the log-transformed values before further downstream analyses.
Expression quantitative trait methylation
We used the Infinium® MethylationEPIC BeadChip array annotations (v1.0 B5) provided by Illumina (https://webdata.illumina.com/downloads/productfiles/methylationEPIC/infinium-methylationepic-v-1-0-b5-manifest-file-csv.zip) to map CpGs to their annotated genes. For 135 RNA-seq samples from EA survivors (out of 165 in total), the normalized expression values of the nearest genes of each lipid-associated CpGs were extracted to fit a linear regression against the DNAm levels of each CpG.
Additional statistical and bioinformatic analyses
Quantile–Quantile plots were generated from P-values in each EWAS using the R lattice package. GenABEL R package  was used to estimate genomic inflation factor (i.e., lambda). We searched lipid-associated CpGs and nearby genes identified in our study and compared with those previously reported from the EWAS Catalog .
Availability of data and materials
The DNA methylation data is accessible through the St. Jude Cloud (https://stjude.cloud).
Chronic health conditions
St. Jude lifetime cohort study
Mittelstraß K, Waldenberger M. DNA methylation in human lipid metabolism and related diseases. Curr Opin Lipidol. 2018;29(2):116–24.
Robison LL, Hudson MM. Survivors of childhood and adolescent cancer: life-long risks and responsibilities. Nat Rev Cancer. 2014;14(1):61–70.
Bhakta N, Liu Q, Ness KK, Baassiri M, Eissa H, Yeo F, et al. The cumulative burden of surviving childhood cancer: an initial report from the St Jude lifetime cohort study (SJLIFE). Lancet. 2017;390(10112):2569–82.
Armenian SH, Armstrong GT, Aune G, Chow EJ, Ehrhardt MJ, Ky B, et al. Cardiovascular disease in survivors of childhood cancer: insights into epidemiology, pathophysiology, and prevention. J Clin Oncol. 2018;36(21):2135–44.
Khanna A, Pequeno P, Gupta S, Thavendiranathan P, Lee DS, Abdel-Qadir H, et al. Increased risk of all cardiovascular disease subtypes among childhood cancer survivors: population-based matched cohort study. Circulation. 2019;140(12):1041–3.
Song N, Hsu CW, Pan H, Zheng Y, Hou L, Sim JA, et al. Persistent variations of blood DNA methylation associated with treatment exposures and risk for cardiometabolic outcomes in long-term survivors of childhood cancer in the St. Jude lifetime cohort. Genome Med. 2021;13(1):53.
Hudson MM, Ness KK, Gurney JG, Mulrooney DA, Chemaitilly W, Krull KR, et al. Clinical ascertainment of health outcomes among adults treated for childhood cancer. JAMA. 2013;309(22):2371–81.
Reulen RC, Winter DL, Frobisher C, Lancashire ER, Stiller CA, Jenney ME, et al. Long-term cause-specific mortality among survivors of childhood cancer. JAMA. 2010;304(2):172–9.
Frank AT, Zhao B, Jose PO, Azar KM, Fortmann SP, Palaniappan LP. Racial/ethnic differences in dyslipidemia patterns. Circulation. 2014;129(5):570–9.
Sheet SF. Older Americans and cardiovascular diseases. Dallas: American Stroke Association; 2013.
Liu Q, Leisenring WM, Ness KK, Robison LL, Armstrong GT, Yasui Y, et al. Racial/ethnic differences in adverse outcomes among childhood cancer survivors: the childhood cancer survivor study. J Clin Oncol. 2016;34(14):1634–43.
Husquin LT, Rotival M, Fagny M, Quach H, Zidane N, McEwen LM, et al. Exploring the genetic basis of human population differences in DNA methylation and their causal impact on immune gene regulation. Genome Biol. 2018;19(1):222.
Breeze CE, Batorsky A, Lee MK, Szeto MD, Xu X, McCartney DL, et al. Epigenome-wide association study of kidney function identifies trans-ethnic and ethnic-specific loci. Genome Med. 2021;13(1):74.
Breeze CE, Wong JYY, Beck S, Berndt SI, Franceschini N. Diversity in EWAS: current state, challenges, and solutions. Genome Med. 2022;14(1):71.
Hudson MM, Ness KK, Nolan VG, Armstrong GT, Green DM, Morris EB, et al. Prospective medical assessment of adults surviving childhood cancer: study design, cohort characteristics, and feasibility of the St. Jude lifetime cohort study. Pediatr Blood Cancer. 2011;56(5):825–36.
Hudson MM, Ehrhardt MJ, Bhakta N, Baassiri M, Eissa H, Chemaitilly W, et al. Approach for classification and severity grading of long-term and late-onset health events among childhood cancer survivors in the St. Jude lifetime cohort. Cancer Epidemiol Biomark Prev. 2017;26(5):666–74.
Battram T, Yousefi P, Crawford G, Prince C, Sheikhali Babaei M, Sharp G, et al. The EWAS catalog: a database of epigenome-wide association studies. Wellcome Open Res. 2022;7:41.
Buniello A, MacArthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47(D1):D1005-d1012.
Yang R, Wang H, Kang B, Chen B, Shi Y, Yang S, et al. CDK5RAP3, a UFL1 substrate adaptor, is crucial for liver development. Development. 2019;146(2):1–13.
Marvin J, Rhoads JP, Major AS. FcγRIIb on CD11c(+) cells modulates serum cholesterol and triglyceride levels and differentially affects atherosclerosis in male and female Ldlr(-/-) mice. Atherosclerosis. 2019;285:108–19.
Albakova Z, Armeev GA, Kanevskiy LM, Kovalenko EI, Sapozhnikov AM. HSP70 multi-functionality in cancer. Cells. 2020;9(3):587.
Jux B, Gosejacob D, Tolksdorf F, Mandel C, Rieck M, Namislo A, et al. Cytohesin-3 is required for full insulin receptor signaling and controls body weight via lipid excretion. Sci Rep. 2019;9(1):3442.
Kanoni S, Graham SE, Wang Y, Surakka I, Ramdas S, Zhu X, et al. Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis. Genome Biol. 2022;23(1):1–42.
Tsai PC, Bell JT. Power and sample size estimation for epigenome-wide association scans to detect differential DNA methylation. Int J Epidemiol. 2015;44(4):1429–41.
Paul DS, Teschendorff AE, Dang MA, Lowe R, Hawa MI, Ecker S, et al. Increased DNA methylation variability in type 1 diabetes across three immune effector cell types. Nat Commun. 2016;7:13555.
Howell CR, Bjornard KL, Ness KK, Alberts N, Armstrong GT, Bhakta N, et al. Cohort profile: the St. Jude lifetime cohort study (SJLIFE) for paediatric cancer survivors. Int J Epidemiol. 2021;50(1):39–49.
Song N, Sim JA, Dong Q, Zheng Y, Hou L, Li Z, et al. Blood DNA methylation signatures are associated with social determinants of health among survivors of childhood cancer. Epigenetics. 2022;17:1–15.
Wang Z, Liu Q, Wilson CL, Easton J, Mulder H, Chang T-C, et al. Polygenic determinants for subsequent breast cancer risk in survivors of childhood cancer: the St Jude lifetime cohort study (SJLIFE) polygenic determinants for subsequent breast cancer risk. Clin Cancer Res. 2018;24(24):6230–5.
Qin N, Wang Z, Liu Q, Song N, Wilson CL, Ehrhardt MJ, et al. Pathogenic germline mutations in DNA repair genes in combination with cancer treatment exposures and risk of subsequent neoplasms among long-term survivors of childhood cancer. J Clin Oncol. 2020;38(24):2728.
Wang Z, Wilson CL, Easton J, Thrasher A, Mulder H, Liu Q, et al. Genetic risk for subsequent neoplasms among long-term survivors of childhood cancer. J Clin Oncol. 2018;36(20):2078.
Dong Q, Song N, Qin N, Chen C, Li Z, Sun X, et al. Genome-wide association studies identify novel genetic loci for epigenetic age acceleration among survivors of childhood cancer. Genome Med. 2022;14(1):1–12.
Barfield RT, Kilaru V, Smith AK, Conneely KN. CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics. 2012;28(9):1280–1.
Mansell G, Gorrie-Stone TJ, Bao Y, Kumari M, Schalkwyk LS, Mill J, et al. Guidance for DNA methylation studies: statistical insights from the Illumina EPIC array. BMC Genom. 2019;20(1):366.
Parker M, Mohankumar KM, Punchihewa C, Weinlich R, Dalton JD, Li Y, et al. C11orf95-RELA fusions drive oncogenic NF-κB signalling in ependymoma. Nature. 2014;506(7489):451–5.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Anders S, Pyl PT, Huber W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–9.
Wagner GP, Kin K, Lynch VJ. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 2012;131(4):281–5.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.
Bolstad BM, Irizarry RA, Astrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19(2):185–93.
Aulchenko YS, Ripke S, Isaacs A, van Duijn CM. GenABEL: an R library for genome-wide association analysis. Bioinformatics. 2007;23(10):1294–6.
The authors thank all individuals who participated in this study.
This work was supported by funding from the V Foundation [Grant # DT2020-014], the National Institutes of Health [Grant # CA021765, CA195547], and the American Lebanese Syrian Associated Charities (ALSAC). The funders of the study had no role in the design and conduct of the study and were not involved in collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or the decision to submit the manuscript for publication.
Ethics approval and consent to participate
The SJLIFE study protocol was approved by the Institutional Review Board (IRB) at St. Jude Children’s Research Hospital.
Consent for publication
We declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Dong, Q., Chen, C., Song, N. et al. Distinct DNA methylation signatures associated with blood lipids as exposures or outcomes among survivors of childhood cancer: a report from the St. Jude lifetime cohort. Clin Epigenet 15, 32 (2023). https://doi.org/10.1186/s13148-023-01447-3
- DNA methylation
- Lipid levels
- Childhood cancer survivors
- African ancestry
- European ancestry