Study population
The study population included individuals from the Swedish Adoption/Twin Study of Aging (SATSA), a sub-study of the Swedish Twin Registry (STR). Details on the study design, sample characteristics, and methods of data collection have been described in previous publications [6,7,8]. Briefly, SATSA, launched in 1984, has overall 2018 participants and includes 10 waves of repeated measurements of various aspects of aging-related information from Swedish twin pairs [6]. In each wave, the individuals were examined using a comprehensive questionnaire survey to collect participants’ information concerning their working environment and health-related behaviors. In addition, in-person testing (IPT) was performed with health examinations, structured cognitive tests, and blood sample collection in a subsample of the SATSA cohort [9]. The naming convention was “IPT” with a number (for example, first IPT was IPT1). IPT1 was performed in 1986 to 1988, and then every third year, a new IPT was conducted. After IPT8 the interval was every second year. A total of ten IPTs were conducted through 2014. IPT4 was excluded because it was implemented only through telephone survey. For the inclusion and exclusion of IPT samples, briefly, participants in IPT1 were those who responded to questionnaire 1 and were above 50 years old at the start of IPT1. From IPT2 to IPT5, participants included those who had taken part in the previous IPT and new twins who had recently turned 50 years old. From IPT6 and onwards, there were no new twins added, only longitudinal followed-up (Fig. 1).
Totally, 861 individuals have participated in at least one wave of IPT in the SATSA cohort. DNA methylation was measured in 536 individuals attending at least one of these IPTs (IPT3, IPT5, IPT6, IPT8, IPT9, IPT10). After excluding one individual’s samples used for quality control of different DNA methylation chip assays, 535 individuals corresponding to 1399 samples with up to six DNA methylation measurements were included in the present study (Fig. 1). Of 535 participants, there were 83 MZ twin pairs, 155 DZ twin pairs and 59 singletons. Singleton was defined as those individuals in the cohort who only have his/her own information but we do not have information about his/her twin. All participants provided informed consent, and ethics approval for this study was given by the Regional Ethics Board at Karolinska Institutet, Stockholm, Sweden.
Definition and measurement of lipids and statin use
Serum was extracted from whole blood and stored at − 70 °C for the estimation of lipid profiles. Total triglyceride (TG) and total cholesterol (TC) levels were measured by the enzymatic calorimetric method at all IPTs except IPT4. High-density lipoprotein cholesterol (HDL) was measured by the precipitation method at IPT1 and homogeneous assays were used in later IPTs [10]. Outliers were removed as described in a previous study [11]. Low-density lipoprotein cholesterol (LDL) was tested only at IPT7-10; therefore, we used the Friedewald equation [12] “LDL = TC—HDL—TG/5” to impute the missing values of LDL. The correlation between LDL measured and LDL calculated was 98.8%, demonstrating the high quality of imputation. In general, Friedewald equation is not valid for TG levels greater than 400 mg/dL (equal to 4.52 by multiplying by 0.0113 to convert to millimoles per liter (mmol/L)) and direct measurement of LDL is preferred in this situation. We have 1.5% (21/1399) of samples with TG higher than 400 mg/dL; however, their direct LDL measurements were missing, we hence still chose to use calculated LDL for TG levels higher than 400 mg/dL, but sensitivity analyses were further performed by excluding those 21 participants. All lipid values are expressed in mmol/L.
Information on statin use was obtained from self-reported records in IPTs and from data linked from the Swedish Prescribed Drug Register available from 2005 and onwards. In our dataset, the most commonly used statin was simvastatin, which accounted for approximately 77% of all statin use. Atorvastatin is the second commonly used statin, followed by Rosuvastatin, Pravastatin and Fluvastatin. We defined “statin users” as individuals with at least one record of taking any types of statin therapy and coded it as 1. Likewise, “non-statin users” were coded as 0.
DNA methylation measurements
Blood samples for DNA methylation testing were collected from a subsample of SATSA participants [7, 8]. DNA was extracted from peripheral blood and then treated with bisulfite using the EZ-96 DNA MagPrep methylation kit (Zymo Research, Irvine & Tustin, California, USA). Using bisulfite-converted DNA, methylation of DNA was quantified by the Infinium HumanMethylation450 BeadChip (Illumina, Inc., San Diego, California, USA) or Infinium Methylation EPIC BeadChip assays (Illumina, Inc.). Methylation data were harmonized using a preprocessing pipeline including quality control, normalization and adjustment for cell counts and batch effects, as described in previous studies [7, 8]. Finally, 255,356 CpGs from both arrays remained in the analysis, and repeated measurements of DNA methylation were distributed over IPT3, IPT5, IPT6, IPT8, IPT9, and IPT10. Beta values ranging between 0 and 1 were used as DNA methylation levels.
For the selection of statin-associated CpGs, five candidate CpGs were chosen from Ochoa-Rosales et al. [5]. The first two, cg17901584 and cg10177197, are located in the DHCR24 gene that encodes an enzyme involved in cholesterol biosynthesis, oxidative stress response, neuroprotection, anti-apoptosis and anti-inflammatory activities [13]. The other two, cg06500161 and cg27243685, are located in the ABCG1 gene that encodes a protein that plays an important role in the process of reverse cholesterol transport [14]. The last one, cg05119988, is located in sterol-C4-methyl oxidase–like gene that encodes enzyme to catalyze the demethylation of C4-methylsterols in the cholesterol synthesis pathway. However, cg06500161 and cg05119988 were excluded from our analysis because they failed to pass the quality control procedure.
Statistical analysis
To replicate the longitudinal association between blood lipids and DNA methylation in selected CpGs (aim 1), we used a linear mixed effect model. DNA methylation level was used as a dependent variable and lipid level as an independent variable. Age (repeated measurement, continuous variable), sex (binary variable, women/men coded as 1/0), and smoking (repeated measurement; non-smoker, ex-smoker and current smoker coded as 1, 2 and 3, respectively) were covariates. Individual ID nested within twin pair ID was entered as a random effect, other variables were fixed effects.
For aim 2 and 3, examining the statin effect on the longitudinal changes of blood lipids and DNA methylation, a piecewise latent linear–linear growth curve model (LGCM) was used. Details on the model are presented in Additional file 1: Methods, Figure S1, and Tables S1-S2. Briefly, the piecewise LGCM was constructed in the framework of a structural equation model (SEM), and in our study it was composed of two separate linear segments that were connected by a knot (or change point), here defined as the IPT of starting statin use. The piecewise LGCM has three latent variables: one intercept measuring the individual lipid (or methylation) level at the knot and two slopes (denoted as Pre_slope and Post_slope, respectively) measuring the changing rate of lipids (or DNA methylation) over time before and after statin therapy for each individual. The piecewise LGCM was established separately for statin users and non-users. Of note, non-statin users did not have a change point and the piecewise LGCM was reduced to one phase with only two latent variables (intercept and Pre_slope). We included sex, baseline age, and statin use as time-independent covariates in the regression equation of latent intercepts and slopes, as we assumed that these covariates would have associations with these latent variables [5, 7, 8, 15]. However, statin use only had associations with Pre_slopes.
To fulfill aim 4, to examine the co-varying dynamic patterns of blood lipids and DNA methylation levels across multiple time points in response to statin treatment, we used a bivariate autoregressive latent trajectory model with structured residuals (ALT-SR model) stratified by status of statin therapy. We used the same ALT-SR model as in our previous research [16]. More details are found in Additional file 1: Method and Figure S2. Briefly, the ALT-SR model is also established in the frame of SEM and composed of two parts: the autoregressive model (AR) and the linear latent growth curve model (LGM). The LGM was established first by the latent intercept and the latent slope, which had the same interpretation as in the piecewise LGCM. The AR model measured the co-varying pattern of DNA methylation and lipids over time and was then established based on two regression paths: the autoregressive path measuring the within-person changes of DNA methylation or lipids over time, and the cross-lagged path measuring the predicted effect of DNA methylation at one time point on the within-person lipid level at the adjacent later time point, and/or vice versa. The ALT-SR was constructed separately in statin users and non-users, and differences between the coefficients from separate ALT-SR were used to assess the statin effect on the co-varying dynamic patterns of blood lipids and DNA methylation levels over time. Sex and baseline age were included as time-independent covariates.
The Bonferroni method was applied to control for multiple comparisons based on the numbers of independent tests. In the piecewise LGCM and ALT-SR model, three indices and their thresholds were used to indicate a well-fit model: chi-square value (P value > 0.05), root-mean-square error of approximation (RMSEA) (< 0.06), and comparative fit index (CFI) (> 0.95) [17]. Sensitivity analyses for all the aims were conducted by excluding samples with TG levels greater than 400 mg/dL to assess the impact of imputation of missing measurement of LDL.
All analyses were performed with R software (4.0.5). The Lavaan package (0.6–9) was used to fit the piecewise LGCM and ALT-SR model. Onyx software [18] was used to create the diagrams in Additional file 1: Figure S1 and Figure S2.