Skip to main content
Fig. 8 | Clinical Epigenetics

Fig. 8

From: Stability selection enhances feature selection and enables accurate prediction of gestational age using only five DNA methylation sites

Fig. 8

Overview of sample selection and analysis flow. Datasets are highlighted in green, methods in blue, analysis output in orange, and epigenetic clocks in yellow. Two randomly sampled subsets from MoBa (dataset 1 and dataset 2) were included in the current study. Data from four individuals that were present in both datasets were excluded from dataset 2. The two datasets were then merged into a single dataset (‘combined dataset’). The samples from the combined dataset were randomly assigned to a training and test set. Stability selection was performed both on the combined dataset and the training set. Generalized additive model (GAM) regression was used to model the effect of the stably selected CpGs on gestational age (GA) to build clocks based on the stably selected CpGs. In parallel, lasso regression was performed directly on the training set to build a standard GA clock. The standard GA clock and the clocks based on the stably selected CpGs were used to predict GA in the test set

Back to article page