Identification of DNA methylation-regulated genes as potential biomarkers for coronary heart disease via machine learning in the Framingham Heart Study

Zhang, Xiaokang; Wang, Chen; He, Dingdong; Cheng, Yating; Yu, Li; Qi, Daoxi; Li, Boyu; Zheng, Fang

doi:10.1186/s13148-022-01343-2

Table 2 Performances of the models based on machine learning

From: Identification of DNA methylation-regulated genes as potential biomarkers for coronary heart disease via machine learning in the Framingham Heart Study

Features	Algorithm	Dataset	F1	ACC	AUC (95% CI)	AP	KS	TP	FP	TN	FN	TPR	TNR	Kappa
Methylation	LightGBM	Training	0.995	0.999	1.000 (1.000–1.000)	1.000	1.000	197	0	1364	2	0.990	1.000	0.994
		Validation	0.460	0.807	0.768 (0.694–0.843)	0.538	0.540	43	77	378	24	0.642	0.831	0.353
	XGBoost	Training	1.000	1.000	1.000 (1.000–1.000)	1.000	1.000	199	0	1364	0	1.000	1.000	1.000
		Validation	0.429	0.770	0.756 (0.683–0.830)	0.391	0.525	45	98	357	22	0.672	0.785	0.308
	Random forest	Training	0.995	0.999	1.000 (1.000–1.000)	1.000	1.000	197	0	1364	2	0.990	1.000	0.994
		Validation	0.443	0.803	0.737 (0.656–0.818)	0.611	0.517	41	77	378	26	0.612	0.831	0.334
Expression	LightGBM	Training	0.992	0.998	1.000 (1.000–1.000)	1.000	1.000	196	0	1364	3	0.985	1.000	0.991
		Validation	0.447	0.801	0.709 (0.626–0.792)	0.465	0.472	42	79	376	25	0.627	0.826	0.337
	XGBoost	Training	0.997	0.999	1.000 (1.000–1.000)	1.000	1.000	198	0	1364	1	0.995	1.000	0.997
		Validation	0.426	0.784	0.706 (0.646–0.766)	0.538	0.494	42	88	367	25	0.627	0.807	0.309
	Random forest	Training	1.000	1.000	1.000 (1.000–1.000)	1.000	1.000	199	0	1364	0	1.000	1.000	1.000
		Validation	0.283	0.592	0.647 (0.563–0.731)	0.347	0.320	42	188	267	25	0.627	0.587	0.105
Combination	LightGBM	Training	1.000	1.000	1.000 (1.000–1.000)	1.000	1.000	199	0	1364	0	1.000	1.000	1.000
		Validation	0.517	0.839	0.834 (0.770–0.897)	0.616	0.615	45	62	393	22	0.672	0.864	0.427
	XGBoost	Training	1.000	1.000	1.000 (1.000–1.000)	1.000	1.000	199	0	1364	0	1.000	1.000	1.000
		Validation	0.439	0.780	0.807 (0.740–0.874)	0.460	0.566	45	93	362	22	0.672	0.796	0.322
	Random forest	Training	0.995	0.999	1.000 (1.000–1.000)	1.000	1.000	197	0	1364	2	0.990	1.000	0.994
		Validation	0.452	0.791	0.818 (0.758–0.878)	0.599	0.487	45	87	368	22	0.672	0.809	0.340
FRS	Framingham 10-	Training	–	0.830	0.647 (0.606–0.687)	–	–	39	105	1188	147	0.210	0.919	–
	Year risk scale	Validation	–	0.797	0.610 (0.536–0.684)	–	–	12	49	376	50	0.194	0.885	–

ACC Accuracy, AUC Area under the receiver operating characteristic curve, CI Confidence interval, AP Average precision score, KS Kolmogorov–Smirnov, TP True positive, FP False positive, TN True negative, FN False negative, TPR True positive rate, TNR True negative rate, FRS Framingham risk score. Dashes meant the parameters were not applicable in the Framingham 10-year risk scale

Back to article page

ISSN: 1868-7083

Contact us

Submission enquiries: avalyn.villar@springernature.com
General enquiries: info@biomedcentral.com

Clinical Epigenetics

Contact us