Skip to main content

Table 3 Performances of binary classifiers of smoking status

From: Epigenetic modelling of former, current and never smokers

Data Accuracy statistics AHRR model (reference) Candidate CpG LASSO model Maas model Agnostic LASSO model
Ever/never smokers
Training data Accuracy (95% CI) 0.721 (0.693–0.747) 0.771 (0.744–0.795) 0.752 (0.725–0.777) 0.792 (0.766–0.816)
NIR (P: Acc > NIR) 0.658 (6.3 × 10−6) 0.658 (7.3 × 10−16) 0.658 (2.1 × 10−11) 0.658 (< 2.2 × 10−16)
Kappa 0.444 0.527 0.480 0.577
Sensitivity 0.661 0.743 0.745 0.744
Specificity 0.835 0.824 0.764 0.885
PPV 0.885 0.890 0.858 0.925
NPV 0.562 0.625 0.610 0.658
External validation data Accuracy (95% CI) 0.815 (0.784–0.842) 0.837 (0.808–0.863) 0.822 (0.791–0.849) 0.822 (0.791–0.849)
NIR (P: Acc > NIR) 0.637 (< 2.2 × 10−16) 0.637 (< 2.2 × 10−16) 0.637 (< 2.2 × 10−16) 0.637 (< 2.2 × 10−16)
Kappa 0.624 0.661 0.627 0.633
Sensitivity 0.766 0.818 0.814 0.792
Specificity 0.900 0.869 0.835 0.873
PPV 0.931 0.917 0.896 0.917
NPV 0.686 0.731 0.719 0.705
Current/former smokers
Training data Accuracy (95% CI) 0.707 (0.671–0.740) 0.512 (0.474–0.550) 0.700 (0.664–0.733) 0.757 (0.723–0.788)
NIR (P: Acc > NIR) 0.522 (< 2.2 × 10−16) 0.522 (0.715) 0.522 (< 2.2 × 10−16) 0.522 (< 2.2 × 10−16)
Kappa 0.416 0.025 0.403 0.516
Sensitivity 0.658 0.504 0.625 0.701
Specificity 0.761 0.521 0.781 0.817
PPV 0.750 0.535 0.758 0.808
NPV 0.670 0.490 0.656 0.715
External validation data Accuracy (95% CI) 0.646 (0.600–0.689) 0.541 (0.494–0.587) 0.619 (0.573–0.664) 0.674 (0.629–0.717)
NIR (P: Acc > NIR) 0.576 (1.3 × 10−3) 0.576 (0.940) 0.576 (0.03) 0.576 (9.9 × 10−6)
Kappa 0.318 0.093 0.251 0.373
Sensitivity 0.825 0.603 0.706 0.861
Specificity 0.513 0.494 0.555 0.536
PPV 0.556 0.468 0.539 0.578
NPV 0.799 0.628 0.719 0.839