Skip to main content

Table 4 Performance of ternary classifiers of smoking status (current, former and never)

From: Epigenetic modelling of former, current and never smokers

Data Accuracy statistics AHRR model (reference) Candidate CpG LASSO model Maas model Agnostic LASSO model
Training data Accuracy (95% CI) 0.606 (0.576–0.635) 0.538 (0.508–0.568) 0.619 (0.589–0.648) 0.695 (0.667–0.723)
NIR (P: Acc > NIR) 0.364 (< 2.2 × 10−16) 0.364 (< 2.2 × 10−16) 0.364 (< 2.2 × 10−16) 0.364 (< 2.2 × 10−16)
Kappa 0.405 0.306 0.427 0.541
Never smokers     
 Sensitivity 0.835 0.824 0.764 0.885
 Specificity 0.661 0.743 0.745 0.744
 PPV 0.562 0.625 0.610 0.643
 NPV 0.885 0.890 0.858 0.925
Former smokers     
 Sensitivity 0.299 0.380 0.455 0.518
 Specificity 0.872 0.763 0.797 0.892
 PPV 0.518 0.423 0.507 0.687
 NPV 0.731 0.729 0.762 0.802
Current smokers     
 Sensitivity 0.658 0.397 0.625 0.669
 Specificity 0.875 0.802 0.887 0.905
 PPV 0.730 0.512 0.743 0.787
 NPV 0.830 0.720 0.819 0.839
External validation data Accuracy (95% CI) 0.612 (0.576–0.648) 0.594 (0.557–0.609) 0.603 (0.566–0.639) 0.637 (0.601–0.673)
NIR (P: Acc > NIR) 0.367 (< 2.2 × 10−16) 0.367 (< 2.2 × 10−16) 0.367 (< 2.2 × 10−16) 0.367 (< 2.2 × 10−16)
Kappa 0.405 0.390 0.406 0.462
Never smokers     
 Sensitivity 0.900 0.869 0.835 0.873
 Specificity 0.766 0.818 0.814 0.792
 PPV 0.686 0.731 0.719 0.705
 NPV 0.931 0.917 0.896 0.917
Former smokers     
 Sensitivity 0.171 0.368 0.297 0.270
 Specificity 0.914 0.811 0.824 0.916
 PPV 0.536 0.530 0.494 0.651
 NPV 0.656 0.689 0.669 0.684
Current smokers     
 Sensitivity 0.825 0.531 0.706 0.820
 Specificity 0.748 0.767 0.771 0.757
 PPV 0.548 0.458 0.533 0.556
 NPV 0.920 0.815 0.876 0.919
  1. N.B. Ternary classifiers are the result of two binary classifiers being applied to DNAm data in sequence: ever versus never smoker classification, then current versus former classification of the ever smokers