DNA methylation at 10 autosomal CpGs was sufficient to correctly identify all known contaminated male samples, and found 13 contaminated female samples. a The 10 CpGs selected by the random forest method clearly separate cord and adult samples, and also clearly discriminate non-contaminated (N) from contaminated (C) male samples, and divide unknown (U) samples into two groups. b Counting the number of sites over thresholds per sample (x axis), contamination was called if at least 5 of the 10 CpGs were above the threshold. Unclear males were all non-contaminated, and 13 females were identified as being contaminated. c A subset of 3 out of the 10 CpGs can be used for pyrosequencing screening. Two thresholds are shown—one requiring two of the three CpGs to be above the threshold to be called contaminated (yellow), and one requiring all three (red)

