Skip to main content
Fig. 7 | Clinical Epigenetics

Fig. 7

From: Batch-effect detection, correction and characterisation in Illumina HumanMethylation450 and MethylationEPIC BeadChip array data

Fig. 7

Biologically meaningful methylation clustering. Some probes with biologically meaningful methylation were erroneously corrected. Four example probes are illustrated. Each scatter plot compares methylation across slides (X axis) with the methylation β value (Y-axis). The datapoints are from each of the 369 Beadchip arrays, with the data sorted and coloured by slide number. The panel is ordered column-wise from left to right as original, Harman-corrected and ComBat-corrected data. It was observed that the standard deviation (SD) of the data remained the same or less; however, the log-variance ratio (LVR) was elevated considerably above 0. The mean β shift (Shift) is the mean change in β across all the 369 arrays induced by erroneous batch correction. The mapping of common SNPs falling within CpG sites can be used to identify CpG sites which should not be batch corrected. An example of this the probe cg25465065, which has the common C/T SNP rs3768276 positioned at the cytosine and as expected, the frequencies in each cluster are consistent with expectations of the Hardy–Weinberg equilibrium. However, the methylation as measured by probe cg15544633 on chromosome 2 is clustered to two groups: intermediate methylation and no methylation. This clustering is not in Hardy Weinberg equilibrium (p = 9.520 × 10−9), yet the clustering is likely influenced by genetics as the common SNP rs2516834 is immediately adjacent to the assayed CpG site. In the example with the Y chromosomal probe cg00455876, there is clearly a higher methylation state in males and this is clustering is still apparent after batch correction as gender was declared as biological variance to preserve. However, more complex gender associations may arise, in which batch-effect correction performs poorly. One of the alleles for the X chromosomal cg15410402 probe is inactivated in females, but the methylation state in males is complex, with almost half of the males having intermediate methylation and half no methylation. This may well be due to an interaction between gender and genetics, likely due to the influence of the commonly deleted sequence 5’-GGAGCTAGGCCG (rs66532084) 12 bp upstream from the measured CpG site

Back to article page