Skip to main content
Fig. 1 | Clinical Epigenetics

Fig. 1

From: Cell-type heterogeneity: Why we should adjust for it in epigenome and biomarker studies

Fig. 1

The need to adjust for CTH in epigenome studies. a A comparison of the relative data variance, expressed as a fraction of the total variance accounted by the top 15 PCs (y-axis, fVAR), explained by each of the top-15 principal components (PCs) (x-axis) for 3 separate epigenome studies, with datapoints annotated to the main factor driving that PC. CTH = cell-type heterogeneity; Ethn = ethnicity; EADC = esophageal adenoma carcinoma; ER = estrogen receptor status. The tissue-type and number of samples in each study are given above plots. These plots derive from Illumina DNA methylation data from the following published works: Blood [51], Saliva [49] and Breast [52]. Briefly, the blood dataset is from healthy individuals, saliva samples are from EADC patients and matched healthy controls, and the breast tissue data is from breast cancers and normal-adjacent tissue. In the case of blood, the top-PC correlates with CTH, PC-2 correlates with ethnicity and PC-3 with age. b Sensitivity, false positive rate (FPR) and precision to detect 1000 simulated DMCs introduced in 139 monocyte samples from BLUEPRINT with an exposure distinguishing 69 cases from 70 controls. In each panel, we display the metrics when inferring DMCs from realistic mixtures of 3 cell-types (neutrophils, CD4+ T cells and monocytes) (Mix, red), when inferring DMCs from these same mixtures whilst adjusting for CTH (Mix CTH, blue) and when inferring DMCs from the purified monocyte samples (Mono, green). c For the same simulated data as in (b), the unsupervised hierarchical clustering obtained when clustering the 139 monocyte samples over the top 2 PCs correlating with the exposure (top panel), when clustering the 139 mixtures over the top 2 PCs correlating with the exposure without any adjustment for CTH (middle panel), and when clustering the 139 mixtures over the top 2 PCs correlating with the exposure after adjustment for CTH (lower panel). Note that in the second case, i.e. when clustering over the top 2 PCs derived from the mixtures without adjustment for CTH, that these PCs only exhibited very marginal associations with the exposure, hence why the samples do not segregate by exposure

Back to article page