- Open Access
Epigenome-wide discovery and evaluation of leukocyte DNA methylation markers for the detection of colorectal cancer in a screening setting
© The Author(s) 2017
- Received: 6 October 2016
- Accepted: 6 February 2017
- Published: 3 March 2017
Colorectal cancer (CRC) is the third most common cancer worldwide. If detected at an early stage, prognosis is good. Despite increasing evidence for the benefits of implemented screening programs, such as screening colonoscopy, compliance is rather low. Hence there is demand for non-invasive tests for the early detection of CRC with high acceptance in population-wide screening. The objective of this study was to identify and evaluate leukocyte DNA methylation patterns as a potential biomarker for early detection of CRC.
Blood samples of patients scheduled for a screening colonoscopy were collected before the procedure. Additionally, blood samples from CRC cases recruited in a clinical setting were collected. DNA was extracted from leukocytes, and DNA methylation was measured with the Infinium 450K BeadChip. In total, 46 CRC cases and 140 controls from the screening setting and 93 CRC cases from the clinical setting were measured.
An epigenome-wide discovery revealed two CpG sites in the promoter region of KIAA1549L that were significantly differentially methylated between cases and controls. A third marker in the body region of BCL2 was discovered in a candidate approach testing biomarkers reported in the literature. Logistic regression models built on these three markers yielded an optimism-corrected c-statistic of 0.69 in the screening setting and 0.73 in the clinical setting.
Although diagnostic performance of the DNA methylation signature identified in this first epigenome-wide association study of leukocyte DNA methylation with CRC in a screening setting is not competitive with established screening tests, the identified markers may contribute to multimarker panels for early detection of CRC.
- Colorectal cancer
- Early detection
- Screening setting
- DNA methylation
- Illumina Infinium 450K
- Epigenome-wide association study
- Leukocyte composition
With ∼1.4 million incident cases and almost 700,000 deaths in 2012, colorectal cancer (CRC) is the third most common cancer and the fourth most common cause of cancer death worldwide . Stage at diagnosis is the most important prognostic factor with relative 5-year survival rates of 90, 74, and 14% when diagnosed at a localized, regional, and advanced stage, respectively . Colonoscopy is the gold standard for the detection of CRC and its precursors. CRC is diagnosed earlier by screening colonoscopy  and can even be prevented by removing precursors during colonoscopy. In a recent meta-analysis, screening colonoscopy was estimated to reduce the risk of incident CRC by 69% and CRC mortality by 68% .
In Germany, a first screening colonoscopy is offered free of charge to men and women aged 55 and older, a second one is possible after 10 years. Alternatively, biennial fecal occult blood tests can be performed. Nearly 90% of the people entitled were found to be aware of this program , but despite awareness and clear benefits, less than 40% were up to date with CRC screening recommendations . A recent study among 172 asymptomatic participants recruited during regular consultations found that 109 (63%) refused to undergo screening colonoscopy, but of these, 106 (97%) accepted non-invasive screening methods with 90 (83%) choosing a blood-based test . Although blood tests do not achieve the excellent performance of colonoscopy or immunochemical fecal occult blood tests, their value comes from the high acceptance by patients. Higher participation in screening programs could lower CRC mortality.
Only a fraction of the biomarkers for the early detection of cancer reported in the literature find their way to clinical application eventually. Promising results from retrospective case-control studies often cannot be validated in prospective studies. A great number of blood-based biomarkers for the early detection of CRC have been reported , but only one has been approved by the U.S. Food and Drug Administration so far: the Epi proColon test (Epigenomics AG, Berlin, Germany), that measures the methylation of the SEPT9 gene in circulating cell-free DNA in blood, detected CRC with a sensitivity (specificity) of 48% (92%) in a screening setting . Several studies have reported leukocyte DNA methylation (DNAm) markers for various types of cancer (Additional file 1: Table S1), but none of these studies were conducted in a screening setting. We used blood samples from BLITZ (German: Begleitende Evaluierung innovativer Testverfahren zur Darmkrebs-Früherkennung), a study among participants of screening colonoscopy, to conduct the first epigenome-wide association study (EWAS) for leukocyte DNAm markers for the early detection of CRC in a screening setting.
We analyzed blood samples collected in a screening setting (BLITZ study) and in a clinical setting (DACHS+ study). Details of both studies have been described elsewhere [10, 11]. In brief: BLITZ is conducted in cooperation with several gastroenterological practices in Southern Germany. Eligible are men and women aged 55 to 75 who are scheduled for a screening colonoscopy. They are informed about the study and invited to participate by their physicians at a preparatory visit for the colonoscopy. Participants are excluded, if colonoscopy is indicated due to other reasons (e.g., visible rectal bleeding or a positive test for fecal occult blood), if they had a previous endoscopic examination within the preceding 5 years, or if they had a previous gastrointestinal cancer diagnosis. Blood samples are taken before the colonoscopy; afterwards, medical reports are obtained from the physicians. Recruitment is ongoing; 6613 participants had been enrolled by the end of 2014. They were classified according to the most advanced findings as follows: CRC (n = 57), advanced adenomas, non-advanced adenomas, undefined polyps or other findings of the colonic mucosa (such as pseudopolyps), hyperplastic polyps (n = 643), neither of these findings (n = 3856). For this analysis, we selected CRC cases and participants of the last two categories as controls; all other categories were excluded.
Because of the limited number of CRC cases even in such a large screening study, CRC cases recruited in a clinical setting in the DACHS+ study were included in addition. In the DACHS+ study, 819 men and women aged 55 to 75 with a first diagnosis of a gastrointestinal cancer between October 2006 and December 2014 from several clinics in SouthWest Germany were recruited. Patients with a previous cancer diagnosis in the gastrointestinal tract were excluded. Blood samples were collected before surgery.
Sample selection and processing
Epigenome-wide marker discovery
450K data were normalized using the R package normalize450K . Methylation levels of each CpG site were regressed linearly on disease status and the following covariates: sex, age, leukocyte composition, and batch (96-well plate on which samples were run). Methylation levels were expressed as β-values because of their linear relation with cell proportions. Cell proportions of six major leukocyte types were estimated according to Houseman et al. , including granulocytes, monocytes, CD8+ T-cells, CD4+ T-cells, natural killer cells, and B lymphocytes. Disease status was coded by two variables, screening and clinical, in such a way that screening cases and clinical cases were compared only to controls within the same setting. We compared this first model to a second one without screening and clinical and used the likelihood-ratio test to assess their significance. CpG sites that were significantly associated at a false discovery rate of 10% and showed a consistent trend in the screening and clinical setting (same sign for regression coefficients r s and r c of screening and clinical) were selected as markers.
In addition to the hypothesis-free approach, we also deployed a targeted approach. We searched PubMed for publications from January 2011 to June 2016 reporting leukocyte DNAm markers for any kind of solid cancer. We found 18 relevant publications which reported either probe identifier (in case they used the 27K or 450K platform) or genes, in which case they were mapped to probes on the 450K chip using the annotation provided by the manufacturer. Five publications reported 14 probe identifiers and 13 publications reported 32 genes mapping to 733 probes. Association of these in total 747 probes with CRC was tested using the same approach as in the epigenome-wide discovery. Search terms and a list of all included publications are provided in Additional file 1: Table S1.
Fitting diagnostic models
β-values of the markers discovered in the previous steps were adjusted for leukocyte composition and batch effects by subtracting related terms from the corresponding linear regression models to compute β ′. Three different models for CRC diagnosis were trained by logistic regression: (i) a risk-factors-only model that included only the risk factors sex and age; (ii) a markers-only model that included only β ′-values of the epigenetic markers; and (iii) a full model, including both markers and risk factors. To avoid overoptimism and to provide 95% confidence intervals, we generated 1000 stratified bootstrap samples (“stratified” meaning, that the number of cases and controls was the same as in the original sample). Models were fitted on the bootstrap samples and tested on the left-out subjects. This was done separately for the screening setting (plate A) and the clinical setting (plates B and C). Discrimination was measured by the c-statistic.
Estimating model performance in the BLITZ population
The result should now reflect the impact of the risk factors. To account for the different distribution of Z in the population than in the matched sample, subjects were weighted by inverse propensity scores (a logistic regression with the dependent variable indicating if a subject is included in the matched sample and Z as predictors), so that the weighted matched sample reflected the distribution of Z in the BLITZ population. Discrimination was measured by the weighted c-statistic. Performance of the other models in the BLITZ population was estimated analogously.
Study population characteristics
Mean age ± SD
67 ± 7
62 ± 7
Screening setting (plate A)
Mean age ± SD
67 ± 7
67 ± 7
54.9 ± 12.8
48.7 ± 14.9
7.2 ± 1.9
7.6 ± 2.8
Natural killer cells
11.7 ± 5.2
13.3 ± 5.8
CD8+ T cells
3.5 ± 4.1
4.4 ± 3.8
CD4+ T cells
16.7 ± 7.7
19.1 ± 9.5
5.9 ± 2.8
6.8 ± 3.9
(Difference in mean LC is not significant, p-value 0.29)c
Clinical setting (plates B and C)
Mean age ± SD
65 ± 8
65 ± 8
Neoadj. therapy (none/rad./chemo./comb.)b
64.9 ± 12.8
52.3 ± 12.6
9.3 ± 3.8
7.8 ± 2.7
Natural killer cells
7.2 ± 4.3
11.1 ± 5.2
CD8+ T cells
2.0 ± 2.9
3.5 ± 3.9
CD4+ T cells
12.6 ± 7.1
19.4 ± 7.8
4.0 ± 2.6
5.9 ± 2.9
(Difference in mean LC is significant, p-value 3.9 × 10−11)c
At a false discovery rate of 10%, there was a single significant hit (without adjustment for LC, there would have been 90,288), cg04036920, located 1373 basepairs upstream to the transcription start site of KIAA1549L. Methylation levels of proximal CpG sites are often correlated; therefore, we tested the remaining 25 probes on the 450K chip that are ascribed to KIAA1549L, in case we missed some of them due to the high multiple-testing burden. After Bonferroni correction (significance level α<0.05/25), one other site was significant, cg14472551, which is located 557 basepairs apart from cg04036920 and 817 basepairs upstream to the transcription start site. Cases showed higher methylation levels than controls consistent between screening and clinical setting for both markers (cg04036920 r s =0.029,r c =0.032; cg14472551 r s =0.042,r c =0.022). Two more probes were significant at a false discovery rate of 10% when we tested only the candidates extracted from our literature search (without adjustment for LC there would have been 170). One was excluded, as r s and r c showed opposite trends. The other probe, cg12459502 (r s =0.011,r c =0.025), located in the body region of BCL2, was added to the marker panel. Based on nine technical replicates of a single sample, allocated on the same plates, we estimated standard measurement errors of 0.029, 0.019, and 0.019 for cg04036920, cg14472551, and cg12459502, respectively.
Thirty-three cases from DACHS+ received neoadjuvant therapy before blood sampling. In a sensitivity analysis, we excluded these cases (but kept the matched controls). While cg04036920 was no longer significant after correction for multiple testing in the epigenome-wide search, it remained the marker with the smallest p-value. The targeted search yielded again cg12459502, this time as the only significant marker.
Single marker performance in matched case-control sample
Performance of diagnostic models
We analyzed blood samples of CRC cases and controls collected prospectively from participants of screening colonoscopy. Additionally, blood samples from clinical CRC cases were collected. An epigenome-wide discovery of leukocyte DNAm markers for the early detection of CRC identified two differentially methylated CpG sites located near the transcription start site of KIAA1549L. A third marker in the body region of BCL2 was identified in a targeted approach looking only at candidate markers reported in the literature. Discrimination (measured by the c-statistic) of CRC cases and controls by logistic regression models based on these three markers was 0.69 and 0.73 in the screening setting and clinical setting, respectively, and was estimated at 0.74 for the target population of the German CRC screening program. To our knowledge, this is the first epigenome-wide association study (EWAS) for leukocyte DNAm markers for the early detection of CRC in a screening setting.
Screening colonoscopy, the gold standard for the detection of CRC and precursors, suffers from low adherence. Non-invasive tests could increase participation in CRC screening programs and interest in the identification of suitable biomarkers is growing , but only a few were validated in a screening setting: the most advanced blood test so far, based on the detection of cell-free methylated SEPT9 in plasma, achieved a sensitivity (specificity) of 48% (92%) for the detection of CRC . The combination of two other markers, CEA and anti-TP53 antibody, also evaluated in the BLITZ study (albeit not using the same samples as in the current analysis), achieved a sensitivity (specificity) of 58% (90%) .
Stool tests represent another non-invasive alternative. The Cologuard test (Exact Sciences, Madison, WI, USA) combines assays to test for aberrant methylation of the promoter regions of the BMP3 and NDRG4 genes, for mutations of the KRAS gene, and for human hemoglobin. A score calculated from the combined results from these assays led to an improved sensitivity (but decreased specificity) of 92% (87%) compared to 72% (95%) when using only the hemoglobin component . In the BLITZ study, a standalone fecal immunochemical occult blood test (FIT) had a sensitivity (specificity) of 73% (96%) . FITs perform so far substantially better than blood-based tests.
Our study has several strengths and limitations. Most importantly, we are using CRC cases and controls from a true screening setting. Blood samples in BLITZ were collected before the participants were aware of their case/control status, thereby eliminating (largely) the possibility of selection bias and information bias. Case-control studies using clinical settings often try to account for selection bias by matching cases and controls on a number of confounders. However, there might be unknown confounders which cannot be accounted for and which might result in false positive candidate markers. Using a screening setting ensures that cases and controls are (on expectation) comparable even for the unknown confounders. For instance, Prolactin, discovered in a clinical setting as a biomarker for ovarian cancer, failed in a screening setting later and turned out to be sensitive to the way samples were collected (at the day of a planned surgery or in the days before) and might just be a symptom of stress [18, 19]. There might be many ways in which patients change their lifestyle after a cancer diagnosis. In such cases, not the presence of the disease but being aware of it might cause differences in biomarker levels. Using a screening setting avoids these pitfalls. Of course, these considerations do not apply to the DACHS+ study. Therefore, we filtered out candidates that showed a inconsistent trend in the screening and clinical setting. Furthermore, the BLITZ population closely resembles the target population, as should the AUC estimates of the diagnostic models.
Another strength of our study is the high sample size. BLITZ is one of the largest screening studies for CRC with more than 6600 participants. Therefore, despite the low prevalence of CRC in the BLITZ population (<1%), we had approximately 60 samples of CRC to choose from. We still might have missed potential markers due to a lack of statistical power caused by the high multiple testing burden. We compensated for this by including samples from clinical CRC cases, but, as outlined above, they are no equivalent substitute.
We matched cases and controls on the risk factors sex and age. Matching would not be necessary in BLITZ, as the study population closely mirrors the target population, in which these groups differ on these factors, and might even lead to biased estimates of sensitivity and specificity if marker levels are associated with the factors matched on [20, 21]. Matching was done here to increase the statistical power to find biomarkers that provide diagnostic value beyond these known risk factors. We used the method described in  to arrive at presumably unbiased estimates of marker performance in the BLITZ population.
Another strength of our study is the adjustment for leukocyte composition (LC). LC is often considered as the most important confounder when analyzing whole blood DNAm . This turned out to be true here as well. Clinical cases and controls differed much more than screening cases and controls, which may reflect the higher fraction of late stages among clinical cases and the fact that some received neoadjuvant therapy. Confounding by LC might be one of the reasons, why none save one candidate marker from our literature search could be validated here. Only two of the studies included did adjust for LC, and none was conducted in a screening setting (with the exception of one study looking at colorectal adenomas). Indeed, an unadjusted analysis of our data would have confirmed 170 of the 747 markers. Another reason might be that those markers are specific for the types of cancer investigated in the original studies.
We adjusted for six major leukocyte types, yet we cannot exclude the possibility that observed differences at the identified markers are still due to residual confounding, either due inaccurate cell proportion estimates or because an even finer distinction of cell types would be necessary. On the other hand, regardless if the observed effects represent genuine changes of the methylation state or not, a biomarker must merely hold predictive value. One could use not only the marker panel but also harness the predictive value of the leukocyte composition, as proposed in . Again, as seen in the differences in leukocyte composition between screening and clinical cases, such markers would need to be evaluated in a screening setting. A detailed investigation of this issue was beyond the scope of this work. However, metrics like the neutrophil/lymphocyte ratio are influenced by many factors and therefore cannot be specific for CRC.
It is unlikely that observed differences between cases and controls are caused by batch effects. Blood samples were collected blinded to the outcome, as well were DNA extraction and DNAm measurements. Batch effects are omnipresent and numerous for the 450K platform , but our careful sample allocation scheme ensured that they were not associated with the outcome or the matching factors.
It is unclear if the identified methylation signature is specific for CRC, as in contrast to cell-free DNA in blood serum which might originate from tumor tissue, changes in leukocyte DNA methylation probably reflect a response of the immune system which might be similar for other cancers or diseases.
The function of KIAA1945L is largely uncharacterized; therefore, we refrain from speculations about the biological plausibility of this finding.
We identified three CpG sites whose methylation levels in whole blood can be used as biomarkers for the early detection of colorectal cancer. While their performance on their own is not competitive to screening colonoscopy or fecal immunochemical tests, their combination in a multi-marker panel, similar to the multitarget stool test mentioned above , could render them useful as a screening tool. Further validation of these markers in other study populations is necessary.
We thank the microarray unit of the DKFZ Genomics and Proteomics Core Facility, especially Matthias Schick, for providing the 450K arrays and related services, and Katarina Cuk for sample preparation.
The BLITZ study was partly funded by grants from the German Research Foundation (DFG, grant No. BR1704/16-1).
Availability of data and materials
HB is the principal investigator of the BLITZ and DACHS+ study. HB conceived the study. JAH analyzed the data and wrote the manuscript draft. Both authors have read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
The BLITZ and DACHS+ study have been approved by the ethics committe of the Medical Faculty of the University of Heidelberg, and informed consent was collected from all participants.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Arnold M, Sierra MS, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global patterns and trends in colorectal cancer incidence and mortality. Gut. 2016. doi:10.1136/gutjnl-2015-310912.
- Howlader N, Noone A, Krapcho M, Miller D, Bishop K, Altekruse S, Kosary C, Yu M, Ruhl J, Tatalovich Z, Mariotto A, Lewis D, Chen H, Feuer E. SEER Cancer Statistics Review, 1975–2013, based on November 2015 SEER Data Submission. National Cancer Institute. 2016. http://seer.cancer.gov/csr/1975_2013/. Accessed 07 Jul 2016.
- Kubisch CH, Crispin A, Mansmann U, Göke B, Kolligs FT. Screening for colorectal cancer is associated with lower disease stage: a population-based study. Clin Gastroenterol H. 2016. doi:10.1016/j.cgh.2016.04.008.
- Brenner H, Stock C, Hoffmeister M. Effect of screening sigmoidoscopy and screening colonoscopy on colorectal cancer incidence and mortality: systematic review and meta-analysis of randomised controlled trials and observational studies. BMJ. 2014. doi:10.1136/bmj.g2467.
- Daten und Fakten: Ergebnisse der Studie “Gesundheit in Deutschland Aktuell 2010”: Robert-Koch-Institut; 2012. http://edoc.rki.de/documents/rki_fv/remDCCtjOJxI/PDF/21TgKGZEOWNCY.pdf. Accessed 07 Jul 2016.
- Stock C, Ihle P, Schubert I, Brenner H. Colonoscopy and fecal occult blood test use in Germany: results from a large insurance-based cohort. Endoscopy. 2011. doi:10.1055/s-0030-1256504.
- Adler A, Geiger S, Keil A, Bias H, Schatz P, deVos T, Dhein J, Zimmermann M, Tauber R, Wiedenmann B. Improving compliance to colorectal cancer screening using blood and stool based tests in patients refusing screening colonoscopy in Germany. BMC Gastroenterol. 2014. doi:10.1186/1471-230X-14-183.
- Shah R, Jones E, Vidart V, Kuppen PJK, Conti JA, Francis NK. Biomarkers for early detection of colorectal cancer and polyps: systematic review. Cancer Epidem Biomar. 2014. doi:10.1158/1055-9965.EPI-14-0412.
- Church TR, Wandell M, Lofton-Day C, Mongin SJ, Burger M, Payne SR, Castaños-Vélez E, Blumenstein BA, Rösch T, Osborn N, Snover D, Day RW, Ransohoff DF, for the PRESEPT Clinical Study Steering Committee Investigators and Study Team. Prospective evaluation of methylated SEPT9 in plasma for detection of asymptomatic colorectal cancer. Gut. 2014. doi:10.1136/gutjnl-2012-304149.
- Hundt S, Haug U, Brenner H. Comparative evaluation of immunochemical fecal occult blood tests for colorectal adenoma detection. Ann Intern Med. 2009. doi:10.7326/0003-4819-150-3-200902030-00005.
- Tao S, Haug U, Kuhn K, Brenner H. Comparison and combination of blood-based inflammatory markers with faecal occult blood tests for non-invasive colorectal cancer screening. Brit J Cancer. 2012. doi:10.1038/bjc.2012.104.
- Heiss JA, Brenner H. Between-array normalization for 450K data. Fron Genet. 2015. doi:10.3389/fgene.2015.00092.
- Houseman E, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012. doi:10.1186/1471-2105-13-86.
- Bansal A, Pepe MS. Estimating improvement in prediction with matched case-control designs. Lifetime Data Anal. 2013. doi:10.1007/s10985-012-9237-1.
- Werner S, Krause F, Rolny V, Strobl M, Morgenstern D, Datz C, Chen H, Brenner H. Evaluation of a 5-marker blood test for colorectal cancer early detection in a colorectal cancer screening setting. Clin Cancer Res. 2016. doi:10.1158/1078-0432.CCR-15-1268.
- Imperiale TF, Ransohoff DF, Itzkowitz SH, Levin TR, Lavin P, Lidgard GP, Ahlquist DA, Berger BM. Multitarget stool DNA testing for colorectal-cancer screening. New Engl J Med. 2014. doi:10.1056/NEJMoa1311194.
- Brenner H, Tao S. Superior diagnostic performance of faecal immunochemical tests for haemoglobin in a head-to-head comparison with guaiac based faecal occult blood test among 2235 participants of screening colonoscopy. Eur J Cancer. 2013. doi:10.1016/j.ejca.2013.04.023.
- Buchen L. Cancer: missing the mark. Nature. 2011. doi:10.1038/471428a.
- Thorpe JD, Duan X, Forrest R, Lowe K, Brown L, Segal E, Nelson B, Anderson GL, McIntosh M, Urban N. Effects of blood collection conditions on ovarian cancer serum markers. PloS ONE. 2007. doi:10.1371/journal.pone.0001281.
- Pepe MS, Fan J, Seymour CW, Li C, Huang Y, Feng Z. Biases introduced by choosing controls to match risk factors of cases in biomarker research. Clin Chem. 2012. doi:10.1373/clinchem.2012.186007.
- Brenner H, Altenhofen L, Tao S. Matching of controls may lead to biased estimates of specificity in the evaluation of cancer screening tests. J Clin Epidemiol. 2013. doi:10.1016/j.jclinepi.2012.09.00810.1016/j.jclinepi.2012.09.008.
- Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014. doi:10.1186/gb-2014-15-2-r31.
- Kilincalp S, Çoban Ş, Akinci H, Hamamcı M, Karaahmet F, Coşkun Y, Üstün Y, Şimşek Z, Erarslan E, Yüksel İ. Neutrophil/lymphocyte ratio, platelet/lymphocyte ratio, and mean platelet volume as potential biomarkers for early detection and monitoring of colorectal adenocarcinoma. Eur J Cancer Prev. 2015. doi:10.1097/CEJ.0000000000000092.
- Buhule OD, Minster RL, Hawley NL, Medvedovic M, Sun G, Viali S, Deka R, McGarvey ST, Weeks DE. Stratified randomization controls better for batch effects in 450K methylation analysis: a cautionary tale. Fron Genet. 2014. doi:10.3389/fgene.2014.00354.
- van den Boogaart KG, Tolosana-Delgado R. Analyzing compositional data with R: Springer Berlin Heidelberg; 2013. doi:10.1007/978-3-642-36809-7.