Global DNA methylation patterns in Barrett’s esophagus, dysplastic Barrett’s, and esophageal adenocarcinoma are associated with BMI, gender, and tobacco use

Background The risk of developing Barrett’s esophagus (BE) and/or esophageal adenocarcinoma (EAC) is associated with specific demographic and behavioral factors, including gender, obesity/elevated body mass index (BMI), and tobacco use. Alterations in DNA methylation, an epigenetic modification that can affect gene expression and that can be influenced by environmental factors, is frequently present in both BE and EAC and is believed to play a role in the formation of BE and its progression to EAC. It is currently unknown whether obesity or tobacco smoking influences the risk of developing BE/EAC via the induction of alterations in DNA methylation. To investigate this possibility, we assessed the genome-wide methylation status of 81 esophageal tissues, including BE, dysplastic BE, and EAC epithelia using HumanMethylation450 BeadChips (Illumina). Results We found numerous differentially methylated loci in the esophagus tissues when comparing males to females, obese to lean individuals, and smokers to nonsmokers. Differences in DNA methylation between these groups were seen in a variety of functional genomic regions and both within and outside of CpG islands. Several cancer-related pathways were found to have differentially methylated genes between these comparison groups. Conclusions Our findings suggest obesity and tobacco smoking may influence DNA methylation in the esophagus and raise the possibility that these risk factors affect the development of BE, dysplastic BE, and EAC through influencing the epigenetic status of specific loci that have a biologically plausible role in cancer formation. Electronic supplementary material The online version of this article (doi:10.1186/s13148-016-0273-7) contains supplementary material, which is available to authorized users.


Background
The incidence of esophageal adenocarcinoma (EAC) has been increasing in the USA for several decades for reasons that are not entirely clear but may be related to the increasing prevalence of risk factors such as obesity [1]. The precursor lesion for EAC is Barrett's esophagus (BE), a metaplastic condition where the squamous-lined esophageal mucosa is replaced by specialized intestinal mucosa. A minority of individuals with BE will develop EAC through a progression sequence in which BE transitions to BE with low-grade dysplasia (LGD), BE with high-grade dysplasia (HGD), and ultimately to EAC [2].
It is recognized that both genetic and epigenetic alterations arise in the esophagus during the development and progression of BE and EAC [3][4][5]. Epigenetic alterations, primarily in the form of hypermethylated or hypomethylated CpG dinucleotides in the DNA, have been described in BE and EAC using both candidate gene approaches and microarray-based strategies. Hypermethylation of CpGs in CpG islands in promoter regions has been associated with the repression of transcription of some genes, and hypermethylation of CpGs in gene bodies is associated with increased gene expression [6,7]. The effects of DNA methylation on the regulation of gene expression have supported the plausibility that alterations in DNA methylation can affect disease processes in people.
Aberrant DNA methylation has been shown to occur early in the BE → dysplastic BE → EAC progression sequence [8]. The aberrant methylation of numerous cancer-related genes, such as CDKN2A, as well as global alterations in DNA methylation has been observed in BE, and many of these epigenetic alterations are also found in dysplastic BE and EAC [8][9][10][11][12][13]. However, despite the near universal observation of altered DNA methylation in BE and EAC, the mechanisms driving aberrant DNA methylation in the esophagus, as in most other pre-neoplastic and neoplastic tissues, remain elusive.
The risk of developing BE and/or EAC is associated with specific demographic and behavioral factors, including obesity/elevated body mass index (BMI) and tobacco use [14,15]. Numerous mechanisms through which these factors may affect BE and/or EAC formation have been proposed [16,17]; however, no assessment of effects on the epigenome in the esophagus has been made to date. There is evidence that certain environmental, behavioral, and demographic factors can influence the epigenetic state, which suggests that the behavioral factors associated with BE and EAC may act by inducing alterations in the methylation status of DNA [18]. For example, alterations in the methylation status of CpG islands in the promoter regions of genes implicated in obesity, appetite control, and metabolism have been shown to occur in DNA isolated from blood and breast tissue of obese compared to lean individuals [19][20][21][22]. Tobacco smoking, meanwhile, has been associated with alterations in DNA methylation of multiple cancer-related genes in studies focused on single candidate genes as well as in genome-wide methylation studies of prostate cancer, the bronchial epithelium, and peripheral blood mononuclear cells [23][24][25][26].
These observations led us to use HumanMethyla-tion450 (HM450) BeadChips to evaluate epigenomewide patterns of DNA methylation in a collection of human esophageal tissue samples, including cases of Barrett's esophagus (BE), Barrett's with low-or highgrade dysplasia, and esophageal adenocarcinoma (EAC). We were interested in determining whether BMI, tobacco smoking, and/or gender were associated with increased or decreased DNA methylation at specific CpG dinucleotides or in particular genomic regions, which would support a possible functional role in the pathogenesis of EAC. We also focused on whether epigenetic alterations linked with these demographic features associated with particular molecular or cancer-related pathways in order to assess for possible mechanisms through which alterations in the DNA methylation status may be involved in the formation of BE and/or EAC.

Results
Differences in the methylation status of genes in obesityrelated pathways are associated with BMI status Obesity has been consistently associated with an increased risk for developing both BE and EAC, yet little is known about the mechanisms involved in this elevated risk [27,28]. While it is likely that both somatic genetic and epigenetic alterations play a role in the pathogenesis of BE and EAC, there is currently very little information about the relationship between Barrett's esophagus and esophageal adenocarcinoma, obesity, and aberrant DNA methylation. From the 81 samples we analyzed on the HM450 array, body mass index (BMI) data were available for 46 cases, including 15 BE, 14 LGD, nine HGD, and eight EAC cases. We classified each of these samples as arising in the setting of either high BMI (BMI > 30) or low BMI (BMI ≤ 30). For female patients (N = 7), there were three in the low BMI and four in the high BMI groups. First, we determined whether the BE samples from the high BMI group (N = 11) had global DNA methylation alterations that were more closely related to HGD and/or EAC cases compared to the BE samples from study subjects with low BMI (N = 4). We found that high and low BMI BE cases tend to cluster together and that the high BMI BE cases did not appear to be more related to HGD/EAC than the low BMI BE cases (data not shown).
Next, we assessed for differentially methylated loci (DML) that varied between the combined esophageal tissue samples (BE, LGD, HGD, EAC) from individuals with high vs. low BMI. Using criteria for DML of a p value <0.001 and Δβ between high BMI and low BMI > 0.10, we found a total of 974 DML between the high and low BMI groups, including 226, 471, and 277 DML located in promoter, intragenic, and intergenic regions, respectively. A dendrogram depicting the DML between high and low BMI patients is shown in Fig. 1. One hundred and eighty-two (182) DML were located in CpG islands and 376 were located in CpG island shores (within 2 kb of a transcription start site [29]). We also found 352 DML (36.1 % of the total 974 DML) that were cancer associated, which we defined as loci that were differentially methylated between the normal squamous (SQ; N = 12) and EAC (N = 24) cases on the HM450 array. In general, the high BMI cases showed increased methylation at the DML, with 872 out of 974 DML (89.5 %) demonstrating elevated methylation in high vs. low BMI cases. The DML with the greatest statistical significance (p < 5 × 10 −6 ) associated with BMI are shown in Table 1.
We also evaluated the association of BMI with tissue DNA methylation in the separate histologic types of esophageal tissues (e.g., BE, BE with LGD, BE with HGD, and EAC). We compared methylation in the high BMI (N = 4) vs. low BMI (N = 11) BE cases, the high BMI (N = 7) vs. low BMI (N = 7) LGD cases, and the high BMI (N = 9) vs. low BMI (N = 8) HGD/EAC cases. Table 2 summarizes the DML found when comparing these groups. The methylation status of the high compared to low BMI BE cases with respect to genomic regions and CpG island location is shown in Fig. 2. In general, in the BE cases, DML located in promoters and CpG islands were hypermethylated in high BMI vs. low BMI cases, whereas DML located elsewhere were hypomethylated in high BMI vs. low BMI cases. In contrast to this, DML in the HGD/EAC cases were hypermethylated at all functional regions as well as CpG island shores, shelves [30], and open seas in the high BMI vs. low BMI cases but not at CpG islands (Fig. 2).
We also looked to see if any of the DML between the high and low BMI BE cases overlapped with any of the DML when comparing BE to EAC, in order to determine if methylation alterations in obese individuals with BE might be associated with progression to HGD/EAC. We did find nine probes that overlapped between these groups, including those targeting the genes HLA-DPA1, TBR1, OSR2, TMEM63A, CD300E, and UBD/FAT10. UBD/FAT10, which we found to be hypomethylated in high BMI BE patients, is of interest as this gene has been shown to be overexpressed in hepatocellular carcinoma (HCC) and is thought to modulate the β-catenin/TCF4 pathway and drive HCC invasion and metastasis [31].
Because of the potential for DNA methylation alterations to modify gene expression, we next assessed the methylation status of CpGs located in genes associated with signaling pathways and biological mediators implicated in obesity-associated cancers [17,32,33] in the esophageal tissues from the subjects with low vs. high BMI. With regard to the insulin and insulin growth factor 1 (IGF-1) related pathways, we observed increased methylation of IGFBP1 (average beta = 0.11 in low BMI cases and 0.27 in high BMI cases) and IRS2 (average beta = 0.11 in low BMI cases and 0.36 in high BMI cases) in the high BMI compared to low BMI BE cases. Both genes were hypermethylated in the high BMI cases in a CpG island located within exon 1. Unlike with BE cases, genes of the insulin or IGF-1 pathways did not show altered methylation in high vs. low BMI cases in the LGD, HGD, or EAC tissue sets. We also examined molecular pathways associated with adipose inflammation, which has been shown to mediate obesity-related cancer [32] and found the proinflammatory gene IL-1β (IL1B) to be hypermethylated in high vs. low BMI cases when we assessed the combined esophageal tissue sets. We also found hypermethylation of IL1B in the HGD/ EAC cases from high BMI subjects. For the combined cases, the average beta was 0.25 (SD = 0.10, 95 % CI = 0.21-0.30) in low BMI cases and 0.35 (SD = 0.12, 95 % CI = 0.30-0.41) in high BMI cases and for the HGD/EAC cases, average beta was 0.20 (SD = 0.08, 95 % CI = 0.12-0.27) in low BMI cases and 0.38 (SD = 0.11, 95 % CI = 0.30-0.47) in high BMI cases. Of interest, Fig. 1 Dendrograms depicting DML when comparing high to low BMI cases. Because absolute differences in methylation (i.e., beta values) between cases were small, these heatmaps illustrate relative differences in methylation between cases instead of absolute beta values. a High vs. low BMI, all cases (BE, LGD, HGD, and EAC) combined. b High vs. low BMI, BE cases. c High vs. low BMI, HGD/EAC cases adiponectin and leptin have also been implicated in obesity-associated cancer [34,35]; however, we did not observe any differences in the DNA methylation status of genes involved in leptin or adiponectin pathways in any of the esophageal tissue sets in the high vs. low BMI subjects.
There are numerous differentially methylated regions (DMR) between individuals with high and low BMI in esophageal tissues The analysis described above was focused on the methylation status of individual CpG dinucleotides located in promoters, gene bodies, and intergenic   [36,37], we next assessed for differentially methylated regions (DMR) in the esophageal tissue samples from the low vs. high BMI subjects. Among the BE cases, there were DMR in 10 genes that differed between the high and low BMI groups (FWER < 0.10, Δβ > 0.10, and at least two contiguous CpG dinucleotides differentially methylated). Examples of two of these genes, TFAP2C and DIP2C, are shown in Fig. 3. Among the HGD/EAC cases, 31 DMR were identified using the same criteria, including regions in the genes ZNF790 and SIM2 (Fig. 3). We did not find any DMR within prominent genes in the insulin, IGF-1, TNF-α, or leptin pathways.
A comparison of genes showing differential methylation between high vs. low BMI cases demonstrates the involvement of cancer-related pathways and gene sets We used the NCI Pathway Interaction Database (NCI-PID), Kyoto Encyclopedia of Genes and Genomes (KEGG) database, and the list of Gene Ontology (GO) terms to identify biological processes or pathways that were over-or under-represented based on genes containing Fig. 2 Genomic location, relationship to CpG islands, and methylation status of DML when comparing high vs. low BMI esophageal samples. In each panel, "Hypo" refers to percentage of DML that are hypomethylated in high BMI vs. low BMI samples; "Hyper" refers to percentage of DML that are hypermethylated in high BMI vs. low BMI samples. On the Y axis, DMLs (%) refers to the percentage of the total DML that are associated with a particular genomic location (a, d) or CGI relationship (b, e). Percentages may add up to more than 100 % because some probes were classified with more than one designation. Beta values are equivalent to percent methylation. a DML when comparing high BMI to low BMI BE cases by genomic region. Non-promoter regions were enriched with hypomethylated loci (p = 0.008), whereas promoter regions were borderline-enriched with hypermethylated loci (p = 0.06). DML between the esophageal tissue sets in the subjects with either high or low BMI status. As mentioned previously, we defined "cancer-associated" probes as those that were differentially methylated between EAC and SQ cases on the array. Among the BE cases, we found one NCI-PID pathway, "direct p53 effectors", which includes the differentially methylated gene RDX from our dataset, associated with methylation differences between high and low BMI groups. There were 13 KEGG pathways (including "cell adhesion molecules") and 77 GO terms (including "response to growth hormone" and "biological adhesion") that were represented in the differentially methylated genes in the BE samples from the high vs. low BMI subjects. The list of GO terms is shown in Additional file 1: Table S2.
With respect to the HGD/EAC cases, there were no NCI-PID pathways that were significantly associated with methylation differences between high and low BMI status after restricting our analysis to only cancer-related genes. There was one KEGG pathway ("Wnt signaling") and 87 GO terms (such as "tissue morphogenesis" and "response to TGF-beta") differentially methylated between HGD/EAC cases from subjects with high BMI vs. low BMI (p value <0.05) (Additional file 2: Table S3).

Gender-related differences in DNA methylation in esophageal tissues
Little is known about gender-specific variations in DNA methylation in most tissues, including the esophagus. Previous studies have shown that repetitive elements and specific CpG dinucleotides isolated from blood samples demonstrate modestly increased methylation in males compared to females [38,39]. Another study of four candidate genes in colorectal adenocarcinoma cells demonstrated that males had increased methylation of MTHFR, CALCA, and MGMT compared to females [40]. To the best of our knowledge, a genome-wide analysis of gender differences in DNA methylation in the esophagus has not been previously reported. Using HM450 array analysis of BE, HGD, and EAC esophageal samples from 118 males and 23 females, we found numerous CpG sites that were differentially methylated between the genders after excluding probes on the X and Y chromosomes and after accounting for differences in the age between the men and women in our study. When we combined the BE, HGD, and EAC cases, there were 1092 DML, including 369, 421, and 402 DML located in promoter, intragenic, and intergenic regions, respectively. From this list, there were 402 DML where the mean beta value difference between males and females was >0.10 and p value was <0.001. These DML were associated with CpGs in genes such as DUSP22, a regulator of estrogen receptor alpha mediated signaling, FRG1B, which is involved in pre-mRNA splicing, and CGREF1, which mediates cell-cell adhesion in a calciumdependent manner. Of these 402 DML, 327 (81.3 %) were more highly methylated in females. The DML with the greatest statistical significance (p < 5 × 10 −6 ) between males and females are listed in Table 3. Of interest, half of the top DML were located in CpG islands.
Tobacco use is associated with DNA hypermethylation in the esophagus Tobacco smoking, which is a well-known risk factor for Barrett's esophagus and EAC, has been associated with alterations in DNA methylation in peripheral blood lymphocytes [26,41]. However, little is known about the relationship between smoking and DNA methylation alterations in esophageal tissues, including BE and EAC. To investigate this further, we assessed the relationship between smoking and aberrant DNA methylation in samples from subjects for which we had data on tobacco use. We divided cases into "smokers" (which included both current and former smokers) and "nonsmokers;" we did not further segregate smokers by current smoking status, pack-years, etc. due to the relatively small number of cases available. We first compared BE nonsmokers (N = 7) to BE smokers (N = 9) using principal component analysis (PCA) to determine whether methylation patterns of BE smokers more closely resembled the patterns we observed in LGD, HGD, and/or EAC cases compared to BE nonsmokers. When we examined the 1000 loci with the most variable methylation between groups, we did not find the BE smokers were grouped more closely with LGD/HGD/EAC than BE nonsmokers (data not shown).
Next, we evaluated 54 esophageal samples of various histologic types (BE, LGD, HGD, and EAC) for global alterations in DNA methylation associated with tobacco smoking. After controlling for differences associated with the histological diagnosis (BE, LGD, or HGD/EAC), we found a total of 256 DML between the smokers (N = 40) and nonsmokers (N = 14) (Δβ > 0.10, p < 0.001). Heatmaps depicting the DML between smokers and nonsmokers are shown in Fig. 4.
These DML included 98, 40, and 118 loci located in promoter, intragenic, and intergenic regions, respectively. Two hundred forty-two (242) of the 256 DML (94.5 %) were more highly methylated in smokers compared to nonsmokers, and 105 of the 256 DML (41.0 %) affected cancer-associated genes, as based on the criteria described above. The DML with the greatest statistical significance (p < 1 × 10 −4 ) associated with smoking are shown in Table 4.
We also evaluated the association of tobacco use with DNA methylation in the separate esophageal tissue types (i.e., BE, LGD, HGD, and EAC). We assessed for DML in the BE smokers (N = 9) vs. BE nonsmokers (N = 7) and in the HGD/EAC smokers (N = 19) vs. HGD/EAC nonsmokers (N = 7) while controlling for age differences. We were not able to compare the LGD cases as all samples were from smokers. Table 5 summarizes the DML we identified for these comparisons and shows the functional genomic locations of the loci when comparing these groups. The methylation status of the BE and HGD/EAC tissues from smokers compared to nonsmokers with respect to genomic regions and CpG island location is shown in Fig. 5. In both BE and HGD/ EAC cases, the DML from smokers showed much higher methylation in all genomic regions analyzed (Fig. 5).
There are numerous differentially methylated regions (DMR) in esophageal tissues based on tobacco use status As with the BMI-based comparison described above, we were interested in extending our analysis of differential DNA methylation between smokers and nonsmokers to include differentially methylated regions in addition to DML, which are single CpG sites. Among the BE cases, there were DMR found involving 13 genes when comparing smokers to nonsmokers (FWER < 0.10, Δβ > 0.10, and at least two contiguous CpG dinucleotides differentially methylated). These DMR were located within the genes TNXB and HOXA4, which are notable because TNXB is a member of the tenascin family and regulates cell-extracellular matrix interactions [42,43] and HOXA4 is a transcription factor previously shown to inhibit cell motility and to be aberrantly methylated in acute myeloid leukemia [44,45] (Fig. 6). TNXB is normally more highly expressed in BE tissues compared to normal squamous esophagus (fold change = 3.39) whereas HOXA4 has not been shown to be differentially expressed in BE vs. normal esophagus (expression data obtained from www.oncomine.org). Among the HGD/ EAC cases, we identified 29 DMR, including areas with altered methylation in the genes GFI1, which is a transcriptional repressor implicated in the regulation of p53 activity and Notch signaling [46,47] and CLDN11, a cell adhesion protein involved in cell migration that is commonly altered in cancer [48] (Fig. 6). Normally, both GFI1 and CLDN11 have been shown to be more highly expressed in EAC tissues vs. normal esophagus (fold changes = 1.30 and 1.11-3.39, respectively; www.oncomine.org).

Differences in esophageal DNA methylation between smokers and nonsmokers are associated with several cancer-related pathways and gene sets
We were interested to see which molecular and cancer-related pathways were associated with the epigenetic differences in the BE and EAC tissues from smokers as compared to nonsmokers. As with the BMI cases, we restricted our NCI Pathway Interaction Database (NCI-PID) analysis to only those DML that we considered to be "cancer related" to improve the likelihood these pathways would contain biologically plausible mechanisms involved in smoking-related BE and/or EAC formation. Analysis of BE cases alone did not identify any NCI-PID pathways that were differentially methylated in BE smokers vs. nonsmokers. However, there was 1 KEGG pathway ("type 1 diabetes mellitus") and 20 GO terms (including "positive regulation of mismatch repair" and "enteric smooth muscle cell differentiation") that were differentially represented between the BE samples from smokers vs. nonsmokers (p < 0.05) (Additional file 3: Table S4).
When we compared DNA methylation in the HGD/ EAC tissues of smokers and nonsmokers, we found two NCI-PID pathways associated with alterations in DNA methylation and smoking (FDR ≤ 0.05), including the "neurotrophic factor-mediated Trk receptor signaling" and "SHP2 signaling" pathways. The differentially methylated genes NTRK2 and NTRK3 were notable affected members of both of these pathways. There were no KEGG pathways but there were 217 GO terms (such as "localization of cell" and "regulation of cell migration") that were differentially represented (Additional file 4: Table S5).

Discussion
Genetic and epigenetic alterations are commonly found in BE and EAC and likely play a prominent role in driving the initiation and progression of BE to EAC. It is also well known that a variety of environmental factors associate with the risk of developing BE and/or EAC. Thus, we assessed the relationship between DNA methylation in the esophagus and known risk factors for BE and EAC using a genome-wide methylation platform.
We also sought to describe the epigenetic differences between males and females in esophageal tissues in light of the known differences in BE and EAC incidence in men vs. women. With respect to demographic and behavioral variables, we were particularly interested in the correlation of BMI and tobacco use with DNA methylation since both are well-established risk factors for BE and EAC.
We assessed the methylation status of more than 485,000 CpG sites located in 99 % of the RefSeq genes in 81 esophageal tissues representative of the stages of esophageal adenocarcinoma development (BE, BE + LGD, BE + HGD, EAC). The annotation of array probes permitted us to determine whether differentially methylated loci were located in specific types of genomic regions (promoter, gene body, or intergenic) and to determine the relationship of differentially methylated loci (DML) to CpG islands (CpG island, shore, shelf, or Fig. 5 Genomic location, relationship to CpG islands, and methylation status of DML when comparing smokers and nonsmokers in esophageal samples. "Hypo" refers to percentage of DML that are hypomethylated in smokers vs. nonsmokers; "Hyper" refers to percentage of DML that are hypermethylated in smokers vs. nonsmokers. On the Y axis, DMLs (%) refers to the percentage of the total DML that are associated with a particular genomic location (a, d) or CGI relationship (b, e). Percentages may up to more than 100 % because some probes were classified with more than one designation. Beta values are equivalent to percent methylation. Note: for all regions, the distribution of hypo/hypermethylated DML compared to the expected distribution (based on all array probes) was not statistically significant. a DML when comparing smoker to nonsmoker BE cases by genomic region. b Location of DML when comparing smoker to nonsmoker BE cases with respect to CpG island location. c Box and whisker plots showing distribution of DML that are hypomethylated in the smoker vs. nonsmoker BE cases (left) and hypermethylated in the smoker vs. nonsmoker BE cases (right). d DML when comparing smoker vs. nonsmoker HGD/EAC cases by genomic region. e Location of DML when comparing smokers vs. nonsmoker HGD/EAC cases with respect to CpG island location. f Box and whisker plots showing distribution of DML that are hypomethylated in the smoker vs. nonsmoker BMI HGD/EAC cases (left) and hypermethylated in the smoker vs. nonsmoker HGD/ EAC cases (right) open sea). Our analysis of the regions outside of promoterrelated CpG islands is notable because an understanding of methylation alterations in areas with relatively low CpG density is becoming increasingly recognized to be important in diseases such as cancer [49,50]. It has been shown that CpG-rich regions (i.e., CpG islands) demonstrate more stable DNA methylation across tissues and cell populations whereas methylation is more dynamic in CpG shores (within 2 kb of a CpG islands) and CpG shelves (within 4 kb of a CpG island). Furthermore, the methylation status of CpG shores and shelves appears to regulate gene expression [29,51].
We initially investigated the relationship between DNA methylation and BMI in esophageal tissues. Elevated body mass index (BMI) is an established risk factor for BE and EAC, and we demonstrated that DNA isolated from individuals with BMI > 30 was differentially methylated at nearly 1000 CpG sites in combined BE, BE with low-and high-grade dysplasia, and EAC tissues when compared to samples from individuals with a low BMI status. Interestingly, nearly 90 % of the DML showed elevated methylation in the high BMI cases, and over 36 % of the total DML were cancer related. There were 20 % more cancer-related DML in the high BMI group than we would expect by chance alone since just 16 % of the total probes on the array are "cancer related" by our criteria as previously described. In the BE cases, DML located in promoters and CpG islands tended to be hypermethylated in those with high BMI which suggests a possible association between methylation and altered gene expression in those with elevated BMI as promoter hypermethylation has been associated with gene silencing [52]; this remains speculative given we did not have associated gene expression data. There was also evidence of altered methylation in BE and EAC samples from obese patients when we looked at differentially methylated regions (DMR), which are genomic regions that have multiple adjacent CpG sites showing Smokers are shown in pink and nonsmokers in blue. a TNXB gene, BE cases. b HOXA4 gene, BE cases. c GFI1, HGD/EAC cases. d CLDN11 gene, HGD/EAC cases concordant methylation changes. DMR are potentially more biologically important than differentially methylated individual CpG dinucleotides because they are indicative of larger scale epigenetic alterations that might be more relevant functionally [36,53].
We were also interested in whether the high BMI BE cases displayed methylation alterations resembling the EAC cases, our rationale being these epigenetic alterations in the obese with BE might be markers for progression to dysplasia or cancer and provide some evidence of a biological role for the genes subjected to aberrant methylation. This was not the case, however, as the high BMI cases clustered more closely with the low BMI cases, not the EAC cases.
We subjected the DML to KEGG, Gene Ontology (GO), and NCI-PID analyses to determine whether particular molecular groups or pathways were associated with the methylation changes in obese individuals with BE, dysplastic BE, or EAC. Among the BE cases, we found epigenetic alterations in the direct p53 effectors pathway in individuals with elevated BMI. This included differentially methylated loci within the RDX gene, which encode a cytoskeletal component that has been shown to inhibit metastasis in gastric cancer [54].
TP53, the gene for p53, is a well-known tumor suppressor gene that is frequently lost early in BE through mutation or loss of heterozygosity (LOH) [55]. TP53 LOH has been shown to identify a subset of BE patients who are at risk for progression to EAC [56,57]. The finding of differential methylation involving the p53 pathway in BE from subjects with high vs. low BMI suggests a relationship between obesity and DNA methylation of cancer-related genes in the esophagus. Similar results have been found in other studies comparing methylation in obese to lean individuals. In a recent study of 345 breast cancer cases, the majority (87 %) of CpG sites analyzed showed elevated methylation in obese patients, particularly in estrogen receptor-positive tumors. Obesity was associated with the aberrant methylation of cancer-related genes involved with the immune response, cell growth, and DNA repair [22]. Several prior studies have compared DNA methylation in whole blood or peripheral blood leukocytes among obese and nonobese individuals [58][59][60]. In two of these studies, the gene HIF3A was found to be hypermethylated in the blood cells and adipose tissue of obese adults, suggesting perturbation of the hypoxia inducible transcription factor pathway in those with elevated BMI.
We were also interested in determining if there were genome-wide differences in esophageal DNA methylation between males and females. Previously, when we used a genome-wide approach to compare methylation in the normal colon between males and females, we found 82 DML between the groups, with females showing increased methylation at 69.5 % of the differentially methylated CpGs [61]. In the present study, we found 402 DML after controlling for age and histology, with 81.3 % showing higher methylation in females. Other studies have shown differences in autosomal DNA methylation by gender in the brain, saliva, and blood [9,62,63]. These results suggest that DNA methylation might function in the differentiation or maintenance of sexual dimorphism. An understanding of tissue-specific gender differences is also important in terms of understanding the role of environmental, behavioral, and demographic factors on alterations in DNA methylation in order to appropriately account for potentially confounding effects of gender [63].
Tobacco smoking is another known risk factor for developing Barrett's esophagus and esophageal adenocarcinoma [64]. The mechanisms accounting for this risk are only partly understood and are believed to involve carcinogen-mediated mutations. Cigarette smoke contains multiple carcinogens which likely exert their effects via the induction of DNA adducts, aberrant DNA methylation and mutation, and chromosomal translocation [65,66]. In order to define the association between tobacco use and aberrant DNA methylation in BE/EAC, we analyzed 54 esophageal samples of various histological types for global alterations in DNA methylation associated with tobacco smoking. We found 256 DML in these tissues between smokers and nonsmokers. Ninety-five percent (95 %) of these DML showed elevated methylation in the smoker group and 41.0 % were cancer related, which is 25 % more cancer-related DML than would be expected by chance alone.
The finding of widespread and frequent hypermethylation in BE, dysplastic BE, and EAC tissues of tobacco smokers suggests that tobacco-related epigenetic alterations may be a mechanism through which tobacco affects the development of BE and EAC. After enriching the DML (smokers vs. nonsmokers) for cancer-related genes, we found the Trk and Shp2 pathways to be differentially activated between these groups; these differences were driven by hypermethylation of the NTRK2 and NTRK3 genes in smokers. The differentially methylated NTRK2 locus, located in a promoter CpG island, demonstrated an average methylation level of 36 % in the HGD/EAC samples from smokers vs. 9 % in nonsmokers. The differentially methylated NTRK3 locus, located in the gene body, showed an average methylation of 85 % in the HGD/EAC samples of smokers compared to 62 % in nonsmokers. We previously found the aberrant methylation of NTRK3 in 60 % of colon adenomas and 67 % of colon adenocarcinomas, suggesting NTRK3 is a novel conditional tumor suppressor gene that is commonly inactivated in colorectal cancer by both epigenetic and genetic mechanisms [67]. NTRK2 has also been shown to be hypermethylated in colon cancers as well as prostate cancer cell lines and cancers [68,69]. With respect to esophageal cancer, NTRK2 was found to have an altered allele frequency in a group of mainly esophageal squamous cell cancers, suggesting a role in esophageal cancer susceptibility and/or development [70]. The effect of DNA methylation on NTRK2 in BE and HGD/EAC is not clear at this time as its expression in BE or HGD is similar to normal esophagus based on publically available gene expression data, whereas NTRK3 is normally overexpressed in EAC (but not BE) vs. normal esophagus (expression level 1.03-1.79; www.oncomine.org).
We did not have mRNA expression data available for these samples to allow us to determine whether methylation alterations were associated with concordant changes in expression, which is a limitation of this study. In order to increase the likelihood that differences in methylation between the groups we studied were biologically relevant, we focused upon cancer-related pathways and pathways known to be involved in obesity and inflammation. Another potential limitation of this study in the EAC cases presumably contained a mix of cell types, including cancer cells, stromal cells, and inflammatory cells. We aimed to reduce the effects of cell heterogeneity by including only samples with >75 % cancer cells and focusing on genes with relatively large differences in methylation.

Conclusions
In summary, we used a microarray-based approach to determine genome-wide methylation profiles of a collection of 81 esophageal specimens, including samples of BE, dysplastic BE, and EAC DNA. With respect to gender, BMI, and tobacco use we found numerous alterations in DNA methylation involving various regions of the genome. These results suggest that obesity and tobacco smoking influence DNA methylation in the esophagus and provide novel insights into the pathways linking these risk factors to the development of BE, dysplastic BE, and EAC.

Methods
Primary tissue samples and sample preparation DNA was extracted from formalin-fixed, paraffinembedded (FFPE) tissue cores obtained from the Department of Pathology at University Hospitals Case Medical Center using the DNAeasy blood and tissue kit (Qiagen #69504) according to the manufacturer's instructions with minor modifications [71]. Protocols were approved by the institutional review board. All samples were reviewed by an expert gastrointestinal pathologist (JEW) prior to processing. The total number of samples prepared was: 21 Barrett's esophagus (BE), 18 Barrett's with low-grade dysplasia (BE + LGD), 18 Barrett's with high-grade dysplasia (BE + HGD), and 24 with esophageal adenocarcinoma (EAC) (Additional file 5: Table S1). We also analyzed 12 cases of esophageal squamous epithelia (SQ) and compared methylation of this sample group to the EAC group to generate a list of "cancer-associated" loci.
Epithelial cell layers were identified and subsequently microdissected from glass slides. For the EAC cases, at least 75 % of each sample contained cancer in order to minimize methylation differences that might be due to cellular heterogeneity. After extraction, the DNA concentration was determined using the Quant-iT Pico-Green dsDNA assay kit (Invitrogen/Life Technologies, #P7589), and DNA quality was confirmed using the Illumina FFPE QC kit (Illumina, #WG-321-1001). Next, a total of 250 ng of each sample was sodium bisulfite converted using the EZ DNA methylation kit following the manufacturer's protocol (ZymoResearch, #D5002), and then the DNA samples were treated with the Infinium HD FFPE DNA restore kit to repair any degraded DNA (Illumina, #WG-321-1002). Bisulfite-converted, restored DNA was submitted to the Genomics Core at the Fred Hutchinson Cancer Research Center (FHCRC) for processing, application, and scanning on the Human-Methylation450 (HM450) BeadChip following the manufacturer's instructions (Illumina #WG-314-1003; http:// www.Illumina.com).

Genome-wide methylation arrays
HM450 BeadChips were used to analyze patterns of DNA methylation in 81 of the esophageal samples listed above. We followed our previously validated protocols for data filtering, normalization, and differential methylation analysis [61,72] with the following modifications or clarifications: probes with detection p value >0.05, probes on the X chromosome, and probes containing at least one SNP with low minor allele frequency (MAF = 0) in the probe body were filtered out. After filtering, a total of 453,444 probes were available for downstream analysis. The ComBat algorithm was used to correct known batch effects across the three different microarray experiments while retaining the expected variation between the different histological tissue types [73,74]. Data was analyzed using both "β values," where 0.0 is equivalent to 0 % methylation and 1.0 is equivalent to 100 % and "M values" which are logarithmic scores similar to those used in gene expression microarrays. We performed clustering analysis using the 3000 most highly variable loci when considering all BE, LGD, HGD, and EAC cases assessed using the HM450 array. We used the limma and minfi Bioconductor packages to compute a refined F-statistic to quantify the difference in DNA methylation based on a probe's M-value between sample types. We used a false discovery rate (FDR) q value to determine the significance of differentially methylated loci (DML) and considered loci to be differentially methylated if q < 1 × 10 −5 [75]. Cancer-associated loci were those that showed differential methylation when comparing EAC and squamous (SQ) samples (q < 0.001).