Validation of an epigenetic field of susceptibility to detect significant prostate cancer from non-tumor biopsies

Background An epigenetic field of cancer susceptibility exists for prostate cancer (PC) that gives rise to multifocal disease in the peripheral prostate. In previous work, genome-wide DNA methylation profiling identified altered regions in the normal prostate tissue of men with PC. In the current multicenter study, we examined the predictive strength of a panel of loci to detect cancer presence and grade in patients with negative biopsy tissue. Results Four centers contributed benign prostate biopsy tissues blocks from 129 subjects that were either tumor associated (TA, Grade Group [GG] ≥ 2, n = 77) or non-tumor associated (NTA, n = 52). Biopsies were analyzed using pyrosequencing for DNA methylation encompassing CpG loci near CAV1, EVX1, FGF1, NCR2, PLA2G16, and SPAG4 and methylation differences were detected within all gene regions (p < 0.05). A multiplex regression model for biomarker performance incorporating a gene combination discriminated TA from NTA tissues (area under the curve [AUC] 0.747, p = 0.004). A multiplex model incorporating all the above genes and clinical information (PSA, age) identified patients with GG ≥ 2 PC (AUC 0.815, p < 0.0001). In patients with cancer, increased variation in gene methylation levels occurs between biopsies across the prostate. Conclusions A widespread epigenetic field defect is utilized to detect GG ≥ 2 PC in patients with histologically negative biopsies. These alterations in non-tumor cells display increased heterogeneity of methylation extent and are spatially distant from tumor foci. These findings have the potential to decrease the need for repeated prostate biopsy.


Background
Prostate cancer (PC) is the most frequently observed cancer in men, with approximately 1 in 6 diagnosed in their lifetime [1]. Despite its high incidence, PC detection remains clinically challenging. Typically, prostatespecific antigen (PSA) is used to detect PC, and if abnormal, a 10-12 core biopsy is obtained under ultrasound guidance. Over 40% of patients with a negative biopsy receive a second biopsy, and many will receive additional biopsies in an effort to detect this microscopic disease [2]. Indeed, repeat biopsies account for roughly 780,000 of the 1.2 million biopsies done annually. MRI has improved the detection of larger volume cancers, but roughly 30% of significant PCs remain undetected by this approach [3]. The biology of this common, multifocal, and microscopic disease presents unique genomic opportunities to improve its detection.
The concept of a field defect, which can explain the multifocality of some cancers, including prostate, colon, and bladder [4][5][6], suggests that preneoplastic molecular alterations may exist in benign tissues [2]. The predilection of PC for the peripheral zone of the prostate and its frequent multifocality suggests a field of susceptibility.
This field change strongly links to epigenetic alterations, the initial finding being a loss of genomic imprinting for the insulin-like growth factor-2 (IGF2) gene [7,8]. It is also characterized by a panel of DNA methylation changes at specific loci that persist even in regions spatially remote (over 1 cm) from tumor-bearing areas [4]. Because of the widespread nature of these methylation changes in normal tissue, their use may offer increased sensitivity over diagnostic approaches using methylation associated with peritumor or "halo" alterations found in some benign tissues adjacent to cancer [2]. This field of susceptibility offers an opportunity for improved detection of the disease. The primary objective of this study was to further define these methylation patterns in tissue biopsies and validate a panel of methylated regions as a method for detecting higher risk PC in men with histologically negative biopsies. Table 1 displays clinical data for the study cohorts. Benign biopsy cores were obtained from two cohorts: (1) Tumor-associated (TA) patients diagnosed with GG ≥ 2 cancer (n = 77), and (2) non-tumor-associated (NTA) patients (n = 52) who had no cancer on any biopsy core. None of the analyzed biopsy cores from either cohort contained cancer. All TA patients also had a radical prostatectomy to confirm a final pathologic Grade Group (GG ≥ 2) consistent with the study's goal of focusing on the detection of clinically significant PC. TA and NTA cohorts are matched for age and PSA density. PSA (7 vs 5.8; p < 0.01) and prostate size (47 g vs 36 g; p < 0.01) are increased in the NTA group compared with the TA group demonstrating the limited potential of PSA in detecting cancer in this population.  T2a  ---6  6  ---T2b  ---9  9  ---T2c  ---39  39  ---T3a  ---18  18  ---T3b  ---5  5 ---*Some samples are missing data; TA, tumor associated; NTA, non-tumor associated; PSA, prostate-specific antigen; BMI, body mass index. All data represented as mean (range) unless otherwise specified Methylation assay performance for two biopsies in discriminating tumor-associated from non-tumorassociated samples

Patient characteristics
Utilizing bisulfite-treated DNA and pyrosequencing, linear results using standards are seen across the clinically pertinent methylation ranges for each of the loci tested validating this testing approach (Additional file 1: Table  S1). We observe robust methylation differences between NTA and TA prostate biopsies across all regions associated with EVX1, CAV1, PLA2G16, and SPAG4 (hypermethylation) and FGF1 and NCR2 (hypomethylation) at all CpGs assayed validating our previous exploratory studies [4,9]. Mean, maximum, and minimum methylation levels were compared between the two biopsies for both TA and NTA tissues (Additional file 1: Tables S2-S4). Using maximal methylation values shows improved statistical significance in differentiating TA samples at hypermethylated loci, while minimal methylation values improve hypomethylated regions. The predictive accuracy of these genes was assessed with regression models using each gene alone (uniplex) or in combination (multiplex) in Table 2. In uniplex models when examining CpGs tested, 6 out of 6 EVX1, 2/10 CAV1, 1/5 FGF1, 1/3 NCR2, 5/6 PLA2G16, and 2/5 SPAG4 show strong predictive accuracies (area under the curve [AUC] 0.61-0.71, p < 0.05, Table 2). As a single marker, EVX1_ CG1 generates the best AUC of 0.710, (p = 0.001).
To determine whether a panel performed better than any single biomarker, we performed a multiplex analysis (Table 2). First, the collinearity of individual CpG sites using correlation matrices for every CG in each gene was assessed. Since CG sites correlated highly, only one CG with the highest predictive value (AUC) per gene was selected to enter multivariate logistic regression models to prevent overfitting. On multivariate analysis, the genes hypermethylated in TA, Max_CAV1-10, Max_EVX1-1, Max_PLA2G16-5, Max_SPAG4-2, and the genes with hypomethylated in TA, Min_FGF1-3, and Min_NCR2-2 entered the model ( Table 2 and Fig. 1). The predictive accuracy with pan-biomarkers for discriminating TA from NTA tissues was 0.747 (p = 0.004).
The predictive accuracy of adding clinical features was assessed using regression models. Only age and logPSA are significant with an AUC of 0.631 (p = 0.005). We used logPSA to minimize the effect of extreme PSA values. A multiplex model incorporating the pan-biomarkers (6 genes) and clinical information (logPSA and age) identified patients with PC GG ≥ 2 with a high predictive accuracy (AUC 0.815, p < 0.0001, Table 3 and Fig. 1).
An alternate statistical analysis using the leave-one-out approach generated a multivariate marker model with the highest AUC of 0.679 (95% CI 0.5868-0.7726). This stepwise selection left only EVX1_CG1 in the model. When age and PSA were included in the final model with EVX1_CG1, the AUC was 0.740 (95% CI 0.6513-0.8287).
Comparing the ability of these markers to differentiate high-(GG ≥ 4) versus low-grade (GG1) cancer was performed in an additional cohort of histologically normal biopsy cores (n = 53 and n = 52 respectively, [Additional file 1: Table S5]) undergoing prostatectomy. NCR2 alone differentiated high-from low-grade cancer at multiple CGs (Additional file 1: Table S6).
Methylation assay at multiple biopsies reveals greater heterogeneity across histologically normal prostate tissues in tumor-associated samples than non-tumor samples To assess the uniformity of the methylation field effect, we compared methylation differences at several loci across benign biopsy blocks. Interestingly, prostate tissues not containing cancer show less variation in methylation at the tested loci between two biopsies (e.g., greater R correlation value) than TA samples at the majority (91%) of CGs tested (Additional file 1: Table S7). We expanded this analysis in a subset of 56 subjects with four or more biopsy blocks (28 TA and 28 NTA).
Methylation patterns of patients using ≥ 4 biopsies again show more variation in men with cancer than without ( Fig. 2a-f). We performed the coefficient of variation (CV) to quantify the variability among each of the patients with four samples using a one-way ANOVA. The CVs of the TA group were significantly higher than the NTA group in EVX1, CAV1, FGF1, and PLA2G16   (Fig. 2a). Figure 2b-f listed the individual methylation value of every biopsy from each patient. These data indicate that methylation is more heterogeneous in histologically normal prostate tissues associated with tumor (TA) compared with NTA samples. Given these findings, we questioned whether examining additional biopsy blocks would increase the ability to detect the presence of cancer. In uniplex modeling, of the CpG sites tested, 6 out of 6 EVX1, 3/10 CAV1, 4/5 FGF1, 5/6 PLA2G16, and 4/5 SPAG4 showed improved predictive accuracy (p < 0.05, AUCs > 0.6, Table 4). EVX1_CG2 alone showed the best predictive value (AUC 0.741, p = 0.001). Multiplex models using one CG with the highest AUC per gene increased the AUC to 0.774 (p = 0.0004, Table 4).

Discussion
An epigenetic field of cancer susceptibility occurs in aging-related cancers [6,10] and is especially marked in men with PC [4,7,8]. The field effect that arises with Fig. 2 Heterogeneity of methylation between biopsy samples from patients with associated cancer versus those without. Pyrosequencing was performed on biopsy samples as described. a Mean value of coefficient of variations from 4 samples for each patient in different cohort. The coefficient of variation (CV) was performed to quantify the variability among each of the patients with four samples using a one-way ANOVA, p < 0.05 was considered significantly different between the TA and NTA groups. b-f This decreased clustering is noted when additional biopsies (4+) are compared at discrete loci. One CG with the highest predictive accuracy for each gene was selected, ten patients from each group were presented, and the error bar is shown as mean ± SE and contributes to the development of cancer can be exploited to detect or rule out cancer. The standard approach for PC diagnosis includes histopathological examination of prostate biopsy tissue, but this approach contains a high false-negative rate due to sampling errors. As a result, many men undergo repeated biopsies. Our group has discovered that epigenetic alterations exist not only in the tumor tissue, but also at distance in the histologically benign tissue from patients with PC [4,9]. Using subjects from multiple institutions, we generated an assay that predicts the presence of PC in histologically benign biopsies. Furthermore, we find that these gene methylation patterns display more heterogeneity in men with cancer elsewhere in the gland than men without suggesting variations in the methylation field effect.
In the current study, EVX1, CAV1, PLA2G16, and SPAG4 were hypermethylated and FGF1 and NCR2 were hypomethylated in TA samples. Each single CG demonstrates robust predictive capabilities. A combination of markers incorporating the six genes allowed for even stronger predictive accuracy (AUC = 0.747). By combining the epigenetic assay with clinicopathological features (PSA, age), the predictive power for PC detection by these field defect markers was improved even more (AUC = 0.815). At each point along the ROC curve, the multiplex model performed better than gene marker or clinical factors alone. Using a cutoff value (PTA of 25%) for the multiple marker combined with PSA and age to detect the presence of cancer yields a 97% sensitivity and 16% specificity, positive predictive value (PPV) of 63%, and negative predictive value (NPV) of 80%.
The current marker analyses indicate that the methylation patterns from patients with cancer are more heterogeneous across the prostate than those found in patients without cancer. This interesting finding gives a window into the biology of the multifocal nature of this disease and proves useful in improving the ability to discriminate risk of associated cancer. Because the methylation patterns in men with associated cancer vary, using maximum or minimum metrics increased the predictive value compared with using the average metric (Additional file 1: Table S2-S4). In determining cervical cancer risk, CpGs exhibiting heterogeneous outlier methylation profiles can improve diagnosis [11]. In breast cancer, DNA methylation outliers in normal breast tissue identify field defects that are enriched in women with cancer [12]. Recent work in prostate suggests that clonal basal stem cells migrate from periurethral ducts [13]. This may give rise to the observed variation in field methylation and contribute to heterogeneity between multifocal PCs.
The current approach employed two biopsy blocks for diagnosis in contrast to other reports that rely on more extensive core analysis (> 5) [14]. Given the finding of this heterogeneity, we increased the number of biopsies analyzed across the prostate gland and find that it improves the assay accuracy in the subset of subjects with this information. In a uniplex model, all CGs demonstrated improved predictive accuracy and increasing AUC values with four or more biopsy blocks compared with two ( Table 4). As we obtained four biopsy blocks from 43% of the patients, we did not perform further analyses incorporating this approach with clinical features. Of note, an additional comparison of methylation at these loci between indolent (GG = 1) and aggressive (GG ≥ 3) cancers had to be performed in an independent cohort as all patients in the trial had GG ≥ 2. We observed decreased methylation levels between GG1and GG4/5 (Additional file 1: Table S6) at multiple CGs for the hypomethylated gene NCR2. Further testing on larger cohorts containing a range of cancer grades will be required to evaluate this aspect more definitively. One factor that makes it difficult to determine an absolute absence of cancer in any cancer detection study and may reflect the lower NPV is that cancer is often difficult to detect even with multiple biopsies as a trial criterion. Our study required at least two negative biopsy sets (24+ cores) to be obtained for entry, and the majority of patients had a negative MRI (62%). Two negative biopsies (without imaging) decrease the risk of missed prostate cancer to less than 9% in previous work [15]. We followed the NTA group over an extended period of time (2+ years) as well. Because PSA is elevated by both cancer and prostate enlargement, and this elevation drives prostate biopsy, the NTA group demonstrates increased size compared with the tumor-associated group (Table 1).
In addition, discrepancies in the way biopsies are obtained between and within institutions might affect methylation values across samples. Biopsies encompassing tissue from the central zone of the prostate, seminal vesicle, or bladder may alter methylation levels due to the inclusion of other tissue types. Heavily inflamed samples were excluded from the current study to avoid this confounding factor. The cell of origin for the methylation changes was not determined by microdissection of the sample since the goal was to evaluate the whole tissue field defect as a marker for cancer presence. Alterations in genomic imprinting of the IGF2 gene, which marks this field defect, appear in the epithelial component [8,16].

Conclusions
This field effect improves the detection of PC as demonstrated by application of a methylation assay. Additionally, these abnormalities occur in benign tissue distant from the cancer foci and vary across the normal tissue in the prostate gland. The methylation status of the above biomarkers distinguishes between TA and NTA prostate tissues, marking a field of susceptibility associated with the development of PC.

Tissue samples and histopathology
Individual medical centers obtained institutional review board approval exemption or waiver for the use of archived clinical samples for research purposes. Nontumor-associated (NTA) control subjects (n = 77) had two or more consecutive negative sets of biopsies within 24 or greater months. Tumor-associated (TA) samples were from 52 patients diagnosed with PC who had undergone radical prostatectomy and final pathology was available for grade confirmation. On final pathology, all cancer samples were Gleason Score (GS) ≥3 + 4 = 7 (Grade Group (GG) ≥ 2), considered clinically significant cancer. Other inclusion criteria involved 10-12 total cores per biopsy (separated into distinct regional zones) collected no earlier than 2011, PSA between 3 and 15 ng/mL, and age 50-70 years old. At least two biopsy blocks were requested with each block containing 1-2 biopsy cores and an effort was made to take the normal tissue from the contralateral side away from the detected cancer to avoid contamination. Requested data included ethnicity, family history of PC, positive or negative digital rectal exams, prior negative prostate biopsy, and body mass index. Prostate size was calculated by ultrasound. A total of 176 patients were initially collected, of which 47 (26.7%) were excluded because of the failure to undergo sextant biopsy (n = 46) or insufficient biopsy material (n = 1) leaving 129 subjects for analysis.
For all specimens, a five-micron section was cut from the non-tumor blocks provided, hematoxylin and eosin (H&E) stained, and centrally reviewed by a fellowship trained genitourinary pathologist (Dr. Wei Huang). Samples with extensive high-grade intraepithelial neoplasia (HGPIN) or atypical small acinar proliferation (ASAP) were excluded.

Quantitative pyrosequencing
Ten-micron sections were utilized to make DNA from each block. DNA isolation and sodium bisulfite modification were performed according to the manufacturer's protocol using the EpiTect Plus FFPE Bisulfite Kit (Qiagen, CA, USA). Bisulfite-modified DNA was then amplified using PCR in preparation for pyrosequencing, with either biotinylated forward or reverse primer. All PCR and sequence primers for pyrosequencing were designed using PyroMark Assay Design 2.0 (Qiagen), have been previously described (4). PCR products were captured with streptavidin sepharose beads, denatured to single strand, and annealed to the sequencing primer for the pyrosequencing assay. Human Premixed Calibration Standard with different percentage of methylation (EpigenDx, Hopkinton, MA), human white blood cell DNA, and SssI methylase-treated DNA from human PC cells-Du145 were used as controls in each run. Methylation was quantified with the PyroMark MD Pyrosequencing System (Qiagen) within the linear range of the assay. All samples were analyzed using two independent experiments.

Statistical analyses
All samples were run in duplicate (two independent experiments) and the two methylation percentage values were averaged. For the cohorts in Table 1, since there are two biopsy tissue blocks from each patient, three metrics (mean, maximum, and minimum) were used to determine significant differences between NTA and TA cohorts. Mean values for each marker were calculated by averaging the methylation of all samples for that cohort. Maximum and minimum values for each marker were calculated by selecting the highest (or lowest) methylation percentage for each patient. At each CpG, a t test was performed to analyze the significant differences between NTA and TA groups.
All metrics which significantly differentiated NTA from TA (p < 0.05) were entered into a univariate logistic regression model to test their ability to predict the presence of cancer. Area under the curve (AUC) values and p values were calculated. The collinearity of individual CpG sites was also assessed using correlation matrices for each gene. Since CG sites correlated highly, only one CG with the highest AUC value for the univariate per locus was selected to enter multivariate logistic regression models to prevent over fitting. The univariate logistic analysis was also performed using clinical factors. Finally, multivariate logistic regression analysis for the performance of biomarkers combined with clinical factors was done. A two-sided p value of < 0.05 was considered significant for all hypothesis tests.
A separate statistical approach was used for cross validation of the performance of these biomarkers. The AUCs and the 95% confidence intervals (CI) using DeLong's method were computed using R 3.4.2 and the pROC package. The CGs with the highest AUC within each gene were selected for further evaluation in a multivariate model of markers, with only one CG per gene included at a time. Stepwise selection was utilized in SAS 9.4. AUC was calculated with 95% CI using the leave-one-out method for validation. The final model included the combination of markers yielding the highest AUC along with age and PSA (with a logarithmic transformation). Drs. Glen Leverson and Kaitlin Woo performed statistical analyses for this manuscript using SAS v.9.4 (SAS Institute, Cary, NC, USA).
Additional file 1: Table S1. R2 linearity values for methylation pyrosequencing assay at each gene locus. Table S2. Mean methylation value (%) with SD for two prostate biopsies. Table S3. Maximum methylation value (%) with SD for two biopsies. Table S4. Minimum methylation value (%) with SD for two biopsies. Table S5. Clinicopathological features of Grade Group = 1 and Grade Group 4/5. Table S6. Comparing the ability of the markers to different GG 1 vs GG4/ 5. Table S7. Estimated R correlation between two biopsies.