Lymph node metastasis (LNM) is an important factor for both treatment and prognosis of early gastric cancer (EGC). Current methods are insufficient to evaluate LNM in EGC due to suboptimal accuracy. Herein, we aim to identify methylation signatures for LNM of EGC, facilitate precision diagnosis, and guide treatment modalities.
For marker discovery, genome-wide methylation sequencing was performed in a cohort (marker discovery) using 47 fresh frozen (FF) tissue samples. The identified signatures were subsequently characterized for model development using formalin-fixed paraffin-embedded (FFPE) samples by qPCR assay in a second cohort (model development cohort, n = 302, training set: n = 151, test set: n = 151). The performance of the established model was further validated using FFPE samples in a third cohorts (validation cohort, n = 130) and compared with image-based diagnostics, conventional clinicopathology-based model (conventional model), and current standard workups.
Fifty LNM-specific methylation signatures were identified de novo and technically validated. A derived 3-marker methylation model for LNM diagnosis was established that achieved an AUC of 0.87 and 0.88, corresponding to the specificity of 80.9% and 85.7%, sensitivity of 80.6% and 78.1%, and accuracy of 80.8% and 83.8% in the test set of model development cohort and validation cohort, respectively. Notably, this methylation model outperformed computed tomography (CT)-based imaging with a superior AUC (0.88 vs. 0.57, p < 0.0001) and individual clinicopathological features in the validation cohort. The model integrated with clinicopathological features demonstrated further enhanced AUCs of 0.89 in the same cohort. The 3-marker methylation model and integrated model reduced 39.4% and 41.5% overtreatment as compared to standard workups, respectively.
A novel 3-marker methylation model was established and validated that shows diagnostic potential to identify LNM in EGC patients and thus reduce unnecessary gastrectomy in EGC.
Early gastric cancer (EGC), with an invasion depth limited to mucosa or submucosa, accounts for approximately 10–20% of gastric cancer [1, 2]. Lymph node metastasis (LNM) status is one of the most important clinical factors affecting the prognosis of gastric cancer; the incidence of LNM in EGC is about 8–25% [3, 4]. Endoscopic submucosal dissection (ESD) and endoscopic mucosal resection (EMR) are the mainstream approaches for LNM treatment in low-risk EGC patients, due to the minimally invasive, function-preserving, en bloc resection, limited trauma, and maintenance of a good quality of life [5, 6]. However, for EGC patients at high risk of LNM, radical gastrectomy with a lymphadenectomy is usually adopted. However, it could lead to various post-gastrectomy complications that include anastomotic leakage, bleeding, stricture, delayed gastric emptying, reflux esophagitis, residual food, and reduced quality of life postoperatively [5, 6]. Therefore, precise assessment of lymph node metastatic status in EGC plays a critical role in the treatment decision making.
Currently, LNM is diagnosed mainly by imaging methods, such as endoscopic ultrasonography, computed tomography (CT), positron emission tomography with CT (PET-CT), or by evaluating clinicopathological features after endoscopic biopsy, including submucosal invasion, ulceration, undifferentiated type, and lymphovascular invasion status [7,8,9]. However, the accuracy and reliability of these methods are unsatisfactory, leading to overtreatment and unnecessary gastrectomy in a large portion of EGC patients [10,11,12]. Post-gastrectomy pathological evaluation showed that about 80% of EGC patients with negative lymph node metastasis were treated unnecessarily with radical gastrectomy [10, 11]. This suggests that the current standard of care in the clinical setting for LNM diagnosis is inadequate and it is imperative to develop novel methods to accurately determine LNM status and improve the quality of life in patients with EGC.
DNA methylation is one of the most important epigenetic modifications. A growing number of studies have shown that DNA methylation plays a prominent role in tumorigenesis and progression [13, 14]. Abnormal DNA methylation occurs before the clinical symptoms of the disease become apparent and often leads to gene misexpression . With the development of high-throughput technologies, cancerous genome-wide methylation data have been used to study potential markers of early diagnosis, prognostic assessment, progression monitoring, and chemoradiotherapy sensitivity . To accurately assess the possibility of LNM in EGC, numerous studies have reported different prediction models, which are constructed mainly based on clinicopathological features [17, 18]. To our knowledge, genome-wide DNA methylation mapping and modeling prediction using methylation markers for LNM in EGC have not yet been reported.
Our previous studies have shown that a genome-wide DNA methylation approach can be applied to the diagnosis of bladder cancer and the identification of benign and malignant pulmonary nodules [19, 20]. In this study, we performed a DNA methylation profiling of LNM in EGC patients and developed a methylation test for LNM diagnosis.
Methods and materials
Study design and patient recruitment
A three-phase strategy was designed in our study (Fig. 1) which included a marker discovery cohort (n = 47, fresh frozen (FF) tissue samples), a model development cohort (n = 302, formalin-fixed paraffin-embedded (FFPE) samples), and a validation cohort (n = 130, FFPE samples). The genome-wide methylation sequencing was applied using FF samples to identify LNM-specific methylation markers which were subsequently validated by a qPCR assay. The identified and validated methylation markers were further characterized in the model development cohort using FFPE samples as the same sample type in a practical clinical setting. The diagnostic model developed was further validated and compared to imaging diagnostics, clinicopathology-based model (conventional model), and current standard workups in the validation cohorts. An overview of the patient recruitment workflow is described in Additional file 1: Figure S1. Patients with treatment-naïve EGC were enrolled from Nanfang Hospital (n = 436, 47 fresh frozen FF samples and 389 FFPE samples) and Shenzhen People's Hospital (n = 189, FFPE samples) between January 2015 and November 2020. Samples with failed experimental QCs (n = 146) were excluded from the study. The tissue samples from the EGC patients were surgical specimens and collected before radiation or chemotherapy. The tumor content over 30% of the FFPE samples was confirmed by pathologists. The pathology and LNM status of the samples were confirmed by at least two gastrointestinal pathologists. The clinicopathological characteristics of all patients inducing gender, age, tumor size, tumor location, differentiation, invasional depth, ulceration, and lymphovascular invasion (LVI) are summarized in Table 1.
Discovery of differentiated methylation markers
To identify potential markers, we gathered 47 FF samples of EGC. There were 23 cases of LNM+ tumor and 24 cases of LNM− tumor. Sample genomic DNA was individually constructed genome-wide methylation library using TruSeq® Methyl Capture EPIC Library Prep Kit (Illumina, USA, Catalog No. FC-151-1002) following the instructions; we refer to the latter as EPIC. The detailed patient clinicopathological features in EPIC genome-wide methylation libraries are shown in Additional file 2: Table S1. After EPIC libraries were tested by Agilent High Sensitivity DNA Kit (Agilent, USA, Catalog No. 5067-4626) for quality assurance, high-throughput sequencing was performed on Illumina's X-Ten platform. The sequencing data processing methods are detailed in Additional file 2: Methods.
DNA extraction, bisulfite treatment, and methylation analysis by qPCR
Genomic DNAs were extracted from the FF specimens and FFPE tissue samples using the AllPrep DNA/RNA Mini Kit (Qiagen, Germany, Catalog No. 80204) and AllPrep DNA/RNA FFPE Kit (Qiagen, Germany, Catalog No. 80234) following the manufacturer’s instruction, respectively. Both genomic DNAs were quantified by the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, USA, Catalog No. Q32851). The quality control criteria of the EGC samples required that the DNA amount was greater than 100 ng and the main bands from the agarose gel electrophoresis were above 500 bp. Bisulfite treatment was implemented using 50 ng of genomic DNA of each FFPE tissue sample with the EZ-96-DNA Methylation-Direct MagPrep Kit (Zymo Research, USA, Catalog No. D5044) according to the manufacturer’s recommendations. Subsequently, a 50-marker EGC-LNM DNA methylation panel (Additional file 2: Table S2) was designed and used to characterize the methylation patterns in EGC-LNM patients with the EGC-LNM detection kit (AnchorDx, China, Catalog No. EGME-002). The methylation analysis by MethyLight approach was described earlier (details are in Additional file 2: Methods) . The qPCR methylation analysis was performed on the Quant Studio 3 Real-Time PCR System (Thermo Fisher, USA). Then, the diagnostic model of LNM in EGC was established and validated based on methylation-specific qPCR data.
Methylation model development and validation
432 FFPE samples were randomly divided into modeling development cohort (n = 302) and validation cohort (n = 130) at a ratio of approximately 7:3. The cohort division was blinded to the methylation test results. The model development cohort (n = 302) was further randomly split into 50% training and 50% testing sets with a 20-fold validation. The identified 50 markers were analyzed with the least absolute shrinkage and selection operator (LASSO) algorithm to determine the minimum marker requirement and select top markers. The selected top markers were further used for model construction with logistic regression algorithm by iterative marker combination analysis in the model development cohort. A validation (n = 130) cohort was used to independently test the final model. Sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were then evaluated.
Development and evaluation of the conventional model and integrated model
The 8 clinicopathological variables were included in the univariate analysis to explore the association with LNM in the model development cohort, and variables with a p value less than 0.05 were included in multivariate analysis for the conventional model. Forward stepwise regression analysis evaluated odds ratio (OR) values with a 95% CI to identify independent predictors. The integrated model was built according to independent predictors and the 3-gene methylation signature. Tolerance and variation inflation factors were used to evaluate the multicollinearity of multivariate models. Based on both multivariate logistic regression models, two quantitative scoring formulas were derived and the area under the receiver operating characteristic curve (AUROC) was measured. (Details are in Additional file 2: Methods.)
Wilcoxon signed-rank test or Mann–Whitney U test were used to analyze epigenome methylation data. Student t test was used to evaluate the distribution of risk scores among different test groups. The χ2 test or Fisher's exact test and two-tailed t test were used to compare categorical and continuous variables, when appropriate. Logistic regression-based model constructions were conducted using R glmnet (2.0.16) packages. Other details of the statistical analyses are described in Additional file 2: Methods.
Genome-wide screening of DNA methylation markers to detect LNM in EGC tissue samples
A schematic workflow of the study design is shown in Fig. 1. To identify DNA methylation markers that are LNM-specific in EGC, we first performed a genome-wide methylation analysis (covering more than 3.34 million CpG sites) on 23 lymph node metastasis positive (LNM+) and 24 lymph node metastasis negative (LNM−) FF tissue samples. A total of 1366 differential methylation CpG sites were found (Additional file 1: Figure S2, FDR < 0.05 and β-value difference ≥ 0.2). Based on the methylation sites, we further identified 60 differential methylated regions (hereafter referred to as the “markers”) by using co-methylation region analysis as previously reported . An unsupervised heretical clustering showed a clear differential pattern between the LNM+ and LNM− patients (Fig. 2a). Of the 60 candidate methylation markers, 40 markers were hypomethylated in the LNM+ group including markers of LAPR5, DLEU1, FCGBP, CBLN4, GNAS, PCDHGB7, NUPR2|LOC650226, LOC646214|CXADRP2, EPS8L1, KCNS1, CCDC166, IRX6, FENDRR, SLC13A5, HOOK2, PEG3, UNC80, KIAA1211L, FOXI2, NCAM2, SLIT2, WI2-237311.2, IGFBP3|TNS3, BTBD11, MICU3, F7, MDGA2|MIR548Y, HS3ST2, LNC00982, BHLHE23, IRX2, SLC35F1, TBX18, CALN1, KRT7|KRT81, two CDH4 gene regions, and four CCDC166 gene regions. There were 20 hypermethylated markers including MEIG1|OLAH, PDTSS2, TGFB1L1, ZBTB7A, IRX1, SLCO5A1, CA6|SLC2A7, ECHDC2, COL9A3, ARPC1B, LMBR1|NOM1, CPSF1, DPP10, ZNF704|PAG1, MAT2B|LOC101927835, PRICKLE1, and four IRF2BP1 gene regions (Fig. 2a).
Our primary goal was to develop a simple methylation-specific qPCR assay for LNM status determination . The 60 markers were further validated technically using the same FF samples by a qPCR approach. Among these markers, 50 markers showed consistent methylation patterns between sequencing and methylation-specific qPCR analysis, and significantly distinguished LNM+ from LNM− in the same samples. However, 10 markers were excluded due to failed technical validation with inconsistent methylation pattern between the two assays (Fig. 2b–d, Additional file 1: Figures S3 and S4). These results suggested that these markers and qPCR-based assays were reliable and could be used for large-scale cohort analysis.
Development and validation of a 3-marker methylation model for LNM diagnosis
Since in a practical clinical setting, the EGC sample acquired is endoscopic sectioned FFPE samples, we further characterized the 50 methylation markers identified from FF samples by the same qPCR assays in a model development cohort which consisted of 302 FFPE EGCs. To improve the assay diagnostic efficiency and reduce marker redundancy, the least absolute shrinkage and selection operator (LASSO) algorithm was used to determine the minimum number of markers required for maintaining stable diagnostic power and select the corresponding top markers from the 50 candidates. A marker number of five was used for further analysis, and the resulted top 5 markers were subjected for further model development. Methylation models containing any 1–5 markers were iteratively constructed using logistic regression algorithm. By comparing the performance and the performance consistency in 100 random splits of datasets with a train—test ratio of 1:1, a 3-marker methylation model was derived. The 3-marker methylation model, comprising of GNAS, FCGBP, and CCDC166, achieved high AUCs of 0.84 (95% CI 0.74–0.94) and 0.87 (95% CI 0.80–0.93) in the training and test sets, respectively (Fig. 3a, b, Additional file 1: Figure S5a and S5b). The model showed consistent specificities of 78.3% and 80.9%, sensitivities of 80.6% and 80.6%, and accuracies of 78.8% and 80.8% in the training and test datasets, respectively (Fig. 3c). Notably, LNM+ patients showed significantly higher LNM risk scores, calculated from the model, than LNM− patients in both training and test sets (Fig. 3d, e, p < 0.001).
The model was further validated in an independent cohort consisting of 30 LNM+ and 98 LNM− patients. It achieved an AUC of 0.88 (95% CI 0.80–0.95), sensitivity of 78.1%, specificity of 85.7%, and accuracy of 83.8% (Fig. 3c, g, Table 2, Additional file 1: Figure S5c). Consistent with the results from the model development cohort, the model showed a significantly higher LNM risk score in the LNM+ patients as compared to LNM− patients (Fig. 3f). We then assessed whether risk scores were associated with clinical characteristics. We found that the LNM risk scores were significantly higher in patients with ulceration, undifferentiation, submucosal invasion, and lymphovascular invasion in the validation cohort (Fig. 3h), indicating the LNM risk scores were associated with the known reported LNM risk factors. On the other hand, the risk score did not vary significantly in EGC patient groups of different age, gender, tumor size, and tumor location (Additional file 1: Figure S6). Taken together, the 3-gene methylation model showed an accurate and robust performance in discrimination for LNM in EGC.
The 3-marker methylation model outperformed CT imaging and clinicopathological features for LNM diagnosis
In standard clinic settings, CT imaging and clinicopathological factors are used routinely to diagnose LNM and to assess the clinical N stage in patients with EGC before radical treatment. It is well known that clinicopathological features including tumor size, lymphovascular invasion, invasional depth, ulceration, and differentiation type are well-established predictor for the incidence of nodal metastasis for EGC [5, 23]. A univariate analysis was performed for each variable in the model development cohort. Variables of age of under 60 years old (OR 2.559, 95% CI 1.447–4.525, p = 0.001), submucosal invasion (OR 3.365, 95% CI 2.008–6.578, p < 0.001), tumor size larger than 20 mm (OR 1.625, 95% CI 1.044–2.538, p = 0.032), undifferentiated type (OR 3.878, 95% CI 1.834–8.200, p < 0.001), lymphovascular invasion (OR 11.950, 95% CI 5.525–25.825, p < 0.001), ulceration (OR 2.758, 95% CI 1.603–4.744, p < 0.001) were the risk factors significantly associated with LNM. Compared to these risk factors, the 3-marker methylation model indicated significantly higher OR value (OR 16.131, 95% CI 8.289–31.392, p < 0.001) (Table 3). Accordingly, we compared the performance of the 3-marker methylation model with CT imaging and these clinicopathological features for EGC LNM diagnosis. Of interest, we found that the diagnostic performance of the 3-marker methylation model (AUC 0.85, 95% CI 0.77–0.91) was significantly higher than CT imaging (AUC 0.60, 95% CI 0.51–0.69; p < 0.0001), differentiation (AUC 0.62, 95% CI 0.55–0.69; p < 0.0001), invasional depth (AUC 0.65, 95% CI 0.59–0.72; p < 0.0001), lymphovascular invasion (AUC 0.66, 95% CI 0.58–0.74; p < 0.0001), ulceration (AUC 0.62, 95% CI 0.55–0.70; p < 0.0001), and tumor size (AUC 0.56, 95% CI 0.48–0.64; p < 0.0001) in the model development cohort (Fig. 4a). In the independent validation cohort, this model also achieved a better performance (AUC of 0.88, 95% CI 0.80–0.95), as compared to CT imaging (AUC 0.57, 95% CI 0.44–0.69; p < 0.0001), differentiation (AUC 0.64, 95% CI 0.53–0.74; p < 0.0001), invasional depth (AUC 0.61, 95% CI 0.50–0.72; p < 0.0001), lymphovascular invasion (AUC 0.69, 95% CI 0.57–0.81; p < 0.0001), ulceration (AUC 0.59, 95% CI 0.45–0.70; p < 0.0001), and tumor size (AUC 0.58, 95% CI 0.45–0.70; p < 0.0001) (Fig. 4b).
Accordingly, we compared the performance of the 3-marker methylation model with CT imaging and these clinicopathological features for EGC LNM diagnosis. Of interest, we found that the diagnostic performance (AUC) of the 3-marker methylation model (0.85 and 0.88) was significantly superior than diagnostic model based on CT imaging (0.60 and 0.57), tumor differentiation (0.62 and 0.64), tumor invasional depth (0.65 and 0.61), tumor lymphovascular invasion (0.66 and 0.69), ulceration (0.62 and 0.59), and tumor size (0.56 and 0.58) in the model development and validation cohort, respectively (Fig. 4a, b). The 3-marker methylation model showed significantly higher accuracies (79.8% and 83.8%) than diagnostic model based on CT imaging (59.6% and 53.8%), tumor differentiation (59.6% and 52.3%), tumor invasional depth (58.3% and 56.2%), tumor lymphovascular invasion (67.9% and 70.8%), ulceration (58.3% and 53.1%), and tumor size (51.3% and 47.7%) in the two cohorts, respectively (Fig. 4c, d). The sensitivity and specificity of the 3-marker methylation model were also significantly higher than diagnostic models based on CT imaging or individual clinicopathological features (Additional file 1: Figure S7), with approximately twofold higher sensitivities as compared to CT-based diagnostics (80.6% vs. 41.7% and 78.1% vs. 40.6% in the model development and validation cohort, respectively) (Additional file 1: Figure S7c and S7d).
An integrated model combining methylation and clinicopathological features further improved the LNM diagnostic performance
To evaluate the performance of the 3-marker methylation model and the clinicopathological characteristic-based model (i.e., the conventional model [17, 18]), the risk factors as identified by previous univariate analysis were used in multivariate analysis to select independent LNM predictors (Table 3 and Additional file 2: Table S3) and these predictors, including lymphovascular invasion (OR 11.30, 95% CI 5.40–23.64, p < 0.001), submucosal invasion (OR 2.48, 95% CI 1.41–4.36, p = 0.002), ulceration (OR 2.36, 95% CI 1.39–4.00, p = 0.001), and differentiation (OR 3.85, 95% CI 1.94–7.67, p < 0.001), were further used for development of a conventional model. We developed a conventional model based on informative pathological features as reported before . However, the performance of the conventional model was inferior to the 3-marker methylation model, with lower AUCs in the model development cohort (0.77, 95% CI 0.71–0.83 vs. 0.85, 95% CI 0.80–0.91, p = 0.0805) and the validation cohort (0.79, 95% CI 0.70–0.88 vs. 0.88, 95% CI 0.80–0.95, p = 0.1250), respectively (Fig. 5a, b). Compared to the conventional model, the 3-marker methylation model achieved higher specificity (79.6% vs. 70.0% and 85.7% vs. 65.3%) and accuracy (79.8% vs. 70.9% and 83.8% vs. 67.7%) with comparable sensitivity (80.6% vs. 73.6% and 78.1% vs. 75.0%) in the model development and validation cohorts, respectively (Fig. 5c, d). Diagrams illustrating the predicted results of both the 3-marker methylation model and conventional model as compared to pathology for the same persons in method development and validation cohorts are shown in Fig. 5e, f. For the same patients having LNM, the 3-marker methylation model and conventional model showed a high concordance with the 3-marker methylation model identified additionally more cases. More importantly, the 3-marker methylation model helped more patients without LNM to avoid over treatment.
To explore whether the diagnostic accuracy of the 3-marker methylation model could be enhanced by combining clinicopathological features, we built an integrated model within the model development cohort using independent predictors of LNM, which included 3-marker methylation model (OR 17.616, 95% CI 9.144–33.937, p < 0.001), submucosal invasion (OR 2.602, 95% CI 1.345–5.037, p = 0.005), differentiation (OR 3.863, 95% CI 1.733–8.609, p = 0.001), ulceration (OR 2.692, 95% CI 1.443–5.022, p = 0.002), and lymphovascular invasion (OR 9.956, 95% CI 4.144–23.917, p < 0.001), as shown in Additional file 2: Table S4. The integrated model showed improved AUCs of 0.91(95% CI 0.87–0.95, p < 0.0001) and 0.89(95% CI 0.81–0.96, p = 0.0079), specificities of 82.6% and 87.8%, and accuracies of 83.4% and 86.2% and compatible sensitivities of 86.1% and 81.3% as compared to the methylation model and conventional model in the model development and validation cohorts, respectively (Fig. 5c–f).
Both the 3-marker methylation model and the integrated model have the potential to reduce overtreatment on LNM− EGC patients
The treatment modalities of EGC depend on the status of LNM in patients. While ESD has been used as the curative procedure of EGC without LNM, surgical resection of tumors with D1/D2 lymphadenectomy is conducted in patients diagnosed with LNM. However, the identification of LNM is not sufficient under current standard workups (The Japan Gastroenterological Endoscopy Society and Japanese Gastric Cancer Association (JGCA) guidelines) [5, 23]. To test whether the 3-marker methylation model can augment LNM diagnosis accuracy and treatment precision, we compared the clinical utilities of the 3-marker methylation model and the integrated model to current standard workups in overall 432 surgically resected specimens. For patients with the absolute indication of ESD in our cohorts (n = 29), the 3-marker methylation model and integrated model resulted in 79.3% and 100% diagnostic accuracy, 0.0% undertreatment, and 20.7% and 0.0% overtreatment due to false positive identification, as compared to standard workups of 100.0% accuracy, 0.0% undertreatment and 0.0% overtreatment (Fig. 6a, b). For patients with expanded indication of ESD in our cohort (n = 81, 13 of LNM+, and 68 of LNM−), while the overtreatment rate of the 3-marker methylation model and integrated model was slightly higher as compared to standard workups (16.0% and 12.3% vs. 0.0%), the undertreatment rates of our models were significantly lower (2.5% and 4.9% vs. 16.1%) and the overall accuracies were comparable to standard workups (81.5%, 82.7% vs. 84.0%) (Fig. 6a, c). For patients with relative indication (n = 322, 91 of LNM+ and 231 of LNM−), the 3-marker methylation model and integrated model showed significantly improved accuracies as compared to standard workups (81.1%, 83.2% vs. 28.3%). Additionally, the 3-marker methylation model and integrated model showed remarkably low overtreatment rates (13.0%, 13.0% vs. 71.74%) (Fig. 6a, d). Since 74.5% of the overall EGC patients are relative indications, the 3-marker methylation model and integrated model have the potential to significantly reduce the overtreatment rate by 39.4% and 41.5% (14.1% and 12.0% vs. 53.5%), respectively, while maintaining a comparable undertreatment rate (4.9% and 3.7% vs. 3.0%) (Fig. 6a, e). Based on our findings, the potential of the methylation model and integrated model integrated in current clinical diagnostic setting was proposed (Additional file 1: Figure S8).
In this study, we performed a comprehensive genome-wide methylation profiling on EGC tissues and identified 60 LNM-specific methylation markers. Derived from these markers, a qPCR-based 3-marker methylation model was developed and validated with large-scale retrospective cohorts, consisting of 302 and 130 tissue samples, respectively. This model was superior to the most commonly used clinicopathological-based conventional tools in diagnosing LNM, as shown in our head-to-head comparison (AUC 0.85 vs. 0.77 in model development cohort and AUC 0.88 vs. 0.79 in validation cohort), while the conventional model we developed using the clinicopathological information showed similar diagnostic power as compared to previous studies (0.84 in the model development cohort and 0.82 in the validation cohort) . The 3-marker methylation model also showed advantageous diagnostic potential as compared to the reported gene expression-based methods, in which a 15-gene signature was used to identify LNM in early stage (T1–T2) gastric cancer with an AUC of 0.76 in training and AUC of 0.74 in the validation set . The results indicate the robustness of DNA methylation as diagnostic biomarker as compared to RNA expression, as DNAs were relatively stable clinical material and DNA methylation profiles may represent a relatively stable long-term programming of the genome and underlying cellular functions, whereas transcription assays only provide a snapshot of the gene expression activity at a specific time point and represent a transient signaling process .
To date, few studies have used genome-wide methylation strategy to screen methylation markers for LNM diagnosis in EGC. Wu et al. reported a 14 LNM-related genes classifier derived from 450 K methylation data of gastric cancer in The Cancer Genome Atlas (TCGA) and developed 14 LNM-related genes classifier which showed a median AUC of 0.78 . Our study applied a more comprehensive approach to dissect the methylome associated with LNM in EGC, with more than 3.34 million CpG sites analyzed which accounted for 97.3% of CpG islands in the genome. The de novo marker discovery effort identified some LNM-specific markers that were first reported in EGC, including the 3 markers (GNAS, FCGBP, and CCDC166) used in the methylation model.
Previous studies have shown that DNA methylation levels of imprinted domains of GNAS in primary breast cancer, lung cancer, and ovarian cancer are very different from those in normal tissues. It has been shown that GNAS promotes breast cancer cell proliferation and epithelial–mesenchymal transformation (EMT) through the PI3K/Akt/Snail1/E-cadherin signaling pathway, which may be responsible for the malignant progression and metastasis [26, 27]. The discovery of methylated region was found in the first exon region of GNAS which is hypomethylated in LNM + EGC in our study, suggesting that imprinted domains in GNAS could play a role in gastric cancer metastatic development as well.
FCGBP (Fc fragment of IgG binding protein) has been identified as a metastasis-related gene in colorectal cancer; its down-regulation is an independent risk factor for overall survival and disease-free survival in patients with metastatic colorectal cancer and is significantly associated with the prognosis of those patients [28, 29]. We found that the methylated region of FCGBP gene is located in the fifth exon region inside the gene, which may be involved in the regulation of gene expression and affect its function on LNM in gastric cancer. CCDC166 was found to be highly mutated in signet ring cell carcinoma . The mutant region did not occur within the methylated region we found. It was discovered that the methylated region is located in the first exon region of CCDC166 and is hypomethylated in LNM+ EGC in our study. Further studies are needed to explore the biological functions and potential regulatory network of these methylation markers in promoting LNM in EGC.
In current clinical settings, endoscopic ultrasound, CT imaging, and clinicopathological features are standard workups for determining the N staging of gastric cancer. As different N staging may lead to different operative management, it is crucial to accurately access the N staging preoperatively. However, preoperative LNM identification is limited with current technologies. Endoscopic ultrasonography was reported with an accuracy of 43%, while CT imaging has an accuracy of 56% [31, 32]. Clinicopathological features can be examined pathologically with endoscopically resected tissues (EMR or ESD) from EGC patients. Patients found with at least one positive pathological feature, such as undifferentiated type, submucosal invasion, lymphatic vascular invasion, or ulceration, are usually recommended for radical surgical procedures .
While the incidence of LNM in EGC is about 8%-25%, approximately 69.1% of the patients with EGC undergo radical gastrectomy with a lymphadenectomy according to standard workups , indicating the current pathological assessment-based LNM diagnosis procedures are suboptimal that resulted in high rate of overtreatment and unnecessary gastrectomies. CT-positive findings that are largely based on nodule size and/or volume are often accompanied by high false-negative rates .
Our 3-marker methylation model demonstrated improved performance over these current conventional methods. We found the LNM risk score calculated from our model was significantly associated with the LNM status in patients but not their age, gender, tumor size, and tumor location. The 3-marker methylation model and integrated model showed significantly improved specificity and low false positive rates, resulting in a remarkable reduction of overtreatment by 39.4% and 41.5% as compared to standard workups; this result suggested a great potential of the assay to reduce unnecessary gastrectomies. However, it is worth pointing out that our study was based on samples that were surgically resected; thus, a large-scale multi-center study with preoperative endoscopic biopsies or endoscopically resected specimens is needed to confirm the robustness and performance of the assay.
In summary, we have established and validated a novel 3-marker methylation model in a large retrospective cohort, with the intention to improve LNM diagnosis accuracy in EGC. With further developments, we are hopeful that we would integrate it into existing preoperative LNM diagnosis procedures and assist in guiding treatment decision making in EGC patients.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Area under curve
Early gastric cancer
Lymph node metastasis
Positive lymph node metastasis
Lymph node metastasis negative
Endoscopic mucosal resection
Endoscopic submucosal dissection
Positron emission tomography with CT
Indication of endoscopy resection. Ct: cycle threshold
The least absolute shrinkage and selection operator
Receiver operating characteristic
Yao K, Uedo N, Muto M, Ishikawa H. Development of an e-learning system for teaching endoscopists how to diagnose early gastric cancer: basic principles for improving early detection. Gastric Cancer. 2017;20(Suppl 1):28–38.
Takizawa K, Ono H, Yamamoto Y, Katai H, Hori S, Yano T, et al. Incidence of lymph node metastasis in intramucosal gastric cancer measuring 30 mm or less, with ulceration; mixed, predominantly differentiated-type histology; and no lymphovascular invasion: a multicenter retrospective study. Gastric Cancer. 2016;19(4):1144–8.
Banks M, Graham D, Jansen M, Gotoda T, Coda S, di Pietro M, et al. British society of gastroenterology guidelines on the diagnosis and management of patients at risk of gastric adenocarcinoma. Gut. 2019;68(9):1545–75.
Giganti F, Orsenigo E, Arcidiacono P, Nicoletti R, Albarello L, Ambrosi A, et al. Preoperative locoregional staging of gastric cancer: is there a place for magnetic resonance imaging? Prospective comparison with EUS and multidetector computed tomography. Gastric Cancer. 2016;19(1):216–25.
Hirasawa T, Gotoda T, Miyata S, Kato Y, Shimoda T, Taniguchi H, et al. Incidence of lymph node metastasis and the feasibility of endoscopic resection for undifferentiated-type early gastric cancer. Gastric Cancer. 2009;12(3):148–52.
Suzuki H, Oda I, Abe S, Sekiguchi M, Nonaka S, Yoshinaga S, et al. Clinical outcomes of early gastric cancer patients after noncurative endoscopic submucosal dissection in a large consecutive patient series. Gastric Cancer. 2017;20(4):679–89.
Saito T, Kurokawa Y, Takiguchi S, Miyazaki Y, Takahashi T, Yamasaki M, et al. Accuracy of multidetector-row CT in diagnosing lymph node metastasis in patients with gastric cancer. Eur Radiol. 2015;25(2):368–74.
Kim SM, Min BH, Ahn JH, Jung SH, An JY, Choi MG, et al. Nomogram to predict lymph node metastasis in patients with early gastric cancer: a useful clinical tool to reduce gastrectomy after endoscopic resection. Endoscopy. 2020;52(6):435–43.
Sekiguchi M, Oda I, Taniguchi H, Suzuki H, Morita S, Fukagawa T, et al. Risk stratification and predictive risk-scoring model for lymph node metastasis in early gastric cancer. J Gastroenterol. 2016;51(10):961–70.
Izumi D, Gao F, Toden S, Sonohara F, Kanda M, Ishimoto T, et al. A genomewide transcriptomic approach identifies a novel gene expression signature for the detection of lymph node metastasis in patients with early stage gastric cancer. EBioMedicine. 2019;41:268–75.
Wu J, Xiao YW, Xia C, Yang F, Li H, Shao ZF, et al. Identification of biomarkers for predicting lymph node metastasis of stomach cancer using clinical DNA methylation data. Dis Markers. 2017;2017:5745724.
Jin X, Zhu L, Cui Z, Tang J, Xie M, Ren G. Elevated expression of GNAS promotes breast cancer cell proliferation and migration via the PI3K/AKT/Snail1/E-cadherin axis. Clin Transl Oncol. 2019;219(9):1207–2119.
Yuan ZM, Zhao ZX, Hu HQ, Zhu YH, Zhang WY, Tang QC, et al. IgG fc binding protein (FCGBP) is down-regulated in metastatic lesions and predicts survival in metastatic colorectal cancer patients. Onco Targets Ther. 2021;14:967–77.
Ungureanu BS, Sacerdotianu VM, Turcu-Stiolica A, Cazacu IM, Saftoiu A. Endoscopic ultrasound vs. computed tomography for gastric cancer staging: a network meta-analysis. Diagnostics (Basel). 2021;11(1):134.
Park JW, Ahn S, Lee H, Min BH, Lee JH, Rhee PL, et al. Predictive factors for lymph node metastasis in early gastric cancer with lymphatic invasion after endoscopic resection. Surg Endosc. 2017;31(11):4419–24.
Zou X-M, Wang X-S, Li Y-L, Jin Z-X, Piao D-X, Li X-Y, et al. Analysis of clinical characteristics of gastrointestinal cancer in Heilongjiang province, China 1998 to 2007. Chin J Gastrointest Surg. 2009;12(6):577–80.
We thank the staff of the Jian-Bing Fan’s Laboratory in Southern Medical University for their excellent technical assistance and the AnchorDx R&D team (AnchorDx Medical Co., Ltd) for kindly providing the custom made 50-marker EGC-LNM DNA methylation panel and related reagent kits. We also thank Dr. Haiyan Wang and Dr. Yan Zhang (Department of Pathology, Nanfang Hospital, Southern Medical University, Guangzhou, China) and Dr. Jingwen Chen and Dr. Liming Liu (Department of Pathology, Shenzhen People's Hospital, Shenzhen, China) for consultation of pathological diagnosis in clinical practice.
This study was supported by the Science and Technology Planning Project of Guangdong Province, China (Grant No. 2017B020226005), Scheme of Guangzhou Economic and Technological Development District for Leading Talents in Innovation and Entrepreneurship (Grant No. 2017-152), Scheme of Guangzhou for Leading Talents in Innovation and Entrepreneurship (Grant No. 2016007), and Scheme of Guangzhou for Leading Team in Innovation (Grant No. 201909010010).
Shang Chen and Yanqi Yu have contributed equally to the work.
Authors and Affiliations
Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, China
Shang Chen, Yingdian Yu & Wenyuan Xue
Department of Pathology, School of Basic Medical Sciences, Southern Medical University, Guangzhou, 510515, China
Yanqi Yu, Quanzhou Peng, Tianfeng Cao & Jian-Bing Fan
Department of General Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
Tao Li & Jiang Yu
AnchorDx Medical Co., Ltd, Unit 502, No. 8, 3rd Luoxuan Road, International Bio-Island, Guangzhou, 510300, China
Weimei Ruan, Jun Wang, Zhiwei Chen & Jian-Bing Fan
Department of Pathology, Shenzhen People’s Hospital, Shennan Dong Lu, Luohu District, Shenzhen, 518002, China
AnchorDx, Inc., 46305 Landing Pkwy, Fremont, CA, 94538, USA
Z.W.C., J.Y., and J.B.F. conceived, designed, and directed the study. S.C., Y.Q.Y., W.M.R., and J.W. designed the experiments and developed the methodology. S.C., Y.Q.Y., T.L., and Q.Z.P. acquired the data. S.C., W.M.R., and J.W. performed the analyses and interpretation of data. S.C., Y.Q.Y., T.L., Q.Z.P., Y.D.Y., T.F.C., and W.Y.X. acquired the patient samples and information. S.C., W.M.R., X.L, and J.B.F. wrote and critically reviewed the manuscript. All authors reviewed and approved the final manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Chen, S., Yu, Y., Li, T. et al. A novel DNA methylation signature associated with lymph node metastasis status in early gastric cancer.
Clin Epigenet14, 18 (2022). https://doi.org/10.1186/s13148-021-01219-x