Patients and tumour samples
SACs were diagnosed on the basis of criteria proposed by Mäkinen et al. (epithelial serrations, clear or eosinophilic cytoplasm, abundant cytoplasm, vesicular nuclei, absence of or less than 10% necrosis of the total surface area, mucin production and cell balls and papillary rods in mucinous areas of a tumour) . hmMSI-H were diagnosed according to prior established criteria (mucinous, signet-ring cell, and medullary carcinoma, tumour infiltrating and peritumoural lymphocytes, “Crohn-like” inflammatory response, poor differentiation, tumour heterogeneity and “pushing” tumour border) (Fig. 1) . Frozen samples of 21 and 18 SACs were retrieved from Santa Lucia General University Hospital (HGUSL), Cartagena, Spain, and Oulu University Hospital, Oulu, Finland, respectively. Additionally, nine matched hmMSI-H from HGUSL were included for the methylome microarray study. Validation by methylated sequences was performed on 16 Spanish SAC cases and nine hmMSI-H from the microarray subset. Validation by qPCR was performed upon frozen specimens of 12 SAC and nine hmMSI-H, and in addition, adjacent normal mucosa was also analysed from eight SACs and six hmMSI-H. Paraffin blocks of 26 SAC and 21 matched hmMSI-H, included in previous works, [10, 14] were used for immunohistochemistry (IHC) validation. The assessment of the MSI-H condition was confirmed at the molecular level as described previously by our group , and none of the hmMSI-H showed serrated morphology. The cases from DNA validation set were used to assess the correlation between gene methylation and KRAS, BRAF and MSI status. The study was approved by the Hospital Ethics Committee and was carried out in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Written informed consent was obtained from all the patients.
A volume of approximately 10 mm3 was extracted from each frozen tissue using the disposable sterile biopsy punch. DNA was extracted following the manufacturer’s instructions (Qiagen, Hilden, Germany). Briefly, tissue was disrupted and homogenized in ATL buffer using a Tissueruptor (Qiagen), incubated with proteinase K and the homogenate was subjected to automatic DNA extraction using the Qiacube equipment and the QiaAmp DNA Mini Kit (cat no.:51306), both provided by Qiagen.
Bisulfite treatment and DNA methylation assay
HumanMethylation450K BeadChip (Illumina, Inc., San Diego, CA), using Infinium HD Methylation assay for genome-wide DNA methylation screening, was employed. In brief, genomic DNA (1000 ng) from each sample was bisulfite converted with the EZ DNA Methylation Kit (Zymo Research, Orange, CA) according to the manufacturer’s recommendations. Bisulfite-treated DNA was isothermally amplified at 37 °C (20–24 h), and the DNA product was fragmented by an endpoint enzymatic process, then precipitated, resuspended, applied to an Infinium Human Methylation450K BeadChip (Illumina, San Diego, CA, USA), and hybridized at 48 °C (16–24 h). The fluorescently stained chip was imaged by the Illumina i-SCAN, and Illumina’s Genome Studio program (Methylation Module) was used to analyse BeadArray data to assign site-specific DNA methylation β-values to each CpG site. The data set supporting the results of this article are available in the GEO repository, GSE68060 in https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE68060.
Preprocessing of methylation data
Processing of raw data was done using R packages. Probes with a low detection p value (p < 0.01) in more than 95% of the samples and those measuring SNPs or mapping in X or Y chromosomes were removed and normalization followed a three-step procedure. Firstly, a colour bias adjustment was applied using the methylumi R-package . Then, wateRmelon  R-package was used to perform between-sample normalization by equalization of type I and type II backgrounds followed by separated quantile normalization of methylated and unmethylated intensities. Finally, A BMIQ  intra-sample normalization procedure, included in the wateRmelon R-package, was applied to correct the bias of type II probe values.
Differential methylation functional profiling
The analysis of differentially methylated genes was performed using limma  R-package. Data were fitted to a linear model, and differential methylated genes were identified by using the empirical Bayes method included in the package. If the comparison was done between paired samples, a moderated paired t test was applied. A FDR-corrected p value of 0.05 was used as the threshold to select differentially methylated genes. Functional profiling of the differentially methylated genes was performed using ClusterProfiler and the FatiScan method included in the Babelomics [33, 34] web suite. For functional annotation, the Biological Process Database from Gene Ontology (GO) (www.geneontology.org) was used. Differentially methylated GO biological process was represented as scatterplot using REVIGO online package .
DNA methylation percentages of five different CpG island sites included in the microarray (two in CD14, three in HLA-DOA) were analysed and quantified by pyrosequencing. Bisulfite-converted DNA was previously amplified by PCR using Hot-Start GoTaq polymerase (Promega, Madison, WI) under the following conditions: 1 ul of DNA, 4 ul of 5X polymerase buffer, 0.2 mM dNTPs, 0.6 mM MgCl2, 0.3 μM of either biotin-labelled forward or reverse primers and 0.05 U/μl Hot-start Go Taq Flexi polymerase (Promega). PCR protocol was performed as follows: initial denaturation at 94 °C for 2 min, 35 cycles of 94 °C 10 s, 64 °C (CD14) or 60 °C (HLA-DOA) 10 s and 72 °C 50 s and a final extension step of 72 °C 7 min. Details of amplicon and primer sequences are provided in Additional file 5: Table S3. PCR products were verified using the QIAxcel DNA high-resolution electrophoresis system. Pyrosequencing of methylated sites was performed using the PyroMark Q24 (Qiagen) according to the manufacturer’s protocol. The methylation level was assessed using the PyroMark Q24 2.0.6 Software (Qiagen) by which the methylation percentage (mC/mC+C) for each CpG was calculated. The results are presented as the percentage (mean ± SD) of the different CpG sites studied for each of the CpG sites analysed whose sequences and relative positions are also shown as Additional file 5: Table S3.
Quantitative PCR for assessing mRNA expression
RNAs from 20 SACs and 22 hmMSI-H, including those from the training set, were extracted with the miRNeasy kit (ref: 217004, Qiagen) and used for validation by qPCR. The retrotranscriptase reaction was performed from a total of 1 μg of DNAseI-treated RNA using the DyNAmo cDNA synthesis Kit (ref: F470L) provided by Thermo Scientific (Rockford, IL). Five microlitres of 1:5 diluted cDNA was added to the qPCR reaction containing 12.5 μl 2X QuantiTect SYBR Green PCR Kit (ref:204145, Qiagen) and 300 nM of each primer in a total volume of 25 μl. qPCR was performed on a 7500F real-time PCR system by Applied Biosystems (Foster City, CA, USA) according to the instruction manual and following the standard protocol: 50 °C 2 min, 95 °C 10 min, 40 cycles of 95 °C 15 s, 60 °C 1 min and a melt curve stage consisting of 95 °C 15 s, 60 °C min, 95 °C 30 s and 60 °C 30 s. Primers were designed using primer3 software and sequences, and amplicon sizes are shown in Additional file 5: Table S3. The relative quantitation was done by the 2-ΔCt method using β-actin as the housekeeping gene.
The validation subset consisted of 26 SAC and 21 hmMSI-H cases matched for gender, age and location, and a representative area of each tumour was selected by one of us (JGS). Whole 2.5-μm sections were stained with CD14 and HLADOA rabbit antibodies. Details on equipment, antigen retrieval conditions (buffer, pH, temperature, time) and incubation (temperature, time) for both antibodies are as follows: Bechmark Ultra Ventana, (CC1, basic, 95 °C, 56 min) and (overnight, room temperature). Antibody purveyor and type, code (clone) and antibody dilution were as follows: for CD14: Cell Marque, monoclonal, 760-4523 (EPR3653), 1:5, and for HLA-DOA: Sigma Aldrich, polyclonal, HPA045038, 1:200. Endogenous peroxidase activity was blocked using 0.5% H2O2 for 5 min. For visualization of the antigen, the sections were immersed in 3,3′-diaminobenzidine (DAB) and counterstained with Harris’ haematoxylin for 5 min. Following manufacturers´ recommendations sinusoidal histiocytes and mantle zone from a lymph node were used as positive controls for CD14 and HLA-DOA, respectively.
These markers were evaluated by considering a staining intensity in both the centre of the tumour and the invasive front (1 = none or weak staining, 2 = moderate, 3 = strong) and a staining area score (A < one third, B = between one and two thirds, C > two thirds) in a given area. For statistical analysis, both intensity and distribution were considered.
Statistical analysis of validation data
For the analysis of quantification of methylated DNA sequences, the data correspond to a split-plot design with one between-subject factor defining six independent groups of cases (SAC, CC, hmMSI-H; tumoural and non-tumoural) and one within-subject factor (CpG sites) defining nine repeated measures for every case. Accordingly, we performed two ANOVA SPF-p-q. The first compared the means of the tumoural vs. non-tumoural groups in each of the nine different CpG sites and the second the means of the six different groups in these sites. For checking the relationship between methylation percentage and binary variables, the t test for independent samples and the Mann-Whitney’s U test were used. Statistical significance in the immunohistochemistry study was assessed using Pearson χ2 or Fisher’s exact test when indicated. Descriptive statistics were computed for real-time PCR. Statistical analysis was performed using the SPSS (Version 22, Chicago, IL) package.