The signature of HBV-related liver disease in peripheral blood mononuclear cell DNA methylation

Hepatitis B virus (HBV)-related liver disease induces liver damage by hepatic immune and inflammatory response. The association between aberrant peripheral blood mononuclear cell (PBMC) DNA methylation and progression of liver disease and fibrosis remains unclear. Here we applied Infinium 450 K BeadChip investigating PBMC genome-wide methylation profiling of 48 HBV-related liver disease patients including 24 chronic hepatitis B (CHB), 14 compensated liver cirrhosis (LC), and 10 decompensated liver cirrhosis (DLC). In total, there were 7888 differentially methylated CpG sites (36.06% hypermethylation, 63.94% hypomethylation) correlate with liver disease progression. LC was difficult to be diagnosed, intermediating between CHB and DLC. We used least absolute shrinkage and selection operator (LASSO)-logistic regression method to perform a LC predictive model. The predicted probability (P) of having LC was estimated by the combined model: P = 1/(1 − e−x), where X = 11.52 − 2.82 × (if AST within the normal range − 0.19 × (percent methylation of cg05650055) − 0.21 × (percent methylation of cg17149911 ). Pyrosequencing validation and confusion matrix analysis was used for internal testing, area under receiver operating characteristic curve (AUROC) of model was 0.917 (95% CI, 0.80–0.977). On the fibrosis progress, there were 1705 genes in LC compared with CHB, whose differentially methylated CpG sites loading within the “promoter” regions (including TSS1500, TSS200, 5′UTR, and the 1st exon of genes) subject into the enrichment analysis using Ingenuity Pathway Analysis (IPA). There were 113 enriched immune-related pathways indicated that HBV-related liver fibrosis progression caused epigenetic reprogramming of the immune and inflammatory response. These data support idea that development of HBV-related chronic liver disease is linked with robust and broad alteration of methylation in peripheral immune system. CpG methylation sites serve as relevant biomarker candidates to monitor and diagnose LC, providing new insight into the immune mechanisms understanding the progression of HBV-related liver fibrosis and cirrhosis.


Introduction
Hepatitis B virus (HBV) infection is a common and growing global public health problem and CHB-related liver cirrhosis and hepatocellular carcinoma (HCC) are the cause of a high rate of morbidity and mortality [1]. WHO estimated that 257 million carriers of hepatitis B surface antigen (HBsAg) around the world in 2015, among which there were more than 887,000 deaths, mostly from hepatitis B resulted liver cirrhosis and HCC (World Health Organization fact sheets are available at www.who.int) Clinically, it might be easy to distinguish decompensated liver cirrhosis (DLC) compared compensated liver cirrhosis (LC) according to presence of series of clinical symptoms form ascites, variceal bleeding, gastrointestinal bleeding, or hepatic encephalopathy, and Child-Turcotte-Pugh (CTP) score is class B or C [2]. It is relatively simple to diagnose DLC, although the treatment might be problematic. On the contrary, it is more challenging to make a distinction from CHB [3]. CT, MRI, and other imaging examinations are limited in the diagnosis of LC. It is still dependent on liver biopsy that is traditional gold standard procedure for staging of fibrosis and diagnosis of LC in Chinese guidelines for the prevention and treatment of chronic hepatitis B (version 2019) [4] and AASLD 2018 Hepatitis B Guidance [5]. However, it is invasiveness with risk of serious complications [6], sampling limitation, and interpretational variability [7]; therefore, clinicians have been trying to seek more accurate and noninvasive tools for assessing LC.
DNA methylation is stable epigenetic modification, whose variation is induced by environmental factors [8], and changes of DNA methylation are used to detect and monitor HBV-related chronic disease and HCC has been reported in recent years [9,10]. Our previous studies had demonstrated that large changes in peripheral blood mononuclear cell (PBMC) DNA methylation altered in HBV/HCV-chronic hepatitis that might be playing a role in the progression form liver disease to HCC. Nevertheless, there was a dramatic and clear differentiation in PBMC DNA methylation profiles between chronic hepatitis and HCC. Meanwhile, the study supported the hypothesis that changes in DNA methylation of PBMC reflect changes in the host immune system interaction of HCC, rather than the footprints of circulating DNA of tumors or tumor substitutes [10].
Here, we initially applied Infinium Human Methylation 450 K BeadChip arrays to examine genome-wide DNA methylation profiles in PBMC samples from 48 HBV-related liver fibrosis and cirrhosis. The chip covered about 480,000 single CpG sites from 21,231 human genes annotated by the university of California Santa Cruz genome database [11]. Genome-wide CpG sites methylation profiling of PBMC DNA could provide relevant biomarker candidates to monitor the process of HBV-related liver fibrosis and cirrhosis, which was used to examine the associations of LC specific CpG sites for establishing diagnosis model. Meanwhile, methylated variations of CpG sites occurred mostly within the promoter regions of genes to regulate their transcription [12]. CpG sites methylation correlating with liver fibrosis and cirrhosis could contribute evidence for immune functional and canonical pathway changes. Our findings also provided new insight into the immune mechanisms underlying the progression of HBV-related liver fibrosis and cirrhosis.

Result
Clinical and pathological characteristics of the diseased patients Forty-eight patients with HBV-related diseases were from Beijing area, whose clinical and pathological characteristics were described in Table 1. These three groups did not differ for age, gender, smoking and alcohol, total bilirubin (TBIL), aspartate transaminase (AST), alanine transaminase (ALT), and serological diagnosis model (APRI, FIB-4) (all P > 0.05). All patients were subjected accurate diagnosis of staging of fibrosis and cirrhosis with traditional gold standard liver biopsy.
Correlation between quantitative distribution of site specific DNA methylation levels and progression of liver disease PBMC was isolated by density gradient centrifugation from 48 patients' blood. After treatment by bisulfite conversion, PBMC genomic DNA was detected using Illumina Infinium HumanMethylation450 BeadChip arrays. Raw data was loaded and analyzed using the ChAMP Bioconductor package in R. In total, 485,577 methylation probes were in the 450 K arrays, while probes with a detection P value > 0.01 (1850 probes), with a beadcount < 3 in at least 5% of samples (6368 probes), containing single nucleotide polymorphisms (SNPs, 49,659 probes), aligning to multiple locations (7074 probes), locating on X,Y chromosome (10,073 probes) as well as NoCpG sites (3330 probes) were consequently removed, finally 407, 223 probes were used for further analysis. Following normalization and correction for batch effects, Pearson correlation analysis with Bonferroni correction for multiple testing (< 1 × 10 −7 ) was used to show linear correlation between the quantitative distribution of CpG sites methylation levels across the array and liver disease progression. The result revealed significant correlation between CpG sites methylation levels and progression of liver disease. DNA methylation levels obviously varied in the whole genome, within 7888 CpG sites (r > 0.8, r < − 0.8; P < 10 −7 , Supplementary table 1) including 2845 (36.06%) hypermethylation CpG sites and 5043 hypomethylation CpG sites most significantly changed during liver disease progression. Hierarchical clustering analysis showed 7888 differentially methylated CpG sites probes distinguish patients of CHB, LC, and DLC ( Fig. 1). Obviously, DLC patients were differed form CHB and LC ones, in addition to fibrotic process of the liver, the other symptoms of DLC might be major factors distinguishing DLC from of CHB and LC. Meanwhile, liver fibrosis S4 stages were both within DLC and LC patients who were separated into two groups, indicating that liver fibrosis was not the cause but one of outcome of the liver disease.
The PBMC DNA methylation characters of progression of liver disease remained significant after correction for potential confounders: gender, age, smoking, alcohol drinking, and cell type.
To rule out the influence of cell mixture distribution in PBMC, gender, age, smoking, and alcohol drinking on DNA methylation, these covariates were checked for correction. First of all, the cell mixture distribution for each patient was determined using the Houseman algorithm [13]. There was no significant difference in cell mixture between the groups using two-way ANOVA followed by pair-wise comparisons and false discovery rate. Then, a multivariate linear regression was performed on the normalized beta values of the 7888 CG sites to divide all cases using group (CHB, LC, and CHB chronic hepatitis B, LC compensated liver cirrhosis, DLC decompensated liver cirrhosis, TBIL total bilirubin, AST aspartate transaminase, ALT alanine transaminase, APRI, FIB-4 serological diagnosis model, HBsAg hepatitis B surface antigen, Anti-HBs hepatitis B surface antibody, HBeAg hepatitis B e antigen, Anti-HBe hepatitis B e antibody, Anti-HBc hepatitis B core antibody. a: one CHB patient data of HBV markers unavailable, b: one DLC patient data of HBV markers unavailable DLC), gender, age, smoking, alcohol drinking, and cell type as covariates. In the model including the other covariates, all CG sites remained highly significant for group covariate, even after following Bonferroni corrections, all CG sites remained highly significant for the group (Supplementary table 2). A multifactorial ANOVA analysis was performed on the beta values of the 7888 sites as dependent variables and group (CHB, LC, and DLC), gender, age, smoking, and alcohol drinking as independent variables to determine whether there were possible separate and combined effects on DNA methylation by the five independent variables. The group remained significant for all 7888 CG sites, no significant interactions between group and separate or combine other four independent variables (gender, age, smoking, and alcohol drinking) were found after Bonferroni corrections (Supplementary  table 3).

Profiles and cluster of CpG sites methylation level by STEM analysis
We aimed to address the hypothesis that changes of DNA methylation level reflected the progression of the disease. We used short time-series expression miner (STEM) analysis to perform a global temporal analysis of the detected CpG sites, in order to find the sites demonstrating a common pattern of change of methylation level along with the disease stages. Of the 50 randomly selected profiles investigated, 18 profiles (1,2,4,8,9,13,17,18,19,22,24,26,31,37,37,43,44, and 45) showed a statistically significant higher number of CpG sites than expected (shown as colored profiles in Fig. 2). Among these profiles, we focused on two profiles for further analysis. In profile 24, the methylation level of the CpG sites was stable from stage S1 to stage S3, then reached the lowest level at stage LC, afterwards returned to higher methylation level at stage DLC (but still lower than stage S3). In profile 26, the methylation level of the CpG sites was stable from stage S1 to stage S3, and then reached the highest methylation level at stage LC. At stage DLC, the methylation level was the lowest.

Selection of from PBMC DNA CpG sites for LC prediction model
From profile no. 24 and no. 26, nine CpG sites were selected and successfully distinguish LC PBMC samples from CHB and DLC samples using hierarchical clustering. Nine CpG sites candidates used to subject least absolute shrinkage and selection operator (LASSO) regularized regression method [14] implemented in the glmnet R package [15,16] to identify CpG sites panel that predicted LC. Optimal value of LASSO penalty weighting, λ was selected with 5-fold cross-validation (CV). The minimum-deviance (λ min ) plus 1 standard error (λ 1se ) was used to select CpG sites sets (Fig. 3a). Using the λ 1se = 0.051, five CpG sites cg23899408 (HOOK2), cg20332088 (no gene can be mapped), cg17040924 (OR52M1), cg17149911 (AAK1), and cg05650055 (MYEOV) were prepared for LC prediction model (Fig. 3c).

LC prediction model from CpG sites in PBMC DNA combined with AST in blood
Firstly, CpG sites methylation obtained from Illumina 450 K array data were validated using the correlation analysis. The three CpG sites (cg20332088, cg05650055, and cg17149911) were chosen for pyromarksequencing, and the primers and cycling conditions for pyrosequencing were shown in Supplementary table 4. The results showed remarkable correlation between quantitation of CpG sites methylation ratio using pyrosequencing and Illumina 450 K array data, with a correlation coefficient of 0.92, 0.90, and 0.87, respectively (Fig. 4a, b, and c).
Univariate analysis and multivariate logistic regression were performed on five CpG sites were selected by LASSO and clinical characteristics of the patients to determine the independent association of each variable with LC. In consideration of clinically relevant and given the number of events available [17], thus the final model contained three variables: cg05650055 (PyroMark value), cg17149911 (PyroMark value), and AST levels in blood. The predicted probability (P) of having LC was estimated by the combined model: P = 1/(1 − e −x ), where X = 11.52 − 2.82 × (if AST within the normal range) − 0.19 × (percent methylation of cg05650055) − 0.21 × (percent methylation of cg17149911) ( Table 2). The PBMC DNA methylation characters between LC and CHB remained significant after correction for potential confounders: gender, age, smoking, alcohol drinking, and cell type.
A multivariate linear regression was performed on the normalized beta values of the 4325 CG sites to differentiate LC from CHB cases using group (CHB versus LC), gender, age, smoking, alcohol drinking, and cell type as covariates. In this model including the other covariates, all CG sites remained highly significant for group covariate, following Bonferroni corrections, 4219 CG sites A multifactorial ANOVA analysis was performed on the beta values of the 4325 sites as dependent variables and group (CHB versus LC), gender, age, smoking, and alcohol drinking as independent variables to determine whether there were possible separate and combined effects on DNA methylation by the five independent variables. Following Bonferroni corrections, 4167 CG sites remained highly significant for the group, 158 significant interactions between group and separate or combine other four independent variables (gender, age, smoking, and alcohol drinking) were found (Supplementary table 7).

Immune functional and canonical pathway changed between LC and CHB in PBMC
To gain insight into the immune functional footprint of the differentially methylated genes between LC and CHB   in PBMC, the genes whose "promoter" regions containing differentially methylated CpG probes (Supplementary table 8) were subjected to enrichment analysis using Ingenuity Pathway Analysis (IPA). There were 113 enriched immune-related pathways (Supplementary  table 9), and the highlighted pathways (P value < 0.05) with relevance to HBV-related fibrosis progression were identified in this analysis (Fig. 5b). The top five immune functional and canonical pathways were including IL-15 production, IL-17A signaling in airway cells, CD40 signaling, IL-8 signaling, production of nitric oxide, and reactive oxygen species in macrophages.

Discussion
HBV infection does not cause directly hepatocyte lesions, interactions between virus and host immune response determine virus clearance or liver damage [18]. It has been well known of viral clearance; however, mechanisms of host immune response responsible for the liver damage caused by HBV infection are still need to elucidate. Less is known about the alterations occurring in immune cells DNA methylation in non-tissues in HBVrelated disease patients. From our analysis, 7888 CpG sites in PBMC DNA whose quantitative state of methylation showed strong correlation (r > 0.8, r < − 0.8; P < 10 −7 ) with progression from CHB, LC to DLC (Fig. 1), which remained significant even after taking into account in the regression model differences in gender, age, smoking, alcohol drinking, and cell-type distribution. These data support idea that development of HBVrelated chronic liver disease is linked with robust and broad alteration of methylation in peripheral immune system. Interestingly, the overall of differences in PBMC DNA methylation varied as HBV-related chronic disease advanced, from hypermethylation to hypomethylation in Fig. 1. Importantly, the profiles of PBMC DNA of DLC patients were clearly differentiated from the CHB and LC ones. There was a sharp boundary between DLC and CHB and LC in heatmap, even both LC and DLC were fibrotic S4. Except for fibrosis level, the clinical and pathological characteristics of DLC might be other important factors distinguishing themselves from CHB and LC. This inference was consistent with some clinical observation that there was no significant correlation between severity of necrosis or liver injury and fibrosis progression in DLC patients [19,20]. The same was true in IL-15RαKO mice [21].
The alteration of PBMC DNA methylation modification with chronic liver disease progression is important and helpful to find biomarkers of liver disease in clinical diagnosis and therapies [22]. Clinically, it might be easy to distinguish DLC according to a series of clinical symptoms. However, the LC patients with fibrosis S4 stage are difficult to make a distinction from CHB [3]. At present, LC diagnosis is still dependent on liver biopsy that is traditionally gold standard procedure for staging of fibrosis and diagnosis. LC specifically PBMC DNA-methylated CpG sites were significantly enriched by STEM analysis based on their beta value of differentiation in the process of liver fibrosis progression [23]. Based on the both profile no. 24 and no. 26 form STEM results, we used five-fold cross-validation to select penalty weighting λ min , five candidate CpG sites were selected by LASSO basing on λ 1se . In consideration of clinically relevant and available number of events, we selected cg05650055 and cg17149911 to prepare for LC prediction model. Considering together with clinical characteristics of patients, we made the LC combine model. In this study, confusion matrix analysis was used for LC model internal testing with 48 liver disease patients, further external validation will carry out to confirm detection efficiency of model.  (11,475,62.94%) were more than hypomethylated ones (6759, 37.06%) [9]. The DNA methylation profile of PBMC DNA was not the same as tissue in severe fibrosis patient. Without an overlap differentially methylated CpG site between PBMC and liver biopsies (partial available data) suggested that changes of DNA methylation seen in PBMC reflect variation in the immune system were not a footprint of disease tissue.
(See figure on previous page.) Fig. 5 a The volcano plot for differential DNA methylation CpG sites between LC and CHB. The x-axis showed the mean DNA-methylation difference (delta Beta), whereas the y-axis showed the -log10 of the adjusted P value, hypermethylated CGs were shown in red, hypomethylated CGs were shown in red blue, non-significant methylated change CGs were shown in green (Bonferroni correction P value < 0.05). b Immune functional and canonical pathway changes between LC and CHB. Ten most significant pathways identified by the IPA canonical pathways analysis ("Cellular Immune Response," "Cytokine Signaling," and "Humoral Immune Response") of genes, whose "promoter" regions containing differentially methylated CpG probes. Upper x-axis represented negative -log (P value) of the enrichment score, which calculated by IPA using Fisher's exact test, right-tailed. Lower x-axis represented the ratio values between selected genes and the total number of genes in each of these curated pathways, and the orange curve pointed out the ratio, orange vertical line represented threshold value was 1.3.
The accurate relationship between DNA methylation and gene expression was complex and unclear [24,25]. An effect of DNA methylation on transcription was highly dependent on specificity of tissue and genomic context [24]. Methylation of promoter regions (TSS1500 and TSS200) and 5′UTR had mostly negative regulatory effect on transcription compared to gene body regions [26,27]. The 1st exon was considered as an important negative factor of transcription regulation [28]. Hypermethylation of the 1st exon region blocked transcriptional initiation, or vice versa and its length was closely related with gene activity [28][29][30]. On other hand, once transcription began, methylation of downstream region was not a significant interference factor restraining extension of RNA polymerase [28]. Methylation of CpGs loaded within the body regions, 3′UTR, ExonBnd, and intergenic region was not taken into account for functional and canonical pathway analysis. Finally, we selected CpGs loaded within "promoter" regions including TSS1500, TSS200, 5′UTR, and the 1st exon of genes for immune functional and canonical pathway analysis.
IPA canonical pathways analysis showed the immune and signal pathway had changed in LC compared to CHB, and provided an overview of the functional pathways that were affected. IL-15 and its signaling played an anti-fibrotic role in two independent ways, IL-15 maintained homeostasis of NK cell whose cytolytic effect avianized fibrogenic potential of hepatic stellate cells (HSCs) [31] and IL-15Rα inhibited collagen transcription repressors in HSCs [21]. In hepatitis B virus-related cirrhosis, IL-17 could activate STAT3 signaling pathway to induce production of type I collagen in HSCs [32]. IL-17A played critical role of HSC activation for liver fibrosis, and the induction of inflammatory cytokine and neutrophils recruitment were IL-17A-dependent [33]. Activation of the IL-17 pathway was a characteristic of human alcoholic hepatitis [34]. IL-8 significantly increased in chronic liver diseases, it served as a novel role to recruit and activate hepatic macrophages via CXCR1 that enhanced hepatic inflammation [35]. The production of nitric oxide (NO) and reactive oxygen species (ROS) in macrophages suggested that except for neutrophils [36], macrophages were another important source of NO and ROS in fibrosis progression. The CXCL12/ CXCR4 pathway recruited and retained immune cells to involve into inflammation and fibrosis both in chronic HCV and HBV infection [37]. These findings provided considerable support for the point that the epigenetic altering of immune system and inflammatory response during HBV-related liver fibrosis progression.
We build a LC prediction combine model, using confusion matrix analysis to confirm the model to be highly sensitive and specific. An external verification study was a necessary for LC diagnosis. The progression of HBV-related liver disease is affected by interaction between viral load, viral genotypes, and immune system. We determine that the PBMC DNA methylation characters of progression of liver disease (7888 CGs) and difference between LC and CHB (4325 CGs) remain significant after correction for HBV markers (Supplementary table 10 and 11) and PBMC cell type (CD4+ T cell, CD8+ T cells, B Cells, natural killer cells, monocytes, granulocytes). However, further research needs to investigate the effect of HBV markers' change, viral genotype, and HBV infection of PBMC on methylation variety in host PBMC. Meanwhile, we still need to find out the specific methylation site in PBMC of HBV infection compared with cases associated with HCV infection, especially end-liver diseases.
In conclusion, we demonstrated epigenome-wide methylation of fibrosis and cirrhosis in PBMC DNA of HBV-related liver disease patients. From PBMC DNA methylome profiling, we found that clinical and pathological characteristics of DLC might be important factors distinguishing themselves from CHB and LC. That was consistent with some clinical observation. On study of HBV-related liver fibrosis progression, we compared PBMC DNA methylome changes of CHB with LC. IPA pathway data showed that fibrosis progression caused epigenetic reprogramming of the immune and inflammatory response, which providing potential targets for the treatment of liver fibrosis.

Study sample
In this study, 48 HBV-related disease patients who were admitted into Beijing You'An Hospital, Capital Medical from August 23, 2013 to March 1, 2017 were recruited. Forty-eight patients included CHB consisting of 10 patients with fibrosis S1, 9 S2, and 4 S3; 14 LC consisting of 4 S3-4 and 10 S4; and 10 DLC. All patients and their family signed informed consents, and ethical approval was granted from the respective ethics committees at the Beijing You'An Hospital, Capital Medical University (EC-B-031-A01-V9.1-2017-26).

Illumina HumanMethylation450 BeadChip analysis
After extraction, PBMC DNA was treated by bisulfite converted (ZYMO Research, Irvine, USA), and prepared for Illumina HumanMethylation 450 K BeadChip (Illumina, Inc., San Diego, USA) analysis by CapitalBio Technology according the manufactures' guide. In brief, bisulfite-converted genomic DNA was whole-genome isothermally amplified at 37°C for 23 h, and enzymatically fragmented, precipitated, denatured, and hybridized on the BeadChips for 18 h at 48°C. Then BeadChips were washed, extended with biotin modified ddCTPs, ddGTP or DNP modified ddATP, ddTTP prior to scanning with Illumina iScan system. Preprocessing of the methylation data including raw.idat file loading, filtering out probes located in chromosome X and Y, quality check, normalization, and batch effect correction was performed with Bioconductor package Chip Analysis Methylation Pipeline (ChAMP) implemented in R [38]. Raw data were filtered the probes with detection P value > 0.01, with a beadcount < 3 in at least 5% of samples, containing single nucleotide polymorphisms (SNPs) aligning to multiple locations, locating on X,Y chromosome as well as was NOCpGs.

STEM analysis
We investigated the continuous change of CpG methylation over 5 disease development stages by applying short time-series expression miner (STEM) analysis [39]. For our analysis, mean CpG sites beta value of each group was used for STEM analysis. Firstly, the clustering algorithm was set as STEM Clustering Method to create 50 model profiles, all these model profiles were defined independently of the data from the experiment. Then, STEM used the hypergeometric distribution to compute the significance of overlap between the data from the experiment and model profiles. Filtering parameters were adjusted for CpG methylation data, as difference was set to from 0 to 1 and minimum absolute expression change was set to 0.

PCR and pyromarksequencing
Bisulfite-converted DNAs were amplified by PCR primers designed with the Pyromark Assay Design software 2.0 (Qiagen, Hilden, Germany) and primers and cycling conditions for pyrosequencing shown in Supplementary table 2. These PCR amplicons were separated and detected by electrophoresis on 2% agarose gel. Briefly, the PCR amplicon for each sample was immobilized by master mix which contained Streptavidin Sepharose High Performance beads (GE, Healthcare) and PyroMark Binding Buffer (Qiagen). The immobilized product was purified by 70% ethanol, PyroMark Denaturation Solution (Qiagen), and PyroMark Wash Buffer (Qiagen) on the PyroMark Q24 Vacuum Workstation (Qiagen), sequentially. Finally, singlestranded DNA was then annealed to a specific sequencing primer (Supplementary table 2) at 80°C for 2 min, and then cooled to room temperature for at least 5 min. Results were analyzed with the PyroMarkQ24 Software 2.0 (Qiagen).

Pathway analysis
All CpG sites were linked to genes on basis of only the 450 K BeadChip annotation file. The majority of the significantly differentially methylated CpG sites were located within the gene region, which was categorized into 8 groups: TSS1500 (1500 bp regions upstream of the transcription start site (TSS)), TSS200, 5′untranslated region (UTR), the 1st exon, exon boundaries (ExonBnd), gene body, 3′UTR, and intergenic region. "Promoter" region included TSS1500, TSS200, 5′UTR, and the 1st exon of genes, which was tightly linked to transcriptional progress [28-30, 40, 41]. Selected genes whose "promoter" region containing differentially methylated CpG probes were subjected to QIAGEN's Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City, www.qiagen. com/ingenuity) to identify relevant immune signaling and functional pathways.

Software platform and packages
All analyses were carried out using the R 3.5.1 language (http://www.r-project.org/). Correlations between pyromarksequencing and 450 K BeadChip data were tested using Pearson coefficients as implemented in R stats-package. The ChAMP package was for analysis of Illumina BeadChip assay [38] and package Limma [42] was implemented in ChAMP. The R package "glmnet" [15,16] was for Lasso logistic regression, and "caret" was for cross validation to select the optimal λ value of LASSO penalty [43] and confusion matrix analysis of internal testing of LC model. "ROCR" [44] was employed to plot the ROC curve and determine the AUROC.