DNA methylation mediates the effect of cocaine use on HIV severity

Cocaine use accelerates human immunodeficiency virus (HIV) progression and worsens HIV outcomes. We assessed whether DNA methylation in blood mediates the association between cocaine use and HIV severity in a veteran population. We analyzed 1435 HIV-positive participants from the Veterans Aging Cohort Study Biomarker Cohort (VACS-BC). HIV severity was measured by the Veteran Aging Cohort Study (VACS) index. We assessed the effect of cocaine use on VACS index and mortality among the HIV-positive participants. We selected candidate mediators that were associated with both persistent cocaine use and VACS index by epigenome-wide association (EWA) scans at a liberal p value cutoff of 0.001. Mediation analysis of the candidate CpG sites between cocaine’s effect and the VACS index was conducted, and the joint mediation effect of multiple CpGs was estimated. A two-step epigenetic Mendelian randomization (MR) analysis was conducted as validation. More frequent cocaine use was significantly associated with a higher VACS index (β = 1.00, p = 2.7E−04), and cocaine use increased the risk of 10-year mortality (hazard ratio = 1.10, p = 0.011) with adjustment for confounding factors. Fifteen candidate mediator CpGs were selected from the EWA scan. Twelve of these CpGs showed significant mediation effects, with each explaining 11.3–29.5% of the variation. The mediation effects for 3 of the 12 CpGs were validated by the two-step epigenetic MR analysis. The joint mediation effect of the 12 CpGs accounted for 47.2% of cocaine’s effect on HIV severity. Genes harboring these 12 CpGs are involved in the antiviral response (IFIT3, IFITM1, NLRC5, PLSCR1, PARP9) and HIV progression (CX3CR1, MX1). We identified 12 DNA methylation CpG sites that appear to play a mediation role in the association between cocaine use and HIV severity.


Introduction
Cocaine use is common among persons with chronic human immunodeficiency virus (HIV) infection, with prevalence estimates for current or recent use ranging from 5 to 30% [1][2][3][4][5][6], compared with 2% in the US general population [7]. Previous studies have shown that cocaine use accelerated HIV progression [8][9][10][11]. However, the biological mechanism of cocaine's effect on HIV outcomes remains largely unknown. Some studies have suggested that cocaine use may worsen HIV outcomes due to poor adherence to antiretroviral therapy (ART) among HIV-positive participants [2,12]. Other studies have demonstrated that cocaine's adverse effect on HIV outcomes is independent of ART [10,11,[13][14][15], supporting the hypothesis that cocaine exposure may lead to long-lasting pathophysiological changes in the immune system that worsen HIV outcomes.
Previous studies have shown that the use of cocaine enhances HIV-1 replication and undermines immune function by dysregulating gene expression on HIV-1 entry coreceptors, enhancing HIV-1 cellular toxicity, and dysregulating interleukins (IL) in the host [43,44]. Cocaine use increases the release of cytokines in immune cells and alters cytokine profile in HIV-infected individuals [45,46]. Specifically, cocaine use was positively associated with IL-4 and IL-10 [47], which likely worsens HIV severity and disease progression. Epigenetic mechanisms may play a role in cocaine's effect on the HIV severity because cocaine exposure has been showed to increase the expression of Methyl CpG binding protein 2 (MeCP2) expression [48] as well as DNMT3A and DNMT3B expressions [49] in the animal brains. In a well-matched, case-control human pilot study, cocaine use alters DNA methylation profile in blood [50]. It is plausible that cocaine use may lead to DNAm changes in immune response genes and gene expression changes in cytokine gene family, which further affects HIV progression. Thus, we hypothesized that DNAm may mediate the effect of cocaine exposure on HIV severity.
In this study, we first validated previous findings by examining cocaine's adverse effect on HIV severity and mortality. We further conducted mediation analyses to assess the mediation role of DNAm sites (or CpGs) on cocaine's effect on HIV severity using the Veteran Aging Cohort Study Biomarker Cohort (VACS-BC, n = 1435). To assess how sensitive our results are to the violation of model assumptions and validate our findings using a different approach, we performed the sensitivity analysis [51] and the two-step epigenetic Mendelian randomization (MR) analysis that used genetic variants as instrumental variables to assess the mediation role of DNAm between cocaine use and HIV severity [52]. Our results provide new insights for the role of DNAm on how cocaine affects HIV severity.

Study samples
VACS is a prospective cohort study of veterans designed to study substance use and HIV-related outcomes with patient surveys, electronic medical records, and biospecimen data [53]. A baseline survey was conducted at enrollment [53]. The follow-up survey of 5 visits occurred at approximately 1-year intervals [53]. Blood samples were collected in the middle of follow-up for a subset of participants in the cohort (VACS-BC) [54]. A total of 1435 HIV-positive participants from the VACS-BC were used to examine cocaine's effect on mortality and HIV severity, and a subset of participants (n = 875) with DNAm data available were used for mediation analyses (Fig. 1). Demographic and clinical information of baseline samples and a subset of the samples at the time of blood collection are summarized in Table 1.

Assessment of cocaine use
The timeline of cocaine use assessment for each analysis is illustrated in Fig. 1. Information on cocaine use status was self-reported through telephone interviews for a total of 5 visits. We defined the "persistent cocaine use" group as self-reported cocaine use across all 5 visits and the "no cocaine use" group as self-reported no cocaine use across all 5 visits. This definition led to a subset of samples with 265 persistent cocaine users and 202 nonusers for the mediation analyses to eliminate the inconsistent response across 5 visits and examine the effect of long-term cocaine exposure on DNAm and HIV severity.
The frequency of cocaine use was also assessed at baseline (Fig. 1). Each participant was asked "how often  in the past year have you used cocaine or crack?", from which cocaine frequency of use was coded as an ordinal variable as follows: 0 = never tried, 1 = no use in the last year, 2 = less than once a month, 3 = 1-3 times a month, 4 = 1-3 times a week, and 5 = 4 or more times a week.

Assessment of mortality and HIV severity
The timeline of HIV severity measurement and survival information for each analysis is shown in Fig. 1. Mortality and survival year information were based on medical records. The VACS index was used as a measure of HIV severity [55][56][57][58] and was obtained at each visit and at the time of blood collection (Fig. 1). The VACS index was calculated by summing preassigned points for age, routinely monitored indicators of HIV disease (CD4 count and HIV-1 RNA), and other general indicators of organ system injury [55]. A high VACS index corresponds to worsened HIV outcomes, and the VACS index is positively associated with increased mortality [59]. The VACS index and DNAm profiling were measured at the same time for the selection of candidate mediator CpGs, and the average VACS index after blood collection was used for mediation analyses (Fig. 1).

DNA methylation profiling and quality control
DNA samples were extracted from blood for a subset of 875 HIV-positive participants (Fig. 1). DNAm was profiled using two different methylation arrays, with 475 samples profiled by the Infinium Human Methylation 450K BeadChip (HM450K, Illumina Inc., CA, USA) and 400 samples later profiled by the Infinium Human Methylation EPIC BeadChip (EPIC, Illumina Inc., CA, USA) [54]. DNA samples were randomly selected for each methylation array regardless of cocaine use status or other clinical demographic variables. The quality control (QC) for samples measured by each array was conducted separately using the same pipeline as previously described [60] by the R package minfi [61]. After QC, a total of 408,583 CpGs measured by both the HM450k and EPIC array remained for analysis. Six cell-type proportions (CD4+ T cells, CD8+ T cells, NK T cells, B cells, monocytes, and granulocytes) were estimated for each participant using the established method [62]. Negative control probes were designed to capture background signals in Illumina arrays, and negative control principal components (PCs) were extracted by minfi to control for background noise [61]. Batch effect removal was conducted by combat after QC [63].

Genotyping and quality control
The 1177 samples were genotyped using the Illumina HumanOmniExpress Beadchip and imputed for 18,960, 156 single nucleotide polymorphisms (SNPs). IMUPTE2 (ver 2.3.2) was used for imputation with the reference of 1000 genome phase 3 [64]. QC was conducted using plink (ver 1.90b21) [65]. SNPs and samples with low call rate less than 0.05 were removed. The Hardy-Weinberg equilibrium test cutoff was set to 1E−06. SNPs with minor allele frequency less than 0.01 were filtered.

Statistical analysis
Cocaine survival analysis among HIV-positive participants at baseline Survival analysis was conducted using baseline information among 1435 HIV-positive participants with cocaine use frequency (0-5) and other covariates (Fig. 1). Kaplan-Meier analyses on 10-year follow-up among HIV-positive and HIV-negative participants by cocaine use frequency (0-5) at baseline were conducted, and the Kaplan-Meier curves were plotted by using the R package survminer [66]. A test on ordered differences of Kaplan-Meier curves by cocaine use frequency was conducted by survminer [66].
To adjust for confounding factors, a Cox proportional hazards model was used to assess the hazard ratio of baseline cocaine use frequency (0-5) on mortality during the follow-up using the R package survival [67]. The following model was used to calculate the adjusted hazard ratio among HIV-positive participants: h t ð Þ ¼ h 0 t ð Þ expðβ 1 cocaine use frequency þ β 2 sex þβ 3 baseline age þ β 4 race þ β 5 log 10 viral load ð Þ þβ 6 CD4 count þ β 7 antiviral medication adherenceÞ Association between cocaine use frequency and HIV severity among HIV-positive participants at baseline This analysis was conducted using baseline information on cocaine use frequency (0-5), the VACS index, and other covariates (Fig. 1). The following linear regression model was performed to test the association of cocaine use frequency and HIV severity, adjusting for confounders as shown in the following model: HIV disease severity ¼ β 1 cocaine use frequency þβ 2 sex þ β 3 age þ β 4 race þβ 5 log 10 viral load ð Þþβ 6 CD4 count þβ 7 antiviral medication adherence Selection of candidate CpGs by epigenome-wide association (EWA) of persistent cocaine use and HIV severity To select candidate CpGs for mediation analysis, we conducted two separate EWAs, one for persistent cocaine use and the other for HIV severity (Fig. 1). Each EWA model adjusted for sex, baseline age, race, smoking, self-reported antiviral medication adherence, white blood cell count, estimated cell-type proportions, and negative control PCs. We used the linear regression model with methylation as dependent variable for EWA as described previously [33,60,68]. Since CD4+ T cell count is one component of the VACS index, to avoid overrepresented CpGs associated with CD4+ T cells in the EWA results, we extracted the top 1000 CD4+ T cell-type relevant CpGs based on data from FlowSorted.-Blood.450k [69]. The top 2 PCs that in total account for > 80% variation of the 1000 CD4+ T cell CpGs were used as covariates in the VACS index EWA model. CpGs with p < 0.001 in both EWAs for persistent cocaine use and HIV severity were selected as candidate CpGs for mediation analyses. A liberal selection threshold was arbitrarily set to make sure there would be a sufficient number of candidate CpGs for the mediation analysis. To limit confounding by use of other substances, we tested the association of each candidate CpG site with alcohol use, cannabis use, and opioid use based on self-reported data. Alcohol use was assessed by using 3 items of Alcohol Use Diagnosis Identification Testconsumption (AUDIT-C). Cannabis and opioid uses were assessed by asking the same questions as for cocaine use, described earlier.

Single-site mediation analysis and joint mediation analysis
The selected candidate CpGs were assessed as potential mediators of the association between persistent cocaine use and HIV severity among HIV-positive participants (n = 467). We performed single-site mediation analysis using the mediation method as previously described [51] and the R package mediation [70]. Here, we used the average VACS index after DNAm profiling to ensure the temporality of our mediation hypothesis that DNAm measurement preceded the HIV severity measurement. In our mediation model, we adjusted for sex, age, race, smoking, self-reported antiviral medication adherence, white blood cell count, and estimated cell-type proportions as confounding factors.
We used M to represent the candidate CpGs (mediator), X to represent persistent cocaine use status (exposure), Y to represent the average VACS index after blood collection (outcome), and C i to represent k confounding variables (sex, age, race, smoking, self-report antiviral medication adherence, white blood cell count, estimated CD8 T cells, granulocytes, NK cells, B cells, and monocytes). The mediator model f(M| X, C) examined the association between persistent cocaine use and CpGs: The outcome model f(Y| X, M, C) examined both the direct effect of persistent cocaine use on VACS index and the mediation effect by CpG: Thus, the mediation effect, or the average causal mediation effect (ACME) of CpG M, was α 1 β 0 , the total effect was α 0 + α 1 β 0 , and the proportion mediated was α 1 β 0 / (α 0 + α 1 β 0 ). The confidence interval and p value were estimated by bootstrapping 1,000,000 iterations.
To assess the robustness of the results if the sequential ignorability assumption was violated, we conducted a sensitivity analysis developed by Imai et al. [51] using the R package mediation [70]. Sequential ignorability consists of two assumptions: (a) conditional on the covariates C i , the exposure X was independent of all potential values of the outcome Y and mediator M; and (b) the observed mediator M was independent of all potential outcomes Y given the observed exposure X and covariates C i . The sensitivity parameter ρ was calculated on a grid of 0.05 and the ρ at which ACME = 0 was calculated. For each mediator, sensitivity plots were illustrated to show the estimated ACME and their 95% confidence interval as a function of ρ ( Figure S2). If the ρ at which ACME = 0 was close to 0, it indicates that the mediation analysis was sensitive to violation of the sequential ignorability assumption.
The joint mediation analysis of all significant mediator CpGs was conducted as previously described [71]. The mediator model f(M j | X, C) for multiple mediators M j (M 1 , M 2 , …, M n ) was: The outcome model f(Y| X, M 1 , …, M n , C) was: The joint mediation effect of CpGs M 1 , …, M j is P n j¼1 α j β 0 j , the total effect is α 0 þ P n j¼1 α j β 0 j , and the proportion mediated is P n j¼1 α j β 0 j =ðα 0 þ P n j¼1 α j β 0 j Þ. The confidence interval and p value were estimated by bootstrapping 1,000,000 iterations.

Two-step epigenetic Mendelian randomization of cocaine and HIV severity
To evaluate whether the results from the mediation analysis were influenced by reverse causation or unmeasured confounding, we conducted a two-step epigenetic MR analysis [52] (n = 1177) on cocaine use, candidate mediator CpGs, and HIV severity using the inversevariance weighted (IVW) method by R package Mende-lianRandomization [72].
In step 1, we conducted a two-sample MR on the effect of cocaine use on candidate CpGs (n = 1177). Based on a recent meta-analysis of a cocaine dependence genome-wide association study (GWAS) [73], 8 SNPs genotyped in our samples pruned at linkage disequilibrium (LD) r 2 < 0.1 by the R package LDlinkR [74] were used as instrumental variables (p < 1E−05) (Table S4). We tested the associations between the 8 SNPs and the candidate CpGs, adjusting for age, sex, race, and 5 ancestry PCs using a linear regression model in our sample (n = 1177). Based on these summary statistics, we conducted MR using the IVW method to evaluate the effect of cocaine use on candidate CpGs.
In step 2, we conducted a one-sample MR on the effect of candidate CpGs on HIV severity (n = 1177). Here, cis-methylation quantitative trait loci (meQTLs) were used as instrumental variables. cis-meQTLs were defined by the distance between a candidate CpG and a SNP within 1 Mb. A linear regression analysis was performed to identify cis-meQTLs, adjusted for age, sex, race, and 5 ancestry PCs. For each candidate CpG, cis-meQTLs with p < 0.01 after pruning (LD r 2 < 0.1 using 1000 genome African ancestry samples as references [75]) were used as instrumental variables in the MR analysis (Table S4) by the R package LDlinkR [74]. Association between each cis-meQTL and HIV severity was assessed by linear regression, adjusting for age, sex, and 5 ancestry PCs. Similar to the first step, we conducted an MR using the IVW method to evaluate the effect of candidate CpGs on HIV severity.

Results
Cocaine use affects HIV severity and mortality among HIV-positive participants We found that among HIV-positive participants, higher cocaine use frequency was associated with increased mortality (p = 0.008, Fig. 2a). This difference was not found among HIV-negative participants (p = 0.180, Fig. 2b). Using Cox proportional hazards model, this trend remained significant with a hazard ratio (HR) of 1.10 (95% CI 1.02-1.19, p = 0.011), controlling for sex, baseline age, race, viral load, CD4 count, and antiviral medication adherence ( Table 2). A higher frequency of cocaine use at baseline was also significantly associated with a higher VACS index (i.e., higher HIV severity, β = 1.00, p = 0.00027) after adjusting for sex, age, race, viral load, CD4 count, and antiviral medication adherence ( Table 2). To account for other drug use, we further adjusted for baseline use of alcohol, cigarette smoking, cannabis, and opioids in the model. Cocaine use frequency remains significantly associated with HIV severity after adjusting for use of other substances (p = 0.049). Our results suggest that cocaine use accelerated HIV progression and increased mortality independent of antiviral medication adherence, which is consistent with previous reports [10,11,[13][14][15]. The EWA scan of persistent cocaine use showed good control of inflation (λ = 1.034, Figure S1). A total of 497 CpGs met our candidate selection threshold (p < 0.001). The top ranked CpG site, cg22917487, was close to the epigenome-wide significance threshold with a p value of 1.69E−07. This CpG site is located in CX3CR1, a gene that encodes a coreceptor for HIV-1 and leads to rapid HIV progression (Table S1).
We selected candidate CpGs that were both associated with cocaine use and HIV severity (p < 0.001) by two separate EWA scans of 408,583 CpGs for mediation analysis. Fourteen CpGs met both candidate selection thresholds. Additionally, cg22917487 in CX3CR1 showed a strong association with cocaine (p = 1.69E−07) and a marginal association with the VACS index (p = 1.73E −03). Given its biological plausibility, this CpG was also included as a candidate mediator for mediation analysis. Five of the top 10 VACS index EWA CpGs were selected as candidate mediator CpGs (cg08122652, PARP9, p = 2.30E−10; cg03038262, IFITM1, p = 7.65E−09; cg06188083, IFIT3, p = 4.76E−08; cg08818207, TAP1, p = 2.11E−07; cg26312951, MX1, p = 2.50E−07). Overall, a total of 15 CpGs were selected as candidates to assess their potential mediation roles on the association between persistent cocaine use and HIV severity. Notably, the DNAm from each of the 15 CpGs was not associated with cannabis, opioid, or alcohol use (p > 0.05, Table S3).

Mediation analysis of candidate CpGs between persistent cocaine use and HIV severity
We examined the mediation role of DNAm between persistent cocaine use and HIV severity. Twelve out of the 15 candidate CpGs showed significant mediation effects on the association between persistent cocaine use and the VACS index, with p values ranging from 1.00E −06 to 0.003 (Table 4). These results remained significant after Bonferroni correction (p < 0.003). Each CpG mediator explained between 11.3 and 29.5% of persistent cocaine use affecting HIV severity. Notably, the direction of mediation effects among these 12 mediator CpGs were the same. The average direct effects of cocaine on HIV severity were attenuated from 0.329 to 0.231-0.291 after adjusting for each mediator CpG. These 12 CpGs collectively mediated 47.2% of the cocaine's effects on HIV severity by joint mediation analysis.
We also conducted a sensitivity analysis on these 15 candidate CpGs to assess the robustness of our mediation analysis when the sequential ignorability assumption was violated [51]. The absolute sensitivity parameters at which ACME = 0 of the 12 significant mediator CpGs were relatively higher (|ρ| ≥ 0.15) than 3 nonsignificant CpGs (|ρ| ≤ 0.10) (Table 4, Figure S2). Notably, 6 significant mediator CpGs had |ρ| of 0.30, indicating that these mediation effects were robust even when the assumptions are slightly violated. The sensitivity analysis showed that our mediation results were relatively stable.

Two-step epigenetic Mendelian randomization of cocaine and HIV severity
To validate our mediation results while eliminating unmeasured confounding and reverse causation, we used the two-step epigenetic MR method [52] to test our mediation hypotheses (n = 1177): whether cocaine use has a causal effect on candidate CpGs (step 1) and whether candidate CpGs have causal effects on HIV severity (step 2). In step 1, we conducted the MR analysis based on summary statistics of a meta-analysis of GWAS on cocaine dependence [73]. The effect estimates of the association between 8 SNP instrumental variables and 15 candidate CpG sites were obtained in our sample. Our MR analysis showed that cocaine had significant MR estimates (p < 0.05) on 4 CpGs (cg03753191, EPST I1; cg06188083, IFIT3; cg26312951, MX1; cg22917487, CX3CR1), as shown in Table 5. Three of these CpGs were also among the top significant mediators in our previous mediation analysis (Table 4).
In step 2, we conducted the MR analysis based on cis-meQTLs of the candidate CpGs and their association with HIV severity in our sample. Seven CpGs showed significant MR estimates on HIV severity (Table 5). Of note, 3 significant CpGs in the MR analysis in step 1 were also significant in step 2.
Overall, 3 mediator CpGs discovered by the mediation analysis were validated as significant mediators by two-step epigenetic MR analysis (cg03753191, EPSTI1; cg06188083, IFIT3; cg26312951, MX1). Three CpGs without significant mediation effects in the mediation analysis were also found to be nonsignificant in the two-step MR analysis (cg26396492, RIN2; cg22385827, C2orf67; cg08623256).

Discussion
Our findings provide evidence that cocaine use worsens HIV severity and increases mortality among HIV-positive participants and that cocaine's adverse effects are partially mediated by DNAm in the blood. We identified 12 CpGs that collectively accounted for a total of 47.2% of cocaine affecting HIV severity. Three of the 12 mediator CpGs were further validated by a two-step epigenetic MR approach, which provides supporting evidence that our mediation results were not affected by unmeasured confounders or reverse causation. The sensitivity analysis showed that our mediation analyses are relatively robust to slight violation of assumptions. These 12 mediator CpGs offer new insights into the mechanisms of how cocaine use may affect HIV outcomes by DNAm.
Methodological considerations are important for examining the mediation effect of DNAm. It is possible that our mediation analyses could be undermined by violation of model assumptions, reverse causation, and unmeasured confounding. To address these concerns, we performed the sensitivity analysis and two-step epigenetic MR analysis to further evaluate the mediation Table 3 The selected candidate CpG sites by epigenome-wide association (EWA) scan on persistent cocaine use (n = 467) and HIV severity (n = 875) effects of the 12 CpG sites. The results from the sensitivity test showed that the 12 mediator CpGs were robust when slight violation of the sequential ignorability assumption is present. The two-step epigenetic MR analysis confirmed 3 of 12 CpG sites as mediators of cocaine affecting HIV severity and was not affected by reverse causation and unmeasured confounding. Of note, the 8 SNPs used in the MR analysis showed marginal association with cocaine use, which limited their utility as instrumental variables and may explain why 9 CpG sites did not show significant mediation effects in two-step epigenetic MR analysis. In addition, in the mediation analysis, the HIV severity was measured after the blood collection for DNAm profiling to assure that the measurement of mediator precedes the measurement of outcome. Our study design intended to match the temporality of exposure, mediator, and outcome and to avoid reverse causation. Of note, we observed a discrepancy on the direction of cocaine use effect on DNA methylation between EWA scan and step 1 MR analysis. This may happen because EWA scan assessed association while MR evaluated the causal effect by removing reverse causality. This difference might also be due to different ways on adjusting for confounding factors in two models. Additionally, to assess whether cocaine use influenced cell-type proportions as reflected by DNA methylation, we conducted a MR on cocaine affecting six cell-type proportions using the same SNP instruments as used in step 1 MR. We found no significant MR estimates across six cell types (p > 0.1) (supplementary table S5), suggesting that cocaine use does not directly affect cell-type proportions in our sample. Overall, we took various measures to make sure our mediation results are valid and robust. We observed that the sum of individual mediation proportion for 12 mediator CpGs exceeded 100%. An alternative approach is to test the joint mediation effect of all mediators [71]. We found that the 12 mediator CpGs jointly accounted for 47% of the total effect (effect size = 0.329) of cocaine use on HIV severity. This finding indicates that these mediators may affect one another or that there is an interaction effect [71]. For example, several mediator CpG sites are near genes on the response to cytokine pathway (PARP9, PLSCR1, CX3CR1, IFITM1, These 12 CpGs are located in or near 11 biologically meaningful genes that were previously reported to be involved in inflammation, HIV-1 viral replication, and other pathways that play critical roles in HIV progression. Specifically, cg06188083 on IFIT3 mediated 28.8% of the variation, and IFIT3 encodes an IFN-induced antiviral protein which acts as an inhibitor of viral processes and viral replication [76]. Another significant mediator CpG site, cg06188083, is located near interferon gene IFITM1. We previously reported the hypomethylation of cg07839457 due to HIV infection, which is located in the promoter region of NLRC5 [33]. This CpG site was also a significant mediator between cocaine and HIV severity in this study. NLRC5 plays an important role in the cytokine response and antiviral immunity through its inhibition of NF-kappa-B activation and negative regulation of type I interferon signaling pathways [77]. The converging evidence on cg07839457 (NLRC5) warrants further investigation of its role in HIV infection and progression. Another interesting CpG site, cg22917487 on CX3CR1, showed both a strong association with persistent cocaine use and a significant mediation effect of cocaine affecting HIV severity. CX3CR1 is involved in leukocyte adhesion and migration and was recently identified as an HIV-1 coreceptor [78]. Some studies also showed that genetic variants on CX3CR1 were associated with HIV susceptibility and rapid HIV progression to AIDS [79]. cg25114611, located in the promoter region of FKBP5, is also biologically plausible, given the implication for chronic cocaine administration upregulating FKBP5 expression in rats [80].
Cocaine use commonly cooccurred with the use of other substances, and this may confound cocaine's effects on HIV severity and the mediation effects of CpGs between cocaine use and HIV outcomes. However, our results show that the association between cocaine use and HIV severity remained significant after accounting for smoking, alcohol, cannabis, and opioid use. Additionally, our cocaine use EWA model adjusted for smoking as a covariate, and the selected candidate CpGs were not associated with alcohol, marijuana, and opioid use (Table S3).
There are several strengths of this study. First, instead of selecting candidate mediator CpGs based on the literature or hypotheses, we applied an unbiased epigenome-wide screening to select CpGs associated with both cocaine use and HIV severity. Second, to limit self-reporting bias of cocaine use, we leveraged longitudinal data in defining persistent cocaine use and no cocaine use. We included

5.70E−01
IV instrumental variable † p < 0.05 is bolded only those participants who consistently reported cocaine use or no cocaine use across all 5 visits for the selection of candidate CpGs and the mediation analyses. Last, we used the average VACS index after blood collection so that DNAm measurements (mediator) preceded HIV severity (outcome) for the mediation analyses.
One limitation of the study is that our sample size for the mediation analyses is small. However, the strict definition of cocaine use helped reduce self-reporting bias and can potentially increase power by comparing extreme groups. In addition, we used a less stringent criterion when selecting candidate CpGs for mediation analysis due to the limited sample size to achieve epigenome-wide significance. To our knowledge, there are no sufficiently sized independent cohorts for replication. Although this approach has also been adopted by previous studies [42,81], using epigenome-wide significant CpG sites as candidate mediators may show stronger signals in the future study with a larger sample size. Additionally, other unmeasured confounding factors such as socioeconomic status may not be fully addressed in the mediation model. Lastly, our samples consisted of mostly male veterans, which may limit the generalizability of our findings.

Conclusions
We validated previous reports that the use of cocaine worsened HIV severity and increased the risk of all-cause mortality among HIV-positive participants. For the first time, this study found that several biologically meaningful DNAm sites mediated the adverse effect of cocaine use on HIV severity. These results merit future studies to further explore the biological mechanisms revealed by these DNAm sites on how cocaine affects HIV disease outcomes.