Parent-of-origin-specific allelic expression in the human placenta is limited to established imprinted loci and it is stably maintained across pregnancy

Background Genomic imprinting, mediated by parent-of-origin-specific epigenetic silencing, adjusts the gene expression dosage in mammals. We aimed to clarify parental allelic expression in the human placenta for 396 claimed candidate imprinted genes and to assess the evidence for the proposed enrichment of imprinted expression in the placenta. The study utilized RNA-Seq-based transcriptome and genotyping data from 54 parental-placental samples representing the three trimesters of gestation, and term cases of preeclampsia, gestational diabetes, and fetal growth disturbances. Results Almost half of the targeted genes (n = 179; 45%) were either not transcribed or showed limited expression in the human placenta. After filtering for the presence of common exonic SNPs, adequacy of sequencing reads and informative families, 91 genes were retained (43 loci form Geneimprint database; 48 recently proposed genes). Only 11/91 genes (12.1%) showed confident signals of imprinting (binomial test, Bonferroni corrected P < 0.05; > 90% transcripts originating from one parental allele). The confirmed imprinted genes exhibit enriched placental expression (PHLDA2, H19, IGF2, MEST, ZFAT, PLAGL1, AIM1) or are transcribed additionally only in the adrenal gland (MEG3, RTL1, PEG10, DLK1). Parental monoallelic expression showed extreme stability across gestation and in term pregnancy complications. A distinct group of additional 14 genes exhibited a statistically significant bias in parental allelic proportions defined as having 65–90% of reads from one parental allele (e.g., KLHDC10, NLRP2, RHOBTB3, DNMT1). Molecular mechanisms behind biased parental expression are still to be clarified. However, 66 of 91 (72.5%) analyzed candidate imprinted genes showed no signals of deviation from biallelic expression. Conclusions As placental tissue is not included in The Genotype-Tissue Expression (GTEx) project, the study contributed to fill the gap in the knowledge concerning parental allelic expression. A catalog of parental allelic proportions and gene expression of analyzed loci across human gestation and in term pregnancy complications is provided as additional files. The study outcome suggested that true imprinting in the human placenta is restricted to well-characterized loci. High expression of imprinted genes during mid-pregnancy supports their critical role in developmental programming. Consistent with the data on other GTEx tissues, the number of human imprinted genes appears to be overestimated. Electronic supplementary material The online version of this article (10.1186/s13148-019-0692-3) contains supplementary material, which is available to authorized users.


Introduction
Genomic imprinting is a unique feature implicated in fine-tuning the dosage of gene expression in mammals. It is defined as an exclusive expression of either paternally or maternally derived allele of a gene, while the other allele is silenced via epigenetic reprogramming of germ cells in utero [1][2][3]. The majority of imprinted loci are localized within gene clusters and the expression of either maternal or paternal set of genes is tightly coordinated at the genomic level. For some specific tissues, such as the placenta, additional ungrouped "singleton" imprinted genes have been reported [4]. Failure in programming genomic imprints may cause severe developmental disorders and fetal growth disturbances [3,5].
Analyses of human imprinted genes have been facilitated by the advanced 'omics' toolsets [6][7][8]. Two recent RNA sequencing (RNA-Seq) based analyses utilizing the Genotype-Tissue Expression (GTEx) dataset across diverse sets of human post-mortem tissues from 178 individuals reported only 12 and 17 novel imprinting candidate genes, respectively [7,8]. The overall number of identified imprinted human genes was lower than initially thought, only 72 (42 high-confident, 30 suggestive) genes and 93, respectively. The data also showed a widespread tissue specificity of imprinting and/or variable maintenance of imprinted status among loci across tissues [7,8].
Although the underlying reasons of imprinting and its 'rationale' in genome function remain debated, it is generally accepted that this phenomenon arose in parallel with the evolution of the mammalian placenta [1,9]. Consistent with the evolutionary context, well-known imprinted genes are critical in regulating human placental function and fetal development, including tissue-specific imprinted microRNA clusters [9][10][11]. Recent studies applying genome-wide allelic DNA methylation analyses of human placentas have suggested a potential organ-specific enrichment of imprinted genes, highly variable imprinting, and possible polymorphic silencing of preferably maternal gene alleles [6,[12][13][14]. Whereas DNA methylation-based studies are valuable tools to identify loci exhibiting either maternal or paternal allele-specific methylation as indicative markers to imprinting, RNA-Seq enables to directly assess transcripts exhibiting parent-of-origin-specific allelic expression. As placental tissues are not included in GTEx, the analysis of parental transcripts in the human placenta has been lagging behind. So far, only two smallscale RNA-Seq studies have been published profiling of parent-of-origin expression and reporting novel imprinting candidate genes in either human term placentas (n = 10, [15]) or early pregnancy chorionic villus samples (n = 21; [14]). However, some of these claims were based on single samples, the applied criteria to define imprinting varied between the studies and the majority of novel reported candidate imprinted loci have been not identified as imprinted genes in other tissues. Thus, there are remaining uncertainties and contradictions among the claims regarding the landscape of imprinting in the human placenta and there has been a lack of transcriptome-based studies analyzing adequate numbers of parental-placental samples.
The current study aimed to clarify the parental allelic expression status in the human placenta for nearly 400 claimed candidate imprinted genes and to confirm (or reject) the evidence for the suggested enrichment of imprinted genes in placental transcriptome compared to other tissues. The study utilized RNA-Seq-based placental transcriptome data and the corresponding genotyping data from 54 parental-placental samples collected from all three trimesters of gestation, as well as term cases of preeclampsia, gestational diabetes, and fetal growth disturbances. Among 91 tested genes with adequate placental expression and available sequencing data from at least 3 informative family trios/duos, only 11 genes showed high-confidence imprinting signals, i.e., nearly monoallelic parent-of-origin determined allelic expression. Additional 14 genes exhibited transcript profiles consistent with biased proportions of parental alleles. The majority, 66 of 91 (72.5%) analyzed candidate imprinted genes were convincingly detected to be expressed in the human placenta in a biallelic manner.

Methods
Datasets of parental-placental trio or maternal-placental duo samples The study exploited previously published 54 placental RNA-Seq datasets [16][17][18] and the corresponding genomewide genotyping data of placental and respective parental blood samples [19,20]. The dataset was comprised of 38 parental (mother, father)-placental trios and 16 maternalplacental duos (Table 1). Placental and parental blood samples of singleton term pregnancy cases (delivery ≥ 37th gestational week) had been collected at the delivery room during the REPROMETA study (Additional file 1: Supplementary Methods). The recruited term pregnancy groups represented cases of uncomplicated gestation (normal third trimester), maternal preeclampsia (PE), gestational diabetes (GD), delivery of a small-(SGA, < 10th birth weight centile) or large-for-gestational-age (LGA, > 90th centile) newborn according to national guidelines [21]. The dataset analyzed in the current study included 38 term pregnancy trios and 2 duos (paternal DNA samples unavailable), delivered at median gestational age (g.a.) 275.5 [260-291] days (Additional file 2: Table S1). Each group (normal third trimester; PE, GD, SGA, and LGA) was represented by eight cases that were matched for gestational age. Additional 14 maternalplacental duos represented 8 electively surgically terminated pregnancies during the first trimester (60 [51-81] gestational days (g.d.)) and 6 medically induced abortions during the second trimester due to maternal health indications (138 [126-167] g.d.) (Additional file 3: Table S2). Gross chromosomal abnormalities in the analyzed placentas had been excluded by placental karyotyping. For the second trimester terminated pregnancies, fetal anomalies were excluded by the pathology specialist assessment.
Placental sampling, RNA sequencing, and genotyping A detailed description of placental sampling, RNA extraction, sequencing procedures, and bioinformatic processing has been described previously [16][17][18] and is provided in Additional file 1: Supplementary Methods. Briefly, for term and second-trimester pregnancy placentas, the sampling was performed through all layers of the middle region of the placenta. Samples of the first trimester placentas were obtained immediately after surgical termination of pregnancy. The maternal tissue was removed under a stereomicroscope (Discovery V8, Zeiss) and chorionic villi containing both cyto-and syncytiotrophoblast cells were sampled. For DNA studies, the placental or chorionic villus samples were placed immediately into dry cryovial, and for the RNA studies into RNAlater solution (Thermo Fisher Scientific, Waltham, MA, USA). The samples were kept at − 80°C until DNA/RNA isolation.
Total placental RNA was extracted using TRIzol reagent (Invitrogen, Life Technologies) and purified with RNeasy MinElute columns (Qiagen, Netherlands). rRNA depletion, preparation of RNA-Seq sequencing libraries, sequencing of transcriptomes (Illumina HiSeq2000) and basic bioinformatic processing of the raw sequencing data (QC, read alignment and transcript and gene expression estimation) were performed according to the established pipeline at the Sequencing Unit of Finnish Institute of Molecular Medicine (FIMM), University of Helsinki, Finland. Initial data analysis was conducted using the in-house RNA-Seq pipeline v2.4 (FIMM). Sequencing reads were filtered for the quality, the presence of the adaptor, rRNA, and mtDNA sequences, as well as homopolymer stretches using custom python scripts. Read alignment to human genome assembly (GRCh37.p7/hg19) was performed with TopHat version 2.0.3 [22] and read counts per gene were estimated using htseq-count [23], based on reference annotations from Ensembl v67 [24]. To compare expression among genes, transcript levels were additionally quantified as FPKM (fragments per kilobase per million), implemented in Cufflinks v 2.0.2 [25]. The complete dataset across 54 placental transcriptomes consisted of 2.28 billion paired-end reads (mean 42.3 million per sample; range 27.3-74.6 million) with an average alignment success rate of 82.6% (range 56.2-87.3%). Median estimate for the fraction of RNA originating from maternal cells was previously calculated to be 0.93% [16].
Placental and blood genomic DNA was genotyped using Illumina HumanOmniExpress-12-v1/24-v1 Bead-Chips (> 715,000 markers with median spacing 2.1 kb) [19,20]. In the current study, we only analyzed exonic SNPs mapped in imprinted candidate genes and with minor allele frequency (MAF) > 10%. Genotype distributions of all analyzed SNPs were in Hardy-Weinberg equilibrium (P > 0.05).  Table S1 and Additional file 3: Table S2 a Chorionic villi from first trimester placentas were sampled after elective surgical termination of pregnancy. Samples of second trimester placentas were derived from cases of medically induced abortion due to maternal health indications C-sect, Cesarean delivery; GD, gestational diabetes; LGA, delivery of a large-for-gestational-age newborn; normal, term pregnancy without any maternal or fetal complications; PE, preeclampsia; SGA, delivery of a small-for-gestational-age newborn; F, female; M, male; n, number; n.a., not available; Mat, maternal blood sample; Pat, maternal blood sample; Pl, placental sample Formation and filtering of the candidate imprinted gene list The list of human genes predicted to exhibit parent-oforigin determined allelic expression were retrieved from the Geneimprint database, the last access May 25, 2018 (n = 300) [26]. The list was supplemented with 96 recently reported novel candidate imprinted genes in the human placenta [6,14,15]. As polymorphic imprinted transcripts were not targeted in this study, the analyzed gene list did not include the respective proposed candidate loci [12,13]. Total number of the genes entering the analysis pipeline was 396 (Additional file 4: Table S3).
To determine the parental origin of analyzed placental transcripts with high confidence, a stringent filtering pipeline, and data QC were applied (Additional file 4: Table  S3). The first step included checking the gene annotations in the human genome assembly (GRCh37.p7/hg19) and assessing the sufficiency of placental gene expression using empirically assigned threshold (median normalized expression < 50 reads across all samples [16]. For the retained 207 genes, Ensembl Biomart tool [27] was implemented to identify common (1000 Genomes Project dataset: MAF > 10%) biallelic exonic SNPs within the available parental-placental genotyping dataset (dropout 9 genes). Custom scripts were developed to identify informative family trios/duos for each SNP to assess the parental origin of the expressed transcripts. Family trios/duos were defined as informative if the placenta had heterozygous genotype of the SNP and at least one of the parents had homozygous genotype of this variant (Additional file 5: Figure S1). Retained SNPs had to be informative for at least 3 family trios/duos (dropout 47 genes). Next, the maternal and paternal read counts at the selected marker SNP positions for each gene were called from the placental RNA-Seq dataset of the informative families (BAM files). Samtools mpileup command [28] was applied with the following parameters: -ABQ 0 (reference genome GRCh37.p7). Upon manual inspection of RNA-seq reads visualized using the IGV 3.0 software [29], SNPs located within alternative exons overlapping with introns of the main transcript and SNPs with < 3 median reads at the variant position across all informative placentas, were discarded (dropout 17 and 43 genes, respectively). The final analyzed dataset was comprised of 91 genes and 227 SNPs. It included 43 genes listed in the Geneimprint database and 48 genes derived from recent publications (9,19, and 20 genes from ref. [6], ref [15], and ref. [14] respectively).

Analysis of parental transcript ratios and gene imprinting status
For each gene, the proportions of maternal (Mat) and paternal (Pat) reads across all samples were calculated and the outcome was expressed as Mat/Pat reads ratio along with the estimated 95% confidence interval (CI). The observed parental transcript ratios were statistically tested under the assumption that both alleles are expressed at equal levels, using binomial test implemented in R. Statistical significance level was defined P<0.05 after application of Bonferroni correction for the number of conducted tests (one test per gene, total 91). A gene was defined as imprinted if at least 90% of the RNA-Seq reads were assigned to only one parental allele, i.e., close to monoallelic expression in the parent-of-origin-specific manner. Among the rest of the genes with statistically significant deviation from the expected maternal/paternal transcript ratio, loci with ≥ 65%, but < 90% reads originating from one parental allele were defined to exhibit biased parental allelic expression. A gene was considered biallelic when the proportions of parental reads did not differ significantly from the expected ratio (P corr > 0.05) and/or the estimated proportions of both parental allelic reads fall within 35-65%.

Validation of the parental origin of transcripts
Validation of the maternal allelic expression of RTL1 was performed on three placental-parental trios informative for two SNP alleles using RT-PCR, cloning, and sequencing of the region. DLK1 served as a reference of a paternally expressed gene and the PAPPA2 (RNA-Seq: biallelic expression) and RHOBTB3 (paternally biased expression) transcripts were cloned as positive controls for the capture of bi-parental expression, if present. cDNA was synthesized from 1 μg total placental RNA according to the manufacturer's instructions (SuperScript III Reverse Transcriptase, Life Technologies). cDNA fragments were amplified by PCR from placental samples using PCR primers provided in Additional file 6: Table S4. To reach highconfidence conclusions about the transcribe allele of the RTL1 gene, long-range PCR (2357 bp) was designed, incorporating simultaneously two marker SNPs (rs3825569, rs6575805). Purification, cloning, and sequencing of PCR products are detailed in Additional file 1: Supplementary Methods. RT-PCR, cloning, and sequencing experiments analyzed at least 10 clones per SNP. DNA sequences were visualized and analyzed using the Bioedit software [30].

Results
Half of the candidate imprinted genes have no or low expression in the human placenta The initial list of 396 candidate imprinted genes was assembled based on the Geneimprint database and recent reports on potential novel placental imprinted genes [6,14,15]. The analyzed RNA-Seq dataset of 54 placental samples covered a broad spectrum of pregnancy scenarios, including uncomplicated gestations across all three trimesters and adverse pregnancy outcomes at term (cases of PE, GD, SGA, LGA; Table 1; Additional file 2: Table S1; Additional file 3: Table S2). In total 189 genes (47.7%) were filtered out in the first step as they were not properly annotated (10 genes), had no (87) or limited (92) placental expression in our dataset ( Fig. 1a; Additional file 4: Table S3). The retained 207 genes were further assessed for the presence of common genotyped SNPs in coding regions and their unambiguous exonic location, adequacy of read counts at the variant position and the availability of minimum three informative family trios/duos in our dataset to determine the parental origin of transcribed alleles (Additional file 5: Figure S1). The set of loci that passed all QC criteria for the analysis of the parental allelic expression comprised of 91 genes and 227 SNPs ( Fig. 1a; Additional file 4: Table S3; Additional file 7: Table S5).
Parental monoallelic expression is limited to well-known placental imprinted genes Only 11 of 91 (12.1%) analyzed genes were expressed in the human placenta in an exclusive parent-of-origin manner and were classified as high confidence imprinted genes (binomial test, P corr < 0.05; > 90% transcripts originating from one parental allele; Fig. 1b, Table 2, Additional file 7: Table S5; Additional file 8: Table S6). The median fraction of reads detected from the preferred parental allele was as high as 97.6% and for all confirmed imprinted genes the proportions of parental transcripts showed an extremely stable pattern across three trimesters of normal gestation and in all analyzed term pregnancy complications ( Fig. 2; Additional file 9: Figure  S2). Among paternally expressed genes, the most stringent level of imprinting was identified for PEG10 and the least conservative for AIM1 (99.8% and 93.7 % of paternal reads, respectively). Among maternal genes, the constraint for parental monoallelic expression was the highest for MEG3 (99.5% of maternal reads) and the lowest for H19 (93.6%). Interestingly, there were more paternally than maternally expressed imprinted genes identified (Fig. 1b). Except for RTL1, the parental origin of transcripts was consistent with the literature data. Although previously reported to be paternally expressed in the mouse placenta [32], our RNAseq data and subsequent experimental validation showed that RTL1 is a maternally expressed gene in the human placenta (Additional file 10: Table S7). All but one (ZFAT) of the highconfident imprinted genes expressed in the placenta are also imprinted in the mouse ( Table 2).
The confirmed imprinted genes are either placentaspecific (AIM1, H19, IGF2, MEST, PHLDA2, PLAGL1, ZFAT) or additionally transcribed only in the adrenal gland (DLK1, MEG3, PEG10, RTL1) ( Fig. 1c; Table 3). Most of them show high placental expression with the peak transcript levels during mid-gestation (Figs. 1d and 3, Additional file 9: Figure S2). The transcription of paternally expressed AIM1 was specifically enhanced in early pregnancy, whereas ZFAT exhibited an unusual expression dynamics characterized by specifically reduced transcript levels during mid-gestation. None of the imprinted genes showed systematic expressional bias in the placentas from analyzed term cases of preeclampsia, gestational diabetes, and deliveries of SGA or LGA newborns.
Genes with biased parental allelic expression in the human placenta Additional group of 14 candidate imprinted genes (15.4%) were detected with high confidence to exhibit biased parental allelic expression in the placenta (binomial test, P corr < 0.05; 65-90% of reads from one parental allele; Table 2, Figs. 1 and 2; Additional file 7: Table S5; Additional file 8: Table S6). The proportions of parental reads of most biased genes showed substantial variability among the analyzed placentas. More loci were identified with paternal (10 genes) compared to maternally biased expression (4 genes). In addition, preferential transcription of maternally biased genes was less pronounced compared to the paternally biased allele genes (median 69.3% vs. 83.0% of reads from the preferred parental allele, respectively). Among genes with preferred maternal allele expression, the most skewed transcript ratio was identified for KLHDC10 (74.9 % of maternal reads), whereas the highest paternal read counts were detected for CPXM2 gene (89.1%). Despite that these candidate imprinted genes showed only biased (not exclusively monoallelic) parental allelic expression, the preferentially transcribed allele for all 14 genes was concordant with the data in previous reports (Table 2).
Notably, none of the genes with biased parental allelic expression is placenta-specific (Fig. 1c, Table 3). These genes (except for MKRN3) are either transcribed in a broad range of tissues or preferentially in some other organ, and their placental expression level tends to be modest with the exception of RHOBTB3 and GRHL1 (Fig.  1d, Table 2). Like imprinted genes, the placental expression of several parentally biased genes followed tight gestational dynamics, e.g., high level of paternally biased GRHL1, MCCC1, DNMT1, and maternally biased NLRP2 specifically in early pregnancy (Fig. 3, Additional file 9: Figure S2). No systematic deviations from biased parental allelic expression were detected in our dataset in the placentas representing term pregnancy complications.

The majority of candidate imprinted genes detected exhibit biallelic expression in the human placenta
Robust biallelic expression in the human placenta was detected for 66 of 91 (72.5%) analyzed candidate imprinted genes (≥ 35 % of reads from both alleles; Figs. 1 and 2; Additional file 7: Table S5; Additional file 8: Table S6). The majority (92%) of the genes that were transcribed from both parental alleles are broadly expressed across tissues (47 genes) or exhibit enhanced transcription in other organ(s) than placenta (14 genes). The transcript levels of biallelic placental genes are variable and some of these loci exhibit either placenta-specific (PAPPA2, LGALS14) or enhanced (AOC1, ASCL2) expression.

Discussion
This study represents the first systematic assessment of parental allelic expression of nearly 400 candidate imprinted genes in 54 human placental samples across all three trimesters of normal gestation and in cases of term preeclampsia, gestational diabetes, and fetal growth disturbances. Almost half of the candidate genes (n = 179; 45%) were either not transcribed or showed limited placental expression. Initial gene list was filtered for the presence of common exonic SNPs, sequencing depth, and informative families for the parental allelic expression. In total, 91 genes were retained for the final analysis. The detailed outcome data is presented as a catalog of parental allelic proportions and gene expression of all analyzed loci across human  Table S3). The list of 300 human genes predicted to exhibit parent-of-origin determined allelic expression were retrieved from the Geneimprint database [26]. The list was supplemented with 96 recently reported novel candidate imprinted genes in the human placenta [6,14,15]. b The analyzed geneset included 11 true imprinted genes with parent-of-origin-specific transcription, 14 genes with biased parental allelic expression, and 66 biallelic loci. c Expressional breadth across human tissues and d the abundance of placental transcripts of the analyzed genes stratified based on the parental allelic expression. Human tissue data was derived from the Protein Atlas database [31]. FPKM, fragments per kilobase of transcript per million mapped reads; Mat, maternal; n/a, not available; Pat, paternal. gestation and in term pregnancy complications (Additional file 9: Figure S2). Only 11 of 91 analyzed genes (12.1%) showed confident signals of parent-of-origin-specific allelic expression in the human placenta and the programming of imprinting for all genes was stable across the entire gestation and assessed term pregnancy scenarios (Table 2; Fig. 2; Additional file 9: Figure S2). The strict requirement of a single copy dosage of these genes in the placental function appears to be conserved among mammals. MEG3, PHLDA2, IGF2, H19, Fig. 2 Examples of analyzed candidate imprinted genes stratified based on the proportions of transcribed parental alleles. A gene was confirmed as imprinted, when it was expressed in a high-confidence parent-of-origin-specific manner (binomial test, P corr < 0.05; > 90% transcripts originating from one parental allele). Biased parental allelic expression was defined when a significant deviation from the equal proportions of transcribed parental alleles was observed, but it did not correspond to exclusive monoallelic transcription (binomial test, P corr < 0.05; 65-90% of reads from one parental allele). A gene was confirmed as biallelically expressed when the proportions of parental reads did not differ significantly from the expected ratio (P corr > 0.05) and/or the estimated proportions of both parental allelic reads fall within 35-65%. Detailed information on all analyzed genes is provided in Additional file 8: Table S6 and in the assembled gene-based catalog (Additional file 9: Figure S2), including data of parental allelic proportions and expression for all analyzed clinical subgroups. GD, gestational diabetes; LGA, large-for-gestational-age newborn; PE, preeclampsia; SGA, small-for-gestational-age newborn; Trim, trimester PEG10, DLK1, and MEST have been classified as ancient imprinted genes as they are expressed with the same parent-of-origin manner in human, mouse, and equine placentas [34]. The confirmed genes with parental monoallelic expression are expressed specifically in the placenta or additionally only in the adrenal gland. High expression of the majority of imprinted genes in the second trimester of pregnancy supports their critical role in supporting the For the majority of genes, the information on the expression in human tissues/organs was derived from Protein Atlas [31]. For RNA genes, the information on the tissue expression was derived from NCBI Gene [33]. The same database was applied to extract functional information on the analyzed genes. n.d., not described a Paternal allele expression in all tissues except brain with maternal allele expression b Paternal expression in brain and maternal in the placenta fine-tuning of developmental programming [35]. As a pronounced temporal dynamics pattern of gene expression across pregnancy was detected for each placental imprinted gene (Fig. 3, Additional file 9: Figure S2), gestational agespecific transcription has to be regulated independently of the programmed stable epigenetic imprints. The restricted number of imprinted genes in the human placenta is consistent with the data on the mouse placenta [36] and other human tissues. Two independent studies on human tissues cataloged in the GTEx Project reached consistent conclusions that the majority of human imprinted genes are already known and the predicted number of loci with parent-origin-specific expression has been overestimated [7,8]. The analysis of transcriptome-wide imprinting signals in 1582 samples representing 37 primary human tissues from 178 individuals reported only 42 high-confidence imprinted genes. Widespread tissue specificity and also a tissue-specific alternative choice of expressed parental allele LGA, large-for-gestational-age newborn; PE, preeclampsia; SGA, small-for-gestational-age newborn; Trim, trimester for some genes (e.g., IGF2) was observed. A parallel study on an extended dataset of 45 tissues detected imprinting signals for 93 genes, but concluded that tissue-specific imprinting is rather rare. In the current study, 8 of 11 confirmed placental imprinted genes show parental monoallelic expression in the majority of human organs [7,8]. Across all tissues, the most stable imprinting has been detected for maternally expressed MEG3 and H19 (Table 3). However, two placenta-specific genes (PHLDA2, ZFAT) exhibit biallelic, but low expression in other tissues and for AIM1 no data has been reported apart from the placenta.
The current study identified also a distinct class of 14 genes that showed a systematic bias towards the enrichment of transcripts from one parental allele (65-90% of reads), but the parental allelic proportions did not correspond to the generally acknowledged definition of imprinting. These genes were characterized by broad expression across tissues, diverse functions and notable inter-individual variation of parental allelic proportions (Fig. 2, Additional file 9: Figure S2). The molecular mechanisms leading to biased parental allelic expression are still to be uncovered. These may likely overlap with the programming of imprints in fetal germ cells, and reflect differences in the epigenetic reprogramming of maternal and paternal pronuclei in fertilized oocytes and/or somatic chromosomal aberrations in early embryos involving preferably one parental chromosome. There is a support to all these scenarios. Some genes with biased parental expression in the placenta have been reported as imprinted in other organs, e.g., ZDBF2 (many tissues), GRB10 (brain), MKRN3 (brain, esophagus) (Table 3) [7,8]. It is also well established that the paternally derived chromosomes are actively demethylated by the TET3 enzymes, whereas the maternally derived chromosomes undergo passive, replication-dependent demethylation achieved by nuclear exclusion of DNMT1 [37][38][39][40]. The observed more conservative pattern of paternally compared to maternally biased genes is supporting the previously reported post-fertilization differences in epigenetic reprogramming of sperm-and oocyte-derived methylation marks [13] (Fig. 2). Oocyte-derived placenta-specific transiently differentially methylated regions (DMR) have been associated with polymorphic imprinting that is characteristic to the placentas of primates [12,13]. Interestingly, these DMRs can adopt an unusual epigenetic signature combining DNA methylation with biallelic enrichment of H3K4 histone methylation that represents typically mutually exclusive epigenetic modifications [41]. Placental genome is hypomethylated [42] and prone to the promotion of somatic genomic changes [19], resembling the generation of chromosomal rearrangements typical in tumor tissues [43]. Interestingly, the placental somatic duplications have been reported to encompass a significant enrichment of imprinted, mostly maternally expressed genes [19]. On the other hand, some placental-biased genes such as NUTD12 (paternal) and NLRP2 (maternal) show often monoallelic, but non-parental expression in other tissues [7]. Additionally, the utilized short-read RNA-seq data may have misclassified the loci that encode both, non-imprinted transcripts and placenta-specific imprinted isoforms (e.g., GRB10 [44]) Mapping the coexpressional reads of several transcripts would mask isoform-specific imprinting signals and the gene may be categorized as a parentally biased locus. Development of locus-specific assays to analyze individual transcriptional isoforms would clarify this issue.
In total, 48 of the analyzed genes had been proposed as novel candidate imprinted loci in recent placental genome-wide DNA methylation or small-scale RNA-Seq based studies [6,14,15]. Disappointingly, the current study could not confirm explicit parental monoallelic expression for any of these genes, and a robust biallelic transcription was detected for most loci (Additional file 7: Table S5; Additional file 8: Table S6; Additional file 9: Figure S2). Only a small fraction of these genes showed reliable evidence for biased parental allelic expression. Among the genes reported to harbor maternal differentially methylated regions (mDMR) [6,45], preferred expression of paternal transcripts was detected for MCCC1, DCAF10, DNMT1, NUDT12, and RHOBTB3 (Table 2). Interestingly, for KLHDC10 showing clearly maternally biased expression, mDMR has been reported within the gene body [14]. The discrepancy between the reported parent-of-origin allelic methylation vs. transcription is supported by the emerging evidence that in a number of genomic regions, constitutive parental DNA methylation imprints are actually decoupled from the parent of origin expression effects [13,46]. Several studies have shown that candidate loci associated with placenta-specific maternal methylation are associated with actual parental allelic transcriptional bias at only half the loci [6,13,14]. Additionally, allelic imbalances in DNA methylation may reflect the underlying differences in primary DNA sequence [47,48]. Concerning RNA-Seq-based studies, spurious claims of parental monoallelic expression may arise from modest informative sample sets, random sampling errors of transcript pools entering library preparation and RNAsequencing, insufficient read coverage and limited QC (e.g., RNA-Seq mapping or genotyping errors), and loose statistical criteria in defining imprinted genes (reviewed in [7]). This may lead to false-positive claims of parental imprinting, especially for the genes with low transcript levels that are fine-tuned at the cellular level by nonparental random monoallelic expression (RMAE) [49,50]. Furthermore, in clonal cell lines that are typical for the placenta, RMAE may be present for a notable subset of cells [51].
Placental imprinting errors have been associated with fetal growth disturbances and with maternal preeclampsia or gestational diabetes [3,5,10,52] (Additional file 11: Table S8). In our dataset, no systematic link was observed between term pregnancy pathologies and deviations of parental allelic proportions or expressional dynamics of imprinted and biased genes (Figs. 2 and 3; Additional file 7: Table S5; Additional file 9: Figure S2). However, we acknowledge that a modest number of analyzed samples representing each subgroup may have limited the ability to detect rare isolated clinical cases with altered imprinting. And in the other way round, the enrichment of placentas representing various scenarios of complicated pregnancies in our dataset may have skewed the analysis due to possible loss-of-imprinting in adverse gestational outcomes.
Also, the limitations of the study have to be acknowledged. The study approach relied on genotyped (vs. imputed) SNPs and applied stringent QC and filtering to minimize false positives claims and detect high-confidence imprinted genes. These procedures excluded from the analysis of 116 imprinting candidate genes (29.3% of the initial list) that are adequately expressed in the placenta.