Splice variants of zinc finger protein 695 mRNA associated to ovarian cancer

Background Studies of alternative mRNA splicing (AS) in health and disease have yet to yield the complete picture of protein diversity and its role in physiology and pathology. Some forms of cancer appear to be associated to certain alternative mRNA splice variants, but their role in the cancer development and outcome is unclear. Methods We examined AS profiles by means of whole genome exon expression microarrays (Affymetrix GeneChip 1.0) in ovarian tumors and ovarian cancer-derived cell lines, compared to healthy ovarian tissue. Alternatively spliced genes expressed predominantly in ovarian tumors and cell lines were confirmed by RT-PCR. Results Among several significantly overexpressed AS genes in malignant ovarian tumors and ovarian cancer cell lines, the most significant one was that of the zinc finger protein ZNF695, with two previously unknown mRNA splice variants identified in ovarian tumors and cell lines. The identity of ZNF695 AS variants was confirmed by cloning and sequencing of the amplicons obtained from ovarian cancer tissue and cell lines. Conclusions Alternative ZNF695 mRNA splicing could be a marker of ovarian cancer with possible implications on its pathogenesis.


Background
Ovarian Cancer (OC) is the sixth most prevalent form of cancer worldwide, which has a high mortality rate because at the time of diagnosis nearly 70% of cases are at an advanced stage, leading to a 5 year survival below 30% [1]. The classification of OC depends on its cellular origin, with approximately two-thirds belonging to the epithelial serous type [2]. Similar to other types of cancer, OC is characterized by changes in gene expression profiles [3][4][5][6], including under and overexpression, or even de novo gene expression [7,8].
Alternative splicing (AS) provides a critical and flexible layer of regulation, intervening in many biological processes, such as the diversity of proteins.
AS has major impact on the cell phenotype as a single pre-mRNA spliced in different ways can give rise to different mature mRNA transcripts (variants) that are translated onto distinct proteins varying in functions [9][10][11]. Apparently, over 90% of human genes have two or more splice variants [9,10], greatly increasing the complexity of both the transcriptome and the proteome [12,13]. Therefore, AS could play an important role in gene regulation both in health and disease. In cancer, AS could affect the cellular processes related to tumor progression, including inhibition of apoptosis, tumor invasiveness, metastasis and angiogenesis [14].
Among the genes with well established AS patterns whose derived alternative proteins affect tumor cell behavior is the SRPK1 kinase that in breast, colonic and pancreatic carcinomas phosphorylates the splicing factor SF2/ASF, allowing import to the nucleus, where it modulates AS of multiple target mRNAs, such as BIN1, S6K1, MNK2, contributing to tumor progression [15,16]. Additional cancer types with apparent alterations of alternative splicing, include: gastric, colon and bladder carcinomas [17], hepatocarcinoma [18], prostatic cancer [19,20], multiple myeloma [21], breast cancer, and OC [22], where most cancer-associated transcript variants belong to genes related to processes such as cellular transformation [23,24], adhesion, proliferation, migration and invasion [25][26][27][28]. In OC, a new, previously unknown, variant of p53 mRNA transcript variant (p53δ) was identified [29] whereas variants of the NR4A1, a nuclear receptor involved in steroidogenesis, and MRRF, a mitochondrial protein, were identified in prostatic cancer [20]. Although the extent and pathophysiological meaning of this has yet to be established, there is little doubt that the study of alternative splicing can lead to a better understanding of the mechanisms of cancer development, and to the identification of new biomarkers for the diagnosis, epidemiological studies of prevalence, prognosis, and therapeutic responses.
The aim of the present study was to identify the whole genome profile of alternatively spliced mRNA in ovarian cancer and cell lines by high-density microarrays. Among the spectrum of several ovarian cancer-associated alternatively spliced genes, one mRNA, coding for ZNF695, a zinc finger protein, had the most significantly overexpression in OC with two prominent splice variants that were not present in normal ovarian tissue. These variants were cloned and sequenced. Here we describe some of the characteristics of ZNF695 mRNA splicing variants associated to ovarian cancer.

Data set and specimens
All investigations were performed in accordance with the Declaration of Helsinki with approval by the Central Research Committee of the Mexican Institute of Social Security and The Ethics Committee of Centro Médico Siglo XXI, Mexican Institute of Social Security. After informed consent was obtained, normal ovarian tissue (HOT), borderline ovarian tumor (BOT), malignant epithelial ovarian tumors stages III and IV (MOT) tissues were collected by the clinical partners at the Oncology Hospital, National Medical Center Siglo XXI, IMSS, and at the General Hospital of Mexico SSA (Secretaría de Salud) from patients with diagnosed ovarian cancer, or healthy ovarian tissue from patients who underwent abdominal surgery for hysterectomy due to uterine myomatosis with no evidence of ovarian pathology. Routinely, during this type of procedure, in patients over 45 years old both ovaries are removed, and only one in patients under 45. Cancer and corresponding normal tissue specimens were cut into three fragments and snapped frozen in liquid nitrogen, one of which was stored in RNA Latter R (Qiagen, Valencia, CA, USA) at −70°C for a maximum of two months until RNA was purified, and the other two remaining fragments were formalin-fixed, paraffin-embedded, sliced, mounted on slides, and stained with HE. Only tissue samples with >80% tumor cells or normal epithelial cells (MOT or HOT, respectively), according to the histopathological examination were included for analysis.
Samples were disrupted using a TissueLyser™ system (Qiagen, Valencia, CA, USA) for 60s at 30 Hz. Total RNA was obtained with RNeasy Mini Kit (Qiagen, Valencia, CA, USA) and total RNA concentration was quantified using a NanoDrop ND-1000 spectrophotometer and RNA quality was visualized and measured on an Agilent RNA 6000 Nano Assays in an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).

Microarray GeneChip 1.0 assay
The microarray used for these studies was Affymetrix GeneChip 1.0, which contains over 750,000 probe sets representing all exons of~28,800 annotated genes. Sample amplification and preparation for microarray hybridization was performed according to Affymetrix specifications (http://media.affymetrix.com/support/downloads/manuals/ wt_expressionkit_manual.pdf). In brief, 100 ng total RNA was reverse transcribed to cDNA, amplified by in vitro transcription and reverse transcribed to cDNA again. Fragments between 40 and 70 bp were generated enzymatically, labelled and hybridized onto the microarray chips in an Affymetrix hybridization oven at 60 rpms, 45°C for 17 hours. Chips were washed according to the established protocols (Affymetrix, Santa Clara, CA, USA) with GeneChip fluidics station 450, and finally they were scanned with an Affymetrix 7G GeneChip scanner. The raw data (CEL files) will be deposited in Gene Expression Omnibus (GEO).

Data analysis
Microarray analysis was achieved by means of CEL files of the Partek Genomics Suite 6.5 v software (Partek Incorporated, Saint Louis, MO). Probe sets were summarized by means of Median Polish and normalized by quantiles with no probe sets excluded from analysis. Background noise correction was achieved by means Robust Multi-chip Average (RMA) and data were log2 transformed. Data grouping and categorization was achieved by principal component analysis (PCA). Differentially expressed exons were detected by means of Alternative Splicing ANOVA with the healthy control samples as the baseline. Moreover, BOT, MOT and OCL were also examined against HOT by the Geometric least squares means model. Hierarchical clustering was based on the dissimilarity of samples (Euclidian method) by means of average linkage.

Reverse transcription PCR
For linear cDNA synthesis, 1 μg total RNA was predigested with 1 U DNAse, 1 × DNAse buffer, 5 mM EDTA, after which it was incubated at 37°C for 30 min and at 65°C for an additional 10 min. Thereafter, samples were placed in master mix containing: 40 U Ribolock RNAse inhibitor, 0.2 μg random hexamer primers, 20 mM dNTP's mix, 40 U M-Mulv reverse transcriptase (RT), and 1 × M-Mulv RT Buffer (Thermo Scientific).
PCR product purification and cloning PCR products were separated by electrophoresis (2.5% agarose gels) and extracted by means of Gel extraction kit TM (Qiagen, Valencia, CA, USA). The extracted products were ligated into pGem-T Easy Vector™ (Promega, Madinson, WI) by incubating overnight in 1.5 mL Eppendorf tubes with 2 × Rapid Ligation Buffer (T4 ligase), pGEM-T Easy Vector, PCR product and T4 DNA ligase at 4°C.
Recombinant plasmid DNA was purified with Wizard Plus Miniprep DNA Purification System™ (Promega) and selected clones were sequenced with M13 oligonucleotide and BigDye Terminator 3.1 cycle sequencing kit (Applied Biosystems), and sequenced in an Applied Biosystems Abi Prism 3130 genetic analyzer automated sequencer. Subsequently, the PCR amplicon sequences were assembled and checked against the transcript sequences annotated in the NCBI nucleotide database.

Expression microarray assays
A total of 14 samples with an RNA integrity number (RIN) ≥ 8 were hybridized in GeneChip 1.0 microarrays according to the MIAME guidelines. Histopathological classification of tissues was as follows: healthy ovarian tissue (HOT) n = 4, benign ovarian tumors (BOT) n = 2, (malignant) serous epithelial ovarian tumors in stages III and IV (MOT) n = 4, and ovarian cell lines (OCL) n = 4. As a prerequisite, healthy tissue had to be free of any visible alteration, whereas all tumor tissues, benign or malignant, selected for study contained at least 90% tumor cells.
Background correction and normalization of microarrays reported no quality control errors (QC) and, as expected, the QC intergroup proportions were variable (Additional file 1), whereas gene expression histograms were similar in all samples. General gene expression was examined and visualized according to the histological groups by means of PCA. As expected, except for HOT and BOT that essentially overlapped, MOT and OCL clustered in distinct regions of the PCA plot ( Figure 1). Thus, HOT and BOT clustered together in the negative end, whereas MOT and OCL clustered separately in the positive area. This indicates that our data set has the power to discriminate OC (both MOT and OCL) from normal tissue or benign tumors. Moreover, as expected, OC gene expression profiles show a wide dispersion, reflecting tumor heterogeneity. In contrast, HOT and BOT plotted in relatively close proximity, even at the level of individual samples that for the two BOT samples practically overlapped. Finally, as expected, HOT also had some individual sample variability, probably reflecting proportional differences of tissue contents, individual variability or variations in the estrous stage, neither of which was addressed.
Moreover, hierarchical clustering on the basis of relative gene expression also grouped HOT and BOT together, whereas malignancies (both MOT and OCL) clustered together but distant of HOT and BOT (Figure 2). Nonetheless, careful analysis of individual genes revealed some small intragroup differences that could reflect tumor heterogeneity and that deserves further in depth analysis. Interestingly, predominant OC-associated changes in gene expression were suppression rather than overexpression (Additional file 2).
Up to here, the results show two major ovarian tissue gene expression patterns, one specific of OC (cell lines and tumors) and the other one characteristic of borderline tumors and healthy tissue. The gross differences seen probably reflect the relatedness among the different groups. To further explore this, we compared gene expression by grouping apparently related conditions together, against each of the individual or grouped opposites (Table 1). By these means, the highest differences were MOT + OCL vs. BOT + HOT (n = 1799), followed by MOT + OCL vs. HOT (1498 differences), MOT + OCL vs. BOT (1030 differences), BOT + MOT + OCL vs. HOT (545 differences). Finally, gene expression differences between BOT and HOT were minimal (~28). On the basis of these findings, we chose to further examine the most significant differentially expressed exons.
Exon analysis identifies two major ovarian cancer-associated, differentially spliced transcripts of gene ZNF695 Once we had examined the relative OC-associated gene expression profiles, it was important to examine whether some of the overexpressed genes reflected only quantitative differences or if there were also qualitative differences among them. To achieve this, we performed exon analysis of genes overexpressed in both MOT and OCL. As differential exon usage cannot be easily examined in suppressed genes, we exclusively examined overexpressed genes.
The analysis was performed by means of Alternative Splicing ANOVA and the criteria to select genes for exon analysis were: false discovery ratio (FDR) < 0.05 and fold change >3, in at least one probe set. According to these criteria, the number of overexpressed alternatively spliced genes in OC was 207 (Additional file 3). To identify OCpredominant splice variants, these genes were subjected to internal analysis by comparing the expression of each individual exon for each study group against the mean total expression of the same gene in each of the groups, and were considered only when they yielded ≥3 fold change. Moreover, we performed visual inspection of each of the individual exon expression profile graphs (MOT + OCL vs. HOT + BOT or vs. HOT or vs. BOT, all of which yielded the same genes). This procedure identified at least nine genes of potential interest (Table 2), of which, ZNF695 (encoding a zinc-finger protein) had the highest overexpression (~7 fold, FDR < 0.005) in MOT and OCL, and its mRNA had the highest significant changes in exon expression with significant suppression in one its exons when compared to HOT and BOT ( Figure 3). The remaining of this study focuses on the characterization of ZNF695.

ZNF695 splice variants in OC
ZNF695 encode a zinc finger protein with as yet unknown functions and its gene contains six exons located in chromosome 1q cytogenetic positions 247,148,625-247,171,358. This gene has six possible transcripts of which two (ZNF695-003, and ZNF695-006) encode complete ORFs yielding a 515 and a 172 amino acid length proteins, respectively; whereas the other four transcripts encode products thought to undergo nonsensemediated decay, a process that detects nonsense mutations and prevents the expression of truncated transcripts (http://www.ensembl.org/Human/Search/Results?q=ZNF6 95;site=ensembl;facet_species=Human).
To characterize transcripts expressed predominantly in OC, we designed primers to identify and clone splice variants of ZNF695 most likely corresponding to the message lengths preferentially expressed in the four initial OC    samples. By means of RT-PCR, in a total of 14 OC tissues (10 MOT and four OCL), expression of the three different transcripts was as follows: seven out of 10 tumor samples and three out of four cell lines expressed all three transcripts at variable degree, whereas the remaining samples (three tumors and one cell line) only expressed the two larger transcripts ( Figure 4A and B). All three amplicons were gel purified, cloned into pGEM-T Easy Vector ( Figure 5A) and confirmed to correspond to the aforementioned ZNF695 RNA transcripts by sequencing ( Figure 5B-E). The first one appears to correspond to the full-length product (FLP), as it yields a 400 bp amplicon spanning FLP nucleotides 38 to 433 with full identity to ZNF695-003, ZNF695-006 and no other possible match ( Figure 5B-C, Additional file 4). The second transcript spans 360 bp (that we name here ZNF695 transcript 4, see below) fully matching ENSEMBL nonsense-mediated decay transcript ZNF695-002, as well as NCBI peptide BAG54313.1 (Isogai, T. Helix Research Institute, Genomics Laboratory; e-mail: flj-cdna@nifty.com, http://www.ncbi.nlm. nih.gov/protein/193785160?from=1&to=118) ( Figure 5B, E, Additional file 5). The third transcript (named here ZNF695 transcript 5) found here aligns with both those transcripts up to nucleotide 361, but our primers cannot identify further on the 5' direction. Although we cannot tell to which of ZNF695 splice variants it corresponds, this transcript contains a sequence partly identical to ZFN695-002 (ZNF695 transcript 4), except that it is missing a 52 bp fragment containing part of the 5' untranslated region and misses the translation initiation signal ( Figure 5D, E, Additional file 4). Although there are additional AUG codons as potential alternative translation initiation signals 3' of the canonical AUG, these fail to yield useful ORFs. Therefore, most likely, this transcript represents a long non-coding mRNA. Finally, Figure 5F shows the ZNF695 AS model according to the transcripts found in MOT and OCL.

Discussion
Understanding the origin of malignancy is one of the greater challenges of modern science. Among malignant tumors, OC represents a major problem because little is known about its pathogenesis, which is also difficult to identify in early stages as it goes asymptomatic over long periods of time to be detectable only in late stages, almost always beyond any possibility of remission [33,34]. Alternative exon splicing is a biological process of major importance, because gene changes leading to altered splicing can affect normal cell and tissue function [19,20,27,35], including malignant transformation [36]. The current studies were carried out to examine whether OC could be associated to particular exon-splicing state and if so, to identify differentially spliced transcripts present in OC but absent in healthy ovarian tissue. With the exon array data set presented here, we identified nine overexpressed genes with differential exon profiles associated to OC, one such gene, ZNF695, coding for a largely uncharacterized zinc finger protein, is the most representative, with three transcripts differentially expressed by MOT and OCL, one corresponding to the whole protein, a second ORF corresponding to a shorter peptide and a third, with lower but significant expression that corresponds to a long non-coding mRNA. These results likely provide a useful biomarker of malignant transformation in women suspected to have OC and open the study of the role of these transcripts in cell proliferation and malignant transformation. Alternative splicing is a major source of protein diversity, bioinformatics-based methods indicate that >90% human genes could be subject to AS [10,12,13] with an estimate of several million different proteins, and some individual proteins having over 1000 variants due only to AS [37]. This process can differ during distinct cellular functional or developmental stages [38,39]. It is, therefore, not surprising that AS has also been found altered during malignant transformation [40], which could be either a general marker of cancer or limited to certain cancer types. Moreover, cancer-associated AS could be the clue to understand the basis of malignant transformation, tumor behavior [23,41], and even for the identification of potential therapeutic targets [42].
We found that OC tissue has indeed a signature of alternatively spliced genes. Although we do not know yet whether these changes are indeed related only to OC or they are general markers of cancer. Of the >270 differentially spliced genes found in OC, nine were highly significant, but we decided to focus on the most significantly expressed gene with differential AS, the zinc finger protein ZNF695.
The zinc finger protein family (ZNF) spans over 700 members with many functional roles within the cell, including regulation of gene expression, which is achieved by different means. For instance, ZNF transcription factors bind to DNA by means of C 2 H 2 zinc finger domains, constituting a subfamily of ZNF [43]. Although some ZNF members act only as repressors, others solely as act as activators, most of them can apparently be either repressors or activators depending on the particular status of the cell. Moreover, some ZNF play roles in signal transduction and many other cellular functions. ZNF695, which we found here to be differentially expressed and spliced in OC belongs to the C 2 H 2 subfamily of ZNF and also contains Krüpple-associated box (KRAB) domains, which characteristically identify gene repressors [44][45][46]. Because these genes, including ZNF695 contain two or more functional domains, changes affecting only one domain can have dramatic consequences [45,47]. On one hand, repressors could lose their regulatory function or even turn in the opposite direction and become activators [47]. KRAB domains in ZNF proteins serve to bind corepressors, which in turn mediate transcription repression [44]. Of the ZNF695 splice variants we found here to be preferentially associated to OC, isoforms 2 and 3 have incomplete KRAB domains that are essential for interactions with co-repressors which suggests that such AS pattern could be related to carcinogenesis. The third variant lacked the initial translation codon; hence, it is unlikely to yield a translational product.
Unfortunately, as yet, almost nothing is known about ZNF695 in humans or in other species, with the closest homologues having up to 64% identity. Therefore, at present it is not possible to predict how ZNF695 could play a role in OC development, if at all. One could envision that because the alternative forms found in OC have incomplete KRAB domains, this potential repressor could function as an activator and turn on cell proliferation and, hence, malignant transformation. The other possibility would be to function as dominant negative variants, but this seems unlikely because normal ovarian tissue does not express ZNF695 in any of its isoforms ad OC cells express only the alternative splice variants. Therefore, we consider ZNF695 splice variants 1/2, 4 and 5 as potential oncogenes playing a role in the pathogenesis of OC.