Distinct expression and prognostic values of GATA transcription factor family in human ovarian cancer

Accumulated studies have provided controversial evidences of expression patterns and prognostic value of the GATA family in human ovarian cancer. In the present study, we accessed the distinct expression and prognostic roles of 7 individual members of GATA family in ovarian cancer (OC) patients through Oncomine analysis, CCLE analysis, Human Protein Atlas (HPA), Kaplan–Meier plotter (KM plotter) database, cBioPortal and Metascape. Our results indicated that GATA1, GATA3, GATA4 and TRPS1 mRNA and protein expression was significantly higher in OC than normal samples. High expression of GATA1, GATA2, and GATA4 were significantly correlated with better overall survival (OS), while increased GATA3 and GATA6 expression were associated with worse prognosis in OC patients. GATA1, GATA2, GATA3 and GATA6 were closely related to the different pathological histology, pathological grade, clinical stage and TP53 mutation status of OC. The genetic variation and interaction of the GATA family may be closely related to the pathogenesis and prognosis of OC, and the regulatory network composed of GATA family genes and their neighboring genes are mainly involved in Notch signaling pathway, Th1 and Th2 cell differentiation and Hippo signaling pathway. Transcriptional GATA1/2/3/4/6 could be prognostic markers and potential therapeutic target for OC patients. Supplementary Information The online version contains supplementary material available at 10.1186/s13048-022-00974-6.


Introduction
Ovarian cancer (OC) is the most cause of cancer-related death form of all gynecological malignancies [1,2]. Although standard cytoreductive surgery and platinum based chemotherapy have improved overall survival and life quality, long-term survival of advanced OC patients remains poor [3]. Over 75% of patients are not early diagnosed until advanced stages, and the 5-year rate survival is less than 30%, due to the lack of specific symptoms and efficiently prognostic biomarkers [4,5]. Therefore, further investigation on the mechanisms of OC tumorigenesis and tumor progression, and identification of potential effective and minimally prognostic markers and potential drug targets is still needed for OC patients [3].
The GATA protein family has been identified as one of the zinc finger DNA binding proteins that play an essential role during epithelial proliferation and development of diverse tissues [6]. Based on initial studies of their expression, GATA1, GATA2, and GATA3 were categorized as hematopoietic GATA factors, while GATA4, GATA5, and GATA6 were termed endodermal GATA factors [6,7]. In biological function, GATA1 and GATA2 play pivotal roles in regulating cell cycle or proliferation [8]. GATA3 is not only an important transcriptional factor for T-cell development, but it is also involved in cellular proliferation, development, and differentiation in luminal epithelial and urothelial epithelium cells [9]. GATA4, GATA5 and GATA6 are expressed predominantly in endodermand mesoderm-derived tissues [10,11]. GATA4 and GATA5 tend to mark fully differentiated epithelial cells and confirmed as potential tumor suppressors [12], while GATA6 expresses in the immature proliferating cells in the intestinal crypts and classified as potential oncogene [13]. TRPS1 (trichorhinophalangeal syndrome-1) is a novel GATA transcription factor that has been found to be a critical activator of mesenchymal-to-epithelial transition (MET) during embryonic development in a number of tissues [14]. There is growing evidence that deregulation of GATA expression is a common occurrence in several human malignancies, and distinctive role of individual GATA member in tumor tumorigenesis and progression [6,7,15]. Such as breast [16], colon [17], lung [18], gastric [19] and pancreatic cancer [20], as well as OC [21][22][23][24][25][26]. These proteins are considered having potential value to be adopted as novel biomarkers in the detection and accurate prediction of many kinds of tumors.
Although GATA has been identified as a crucial transcription factors in a variety of hematogenous malignancies and solid tumors, and several GATA family members (GATA3, GATA4 and GATA6) have been shown to be related to prognosis in OC patients [21][22][23][24][25][26]. The roles of distinct different GATA members in contribution to tumorigenesis and development of OC are still lacking. In the current study, we extended the research field to OC based on large databases, with purpose of determining the expression pattern of distinct GATA family members in OC.

Oncomine analysis
The individual gene mRNA expression levels of GATA family members (GATA1, GATA2, GATA3, GATA4, GATA5, GATA6 and TRPS1) were determined through analysis in ONCOMINE database (www. oncom ine. org), which is a publicly accessible online database with cancer microarray information to facilitate discovery from genome-wide expression analyses [27,28]. In this study, students'-test was used to generate a p-value for comparison between cancer specimens and normal control datasets. The fold change was defined as 1.0, p value was set up at 0.05 and top 10% gene rank as threshold.

CCLE analysis
The mRNA levels of GATA members in a series of cancers were analyzed by CCLE database (https:// porta ls. broad insti tute. org/ ccle/ home), which is an online encyclopedia of a compilation of gene expression, chromosomal copy number and massively parallel sequencing data from 947 human cancer cell lines, to facilitate the identification of genetic, lineage, and predictors of drug sensitivity [29].

Immunohistochemistry analysis
The Human Protein Atlas (HPA) database (www. prote inatl as. org) is an international program that has been set up to allow for a systematic exploration of the human proteome. The HPA database was used to investigate and validate the protein expression of GATA members in OC tissues by immunohistochemistry (Scar bar =200 μm).

The Kaplan-Meier plotter and OncoLnc database analysis
The prognostic significance of the messenger RNA (mRNA) expression of GATA family genes in OC was evaluated using the Kaplan-Meier plotter (www. kmplot. com), an online database including gene expression data and clinical data [30]. In this database, all OC patients' gene expressions and survival information were established from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas cancer datasets (TCGA ), and the Cancer Biomedical informatics Grid (caBIG) [31,32]. Simultaneously, OncoLnc (www. oncol nc. org/) online tools to validate the correlation between the expression of each GATA family genes and the prognosis of patients with OC, which combines prognostic data from The Cancer Genome Atlas (TCGA) database with mRNA, miRNA or lncRNA expression levels. The expression and prognosis data for each gene were downloaded, and Kaplan-Meier curves were drawn using online tools. HRs, 95% CIs, and log rank value were determined and displayed on the webpage. A p value < 0.05 was considered to be statistically significant to reduce the false positive rate.

cBioPortal analysis
The cBioPortal for Cancer genomics is an open access resource (http:// www. cbiop ortal. org/), providing integrative analysis of complex cancer genomics and clinical profiles from 105 cancer studies in TCGA pipeline [33]. The frequency of GATA family gene alterations (amplification, deep deletion, missense mutations), copy-number variance (CNV) from GISTIC and mRNA expression z-scores (RNA Seq V2 RSEM) were assessed using the cBioPortal for Cancer Genomics database and TCGA. In addition, co-expression and network was calculated according to the cBioPortal's online instruction [32].

Functional enrichment analysis
Metascape (http:// metas cape. org) is a free well-maintained, user-friendly gene-list analysis tool for gene annotation and analysis resource. In this study, Metascape was used to conduct pathway and process enrichment analysis of GATA family members and neighboring genes. The Gene Ontology (GO) terms for the biological process (BP), cellular component (CC) and molecular function (MF) categories as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were enriched based on Metascape online tool. Only terms with P value < 0.01, minimum count 3, and enrichment factor > 1.5 were concerned as significant. Molecular Complex Detection (MCODE) algorithm was further applied to identify densely connected network components.

The mRNA expression levels of GATA family members in OC
To address the mRNA expression differences of GATA family between tumor and normal tissues in ovarian cancer, we performed an analysis using the Oncomine database. As shown in Fig. 1, ONCOMINE analysis revealed that GATA1, GATA2, GATA3, GATA4 and TRPS1 mRNA expression was significantly higher in OC than normal samples. GATA1 transcripts were 1.082 fold elevated in OC samples as compared with normal tissues in a dataset with 594 samples that derived from TCGA (the Cancer Genome Atlas) database. GATA2 was 1.211-fold elevated in OC samples as compared with normal tissues (p = 9.89E-6). GATA3 was 1.138-fold elevated in OC samples as compared with normal tissues (p = 1.48E-7). GATA4 was 1.201-fold elevated in OC samples as compared with normal tissues (p = 6.23E-5). In addition, TRPS1 was 1.269-fold elevated in OC samples as compared with normal tissues (p = 4.00E-5). We chose the probe with the highest expression fold change as the Fig. 1 display when multiple probes correspond to the same GATA family member. However, no significant difference was found in the mRNA level of other GATA members, including GATA5 (− 2.311 fold change, p = 0.996) and GATA6 (− 2.529 fold change, p = 1.000) between OC samples and normal controls. CCLE analysis demonstrated that although the mRNA expression levels of GATA1 and GATA2 ranked the 14th and 16th highest in OC among different cancer cell types, the expression levels of GATA1 and GATA2 in ovarian cancer cells are generally low, (shown in green frame) (Fig. 2).

The protein expression levels of GATA family members in OC
To further investigate and validate the protein expression level of GATA family members in OC, we performed immunohistochemistry analysis of the protein expression of GATA family members using HPA databases. In addition to GATA5, the protein expressions of the other 6 family members in ovarian cancer are clearly displayed in the HPA database. As shown in Fig. 3, we found that except for the strong staining of GATA4 in both normal and cancer tissues of the ovary, most of the GATA family members showed low expression in normal ovarian tissues, but showed moderate to high expression in OC tissues. Through the analysis of immunohistochemistry pictures, the results indicated that the protein expression of GATA1, GATA2, GATA3, GATA4 and TRPS1 also was upregulated in OC tissues compared with corresponding normal tissues.

Prognostic values of GATA family members in OC patients
We respectively examined the prognostic ability of the mRNA expression of individual GATA family members in OC patients in www. Kmplot. com. Five members were significantly associated with prognosis in OC  (Fig. 4). We chose the probe with the largest sample size as the target probe for further analysis when multiple probes correspond to the same GATA family member. We observed that high expression of GATA1, GATA2, and GATA4 were significantly correlated with better overall survival (OS), while increased GATA3 and GATA6 expression were associated with worse prognosis in OC patients. The mRNA levels of GATA5 and TRPS1 were not correlated with OS, although the expression of GATA5 (hazard ratio [HR] = 0.82 95% confidence interval [CI]: 0.67-1.00, p = 0.0551) was modestly associated with poor survival. The prognostic values of GATA family members were assessed in different pathological histology subtypes of OC, including serous and endometrioid. As shown in Table 1, high mRNA expression of GATA4 was correlated with longer OS, whereas increased GATA6 and TRPS1 mRNA expression were correlated with better OS in serous OC patients. In endometrioid OC, increased GATA6 expression was associated with better prognosis. The remaining GATA family members were not significantly associated with prognosis in serous or endometrioid OC. Simultaneously, OncoLnc analysis demonstrated that abnormal expression of GATA2 and GATA4 was correlated with OS in OC patients (Logrank P = 0.045 and 0.042). However, the expression of other GATA family members was not statistically associated with the prognosis of patients with OC (Supplemental Information 1).
We made further efforts to assess the relationship between individual GATA family members and other clinicopathological features, such as pathological grade (Table 2), clinical stage (Table 3), and TP53 status (Table 4) in OC patients. As shown in Table 2, high mRNA expression of GATA3 was associated with worse OS in pathological grade I + II OC patients. In pathological grade III + IV OC patients, elevated mRNA expression of GATA1, GATA2 and GATA4 were associated with better OS, but high GATA5 and TRPS1 mRNA expression linked to poor OS. As shown in Table 3, only increased expression of GATA3 and GATA5 were associated with worse OS in clinical stage I patients. For clinical stage II OC patients, only high expression of GATA4 was associated with better OS. In clinical stage III OC patients, high expression of GATA2, GATA4 and GATA5  Table 4 shows that the correlation between GATA family member expression and TP53 status. High expression of GATA1, GATA2, GATA3, GATA6 and TRPS1 were associated with poor OS in OC patients harbouring mutated TP53. In contrast, increased GATA2 and GATA3 mRNA expression were linked to better prognosis, and high expression of GATA6 was associated with linked worse OS in OC patients with wild-type TP53.

Functions enrichment analysis of GATA family members in patients with OC
The functions of GATA family members and their neighboring genes were predicted by analyzing gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) in Metascape. As shown in Fig. 6A-D and Table 5, the GO enrichment items were classified into three functional groups: biological process group, molecular function group, and cellular component group. The GATA family members and their neighboring genes were mainly enrichment in the heart development, embryonic organ development, regulation of binding, response to wounding, endocrine system development, regulation of Notch signaling pathway, muscle cell differentiation, regulation of hemopoiesis, regulation of stem cell differentiation, cardiac muscle hypertrophy, cytokine production, animal organ formation, muscle cell development, cellular response to hormone stimulus and response to heat; The molecular functions that these genes were mainly expressed in transcription regulatory region sequence-specific DNA binding, transcription factor binding and carbonate dehydratase activity; The    Fig. 6D and Table 5. Among these pathways, the Notch signaling pathway, Th1 and Th2 cell differentiation and Hippo signaling pathway were found to relate to multiple tumor development, and it be involved in OC tumorigenesis and pathogenesis.
In addition, to better understand the relationship between GATA family members and OC, we performed a Metascape protein-protein interaction (PPI) enrichment analysis and module analysis of the PPI network. The PPI network and MCODE components identified in the gene lists and shown in Fig. 7A-D. The PPI network were significantly associated with heart development, embryonic organ development and chordate embryonic development, while in three significant modules, GO term log-rank test indicated no significant difference in OS between the cases with alterations in one of the query genes and those without alterations in any query genes (P values, 0.0651). F The results of Kaplan-Meier plotter and log-rank test indicated no significant difference in DFS or PFS between the cases with alterations in one of the query genes and those without alterations in any query genes (P values, 0.0736) enrichment analysis of biological processes showed that the genes in these modules were mainly associated with ATP-dependent chromatin remodeling, histone deacetylation, protein deacetylation, chordate embryonic development, embryo development ending in birth or egg hatching and in utero embryonic development.

Discussion
GATA family has been widely recognized as pivotal transcription factors in the development and differentiation of various cell types in vertebrates. Increasing evidence has shown that altered expression of GATA factors plays an important role in dedifferentiation of ovarian carcinogenesis. However, the exact role of GATA expression in OC is still controversial. In the current study, we comprehensively examined the expression patterns and prognosis analyses of individual GATA family members in OC using the Oncomine database, the CCLE database, the KM plotter, cBioPortal and Metascape. Our analysis suggested that, among the members of the GATA family, GATA1, GATA3, GATA4 and TRPS1 mRNA expression was significantly higher in OC than normal samples. The mRNA expression level of GATA1 and GATA2 in OC listed the moderate highest among all cancer types using the CCLE analysis. More importantly, survival analysis indicated that high expression of GATA1, GATA2, and GATA4 were significantly correlated with better OS, while increased GATA3 and GATA6 expression were associated with worse prognosis in OC patients. We further assessed the prognostic value of GATA in different pathological grades, clinical stages and TP53 mutation status of OC patients. The results showed that GATA1, GATA2, GATA3 and GATA6 were closely related to the different clinicopathological features and treatment of OC. Then, we tried to systematically explore the genetic alteration, correlation and potential functions of GATA family numbers in OC. Our findings confirmed that the genetic variation and interaction of the GATA family may be closely related to the pathogenesis and prognosis of OC, and the regulatory network composed of GATA family genes and their neighboring genes are mainly involved in Notch signalling pathway, Th1 and Th2 cell differentiation and Hippo signalling pathway.
GATA1, the first recognised member of the GATA family, is essential for erythropoiesis, megakaryocyte maturation, and eosinophil production [34]. The observations in human patients confirmed the critical role for GATA1 in erythroid and megakaryocytes development, and GATA1 mutations may be closely related to two neoplastic diseases: transient myeloproliferative disorder and acute megakaryoblastic leukemia [35]. However, its role in solid tumour has not yet been fully elucidated [36]. Our results demonstrated that increased expression of GATA1 was correlated with significantly better OS for all OC patients, but not in serous or endometrioid subtype patients. This may be due to the small sample size of these two subtypes. Two previous studies found that GATA1 and its phosphorylation may play an important role in the metastasis of breast cancer, and GATA1 can be used as an independent prognostic marker for breast cancer [37,38]. Unfortunately, as far as I know, no molecular biology studies have directly explored the prognostic value of GATA1 for OC. This study further shows that high expression of GATA1 indicated a better OS for OC patients with high stage (III + IV). Furthermore, the 11% of genetic alterations in GATA1 for OC based on TCGA Provisional dataset, and GATA1 with GATA2 and GATA6 had a significant negative correlation through Pearson correlation analysis. Due to the lack of relevant research, the conclusion of our study on GATA1 needs to be further confirmed.
GATA2 is identified as a critical regulator of growth, differentiation and survival of hematopoietic stem cells [39,40]. Increasing evidence has shown that GATA2 expression is correlated with hematologic pathophysiologies and the proliferation and progression of solid tumors [40]. Upregulated GATA2 expression has been implicated in several tumour types, such as breast cancer [41], colorectal cancer [42] and liver cancer [43]. Moreover, recent studies confirmed that GATA2 overexpression in prostate cancer increases cellular motility and invasiveness, proliferation, tumorigenicity, and resistance to standard therapies [40]. In our study, high expression of GATA2 was significantly associated with better OS, especially in pathological grade III + IV OC patients. In addition, increased GATA2 expression was linked to better prognosis in OC patients with wild-type TP53 in our analysis. GATA3 is a "master regulator" in both mouse and human development that plays a critical role in multiorgan development and regulates tissue specific cellular differentiation [44]. It is reported to be abnormal expressed in breast and urothelial carcinomas and, hence, has been used as a marker and extensively investigated in these cancers [44,45]. Recent evidence suggests that GATA3 as a strong and independent predictor of clinical outcome in human luminal breast cancer [16,46]. Lower GATA3 expression is strongly associated with higher histologic grade, poor differentiation, positive lymph nodes, ER − and progesterone receptor (PR) negative status, HER2/neu overexpression and all other indicators of poor prognosis [46]. The presumed role of GATA3 in the pathogenesis of OC, however, still remains unclear [47]. Our analysis showed that overexpression of GATA3 was associated with worse prognosis in OC patients, especially in early clinical stages, patients undergoing optimal surgery and two pathological types of OC.
GATA4, GATA5, and GATA6 are expressed predominantly in endoderm and mesoderm-derived tissues [10]. As to the intestinal cell types of expression, it has been suggested that GATA4 and GATA5 tend to mark fully differentiated epithelial cells [48], while GATA6 is expressed in the immature proliferating cells in the intestinal crypts [49]. Thus, GATA4 and GATA5 is currently considered potential tumour suppressors, however, GATA6 can be used as a potential oncogene [6]. Altered expression of GATA4, GATA5, and GATA6 are associated with abroad range of tumours emerging from the gastrointestinal tract [50], lungs [51] and brain [52]. Moreover, some studies reported that methylation in the GATA4 and GATA6 promoter region could play an important role in ovarian carcinogenesis, elevated GATA4 and lower GATA6 mRNA levels are associated with better prognosis in ovarian tumours [21,22,25]. We found a similar result, with high GATA4 expression being related to better prognosis in OC patients, and increased GATA6 expression were associated with worse prognosis in OC patients. Although several studies have shown that the expression and methylation states of GATA5 may be involved in ovarian carcinogenesis. The biologic role and the prognostic effect of GATA5 in OC patients are still poorly understood. Our study suggests that there is a significant positive correlation among GATA2 with GATA4 and GATA5, the 10% of genetic alterations in GATA5 for OC based on TCGA dataset. Regrettably, the expression level of GATA5 is not related to the OS of OC.

Conclusion
In conclusion, the members of the GATA family, GATA1, GATA3, GATA4 and TRPS1 mRNA expression was significantly higher in OC than normal samples. High expression of GATA1, GATA2, and GATA4 were significantly correlated with better OS, while increased GATA3 and GATA6 expression were associated with worse prognosis in OC patients. The genetic variation and interaction of the GATA family may be closely related to the pathogenesis and prognosis of OC, and the regulatory network composed of GATA family genes and their neighboring genes are mainly involved in Notch signalling pathway, Th1 and Th2 cell differentiation and Hippo signalling pathway. Transcriptional GATA1/2/3/4/6 could be prognostic markers and potential therapeutic target for OC patients.