Construction autophagy-related prognostic risk signature to facilitate survival prediction, individual treatment and biomarker excavation of epithelial ovarian cancer patients

Background Existing clinical methods for prognosis evaluating for Epithelial Ovarian Cancer (EOC) patients had defects of invasive, unsystematic and subjective and little data are available for individualizing treatment, therefore, to identify potential prognostic markers and new therapeutic targets for EOC is urgently required. Results Expression of 232 autophagy-related genes (ARGs) in 354 EOC and 56 human ovarian surface epithelial specimens from 7 independent laboratories were analyzed, 31 mRNAs were identified as DEARGs. We did functional and pathway enrichment analysis and constructed protein–protein interaction network for all DEARGs. To screen out candidate DEARGs related to EOC patients’ survival and construct an autophagy-related prognostic risk signature, univariate and multivariate Cox proportional hazards models were established separately. Finally, 5 optimal independent prognostic DEARGs (PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4) were confirmed and the autophagy-related risk model was established by the 5 prognostic DEARGs. The accuracy and robustness of the prognostic risk model for survival prediction were evaluated and verified by analyzing the correlation between EOC patients’ survival status, clinicopathological features and risk scores. Conclusions The autophagy-related prognostic risk model can be independently used to predict overall survival in EOC patients, it can also potentially assist in individualizing treatment and biomarker development.


Introduction
Ovarian cancer (OC) has the highest morbidity and mortality in the female genital tract [1]. It is the fifth most frequent cause of cancer death in women in the United States in 2020 which results in the death of 5% women with cancer, and 5-year survival rates are 39-48% of all the women diagnosed with ovarian cancer [2,3]. Epithelial ovarian cancer (EOC) is the most common type of OC which accounts for almost 90% of all ovarian cancers [4,5]. It generally presents at an advanced stage in over 70% of patients contributing to a high death rate, where the long term survival rate (10 years) is estimated at 15-30% [6,7]. The prognosis of EOC is associated with many factors such as histological type, pathological stage, age, early recognition, the volume of ascites and so on [8,9], existing clinical methods for prognosis evaluation still had defects of invasive, unsystematic and subjective. It is necessary to refine the prognostic model of EOC and establish a more accurate method to managing this high-mortality disease.
Establishing prognosis models for EOC patients is an important part of risk evaluation and treatment, can also greatly assists in biomarker development [10]. From clinically applicability, prognosis model is a handy tool to estimate overall survival and risk of recurrence, and can potentially help in individualizing treatment for patients [11]; From research development, it can contribute to identify subgroups of patients with unfavorable prognosis and promote us to explore alternative treatment strategies for these patients, and provide an idea for targeting therapy [12]. The present study focuses on constructing a prognosis model for EOC by molecular typing methods using large databases.
Autophagy is a homeostatic mechanism that can maintain cell survival by recycling organelles and macromolecules [13]. Various genes named because of participating in autophagy, they were called Autophagy-Related Genes (ARGs) [14]. Accumulating data suggest that autophagy dysregulation in EOC cells caused dormancy and chemo −/radio-therapy resistance, and the process involving proteins (mainly ARGs encoding proteins) are being considered as anticancer molecular therapeutic targets [15][16][17][18]. However, no prior study used the large-scale expression patterns for assessing the autophagy effect on EOC prognosis. So, this study makes use of ARGs to construct the prognostic risk signature of EOC. Through this novel prognostic risk model, we expect to shed light on prognosis evaluation and targeted treatment of EOC.

Data acquisition
The gene expression profiling data sets (ID: GSE14407, GSE6008, GSE14001, GSE26712, GSE29450, GSE38666, GSE105437) were obtained from Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/). The brief information of 7 GEO datasets [containing 410 human ovarian surface epithelial (HOSE) and epithelial ovarian cancer (EOC) specimens] from 7 independent laboratories was extracted and listed in Table 1. The RNA-seq data and the corresponding clinical data of 379 EOC patients were downloaded from the TCGA database (https://portal.gdc.cancer.gov/). We downloaded 232 genes identified so far to be involved in autophagy from the Human Autophagy Database (HADb).

Differentially expressed autophagy-related genes (DEAR Gs) screening
Data preprocessing was performed before difference analysis for 7 independent GEO datasets. We did batch normalization for all expression profiling data through ComBat algorithm in R to eliminate the systematic variations among different studies. The differentially expressed autophagy-related genes (DEARGs) between 354 EOC tissues and 56 HOSE tissues were screened out by the Wilcoxon signed-rank test. The cutoff criteria were adjusted p-value < 0.05 and |log 2 FoldChange| (|log 2 FC|) > 1.

Protein-protein interaction (PPI) network construction for all DEARGs
PPI analysis is a protein correlation analysis that can effectively reveal the molecular mechanisms of crucial cellular activities in carcinogenesis. It is constructed based on the STRING database (https://string-db.org/). The PPI network was constructed for all DEARGs and visualized with the cut-off criterion of interaction score > 0.4. To visualized the PPI network and highlighted the hub genes, we make use of the Cytoscape software to perform deeply biological network analysis.

Functional and pathway enrichment analysis for all DEAR Gs
Gene Ontology (GO) analysis can annotate characteristics of a set of genes, such as involved cellular components (CC), molecular functions (MF) and biological processes (BP). Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was used to reveal the involved pathways of all DEARGs. These analyses were done by clusterProfiler package of R with p value < 0.05 as a strict cutoff.

Identify the prognostic DEARGs
To identify DEARGs whose expression profiles had a significant correlation with the overall survival (OS) of patients with EOC, we performed the univariate Cox regression model. The DEARGs with the threshold of P < 0.05 were regarded as candidate genes related to EOC patients' survival.

Construction of OS risk prognostic model and risk score calculation
The survival-related DEARGs screened out through the univariate Cox proportional hazards model were incorporated into a multivariate Cox regression analysis to weeded out the DEARGs which might not be an independent index in prognosis monitoring. Then, we can obtain a set of optimal prognostic DEARGs, and establish a risk score model using these genes. We can get the risk score for each EOC patient through the followed formula, the risk score ¼ The risk score was calculated using the gene expression value multiplied by the regression coefficients of each individual gene. The genes in the formula mean the independent prognostic DEARGs we screened before. The regression coefficients were obtained from the multivariate Cox regression model and stand for the relative weight of selected genes. It is a measure of prognostic risk for EOC patients. With the median risk score as the cutoff value, patients were divided into low-risk group and high-risk group. A high-risk score means a poor prognosis.

Evaluation of OS risk prognostic model
We draw the survival curves by Kaplan-Meier plotter and assessed the difference in the survival status between the high-risk group and low-risk groups to verify the validity and robustness of the OS risk prognostic model. Then, the relationship between risk score and survival status of EOC patients was visualized. We studied genetic alterations of 5 independent prognostic DEARGs (PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4) in an Ovarian Cystadenocarcinoma case set using an open-access tool cBioPortal (http://www.cbioportal.org).
What's more, we performed Cox proportional hazard regression to evaluate whether the OS risk prognostic model constructed with DEARGs could independent of other clinicopathological features as a predictor of OS for EOC patients. The correlation between the risk score and clinical traits was analyzed by univariate and multivariate Cox regression methods. Age, pathological stage and histological grade were used as candidate clinicopathological risk factors. We want to explore whether clinicopathological risk factors and risk scores calculated based on prognostic DEARGs can all be indicators in prognosis monitoring for EOC patients, and to confirm whether the risk score could be an independent prognosis factor.

Identification of differentially expressed autophagyrelated genes (DEARGs)
The expression values of 13,045 genes in 410 samples (354 EOC and 56 HOSE specimens) were normalized with R package and showed in Fig. 1a. Horizontal axis and vertical axis represents 410 samples and all gene expression value, respectively. After batch normalization, the median of expression value of samples is almost the same, proved normalized data were qualified. The expression values of 232 ARGs were extracted. Considering adjust p-value < 0.05 and |log 2 FC| > 1 as the filter criteria, 31 DEARGs (14 downregulated ARGs and 17 upregulated ARGs) were obtained. We list the details such as log 2 FoldChange and statistical significance of all DEARGs in Table 2. In Fig. 1b, the fold change patterns of 31 DEARGs were showed in a heat map. The expression of 31 DEARGs between EOC tissues and HOSE tissues was visualized by scatter plots (Fig. 1c).

PPI network construction and functional annotation of DEARGs
We displayed the distribution of all DEARGs with a volcano plot (Fig. 2a). A PPI network of DEARGs was constructed with STRING database and visualized the interaction of the PPI network using Cytoscape software, there are 12 hub genes with interaction degree > 5 showed as a diamond and arranged in a circle (Fig. 2b). GO analysis shows that DEARGs are enriched in several hypoxia-related biological processes (BP), including autophagy, macroautophagy, response to hypoxia and response to oxygen levels. Regarding the molecular function (MF), the DEARGs played vital parts in some functions, such as ubiquitin-like protein ligase binding, chaperone binding and heat shock protein binding. In terms of the cellular components (CC), the proteins encoded by the DEARGs are components of melanosome, pigment granule and secretory granule lumen (Fig. 2c). KEGG analysis showed the main pathways that DEARGs involved containing autophagy, and protein processing in the endoplasmic reticulum (Fig. 2d).

Establishment of autophagy-related signature
Altogether RNA-seq and clinical data of 379 EOC tissue specimens were obtained from the TCGA database and 374 EOC tissue specimens contained complete clinical follow-up information were subjected to univariate Cox regression analyses to evaluate the association between 31 screened DEARGs' expression profiles and overall survival in 374 EOC patients. The results of univariate Cox regression analyses revealed that 9 DEARGs were significantly associated with the prognosis of EOC patients (p < 0.05) (Fig. 3a). To improve the validity and robustness, 9 prognostic-related DEARGs obtained from the above univariate analysis were further subjected to the multivariate Cox regression analysis. Finally, 5 DEAR Gs (PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4) were screened out as optimal independent prognosis biomarkers and applied to construct an autophagy-related risk model (Fig. 3e). The risk score of OS for EOC patients was calculated according to the following formula: risk score = (0.3719 ✕ expression value of PEX3) + (− 0.4819 ✕ expression value of DNAJB9) + (0.3235 ✕ expression value of RB1) + (− 0.4574 ✕ expression value of HSP90AB1) + (− 0.1754 ✕ expression value of CXCR4). The risk scores of 374 EOC patients were calculated through the above formula, and patients were divided into high-risk (n = 187) and low-risk group (n = 187) with the median of the risk score as the cutoff value.

Validation of the risk signature
The OncoPrint in cBioPortal is a compact and efficient graphical summary of genomic alterations in tumor specimens. According to previous results of DEARGs screening and autophagy-related signature establishment, the risk score calculated based on the expression of PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4 can be a predictor of OS for EOC patients independent of other clinicopathological parameters. We analyzed and visualized genomic alterations of 5 prognostic DEARGs by cBioPortal in ovary carcinoma cases (Fig. 4a). Kaplan-Meier plot was drawn to compare the OS difference between high-risk and low-risk group. EOC patients in the low-risk group had obviously better survival outcomes than the high-risk group (p = 9.606E-07) illustrated that the risk score based on the autophagy-related risk model correlated with EOC patients' prognosis (Fig. 4b). The size of nodes is inversely related to p-value. Diamond nodes are hub genes whose interactive protein more than 5. c Gene Ontology analysis of 31 DEARGs. d Kyoto Encyclopedia of Genes and Genomes analyses of 31 DEARGs showed the involved signaling pathways Figure.4c visualized the risk scores from low to high. The mortality of EOC patients increased as the risk score increased (Fig. 4d). The heatmap in Fig. 4e displayed the expression patterns of 5 prognostic DEARGs in different risk groups.

Clinical utility of prognostic signature
The univariate and multivariate Cox proportional hazard regression analyses were performed to determine the correlation between the risk score and clinicopathological features, and to further evaluated whether the autophagy-related risk prognostic model constructed with DEARGs could independent of other clinicopathological features as a predictor of OS for EOC patients. There is no difference of risk scores between age > 65 and age ≤ 65 (p = 0.211) (Fig. 5a) or pathological stage IIIC-IV and pathological stage I-IIIB (p = 0.195) (Fig.   5b), although elder patients seemed had a higher risk score than younger patients and high pathological stage patients seemed had a higher risk score than low pathological stage patients. Risk score was higher in histological grade G3-4 than in G1-2 (p = 0.011) (Fig. 5c).
Obviously, the majority of EOC patients are diagnosed at a late stage, so most patients are diagnosed with pathological stage III or histological grade 3-4. The association between the expression level of 5 prognostic DEARGs which used to construct the risk model and clinical pathological parameters of EOC are shown in Fig. 5a, b, c.
In Table 3, univariate and multivariate Cox regression analysis showed that age and risk score was significantly correlated with OS of EOC, risk score can be an independent factor for the OS of EOC. These results confirmed that the autophagy-related prognostic signature   can be an independent indicator for prognosis monitoring for EOC patients.

Discussion
Autophagy is a eukaryotic cellular degradation and recycling process which is highly conserved [19]. Many studies had confirmed that autophagy plays a significant role in EOC [20][21][22][23]. Some pharmaceutical agents targeting autophagy have been proved effective in EOC patients [24,25]. Up to now, histological grade and stage remain the strongest prognostic evaluation tools in EOC [26]. Make use of a prognostic model which constructed based on the specific molecule can improve the understanding of the molecular mechanism of EOC, aid the development of more specific therapies, and identify novel biomarkers [27,28]. So, we construct autophagyrelated prognostic signature with DEARGs to predict the prognosis of EOC patients.
According to our study, an autophagy-related independent prognostic signature is constructed by 5 ARGs with different coefficients, including PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4. There were only a few researches reported a relationship between the 5 ARGs and cancers, Daniela et al and Shaobo et al reported PEX3 plays an important role in Melanoma [29] and colon cancer [30] respectively. DNAJB9 is a known negative feedback regulator of the tumor suppressor gene p53 in nongestational choriocarcinoma [31] and can reduce chemotherapy resistance in acute myeloid leukemia [32]. Guang et al reported that RB1 is a tumor suppressor in OC [33]. HSP90AB1 previously known as heat shock 90-kDa protein 1, beta, its expression is relatively stable in ovarian tissues [34]. As a chemokine receptor, blocking the CXCR4/ CXCL12 signal could be a potential therapy for EOC patients [35]. The results suggest that most autophagyrelated independent potential prognostic markers were identified and verified in other cancers except EOC, hence identifying potential prognostic markers and new therapeutic targets for EOC patients is essential.
There are 374 EOC patients' information with gene expression and survival data were obtained from TCGA to established an autophagy-related prognostic model, and only 364 patients with complete clinical information of age, pathological stage and histological grade. We verified the relationship between autophagy-related prognostic signature and their clinicopathological features. The 5 ARGs are not all associated with clinicopathological features of EOC patients, but the risk score calculated based on 5 prognostic DEARGs is significantly related to the histological grade of EOC patients. It is important to notice that among 364 EOC patients, only 43 patients were subjected to G1-G2, 321 patients were subjected to G3-G4. The same is only 22 patients diagnosed in stage I-II, the rest 342 patients all diagnosed in stage III-IV. Many patients diagnosed at an advanced stage disturbed the analysis for the relationship between the risk model and clinical information. Even so, we still can find a trend of risk score was higher in pathological stage IIIC-IV than in stage I-IIIB. Several assessment methods confirmed that the prognostic model can be an independent indicator for prognosis monitoring for EOC patients. Clinical information of EOC patients reminded us that early diagnosis is urgently needed for EOC patients. Our study shed light on finding the diagnosis and targeted treatment biomarkers in EOC patients.
Pathway enrichment analysis showed that 31 DEARGs were mainly involved in hypoxia or oxygen related pathways. Kalpana et al reported that hypoxia-induced carcinoma progression and metastasis, and drug resistance are serious problems for EOC treatment in clinical [36]. Edith et al proposed that OC metastasizes and recurs all in a unique hypoxia microenvironment in the abdominal cavity [37]. Hence, the affection of hypoxia to EOC patients should be evaluated and managed cautiously.
Our study developed a risk prognostic model to predict individuals' clinical outcomes, molecular signature combined with clinical features make the model steady and credible. Our results proved that the risk model constructed by 5 DEARGs PEX3, DNAJB9, RB1, HSP90AB1 and CXCR4 is clinically practicable to evaluated prognosis for EOC patients. Besides traditional clinicopathological indicators (including pathological stage and histological grade), risk scores based on the autophagy-related genes signature could also be applied in clinical to provide a handy and better prognosis monitoring. The DEARGs can also help facilitate personalized target treatment and early screening.