Development and validation of circulating CA125 prediction models in postmenopausal women

Background Cancer Antigen 125 (CA125) is currently the best available ovarian cancer screening biomarker. However, CA125 has been limited by low sensitivity and specificity in part due to normal variation between individuals. Personal characteristics that influence CA125 could be used to improve its performance as screening biomarker. Methods We developed and validated linear and dichotomous (≥35 U/mL) circulating CA125 prediction models in postmenopausal women without ovarian cancer who participated in one of five large population-based studies: Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO, n = 26,981), European Prospective Investigation into Cancer and Nutrition (EPIC, n = 861), the Nurses’ Health Studies (NHS/NHSII, n = 81), and the New England Case Control Study (NEC, n = 923). The prediction models were developed using stepwise regression in PLCO and validated in EPIC, NHS/NHSII and NEC. Result The linear CA125 prediction model, which included age, race, body mass index (BMI), smoking status and duration, parity, hysterectomy, age at menopause, and duration of hormone therapy (HT), explained 5% of the total variance of CA125. The correlation between measured and predicted CA125 was comparable in PLCO testing dataset (r = 0.18) and external validation datasets (r = 0.14). The dichotomous CA125 prediction model included age, race, BMI, smoking status and duration, hysterectomy, time since menopause, and duration of HT with AUC of 0.64 in PLCO and 0.80 in validation dataset. Conclusions The linear prediction model explained a small portion of the total variability of CA125, suggesting the need to identify novel predictors of CA125. The dichotomous prediction model showed moderate discriminatory performance which validated well in independent dataset. Our dichotomous model could be valuable in identifying healthy women who may have elevated CA125 levels, which may contribute to reducing false positive tests using CA125 as screening biomarker.


Background
Cancer antigen 125 (CA125) is a high molecular-weight glycoprotein (MUC16) normally expressed on tissues derived from the coelomic and Mullerian epithelial cells and aberrantly expressed on a variety of cancers, including breast, lung, leukemia, gastric, and ovarian cancer [1][2][3]. CA125 levels are elevated in more than 80% of ovarian cancer cases and have proven utility assessing response to therapy and prognosis [4].
While CA125 remains the most promising biomarker for ovarian cancer screening, results from two large randomized trials comparing combined CA125 and transvaginal ultrasound (TVUS) to usual care did not show significant improvement in overall survival in the screened group [5,6]. In the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS), stage of ovarian cancer diagnosis was earlier in the screened group, but there was no clinically significant reduction in overall mortality [6]. The Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) showed no difference in ovarian cancer mortality between women screened with CA125 and TVUS and normal clinical care [5].
CA125 has been limited as an ovarian cancer screening biomarker by low sensitivity and specificity in part due to variation associated with differences in personal characteristics, such as age, hormone use, and menopausal status [6][7][8][9][10]. Identifying factors that influence CA125 levels in healthy individuals could be used to create personalized thresholds for CA125, thereby improving its performance as an ovarian cancer screening biomarker. Here we developed and validated two prediction models (linear and dichotomous) of circulating CA125 levels among postmenopausal women without ovarian cancer who had participated in one of five large population-based studies.

Study population PLCO
The Prostate, Lung, Colorectal and Ovarian Cancer (PLCO) Screening Trial was designed to determine the efficacy of screening in reducing mortality from four mentioned cancers [11]. Briefly, from 1993 to 2001, 155,000 healthy subjects, including 78,214 women ages 55-74, were recruited from 10 study sites across the U. S and randomized to screening (the intervention arm) or usual care (the control arm). Screening intervention consisted of CA125 measurements and transvaginal ultrasound at baseline and at each of six annual screenings. For the purpose of this analysis, we used only the baseline CA125 measurements. Data on demographic and lifestyle factors were collected by questionnaires administered at baseline. Among a total of 78,214 participants, we excluded women from the control arm (n = 34,304), as well as those with no ovaries at baseline (n = 9658), a prior diagnosis of ovarian, fallopian or peritoneal cancer (n = 1), missing CA125 measurements at baseline (n = 5624), missing baseline questionnaire data (n = 51), a diagnosis of ovarian cancer or loss to follow-up within 3 year from baseline (n = 535), and those missing information on candidate predictor variables of CA125 (n = 1060). After these exclusions, data from 26,981 PLCO participants were available for this analysis.

EPIC
The European Prospective Investigation into Cancer and Nutrition (EPIC) study is a prospective cohort established between 1992 to 2000 [12]. Briefly, 519,978 participants, including 366,521 women, recruited from 23 research centers in 10 European countries, had completed questionnaires on lifestyle, medical and dietary factors. Most participants (74%) provided a blood sample at baseline. Within this cohort, a nested case-control study of ovarian cancer was designed by matching each ovarian case (n = 810) with up to four controls using incidence density sampling [13]. Among 1939 available controls, we defined postmenopausal women as those who met one of the following criteria at the time of blood draw: not on hormones and had not menstruated in the year prior to blood draw; on hormones and age 50 or greater; age at last menstruation was missing and age 50 or greater; had hysterectomy and age 50 or greater at the time of blood draw [7]. We excluded premenopausal women (n = 485), women whose menopausal status could not be determined using the algorithm above (n = 26), women without available CA125 measurement (n = 12), and those missing information on candidate predictor variables of CA125 (total n = 555), leading to a total study population of 861 EPIC participants for this analysis.

NHS/NHSII
The Nurses' Health Study (NHS) is a prospective cohort established in 1976 when 121,700 registered nurses residing in 11 U.S. states were enrolled to investigate the long-term health outcomes of various contraceptive methods in women [14]. Nurses' Health Study II (NHSII) is a prospective cohort established in 1989 when 116,429 nurses residing in 14 states were enrolled to study the association between oral contraceptives, diet, and lifestyle factors and long-term outcomes [15]. Participants answered baseline and biennial follow-up questionnaires about a variety of lifestyle, reproductive and medical characteristics. Blood samples were collected at two time points both in NHS (1989)(1990)(2000)(2001)(2002) and NHSII (1996NHSII ( -1999NHSII ( , 2010NHSII ( -2011. Among women with available blood samples, CA125 was measured in 152 NHS participants and 50 NHSII participants with no evidence of ovarian cancer, for a total of 202 women. We restricted to postmenopausal women defined as not having menstrual period within the past 12 months at the time of blood draw. We excluded premenopausal women (n = 47), those with unknown menopausal status (n = 14), and those missing information on candidate predictor variables of CA125 (n = 60), resulting in a final dataset of 81 NHS/NHSII participants for this analysis.

NEC
The New England Case Control Study (NEC) is a population-based ovarian cancer case-control study that enrolled participants from New Hampshire and Eastern Massachusetts over three study phases (1992-1997, 1998-2002, 2003-2008) [16]. Briefly, a total of 2075 epithelial ovarian cancer cases and 2100 controls, frequency matched on age and state of residence, participated. All the participants were interviewed in person about lifestyle factors, and medical and reproductive history. Over 95% of the study participants provided blood specimens at enrollment. Of 2100 controls, we restricted to postmenopausal women defined as: not on hormones and self-reported their menstruation had stopped or were regularly bleeding because of menopausal hormone use, were not menstruating because of hysterectomy or a medical condition/treatment and age at blood draw was 50 or greater. We excluded premenopausal women (n = 885), women without CA125 values (n = 95), and those missing information on candidate predictor variables of CA125 (n = 197), resulting in a total population of 923 healthy women for this analysis.

CA125 predictor variables
Candidate predictors of CA125 were selected for this analysis based on the previously published reports [6][7][8][9][10]. These included age at blood draw, race, body mass index (BMI, calculated by kg/m 2 ), smoking status and pack-years (calculated by number of packs of cigarettes per day multiplied by the number of smoking years), age at menarche, use of oral contraceptives (OC), parity, ovarian cysts, self-reported endometriosis, hysterectomy, age at menopause, time since menopause, hormone therapy (HT) use and duration, family history of ovarian cancer in first-degree relatives, and previous history of cancer.
We first developed the prediction models in PLCO using the candidate predictors above and then harmonized the selected final predictors across all studies so the categorization of the variables matched the variables in PLCO. Information on predictor variables listed above were collected by questionnaire data in all five studies. For PLCO, EPIC, and NEC, predictor variables and blood samples were obtained at baseline. For NHS/ NHSII, age and weight were obtained from the questionnaire administered at the time of blood draw and other predictor variables were obtained from the most recent biennial questionnaire prior to the blood collection.
Smoking duration among current smokers and former smokers was defined by pack-years among current and former smokers respectively across all studies. Age at menopause was defined as the self-reported age at the last menstrual period in all studies. For women who had a hysterectomy and were missing age at menopause, age at menopause was excluded. Time since menopause was calculated by subtracting age at menopause from age at blood draw.

CA125 measurements
In PLCO, CA125 was measured using the CA-125II radioimmunoassay (Centocor) with an upper limit of normal (ULN) of 35 U/mL, described in detail elsewhere [8]. The coefficients of variation (CV) were 4.1% at a CA125 level of 52.7 U/mL, and 3.8% at a CA125 level of 106.5 U/mL [5]. In NEC and NHS/NHSII, CA125 was measured using CA-125II radioimmunoassay (Centocor) at the CERLab at Boston Children's Hospital. The reproducibility of the assay was evaluated by including five blinded aliquots of a uniform quality control pool in each of the 46 test batches (CV = 1%). In EPIC, CA125 was measured using a volume-effective highly sensitive multiplex platform (Meso Scale Discovery, MSD) in the Genital Tract Biology Laboratory at Brigham and Women's Hospital, with ULN of 55 U/mL. The CV for unblinded quality controls samples on each assay plates was 8.4% [13].

Statistical analysis
CA125 levels were log-transformed to achieve normality in all of the analyses.

Recalibration of CA125
To account for the differences in CA125 values measured in CA125II and MSD assays, we used data from 534 NEC participants, including 353 postmenopausal women, with CA125 measured using both assays to build the recalibration model [17]. First, we built a regression model to obtain the intercept and beta coefficient (i.e. log-transformed CA125II assay value = intercept + beta*log-transformed MSD assay values). Then, we applied the intercept and beta coefficient values from this model to calculate the predicted log-transformed CA125II assay values for all the EPIC participants based on their MSD assay values. We used the predicted CA125 values based on this model for all EPIC participants in our analyses.

Prediction modeling
We developed and validated CA125 prediction models (linear and dichotomous) in postmenopausal women using five large population-based datasets (Fig. 1). We developed the prediction model in PLCO and validated the models in EPIC and in NHS/NHSII/NEC combined dataset.

Linear model
The association between individual predictors and CA125 levels were examined in age-adjusted models using linear regression in the entire PLCO dataset. Linear trend was tested using the continuous value of the variables (i.e. age, BMI, pack-years in current/former smokers, parity) or using the midpoint of the categories (i.e. age at menarche, duration of OC use, age at menopause, time since menopause, duration of HT use). To develop a linear CA125 prediction model, we used variables associated with CA125 at p-value < 0.05 in univariate analysis and performed a stepwise selection using p-values of 0.15 as model entry and retention criteria in the PLCO training dataset. Next, we tested the performance of the linear prediction model in PLCO testing dataset, EPIC and NHS/ NHSII/NEC datasets. Briefly, predicted CA125 values in those three datasets were calculated using effect estimates from the linear prediction model developed in the PLCO training dataset and plotted against the measured CA125 values. Pearson correlation coefficient (r) was used to

Results
Baseline characteristics were mostly similar across study populations, with CA125 averaging between 10 and 14 U/mL (Additional file 1: Table S1). Briefly, women were in their early 60s on average, with average BMI around 26 kg/m 2 , and mostly white race (> 90%). Approximately half of the participants reported ever smoking and most participants were parous (90%).

Recalibration of CA125
We recalibrated the CA125 values in the EPIC participants using the model based on 534 NEC controls with CA125 measurements on both CA125II and MSD assays. The measured CA125II assay values and the recalibrated values calculated based on the recalibration model in NEC were highly correlated with Pearson correlation coefficient of 0.90 (95%CI: 0.88-0.91).

Linear model
First, we evaluated the association between candidate predictors and continuous CA125 levels in 26,981 postmenopausal women in PLCO. Older age at blood draw, white race, lower BMI, former smoking status, shorter duration of smoking among former smokers, older age at first menstrual period, higher parity, having history of benign ovarian cyst, no history of hysterectomy, older age at last menstrual period, ever use and longer duration of hormone therapy, and shorter time since menopause were associated with higher levels of CA125 (Table 1).
We used stepwise regression analysis in the PLCO training dataset to develop the linear prediction model using variables associated with CA125 levels at p-value < 0.05 in univariate models (i.e. age, race, BMI, smoking status, pack-years among current smokers, pack-years among former smokers, age at first menstrual period, parity, hysterectomy, age at last menstrual period, time since menopause, and ever use and duration of HT use). The linear prediction model included age, race, BMI, smoking status, pack-years among current and former smokers, parity, hysterectomy, age at last menstrual period, and HT use and duration, which explained 5% of the variability of log-transformed CA125 ( Table 2). Alternatively, when all significant predictors were included in the model without variable selection process (which consists of variables above plus age at first menstrual period and time since menopause), the r-squared was 0.05, same as that of the linear model developed using stepwise regression with fewer predictors. The associations between the selected predictors and CA125 levels in the multivariate model were similar to those observed in the univariate model. Next, we calculated the predicted log-transformed CA125 levels in the validation datasets based on the regression coefficients in the PLCO training dataset. In the PLCO testing dataset, the Pearson correlation coefficient of the measured and the predicted log-transformed CA125 was 0.18 (95%CI: 0.16-0.20) (Fig. 2a). In NHS/NHSII/NEC dataset, the Pearson correlation coefficient of the measured and the predicted log-transformed CA125 was 0.14 (95%CI: 0.08-0.20) (Fig. 2b) and in EPIC dataset it was 0.14 (95%CI: 0.07-0.20) (Fig. 2c), both similar to that in the PLCO testing dataset.

Dichotomous model
We evaluated the association between candidate predictors and having CA125 ≥ 35 U/mL in PLCO (Additional file 1: Table S2). Older age at blood draw, white race, lower BMI, greater pack-years among former smokers, nulliparity, no history of hysterectomy, older age at last menstrual period, longer duration of HT use and shorter time since menopause was associated with having CA125 levels ≥35 U/mL. We used stepwise regression analysis using all of the candidate predictors to develop the dichotomous prediction model, which included age, race, BMI, smoking status, pack-years among current and former smokers, hysterectomy, time since menopause, and duration of HT use, with an AUC of 0.64 (95%CI: 0.61-0.66) in PLCO (Table 3, Fig. 3). When we applied the regression coefficients in the PLCO to the validation dataset, the AUC was 0.80 (95%CI: 0.73-0.87) in NHS/NHSII/NEC (Fig. 3).
We observed that ever HT use and longer duration of use were positively associated with CA125 levels both in the linear and dichotomous model. Since women with a history of hysterectomy are more likely to have taken estrogen-only HT and type of HT may be differentially associated with CA125 levels, we conducted a stratified analysis by history of hysterectomy. However, we did not observe statistically significant effect modification by history of hysterectomy (pinteraction = 0.58; data not shown).

Discussion
We confirmed factors contributing to variations in CA125 levels among postmenopausal women, including age, BMI, race, smoking status and duration, age at first menstrual period, parity, having benign ovarian cyst, hysterectomy, age at last menstrual period, HT use and duration, and time since menopause. Based on these factors, we developed and validated two prediction models in postmenopausal women without ovarian cancer using data from five large population-based studies. The final linear CA125 prediction model explained little of the total variation of CA125 values but showed similar performance in the testing and external validation datasets. The final dichotomous CA125 prediction model showed moderate discriminatory performance and validated well in the external validation dataset. Interestingly, age, BMI, race, hysterectomy, and duration of HT use were selected in both linear and dichotomous models, suggesting that these factors are robust predictors of CA125 levels in postmenopausal women. Studies have examined personal factors that influence CA125 levels in healthy women in order to improve the clinical utility in interpreting the biomarker levels [7][8][9][10]20]. The significant predictors selected in our linear prediction model were consistent with three prior studies that evaluated predictors of CA125 in postmenopausal women without ovarian cancer [7][8][9]. Older age at blood draw, non-white race, current smoking status, younger age at menopause, and history of hysterectomy were significant predictors that were consistently associated with lower CA125 levels across all of the studies that had information on these variables. Increased parity was also consistently associated with higher CA125 levels in two of the studies that assessed parity [7,9]. HT use and longer duration were associated with higher CA125 levels in our linear prediction model, but the results were   [7,8]. This could be due to the possible differences in association by type of HT (e.g. estrogen only, estrogen and progesterone combined). If many of the hormone therapies were cyclical hormone therapies using estrogen and progesterone combined, these would result in proliferation of the endometrium and withdrawal bleeding which may possibly lead to increase in CA125 levels compared to women who are not on hormonal therapy and have no withdrawal bleeding, given that CA125 is expressed in the endometrial tissue. Although we did not observe significant effect modification of the HT associations by history of hysterectomy, given that women with history of hysterectomy are more likely to be on estrogen only HT, lack of effect modification is difficult to conclude since we were not able to evaluate the association by type of HT use due to limited information on type of HT. We did investigate former and current HT use separately, although the effect estimates were similar in these two subgroups and therefore we combined the categories into an "ever" use category when including in the final model. In addition to examining individual predictors, we evaluated and validated the performance of the multivariate linear CA125 prediction model. Although several variables were significant predictors of CA125 in postmenopausal women and our linear prediction model was validated in independent datasets, the total variance explained by the linear prediction model was only 5%, suggesting that the known predictors may not be sufficient in explaining the CA125 variation. This is further supported by the observed lack of significant improvement in the model performance even when including all significant predictors in the model.
We also developed and validated a dichotomous prediction model using the CA125 ≥ 35 U/mL threshold. Only one prior study examined predictors of CA125 ≥ 35 U/mL in postmenopausal women, with age, BMI and hysterectomy being the only significant factors in the multivariate model [8], which were consistent with our findings. Our final dichotomous model additionally included race, smoking status and duration, time since menopause, and duration of HT use as significant predictors. Furthermore, our final dichotomous model showed moderate discriminatory performance with nine predictors which validated well in the independent dataset, suggesting the robustness of the model.
The major strength of our study was the use of data from five large population-based studies to develop and conduct internal and external validation of circulating CA125 prediction models in postmenopausal women without ovarian cancer, resulting in robust prediction models. However, there are two major limitations to the study. Since we restricted the candidate predictors to those that have been described previously in the literature, we may be lacking significant predictors which have not been investigated to date. Given that the total variance explained by our final linear model was 5%, there may be other predictors of CA125, such as genetic variants, common medications, or dietary factors, which may explain more of the variability of CA125 in postmenopausal women. Misclassification of CA125 levels in the EPIC cohort is also a concern since CA125 was measured using a different assay in this study. However, the recalibrated CA125 values based on the MSD assay values were highly correlated with the measured CA125 values using the CA125II assays in NEC (r = 0.90). In addition, the performance of the final linear model in NHS/NHSII/NEC was similar to that in EPIC, suggesting the high accuracy of the recalibration model.

Conclusion
In summary, we developed and validated models predicting circulating CA125 in healthy postmenopausal women. The dichotomous prediction model showed moderate discriminatory performance which validated well in independent dataset. However, the linear prediction model explained a small portion of the total variability of CA125, suggesting the need to identify novel predictors of CA125. While CA125 has shown value in distinguishing malignant from benign pelvic masses [21,22], its value as a screening biomarker in the general population has been limited by elevated levels roughly 10% of women without cancer result, which could lead to unnecessary interventions and psychological harms [23]. Our dichotomous model could be used to identify healthy women who may have CA125 levels greater than the current clinical cutoff, which may contribute to reducing false positive tests using CA125 as screening biomarker.
Additional file 1: Table S1. Baseline characteristics across Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO), European Prospective Investigation into Cancer and Nutrition (EPIC), Nurses' Health Studies (NHS/NHSII), and New England Case-Control Study (NEC). Table S2. Age-adjusted association between predictors and CA125 levels above 35 U/mL in Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO).