A real-world study on characteristics, treatments and outcomes in US patients with advanced stage ovarian cancer

Background Detailed epidemiologic descriptions of large populations of advanced stage ovarian cancer patients have been lacking to date. This study aimed to describe the patient characteristics, treatment patterns, survival, and incidence rates of health outcomes of interest (HOI) in a large cohort of advanced stage ovarian cancer patients in the United States (US). Methods This cohort study identified incident advanced stage (III/IV) ovarian cancer patients in the US diagnosed from 2010 to 2018 in the HealthCore Integrated Research Database (HIRD) using a validated predictive model algorithm. Descriptive characteristics were presented overall and by treatment line. The incidence rates and 95% confidence intervals for pre-specified HOIs were evaluated after advanced stage diagnosis. Overall survival, time to treatment discontinuation or death (TTD), and time to next treatment or death (TTNT) were defined using treatment information in claims and linkage with the National Death Index. Results We identified 12,659 patients with incident advanced stage ovarian cancer during the study period. Most patients undergoing treatment received platinum agents (75%) and/or taxanes (70%). The most common HOIs (> 24 per 100 person-years) included abdominal pain, nausea and vomiting, anemia, and serious infections. The median overall survival from diagnosis was 4.5 years, while approximately half of the treated cohort had a first-line time to treatment discontinuation or death (TTD) within the first 4 months, and a time to next treatment or death (TTNT) from first to second-line of about 6 months. Conclusions This study describes commercially insured US patients with advanced stage ovarian cancer from 2010 to 2018, and observed diverse treatment patterns, incidence of numerous HOIs, and limited survival in this population.


Background
Ovarian cancer is the most lethal gynecologic malignancy [1] and the fifth most common cause of cancer death for women in the United States (US) [1]. Epithelial ovarian cancer is primarily treated with surgery and platinum-based chemotherapy, and can also be treated with radiation, hormone, or targeted therapy. Many new treatments, including poly ADP-ribose polymerase (PARP) inhibitors, are indicated specifically for advanced stage ovarian cancer, [2] while potential new therapies, such as immunotherapies, are being investigated [3].
Randomized trials have suggested that adverse events including hypertension, neutropenia, liver-related toxicity, fatigue, anemia and diarrhea can occur commonly after initiation of certain ovarian cancer therapies, [4][5][6] but less is known about the incidence and types of health outcomes of interest (HOIs) occurring in the general ovarian cancer population. Randomized trials are tightly controlled studies that commonly use small and narrowly defined populations. Recent publications have suggested that trial populations are significantly younger, have higher income, and have fewer co-morbidities than the general cancer population [7][8][9].
Real world evidence on the characteristics, treatment patterns, incidence of HOIs, and outcomes (including survival) of advanced stage ovarian cancer patients has been limited [10], partially due to the lack of specific cancer information, such as the stage of disease, in large administrative claims databases. Recently, we developed a validated algorithm to define advanced stage ovarian cancer using supervised machine learning techniques [11]. In this study, we applied this algorithm to an administrative claims database to identify a large cohort of advanced stage ovarian cancer patients and described their characteristics, treatment patterns, survival, and incidence rates of HOIs that could be utilized as comparator incidence rates for new and future ovarian cancer therapies indicated for advanced stage ovarian cancer.

Population and design
This study included incident advanced stage ovarian cancer patients in the US using the HealthCore Integrated Research Database (HIRD). The HIRD is a longitudinal medical and pharmacy claims database from health plan members across each region of the US. Member enrollment, medical care, outpatient prescription drug use, outpatient laboratory test result data, and health care utilization are tracked for health plan members.
Claims databases lack certain types of clinical information not needed for billing purposes, such as cancer stage. To overcome this limitation, we linked claims data with three state cancer registries (Ohio, Kentucky, and New York) and the HealthCore Integrated Research Environment (HIRE) Oncology data. HIRE Oncology is a pre-authorization program in which clinical data is obtained through physicians' submissions of intentions to use certain cancer treatments, and has shown good agreement with medical records with regard to cancer stage [12]. Advanced stage was defined in the registries and HIRE Oncology as epithelial ovarian cancer, either locally advanced (Stage IIIa, IIIb or IIIc) or metastatic (Stage IV). Subsequently, we developed a claims-based predictive model algorithm for advanced stage ovarian cancer among the subset of patients with clinical data using least absolute shrinkage and selection operator (lasso) regression and 20-fold cross validation [11]. The predictive model for advanced stage (III or IV) had a high PPV (95%), specificity (90%), and sensitivity (70%) when validated using data from the state cancer registries and HIRE Oncology, using an 80% probability threshold for defining a case [11].
To identify patients with confirmed incident advanced stage ovarian cancer, patients needed to meet the following inclusion criteria: at least one diagnosis code in any claims position for ovarian cancer (codes starting with International Classification of Diseases [ICD]-9: 1830 or ICD-10: C56; Supplemental Table 1) in the HIRD between January 1, 2010 and January 31, 2018, continuously enrolled in a health plan captured by the HIRD for at least 6 months prior to the first ovarian cancer diagnosis (to restrict to newly diagnosed (incident) cases), and identified as an advanced cancer patient by either matching to a cancer registry, HIRE Oncology, [11,12] or meeting the predictive algorithm for advanced disease [11].
Follow-up for this cohort of advanced stage ovarian cancer was identical to the inclusion period (January 2010 to January 2018). For each patient, the predictive probability of advanced stage ovarian cancer was computed each time a patient had a claim in the predictive model (hypothetical example of a patient results in Supplemental Fig. 1, Supplemental Table 2). The date of incident advanced cancer (i.e. the index date for the start of follow-up) was defined as the first date the patient met the advanced stage predictive model's probability threshold of 80% or higher (Supplemental Fig. 1, Supplemental Table 2). The date of incident advanced cancer defined by the predictive model was within 1 month of cancer registry date for 84% of the patients and the median difference between the registry and model was 1 day apart. For patients with confirmed advanced disease who did not meet the predictive model algorithm, we used the cancer registry or HIRE Oncology date as the date of incident advanced cancer. Cases are defined as "advanced stage at diagnosis" if their advanced stage date (from cancer registry, HIRE Oncology, or predictive model) was within 1 month of their first cancer diagnosis in claims, otherwise they are defined as "Diagnosed as early stage and progressed to advanced stage".
Follow-up started for an individual at the advanced stage index date and continued until they were censored (either by death, end of health plan enrollment, or end of study period (January 2018)). We did not require a set amount of person-time after the advance stage index date, thus a subset of patients in this cohort died or lost to follow-up soon after the advanced stage index date.
Patients were described in terms of demographic and clinical characteristics, prior and concomitant treatments, key incident HOIs, lines of treatment, and mortality. Selected characteristics were presented stratified by treatment line, which was inferred based on observed patterns of medication use which included assumptions such as 28-day cycles and a new line occurring when there were more than 60 days between two cycles or if there were treatment switches or a treatment added. We also identified the 25 most frequently dispensed medication classes during the 12 months before the advanced stage ovarian cancer index date and separately for the 12 months after the advanced stage ovarian cancer index date. The medication classes were defined at the fourdigit Generic Product Identifier (GPI) level. Diagnoses are not linked to a specific prescription, and thus some of the record treatments may have been specified for other cancers such as breast cancer, if a patient had multiple malignancies.
We described characteristics for patients who were platinum therapy sensitive, platinum resistant, or platinum refractory, which were defined similarly to previously published studies [13,14]. The categorization was defined using medication dispensing data for platinum sensitive agents (cisplatin, carboplatin, or oxaliplatin) and other chemotherapies, and time until use of a second-line therapy.
We linked claims to the US National Death Index (NDI) to identify mortality outcomes and cause of death, following NDI standards for identification of death [15]. We also evaluated two real-world surrogates of cancer progression in this cohort, time to treatment discontinuation or death (TTD), and time to next treatment or death (TTNT) [16]. We defined TTD as the time from the date of initiation of a first-line systemic anti-cancer therapy after the advanced stage index date to the earliest of discontinuation (> 60 days without first-line treatment; event), death (event) or loss to follow-up in the HIRD (administrative censor, not an event). TTNT was defined as the time from the date of the first-line treatment after the advanced stage index date to the earliest of a second-line treatment (event), death (event), or loss to follow-up in the HIRD (administrative censor, not an event). We restricted mortality, TTD, and TTNT analyses to the patients available for linkage to the NDI, as a subset of the cohort was unable to be linked due to privacy restrictions. This study was approved by the New England Institutional Review Board (Work Order Number 1-9472-1).

Statistical analysis
Patient characteristics and treatments received were described by counts and percentages for categorical variables and statistics such as mean, standard deviation (SD), and median for continuous variables. Person-time incidence rates and Poisson 95% confidence intervals (CIs) were calculated for pre-specified HOIs. These prespecified HOIs were identified with attention to the Medical Dictionary for Regulatory Activities (MedDRA) classification system and FDA approved standardized case definitions, when possible. MedDRA is not always directly translatable to use in administrative claims data but can sometimes be approximated with ICD codes. These HOIs required two or more ICD-9/ICD-10 diagnosis codes in any setting or at least one ICD-9/ICD-10 diagnosis code in the inpatient setting (codes available upon request). For the main analysis, the incidence rate of each HOI was determined from the case definition date for advanced ovarian cancer (index date) through the first HOI of a given type, or the end of the patient's follow-up due to a censoring event, whichever is sooner. Incidence rates of HOIs after systemic anticancer therapy (while with advanced stage disease) were also conducted. We also assessed severe HOIs as those requiring hospitalization or ER visit as defined by the primary diagnosis on the facility claim.
Administrative claims-based assessments of disease incidence can be inaccurate for repeated events, as it is not always possible to distinguish between a patient who has a past medical history of a condition and one who has been newly diagnosed or experienced an acute event. For this reason, for most HOIs, patients were followed from cohort entry (or treatment initiation from some analyses) until their first recorded event of a given type, and then censored from follow-up for that event type. Unless otherwise specified, we excluded patients who presented the HOI prior to start of study follow-up (i.e. prevalent cases during the baseline period) from these HOI analyses.
The product-limit estimator was used to describe median values and rates of mortality, TTD, and TTNT at one, three, and 5 years and the corresponding Kaplan-Meier curves [17]. In a sensitivity analysis, we also evaluated the rates of mortality when excluding the last 6 months of data provided from the NDI (July 1, 2017 to December 31, 2017) given prior evidence of lower sensitivity of newly released data [15].

Descriptive characteristics
We identified 12,659 advanced ovarian cancer patients that met the eligibility criteria for this cohort. Most patients were classified as incident advanced stage at diagnosis (96.7%) rather than incident early stage cancers that progressed to an advanced stage (3.3%) which may often represent recurrent cases. At the time of advanced stage, these patients had a mean (±SD) age of 62 ± 14 years, and 50% were followed after their advanced cancer date for over 17.3 months ( Table 1). The comorbidity burden was elevated with a median Deyo-Charlson Comorbidity Index (DCI) score of 6 [18]. The most frequently dispensed medication class in the 12 months before and after the advanced stage index date was opioid combinations (pre: 41.6%; post: 46.3%; Supplemental Table 3). Medication use appeared to increase after the advanced stage index date particular for 5-HT3 receptor agonists (pre: 19.7%; post: 37.1%) and phenothiazines (pre: 13.8%; post: 27.0%) which are both often used to treat nausea (Supplemental Table 3).
Regarding the treatment for ovarian cancer, close to half of advanced ovarian cancer patients had at least one ovarian cancer-related surgery during follow-up (i.e. after the advanced stage index date) (40.5%), primarily palliative surgery for relief of small bowel obstruction (34.9%; Supplemental Table 3). More than two-thirds received radiotherapy or systemic anti-cancer therapy (68.5%) after the advanced stage index date, the most common being platinum agents (75.3%; carboplatin = 66.3%, cisplatin = 14.1%, and oxaliplatin = 4.4% of treated patients) and taxanes (70.0%; paclitaxel = 64.2% and docetaxel = 12.8% of treated patients). Common specific agents used were carboplatin (66.3%) and paclitaxel (64.2%). There were 68.5% of patients for whom we observed a first line of treatment (including systemic therapy and radiotherapy), 43.9% had a second line, 30.5% had a third line, and 20.5% had four or more lines (Supplemental Table 4). Following first line therapy, there were 12.1% categorized as platinum sensitive, 15.3% as platinum resistant, and 41.7% as platinum refractory (Supplemental Table 4). The age, DCI, and treatment use were largely similar between platinum sensitive and platinum refractory/resistant patients (results available upon request. Systemic anti-cancer medication class use differed by treatment line ( Table 2). The majority of patients were taking platinum and taxane agents in the first treatment line, while the use of angiogenesis inhibitors, hormonal and related agents, antineoplastic antibodies, and antineoplastic antibiotics all became more widely used in later treatment lines (> 25% in the fourth line or higher; Table 2). The most commonly used agents, carboplatin and paclitaxel, were most frequently used in the first treatment line, and the proportion of patients using them were lower in the subsequent treatment lines (5 0% in first line vs. < 37% in all subsequent treatment lines; Table 2). There were 12% of patients who had a breast cancer diagnosis (in addition to their ovarian cancer diagnosis) noted during their first treatment line therapy, suggesting a small subset of first line therapies may have been for breast cancer.

Health outcomes of interest (HOIs)
The most common pre-defined HOIs among advanced stage ovarian cancer patients included abdominal pain, nausea and vomiting, anemia, and serious infections (each > 24 per 100 person-years; Table 3). Advanced stage ovarian cancer patients also frequently developed malaise/fatigue, hypertension, constipation, pain in joints or limbs, and renal failure (each > 10 per 100-person years; Table 3). Endocrinopathies and immune/autoimmune related event rates were less frequent (e.g., colitis: 3.1 per 100 person-years, type 1 diabetes: 0.5 per 100 person-years; Table 3).
Of the 25,868 person-years of follow-up in the advanced ovarian cancer cohort, 15,938 person-years (62%) were after a systemic anti-cancer therapy. When restricting to time after anti-cancer therapy, rates of many HOIs were similar compared to rates after the advanced stage index date, which included pre and post anticancer treatment time (e.g., any rash -after advanced stage: 3.0 per 100 person-years, after anti-cancer therapy: 3.1 person-years; renal failureafter advanced stage: 9.6 per 100 person-years, after anti-cancer therapy: 10.2 per 100 person-years; Table 3). However, incidence rates of some HOIs, such as serious infections, nausea and vomiting, malaise and fatigue, and thrombocytopenia, were higher after treatment (Table 3).
When restricting to severe HOIs occurring as the primary discharge diagnosis in inpatient or emergency room facilities, the incidence rates of all events were lower than overall HOI event rates, especially events such as nausea and vomiting, anemia, malaise/fatigue, and constipation, which declined over three-fold compared to the overall incidence rate ( Table 3, Supplemental Table 4). Serious infections, abdominal pain, and renal failure were some of the most common hospitalized events noted as the primary discharge diagnosis (each > 4 per 100 person-years; Supplemental Table 5).

Mortality, TTD, and TTNT analyses
In this cohort of 12,659 incident advanced stage ovarian cancer patients, 8374 patients were eligible to be linked to the NDI and thus available for the mortality analyses (66.2% of incident ovarian cancer cases). Characteristics between these patients and those who could not be linked to the NDI were largely similar except patients The cohort includes patients who had at least one ICD-9-CM or ICD-10 diagnosis code for ovarian cancer, were continuously enrolled in a health plan contributing data to the HIRD for at least six months, and were confirmed to have advanced ovarian cancer based on staging information from either a cancer registry or the HIRE Oncology data or met the predictive model algorithm for advanced stage ovarian cancer. b Incident cases are individuals for whom at least six months of data were available in the HIRD prior to the first diagnosis of ovarian cancer in claims. c Cases are defined as "advanced stage at diagnosis" if their advanced stage date (from cancer registry, HIRE Oncology, or predictive model) was within one month of their first cancer diagnosis in claims, otherwise they are defined as "Diagnosed as early stage and progressed to advanced stage".  Table 6). The median overall survival in this cohort was 4.5 years (95%CI = 4.17, 4.86; Fig. 1, Table 4). Approximately 25% of the cohort had died within 1.28 years (95%CI = 1.20, 1.37; Fig. 1), and the five-year survival was 47.7% (95%CI = 0.462-0.493; Table 4). Survival results were similar when excluding data after June 30, 2019 (fiveyear survival = 46.8% (95%CI = 0.451-0.484; Supplemental Table 7).
The TTD and TTNT estimates among treated patients were lower than overall survival estimates with approximately half of the treated cohort having a treatment discontinuation or death within the first 4 months (Fig. 1,  Table 4), or a second line treatment or death by about 6 months (0.46 years, 95%CI = 0.46, 0.53; Fig. 1, Table 4).
After NDI linkage, few fatal HOI events were identified, with hypertension, serious infections, and renal failure being the most common (data available upon request).

Discussion
This study identified a large cohort of incident advanced stage ovarian cancer patients in US administrative claims and examined descriptive data on demographics, treatment patterns, safety events, and mortality rates. Incidence rates of serious infections, and symptoms such as abdominal pain, malaise and fatigue, and nausea and vomiting were high. Incidence rates of HOIs could be used as comparator rates for safety signals to help inform and contextualize the safety of new or future therapies for advanced stage ovarian cancer, especially for uncontrolled clinical trials. Our study, which used our previously validated predictive model for advanced stage ovarian cancer, [11] provides detailed information on the routine care of advanced stage ovarian cancer. In this population, over one-third of individuals received an ovarian-related surgery and over two-thirds of individuals received radiotherapy or systemic anti-cancer therapy during follow-up (i.e. after their advanced stage index date). Surgeries and treatments may have occurred prior to this advanced stage date (e.g. when they had Antineoplastic -Hedgehog Pathway Inhibitors 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) Antineoplastic -Immunomodulators ≤10 ≤10 0 (0%) 0 (0%) 0 (0%) Antineoplastic Radiopharmaceuticals 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) Chemotherapy Adjuncts 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%)  early stage ovarian cancer or just before the index date), or after they have dropped out of the study (e.g. due to health plan discontinuation) as no minimal follow-up time was required. The most commonly used treatments were chemotherapies such as alkylating agents and mitotic inhibitors, particularly in the first and second line of therapy. Other treatments such as antimetabolites and hormonal agents were more common in later lines of therapy. This cohort included some patients who had multiple malignances, and as diagnoses are not linked to a specific prescription, some of the included treatments may represent treatment for diseases outside of ovarian cancer. In addition, the treatment line algorithm may have some level of misclassification, as the results represent the treatment lines since the model estimated date of advanced cancer. Thus, some of the treatments noted in the first line could have been used in an adjuvant setting.

Immune-Checkpoint Inhibitors
This study observed high incidence rates of certain HOIs during follow-up such as anemia, diarrhea, hypertension and fatigue that have been noted as adverse events in trials [4][5][6] and other smaller observational studies [19,20]. This study also provides incidence rates of less common immune and endocrine-related events that have been unable to be robustly evaluated in previous studies given their limited sample size. While each of the 61 pre-specified HOI events did occur in at least one patient in this cohort, most of the immune and endocrine events were rare in advanced stage ovarian cancer patients, but events such as colitis and hypothyroidism were more common with incidence rates over three per 100 person-years of observation. The incidence of colitis and hypothyroidism in these Estimates of IR are shown per 100 person-years. Incidence is calculated as the number of new events divided by the sum of person-time at risk, defined as the time between the start of follow-up and the date of the event. In each row, individuals who had a diagnosis of the applicable event prior to the start of follow-up (i.e., prevalent cases) were not included. Fig. 1 Advanced stage ovarian cancer, overall survival (a), time to treatment discontinuation or death (TTD) (b), time to next treatment or death (TTNT) (c). Abbreviations: 1 L, 1st Line; 2 L: 2nd Line; Trt, Treatment; TTD, treatment discontinuation or death; TTNT, time to next treatment or death women was not significantly higher after systemic therapy (Table 3). While, it is known that treatments such as platinum chemotherapy are associated with adverse events that impact quality of life, few studies have examined the occurrence of adverse events occurring among advanced stage ovarian cancer patients in a large realworld population. This is partially due to the lack of clinical stage information readily available in administrative claims. This study tried to provide proxies for such data through the incidence of HOIs among an advanced stage cancer population. The HOIs in our study were not validated and it is expected that accuracy varies by safety event. In claims research, diagnosis, procedure, and prescription dispensing codes are used to reconstruct patients' medical histories. As such, claims diagnoses are subject to misclassification and incidence estimates can vary widely based on the case definition useda rate based on a definition that is very sensitive but not specific may be an overestimate, while a rate based on a definition that is specific but poorly sensitive may be an underestimate [21]. This is particularly relevant given that some of the outcomes used in the current study are based on clinical characteristics that are less likely to be assigned a diagnosis code (e.g., nausea, fatigue), and therefore would be captured in a claims database with poor sensitivity. These HOI algorithms would not capture fatal safety events if they occurred outside the healthcare system, although our linkage to the NDI could detect fatal HOIs, suggesting that HOIs were rarely noted on death certificates.
Survival of advanced stage ovarian cancer patients, while still relatively low, has been improving over time potentially due to the increasing number of therapeutic options. This study is also the first to our knowledge to provide estimates of TTD and TTNT (previously used as surrogates of disease progression during treatment) [22][23][24] for advanced stage ovarian cancer patients, in addition to overall survival. These proxies have been examined in other cancers and are correlated with progression free survival [16,22]. In our study, we observe near ubiquitous treatment discontinuation (TTD) and transfer to second line (or later) therapies (TTNT) within a few months of initiation of the first line therapy for advanced disease, and while overall survival was longer than the TTD and TTNT measures, it was still poor with approximately half the patients dying within 5 years. We found that almost all patients with advanced stage ovarian cancer (> 95%) were diagnosed at an advanced stage, rather than progressing from an earlier stage. This may be an indication of a lack of screening for this disease suggesting that symptoms may be initially mistaken for other diseases or are not present until later in the disease progression, which could contribute to accelerated mortality. Recent trials suggest that the use of PARP inhibitors (e.g., veliparib and olaparib) alone or in combination with chemotherapy or VEGF inhibitors significantly improves progression-free survival in first-line, as maintenance therapy and after first-line platinum exposure in ovarian cancer [25][26][27]. If these findings are confirmed through a benefit in overall survival, these new treatment strategies will likely reshape the treatment landscape of the disease in the coming years with widespread use and likely improve the outcomes currently observed in this patient population.
Our cohort included both stage III and IV tumors among commercial insured US patients. This population is likely younger and with a higher social economic status than the general US ovarian cancer population, given our limited data on Medicare (> 65 year old) population and the lack of Medicaid data. The median overall survival, which was evaluated in a subset of population that was older than our overall population, was 4.5 years. In contrast, the 5-year survival rates based on Surveillance, Epidemiology, and End Results (SEER) data (US cancer registry) were 74% for regional tumors (spread to regional lymph nodes) at diagnosis and 29% for distant tumors (i.e. metastasized) (46% at 3 years) [28] suggesting overall survival may be modestly higher in our sample compared to SEER data assuming our sample largely is composed of distant stage cancers. While this difference could be related to age and higher income of our sample, there are also other explanations. For example, the start of follow-up time for SEER is the date of cancer diagnosis while in this study it is the date a patient has met the threshold of advanced stage cancer. Additionally, there could be imperfect sensitivity of NDI linkage for mortality, which would bias mortality rates downward. Published literature suggests NDI has a high sensitivity (97%) [29]. However, the sensitivity could be lower in patients with incomplete identifying information (e.g., missing social security number) which is present on at least a small subset of the HIRD. In our main survival analyses, we censored a patient's follow-up at the time they lost healthcare coverage eligibility in the HIRD (e.g., changed insurance plans). Some patients may leave their workplace and their related health plan as the disease progresses and deaths could occur at a differential raterelatively soon after discontinuation of the health plan. To examine this possibility, we conducted an additional analysis where we did not censor at the discontinuation of the health plan. When using all available NDI mortality data, we found that the survival for ovarian cancer was similar to when censoring at health plan discontinuation (data available upon request)suggesting that informed censoring was not a major source of bias.

Conclusions
This study of over ten thousand advanced stage ovarian cancer patients in the US from 2010 to 2018 provides a description of the diverse treatment patterns, numerous HOIs, and relatively short survival time for these women. These data on incidence rates of HOIs could be utilized as comparator rates of safety events for new and future ovarian cancer therapies indicated for advanced stage ovarian cancer, which will be of particularly importance given the numerous new treatment options, such as PARP inhibitors, and increasing survival of this population.
Additional file 1: Figure S1. Example of a hypothetical patient "A" progressing from early to advanced stage ovarian cancer and definition of index date. Most patients (96.7%) in our cohort were classed as advanced stage at diagnosis. This hypothetical example would have been classified in those who "progressed from early to advanced stage ovarian cancer", which represented 3.3% of patients in the cohort.
Additional file 2: Table S1. Codes used to define ovarian cancer. Table S2. Probability of advanced stage ovarian cancer for hypothetical patient "A" over time in the HIRD used to identify their index date. Table  S3. Top 25 most common prescribed medication among 12,659 advanced stage ovarian cancer patients, 12 months before and after their advanced stage ovarian cancer date. Table S4. Advanced stage ovarian cancer cohort, cancer treatment received on or after the advanced stage date (N = 12,659). Table S5. Advanced stage ovarian cancer cohort, hospital or emergency room incidence rates of selected health outcomes of interest. Table S6. Characteristics by National Death Index (NDI) linkable status. Table S7. Ovarian cancer overall survival, excluding last 6 months of follow-up (July-December 2017).

Acknowledgments
Doreen-allen Kahangire (Merck Healthcare KGaA (Darmstadt, Germany), operational), Nianya Liu (HealthCore, Inc., programming), Shiva Krishna Vojjala (HealthCore, Inc., programming). Cancer incidence data used in certain analyses were obtained from the Ohio Cancer Incidence Surveillance System (OCISS), Ohio Department of Health (ODH), a cancer registry partially supported by the National Program of Cancer Registries at the Centers for Disease Control and Prevention (CDC) through Cooperative Agreement Number NU58DP006284. Use of these data does not imply that ODH or CDC agrees or disagrees with the analyses, interpretations or conclusions in this report.

Funding
This study was funded by Pfizer Inc. and Merck HealthCare KGaA.

Availability of data and materials
Data and further materials for this manuscript cannot be shared given privacy regulations.
Ethics approval and consent to participate This study was approved by the New England Institutional Review Board (Work Order Number 1-9472-1). The current study was designed as an analysis based on claims data from a large insured population in the US. There was no active enrollment or active follow-up of study subjects, and no data was collected directly from individuals. The HIPAA Privacy Rule permits PHI in a limited data set to be used or disclosed for research, without individual authorization, if certain criteria are met (further described 45 CFR Part 160 and Subparts A and E of Part 164). Thus informed consent was not required.

Consent for publication
Not applicable, as all results presented in this manuscript were aggregated.