External validation of anti-Müllerian hormone based prediction of live birth in assisted conception
© Khader et al.; licensee BioMed Central Ltd. 2013
Received: 15 November 2012
Accepted: 23 December 2012
Published: 7 January 2013
Chronological age and oocyte yield are independent determinants of live birth in assisted conception. Anti-Müllerian hormone (AMH) is strongly associated with oocyte yield after controlled ovarian stimulation. We have previously assessed the ability of AMH and age to independently predict live birth in an Italian assisted conception cohort. Herein we report the external validation of the nomogram in 822 UK first in vitro fertilization (IVF) cycles.
Retrospective cohort consisting of 822 patients undergoing their first IVF treatment cycle at Glasgow Centre for Reproductive Medicine. Analyses were restricted to women aged between 25 and 42 years of age. All women had an AMH measured prior to commencing their first IVF cycle. The performance of the model was assessed; discrimination by the area under the receiver operator curve (ROCAUC) and model calibration by the predicted probability versus observed probability.
Live births occurred in 29.4% of the cohort. The observed and predicted outcomes showed no evidence of miscalibration (p = 0.188). The ROCAUC was 0.64 (95% CI: 0.60, 0.68), suggesting moderate and similar discrimination to the original model. The ROCAUC for a continuous model of age and AMH was 0.65 (95% CI 0.61, 0.69), suggesting that the original categories of AMH were appropriate.
We confirm by external validation that AMH and age are independent predictors of live birth. Although the confidence intervals for each category are wide, our results support the assessment of AMH in larger cohorts with detailed baseline phenotyping for live birth prediction.
KeywordsAMH Live birth prediction IVF
In vitro fertilization
- ROCAUC :
Area under the receiver operator curve
Body mass index
Chronological age and oocyte yield are independent determinants of live birth in assisted conception . Circulating anti-Müllerian hormone (AMH) levels are strongly associated with oocyte yield after controlled ovarian stimulation . This strong correlation with oocyte yield underlies the independent associations of AMH and age with live birth, which has now been confirmed in several studies [3–7]. We previously exploited this relationship to construct a nomogram for the prediction of live birth using a combination of age and AMH in an Italian cohort of 381 IVF cycles .
The prediction model for a live birth based on age and AMH (with permission from ref.8)
0.4- < 2.8
At present although a variety of prediction models have been reported, we are not aware of any other models that incorporate AMH for the prediction of live birth in IVF cycles (9). Our AMH-age model therefore has the potential to become a clinically useful addition to the fertility workup of infertile couples. However before clinicians can adopt any prediction models into routine clinical practice, the accuracy of the model should be independently evaluated in a population different from the one on which the model was elaborated [9, 10]. External validation (EV) of the discriminative power and calibration of the model is therefore crucial to assess the generalizability of our model to other populations. The objective of the present study was to validate our previously developed prediction model for the live birth in IVF cycles in an independent large cohort of infertile women.
Validation cohort profile
This study analysed the database containing the clinical and laboratory information on IVF treatment cycles carried out at Glasgow Centre for Reproductive Medicine, Glasgow 2006 – 2010. These data were collected prospectively and recorded in the registered database in the fertility centre in Glasgow, UK. Patients were stimulated in accordance with our previously published policies using a combination of agonist and antagonist strategies based on ovarian reserve assessment. For this analysis cycles were selected for analysis if they were the first IVF/ICSI cycle. We had previously limited our analyses to women aged between 25 and 42 years of age (8) and we censored the EV dataset in keeping with our previous age restriction. Embryo transfer policy was in line with UK regulations with predominantly two embryos transferred in women <40 and three in woman ≥40 years. Live birth was defined as at least one infant born alive after 24 weeks gestation, consistent with previous prediction models and publications.
AMH was measured prior to commencement of all IVF cycles and was measured on any day of the cycle. The AMH assay used was the commercial ELISA kit provided by DSL(Webster, TX, USA), with values initially presented in concentrations of picomoles per litre (conversion factor to pmol/l = ng/ml × 7.143). Inter and intra-assay coefficients of variation were 5.3 and 5.4%, respectively. The development cohort had utilised the Immunotech assay and therefore the EV cohort values were transformed using our previously reported equation AMH Immunotech = 1.40 DSL–0.62 pmol/L, with subsequent conversion to ng/ml (12). As AMH was not normally distributed it was log transformed in accordance with previous analyses.
Validation of a prediction model comprises two characteristics of diagnostic performance: calibration which is the agreement between predictions and observations in the validation cohort and discrimination which is the ability of the model to distinguish between women with or without live birth.
Age > 37 & AMH < 0.4
Age > 37 & 0.4 ≤ AMH < 2.8
Age > 37 & AMH ≥ 2.8
31 ≤ Age ≤ 37 & AMH < 0.4
31 ≤ Age ≤ 37 & 0.4 ≤ AMH < 2.8
31 ≤ Age ≤ 37 & AMH ≥ 2.8
Age < 31 & AMH < 0.4
Age < 31 & 0.4 ≤ AMH < 2.8
Age < 31 & AMH ≥ 2.8
The predicted number of live births was calculated for each group by multiplying the predicted probability by the total number of subjects in the group. The predicted number of women without a live birth was calculated as the total number of subjects per group minus the predicted number of live births. Groups were pooled to allow expected counts to have >5 subjects, with comparison of observed and predicted assessed by chi-squared test.
Age > 37 & AMH < 2.8
Age > 37 & AMH ≥ 2.8
31 ≤ Age ≤ 37 & AMH < 2.8
31 ≤ Age ≤ 37 & AMH ≥ 2.8
Age < 31 & AMH < 2.8
Age < 31 & AMH ≥ 2.8
The discrimination of the model was assessed by the area under the receiver operator curve (ROCAUC), and the calibration by the predicted probability versus observed probability. A logistic regression model was fitted with age and AMH as continuous variables; the discrimination of this model was compared to that from the model with age and AMH as categories to investigate the relative accuracy of the age and AMH cut-offs used in the nomogram. Analyses were performed using SASv9.2.
Summary of baseline characteristics and treatment outcomes for development and validation cohort
N = 381
N = 822
34.8 +/− 4.48
35.3 +/− 4.28
24 +/− 5.8
24.6 +/− 4.30
1.3 (0.03, 13.8)
2.16 (0.02, 38.6)
Duration of infertility (years)
2.8 +/− 1.7
2.49 +/− 2.36
Cause of infertility
Combination of cause
Duration of stimulation (days)
12.8 +/− 2.8
10.4 +/− 2.11
205 +/− 58.6
216 +/− 59.0
Oocytes per patient
8.5 +/− 5.1
7.99 +/− 4.71
Embryo transfers performed
Number of embryos transferred:
Expected probability of a live birth based on La Marca model versus observed live birth
Probability of live birth
No live birth
Age > 37; AMH < 0.4
Age > 37; AMH 0.4 - 2.8
Age > 37; AMH > = 2.8
Age 31–37; AMH < 0.4
Age 31–37; AMH 0.4 - 2.8
Age 31–37; AMH > = 2.8
Age < 31; AMH < 0.4
Age < 31; AMH 0.4 - 2.8
Age < 31; AMH > = 2.8
Logistic regression for live birth with respect to age and AMH, both fitted as continuous variables, demonstrated decreased odds with increasing age (odds ratio (OR) 0.91, 95% CI 0.87,0.94; p ≤ 0.0001), and a trend for increased odds with higher AMH levels (OR 1.04, 95% CI 0.99,1.09; p = 0.1297). The ROCAUC for a continuous model of age and AMH was similar at 0.65 (95% CI 0.61, 0.69), suggesting no disadvantage to the cut-offs employed in the nomogram.
This study externally validates our AMH-age based prediction of live birth for IVF . Furthermore equivalent model performance was demonstrated in the EV cohort, with confirmation of the independent associations of AMH and age with live birth [3–7].
Recent literature has identified an array of factors which can influence the success of ART, with various prediction models utilizing these factors to aid the determination of a couple’s likelihood of success [11–13]. However, the use of such prediction models clinically has remained limited, largely due to lack of external validation. Of the 29 pregnancy prediction models identified in a recent systematic review, only 8 were externally validated, with only 3 of these applicable to IVF . Our model adds to this literature, allowing stratification of the probability of live birth prior to the commencement of treatment. A relevant difference with previous published model of live birth in IVF is that while the majority of prediction models are based on variables measured during the IVF cycle (e.g. number and quality of embryos), the AMH-age model is based on only baseline characteristics, hence permitting it to be used by clinicians and patients prior to commencing stimulation.
A criticism of the original study was the cut off points given to age and AMH levels in the nomogram and the potential for predictive power of the model to be attenuated by these designated cut-offs. In order to overcome this potential weakness, we additionally investigated the use of age and AMH as continuous variables. The ROCAUC achieved through this mechanism was 0.65, which was identical to that achieved originally, suggesting that the use of the proposed cut-offs does not compromise the predicted probabilities generated and that alternative values would not improve predictions. This is reassuring and allows AMH and age to be displayed as categories, rather continuous variables, in tables. This has clear benefit for applying the model in a clinical environment, with simple cross tabulation of the patient’s age with their AMH concentration rather than having to apply a complex logistic regression formula.
In the EV cohort the discriminative ability of the model was only moderate (ROCAUC: 0.66), meaning that the model has limited capacity to be able to correctly distinguish between women who will or will not have a baby following IVF. However, ROC curves are primarily designed for diagnostic models (15), rather for prognostic models accuracy is better assessed by examining calibration (16). Calibration is evaluated by determining the level of correspondence between the calculated pregnancy probabilities and the observed proportion of pregnancies. A well-calibrated model for IVF would be able to classify individuals into whether they have a low, medium or high probability of achieving a live birth. In contrast to the relatively modest discrimination, the calibration of the model was found to be good (Figure 1).
The strength of this study is that the sample size was more than twice that used for model derivation. However, the EV cohort differed from the original cohort for several characteristics such as BMI and duration of infertility and also the intermediate outcome of IVF were different between the two cohorts. This largely reflects the difference existing between the demographic characteristics of Italian and Scottish infertility populations and also the different IVF clinical practices between the two countries. Particularly as the initial study was undertaken when the Italian law regulating assisted reproduction limited the number of inseminated oocytes to three, thereby reducing the number of embryos that may be generated for each patient, was still operative. This resulted in a discordance in the number of embryos transferred, with the all available embryos being transferred in Modena – mainly three; while single or double embryo transfer dominated in Glasgow. In the EV cohort, women were included irrespective of the cause of infertility, past medical history or type of stimulation. Despite these relatively important differences in patient characteristics, legislation and clinical practice the proposed model still fitted very well, further highlighting the potential generalizability of the prognostic model.
The original study limited its analysis to age and AMH, as only these two factors were identified as predictive in the original multivariate analysis for model development (8). We are aware that additional characteristics including BMI, cause and duration of infertility may influence results and the lack of association of these baseline factors with live birth, may have reflected the size of the original cohort (14).
Finally it should be acknowledged that the probabilities generated have relatively wide confidence intervals for all groups; therefore a couple’s predicted likelihood can range significantly. For example, women aged below 31, with AMH levels less than 0.4 ng/mL, are predicted a 13% chance of live birth, however, the confidence interval ranges from 4 to 36% which does not infer much reassurance in their chances of successful outcome. It would however be inappropriate to withhold treatment purely based on the probability estimates derived from our nomogram . Even in women with an AMH below or close to the functional sensitivity of the assay, natural and assisted conception pregnancies have been reported [15–18]. Therefore clinical consultations would require interplay of both the interpretation of the nomogram results by the clinician and individual patient opinion as to whether the probabilities produced could be of benefit. The greatest utility of this external validation may therefore be to confirm that AMH is an independent predictor of live birth and is worthy of evaluation in larger cohorts with detailed baseline phenotyping, with a view to assessing its utility in improving model performance [13, 19].
This study externally validates our AMH-age based prediction of live birth for IVF.
The greatest utility of this external validation may be to confirm that AMH is an independent predictor of live birth. Moreover, as it was shown, AMH and age can be displayed as categories, rather continuous variables, with clear benefit for applying the model in a clinical environment.
However a couple’s predicted likelihood of live birth can range significantly, therefore it would require by the clinician interplay of both the interpretation of the nomogram results and individual patient evaluation.
Antonio La Marca and Scott M Nelson joint senior authors.
- Sunkara SK, Rittenberg V, Raine-Fenning N, Bhattacharya S, Zamora J, Coomarasamy A: Association between the number of eggs and live birth in IVF treatment: an analysis of 400 135 treatment cycles. Hum Reprod 2011, 26:1768–1774.PubMedView Article
- La Marca A, Sighinolfi G, Radi D, Argento C, Baraldi E, Artenisio AC, Stabile G, Volpe A: Anti-Mullerian hormone (AMH) as a predictive marker in assisted reproductive technology (ART). Hum Reprod Update 2010, 16:113–130.PubMedView Article
- Nelson SM, Yates RW, Fleming R: Serum anti-Mullerian hormone and FSH: prediction of live birth and extremes of response in stimulated cycles implications for individualization of therapy. Hum Reprod 2007, 22:2414–2421.PubMedView Article
- Lee TH, Liu CH, Huang CC, Hsieh KC, Lin PM, Lee MS: Impact of female age and male infertility on ovarian reserve markers to predict outcome of assisted reproduction technology cycles. Reprod Biol Endocrinol 2009, 7:100.PubMedView Article
- Li HW, Biu Yeung WS, Lan Lau EY, Ho PC, Ng EH: Evaluating the performance of serum antimullerian hormone concentration in predicting the live birth rate of controlled ovarian stimulation and intrauterine insemination. Fertil Steril 2010, 94:2177–2181.PubMedView Article
- Majumder K, Gelbaya TA, Laing I, Nardo LG: The use of anti-Mullerian hormone and antral follicle count to predict the potential of oocytes and embryos. Eur J Obstet Gynecol Reprod Biol 2010, 150:166–170.PubMedView Article
- Gleicher N, Weghofer A, Barad DH: Anti-Mullerian hormone (AMH) defines, independent of age, low versus good live-birth chances in women with severely diminished ovarian reserve. Fertil Steril 2010, 94:2824–2827.PubMedView Article
- La Marca A, Nelson SM, Sighinolfi G, Manno M, Baraldi E, Roli L, Xella S, Marsella T, Tagliasacchi D, D’Amico R, Volpe A: Anti-Mullerian Hormone (AMH) based prediction model for the live birth in assisted reproductive technology (ART). RBM online 2011, 22:341–349.PubMed
- Moons KGM, Kengne AP, Grobbee DE, Royston P, Vergouwe Y, Altman DG, Woodward M: Risk prediction models: II. External validation, model updating, and impact assessment. Heart 2012, 98:691–698.PubMedView Article
- Altman DG, Royston P: What do we mean by validating a prognostic model? Stat Med 2000, 19:453–473.PubMedView Article
- Leushuis E, van der Steeg JW, Steures P, Bossuyt PMM, Eijkemans MJC, van der Veen F, Mol BWJ, Hompes PGA: Prediction models in reproductive medicine: a critical appraisal. Hum Reprod Update 2009, 15:537–552.PubMedView Article
- Coppus SFPJ, van der Veen F, Opmeer BC, Mol BWJ, Bossuyt PMM: Evaluating prediction models in reproductive medicine. Hum Reprod 2009, 24:1774–1778.PubMedView Article
- Nelson SM, Lawlor DA: Predicting live birth, preterm delivery, and low birth weight in infants born from in vitro fertilisation: a prospective study of 144,018 treatment cycles. PLoS Med 2011, 8:e1000386.PubMedView Article
- Nelson SM, Anderson RA, Broekmans FJ, Raine-Fenning N, Fleming R, La Marca A: Anti-Müllerian hormone: clairvoyance or crystal clear? Hum Reprod 2012, 27:631–636.PubMedView Article
- Hagen CP, Vestergaard S, Juul A, Skakkebaek NE, Andersson AM, Main KM, Hjollund NH, Ernst E, Bonde JP, Anderson RA, Jensen TK: Low concentration of circulating antimullerian hormone is not predictive of reduced fecundability in young healthy women: a prospective cohort study. Fertil Steril 2012, 98:1602–1608.PubMedView Article
- Grzegorczyk-Martin V, Khrouf M, Bringer-Deutsch S, Mayenga JM, Kulski O, Cohen-Bacrie P, Benaim JL, Belaisch-Allart J: Pronostic en fécondation in vitro des patientes ayant une AMH basse et une FSH normale. Gynecol Obstet Fertil 2012, 40:411–418.PubMedView Article
- Nelson SM, Fleming R: Low AMH and GnRH-antagonist strategies. Fertil Steril 2009, 92:e40. author reply e41PubMedView Article
- Weghofer A, Dietrich W, Barad DH, Gleicher N: Live birth chances in women with extremely low-serum anti-Mullerian hormone levels. Hum Reprod 2011, 26:1905–1909.PubMedView Article
- Lawlor DA, Nelson SM: Effect of age on decisions about the numbers of embryos to transfer in assisted conception: a prospective study. Lancet 2012, 379:521–527.PubMedView Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.