Discriminative ability was comparable for all models (range of C statistic 0.75-0.80).Sensitivity ranged from 88% (simplified revised Geneva) to 96% (simplified Wells) and specificity from 48% (revised Geneva) to 53% (simplified revised Geneva). Differences were observed between failure rates, especially between the simplified Wells and the simplified revised Geneva models (failure rates 1.2% (95% confidence interval 0.2% to 3.3%) and 3.1% (1.4% to 5.9%), respectively; absolute difference −1.98% (−3.33% to −0.74%)).Accordingly, the diagnostic predictors or tests included in the diagnostic model needed to be measurable at the general practitioner’s office.

Therefore, models derived in hospital or acute care settings cannot simply be implemented in primary care.8 9 10 11 12 13 14 Reasons for this poorer performance include differences in the case mix and the prevalence of pulmonary embolism due to the unselected population, as well as differences in physicians’ experience of patients with suspected pulmonary embolism.9 10 15 16 Hence, when transferring diagnostic models or strategies across healthcare settings, evaluation of their performance in this other setting is necessary first.

This form of external validation is referred to as domain or setting validation,8 10 17 or as quantification of the transportability of prediction models.13 18The recent AMUSE-2 study (Amsterdam, Maastricht, Utrecht Study on thrombo-Embolism)19 has been the first to prospectively quantify the transportability of the, perhaps best known, secondary care derived diagnostic prediction model for pulmonary embolism (that is, the Wells pulmonary embolism rule,20 combined with point of care D-dimer testing) in a primary care setting.

All diagnostic models for safe exclusion of pulmonary embolism have been developed and validated in hospital or acute care settings.

However, diagnostic prediction models developed in a particular setting often perform less well when applied in another setting.

Results Ten published prediction models for the diagnosis of pulmonary embolism were found.

Five of these models could be validated in the primary care dataset: the original Wells, modified Wells, simplified Wells, revised Geneva, and simplified revised Geneva models.Main outcome measures Discriminative ability of all models retrieved by systematic literature search, assessed by calculation and comparison of C statistics.After stratification into groups with high and low probability of pulmonary embolism according to pre-specified model cut-offs combined with qualitative D-dimer test, sensitivity, specificity, efficiency (overall proportion of patients with low probability of pulmonary embolism), and failure rate (proportion of pulmonary embolism cases in group of patients with low probability) were calculated for all models.We firstly did a systematic review and critical appraisal of all available diagnostic models for pulmonary embolism, as recommended by guidelines on prediction models research.21 Next, the diagnostic models easily applicable in primary care were validated in the AMUSE-2 dataset—that is, a large independent prospectively constructed cohort of patients presenting to their general practitioner with complaints suggestive of pulmonary embolism.For our systematic review and critical appraisal of the existing diagnostic models for pulmonary embolism, we followed the recent methodological guidance by the Prognosis Methods Group of the Cochrane Collaboration.21 22 23 24Firstly, we framed the review question and design by using the CHARMS checklist for systematic reviews of prediction models (see appendix box A).21 We then repeated the systematic search previously performed for an aggregate meta-analysis by Lucassen et al and used the same study selection criteria.7 We searched for studies on development and validation of diagnostic prediction models published between January 2010 and October 2014.Subsequently, a qualitative point of care D-dimer test (Simplify D-dimer; Clearview, Inverness Medical, Bedford, UK) was performed.

