PI-RADS Version 2.1 for Prostate MRI Interpretation: Associations of Study Quality and Cancer Detection Metrics—A Systematic Review and Meta-Analysis
Andrea Nedelcu, Benedict Oerther, Alexander Benkendorff, Stephan Dieckbreder, Guido Schwarzer, Georgios Agrotis, Ivo G. Schoots, Rami El Matine, Michel Eisenblaetter, August Sigle, Hannes Engel, Fabian Bamberg, Matthias Benndorf
<b>BACKGROUND</b>. Estimates of outcome metrics for PI-RADS version 2.1 (v2.1) have shown substantial heterogeneity, possibly relating to risks of bias in the relevant literature. <b>OBJECTIVE</b>. The purpose of this study was to provide updated summary estimates for diagnostic test accuracy metrics and cancer detection rates (CDRs) of PI-RADS v2.1 for clinically significant prostate cancer (csPCa) detection and to stratify these results by study quality. <b>EVIDENCE ACQUISITION</b>. We searched seven databases and registers from March 1, 2019, through September 16, 2023, for studies reporting diagnostic test accuracy metrics and/or CDRs of PI-RADS v2.1 for csPCa detection in men with suspicion for prostate cancer. Studies' risk of bias and concerns of applicability were rated in four and three domains, respectively, using the QUADAS-2 tool. Summary estimates of sensitivity and specificity were derived using bivariate binomial models and of CDRs with random intercept logistic regression models. <b>EVIDENCE SYNTHESIS</b>. The analysis included 117 studies with 25,228 patients and 15,553 lesions. At least one domain was rated as unclear or high risk of bias or concerns of applicability in all studies and as high risk of bias or concerns of applicability in 29% (34/117) of studies. Patient-level sensitivity and specificity for PI-RADS category 3 and greater were 96% and 43% and for PI-RADS category 4 and greater were 88% and 66%, respectively. Lesion-level sensitivity and specificity for PI-RADS category 3 and greater were 96% and 44% and for PI-RADS 4 and greater were 89% and 63%, respectively. Lesion-level sensitivity for PI-RADS category 4 and greater was lower for high-risk studies than for remaining studies (78% vs 89%, <i>p</i> = .008). Patient-level CDR for PI-RADS categories 1, 2, 3, 4, and 5 were 3%, 6%, 20%, 53%, and 83%, respectively; CDR for PI-RADS category 2 was higher for high-risk studies than for remaining studies (15% vs 4%, <i>p</i> = .04). Other associations between study quality and outcome metrics were not significant (<i>p</i> > .05). <b>CONCLUSION</b>. The updated summary estimates confirm overall high sensitivity of PI-RADS v2.1 for csPCa detection. A considerable proportion of studies had high risk of bias or concerns of applicability; such studies were associated with reduced sensitivity and increased CDR for PI-RADS category 2. <b>CLINICAL IMPACT</b>. Studies with quality issues may yield flawed performance estimates.