Efficient Algorithms for Building Representative Matched Pairs with Enhanced Generalizability

Abstract

Many recent efforts center on assessing the ability of real-world evidence (RWE) generated from non-randomized, observational data to produce results compatible with those from randomized controlled trials (RCTs). One noticeable endeavor is the RCT DUPLICATE initiative. To better reconcile findings from an observational study and an RCT, or two observational studies based on different databases, it is desirable to eliminate differences between study populations. We outline an efficient, network-flow-based statistical matching algorithm that designs well-matched pairs from observational data that resemble the covariate distributions of a target population, for instance, the target-RCT-eligible population in the RCT DUPLICATE initiative studies or a generic population of scientific interest. We demonstrate the usefulness of the method by revisiting the inconsistency regarding a cardioprotective effect of the hormone replacement therapy (HRT) in the Women’s Health Initiative (WHI) clinical trial and corresponding observational study. We found that the discrepancy between the trial and observational study persisted in a design that adjusted for the difference in study populations' cardiovascular risk profile, but seemed to disappear in a study design that further adjusted for the difference in HRT initiation age and previous estrogen-plus-progestin use. The proposed method is integrated into the R package match2C.

Publication
Biometrics 79(4), 3981-3997
Bo Zhang
Bo Zhang
Assistant Professor of Biostatistics

My research interests include design of observational studies, instrumental variables, application of causal inference in medicine and applied statistics in general.

Related