We thank the authors for taking the time to evaluate our study. While we disagree with several of their comments, we do agree with their conclusion that a randomized controlled trial would answer most of our shared concerns.
Regarding the authors’ first point, our study size indeed decreased with increasing confounding control, but this was because addition of additional control factors in effect highlighted the ways in which the treated and control populations were distinct from one another. We would argue the potential loss of external validity — that is, generalizability to a specific target population— while improving internal validity — that is, estimation of the true causal treatment effect in the study sample — is a necessary trade-off, as results that are generalizable but remain biased do not advance patient care. It’s a trade-off that exists even for randomized trials, in which many more patients are screened than are enrolled. , While this selectivity may limit generalizability, it is selection, but not selection bias. In our study, the population of our final analysis A5 is described in Table 1, with full transparency on the variable selection methodology included in the Supplement. Readers can judge whether this population is representative of their patients, as they would by inspecting the Table 1 of any randomized trial publication.
Regarding the authors’ second point, we disagree that additional covariate adjustment implies over-adjustment, which has a specific definition that includes, for example, adjusting for causal intermediates, instrumental variables, or colliders. Our study avoids such over-adjustment by design, by collecting all covariate data prior to exposure and by explicitly screening for instrumental variables and colliders. Further, with the observed point estimate of 0.99, we do not see how increased statistical power from a larger patient population would change the interpretation of the result. We also disagree that a hazard ratio of 1.72 for risk of cholelithiasis, with a confidence interval that includes the null, invalidates the inspection of this tracer outcome. The study was not powered to detect the tracer outcome, and if it were, dichotomously interpreting a 72% increased risk of cholelithiasis (0.94-3.13) as “non-existent” based on a criterion of p < 0.05 is explicitly discouraged. Rather, we are reassured by the consistent demonstration of the expected increased risk of our two tracer outcomes, cholelithiasis and B12 deficiency anemia across analyses A1 through A5.
Regarding the authors’ third point, the letter omits the key reasons why we employed an active comparator: to overcome substantial and demonstrated information imbalance in EHR data comparing patients undergoing surgery versus not (Figures S3 and S4), and to introduce a specific “time 0” inception point in the non-surgical control group. As noted above, the inclusion of an active comparator may limit generalizability, but its inclusion will not contribute to bias. Further, while the established treatment guidelines around joint replacement are thoughtful and meaningful, they are not followed consistently (Table 1). In our A5 analysis, 56.3% of arthroplasty controls had a BMI ≥40. So while these control patients may indeed be highly selected, they are as highly selected as the treated patients. We achieve the apples-to-apples comparison needed for an unbiased estimate by constructing an active comparator design, by extensive capture of risk factors, and in the application of propensity-score matching.
Taken together, our study does not support a 40-50% reduction in MACE shortly after bariatric surgery. We agree that the previously observed association was unlikely fully explained by unmeasured confounding, yet we remain aware that such strong biases due to unmeasured confounding are not unheard of — including reports of spurious effect of statins on hip fractures, statins on colorectal cancer, and flu vaccines or SGLT-2 inhibitors on all-cause mortality. , Rather, in this case several design-related biases may also have contributed. Note that unlike unmeasured confounding, these design biases would not be captured in an e-value.
Finally, the existence of many studies with consistent results does not guarantee validity. The noted 30 published observational studies that showed positive cardiovascular effects of bariatric surgery bring to mind the many observational studies of hormone replacement therapy (HRT) and the risk of coronary heart disease which, except for the Framingham study, consistently found a clinically meaningful benefit. In the HRT example, the many studies were subsequently refuted by clinical trial evidence and a re-analysis of the real-world data based on principles of causal inference. In the end, we agree that a randomized trial could shed much-needed light on the important question of the cardiovascular benefits of bariatric surgery. Should that trial fail to emerge — or should it under-represent at-risk populations such as Hispanic and non-Hispanic Black adults — real-world evidence, generated as described in our study, offers the best information available about the effect of bariatric surgery on reduction in MACE.