Patrick M. Bossuyt’s research while affiliated with Gezond Amsterdam and other places
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
Objectives
To prevent colorectal cancer (CRC) , most patients with familial adenomatous polyposis (FAP) undergo (procto)colectomy with ileorectal anastomosis (IRA) or ileal pouch-anal anastomosis (IPAA). After surgery, these patients remain at risk of developing cancer in the remnant rectum or rectal cuff/pouch. We aimed to compare the long-term risk of cancer following IRA or IPAA in FAP.
Methods
We performed an international multicenter historical cohort study of FAP patients undergoing IRA or IPAA from 1990 to 2023. The proportion of patients developing cancer following surgery was estimated using the Kaplan-Meier method.
Results
(Procto)colectomy was performed in 685 patients (53.6% female); 366 (53.4%) had IRA and 319 (46.6%) IPAA. Median age at IRA and IPAA was 23 and 27 years, and median follow-up was 12 and 15 years, respectively. Overall, 8 patients (2.2%) developed rectal and/or rectal cuff/pouch cancer after IRA and 0.9% after IPAA. The estimated 10- and 20-year cancer incidence after IRA vs IPAA were 1.6% vs 0.4% and 2.5% vs 0.9%, respectively (log-rank p=0.15. Reoperations, mainly for extensive polyposis, were performed in 39 (10.7%) patients with an IRA and 24 (7.5%) patients following IPAA. The number of post-operative endoscopic surveillance endoscopies was higher in patients with an IRA compared to those with an IPAA (p<0.001).
Conclusions
Over the last three decades, few patients were diagnosed with cancer in the rectum or rectal cuff/pouch after (procto)colectomy in FAP. This might be due to an improved selection of the type of (procto)colectomy and close endoscopic surveillance including prophylactic polypectomies.
Evaluating diagnostic test accuracy during epidemics is difficult due to an urgent need for test availability, changing disease prevalence and pathogen characteristics, and constantly evolving testing aims and applications. Based on lessons learned during the SARS-CoV-2 pandemic, we introduce a framework for rapid diagnostic test development, evaluation, and validation during outbreaks of emerging infections. The framework is based on the feedback loop between test accuracy evaluation, modelling studies for public health decision-making, and impact of public health interventions. We suggest that building on this feedback loop can help future diagnostic test evaluation platforms better address the requirements of both patient care and public health.
Background
Early detection and diagnosis of venous thromboembolism are vital for effective treatment. To what extent methodological shortcomings exist in studies of diagnostic tests and whether this affects published test performance is unknown.
Objectives
We aimed to assess the methodological quality of studies evaluating diagnostic tests for venous thromboembolic diseases and quantify the direction and impact of design characteristics on diagnostic performance.
Methods
We conducted a literature search using Medline and Embase databases for systematic reviews summarizing diagnostic accuracy studies for five target disorders associated with venous thromboembolism. The following data were extracted for each primary study: methodological characteristics, the risk of bias scored by the QUADAS QUADAS-2 instrument, and numbers of true-positives, true-negatives, false-positives, and false-negatives. In a meta-analysis, we compared diagnostic accuracy measures from studies unlikely to be biased with those likely to be biased.
Results
Eighty-five systematic reviews comprising 1’818 primary studies were included. Adequate quality assessment tools were used in 43 systematic reviews only (51%). The risk of bias was estimated to be low for all items in 23% of the primary studies. A high or unclear risk of bias in particular domains of the QUADAS/QUADAS-2 tool was associated with marked differences in the reported sensitivity and specificity.
Conclusions
Significant limitations in the methodological quality of studies assessing diagnostic tests for venous thromboembolic disorders exist, and studies at risk of bias are unlikely to report valid estimates of test performance. Established guidelines for evaluation of diagnostic tests should be more systematically adopted.
Systematic Review Registration
PROSPERO (CRD 42021264912).
Background
To retrospectively assess the added value of an artificial intelligence (AI) algorithm for detecting pulmonary nodules on ultra-low-dose computed tomography (ULDCT) performed at the emergency department (ED).
Methods
In the OPTIMACT trial, 870 patients with suspected nontraumatic pulmonary disease underwent ULDCT. The ED radiologist prospectively read the examinations and reported incidental pulmonary nodules requiring follow-up. All ULDCTs were processed post hoc using an AI deep learning software marking pulmonary nodules ≥ 6 mm. Three chest radiologists independently reviewed the subset of ULDCTs with either prospectively detected incidental nodules in 35/870 patients or AI marks in 458/870 patients; findings scored as nodules by at least two chest radiologists were used as true positive reference standard. Proportions of true and false positives were compared.
Results
During the OPTIMACT study, 59 incidental pulmonary nodules requiring follow-up were prospectively reported. In the current analysis, 18/59 (30.5%) nodules were scored as true positive while 104/1,862 (5.6%) AI marks in 84/870 patients (9.7%) were scored as true positive. Overall, 5.8 times more (104 versus 18) true positive pulmonary nodules were detected with the use of AI, at the expense of 42.9 times more (1,758 versus 41) false positives. There was a median number of 1 (IQR: 0–2) AI mark per ULDCT.
Conclusion
The use of AI on ULDCT in patients suspected of pulmonary disease in an emergency setting results in the detection of many more incidental pulmonary nodules requiring follow-up (5.8×) with a high trade-off in terms of false positives (42.9×).
Relevance statement
AI aids in the detection of incidental pulmonary nodules that require follow-up at chest-CT, aiding early pulmonary cancer detection but also results in an increase of false positive results that are mainly clustered in patients with major abnormalities.
Trial registration
The OPTIMACT trial was registered on 6 December 2016 in the National Trial Register (number NTR6163) (onderzoekmetmensen.nl).
Key Points
An AI deep learning algorithm was tested on 870 ULDCT examinations acquired in the ED.
AI detected 5.8 times more pulmonary nodules requiring follow-up (true positives).
AI resulted in the detection of 42.9 times more false positive results, clustered in patients with major abnormalities.
AI in the ED setting may aid in early pulmonary cancer detection with a high trade-off in terms of false positives.
Graphical Abstract
Background
Guideline development on testing is known to be difficult for guideline developers. It requires consideration of various aspects, such as accuracy, purpose of testing, and consequences on management and people-important outcomes. This can be outlined in a test-management pathway. We aimed to create and user-test a step-by-step guide for guideline developers for designing a test-management pathway.
Methods
Developmental design with a co-creative strategy. We created a draft step-by-step guide, that was user tested in a workshop with 19 experts, and by interviewing 7 guideline panel members.
Results
Our proposed guide consists of five blocks of signalling questions: patients/population, index test(s), current practice/comparison/control, people-important outcomes, and the link between testing and outcome(s). The user testing led to refinement of the signalling questions, the use of inclusive terminology, and addition of a test-management pathway figure with detailed explanation.
Conclusions
The step-by-step guide for formulating focused guideline questions regarding healthcare related testing can help in identifying relevant characteristics of the population, tests, and outcomes and to create a test management pathway. This should facilitate the formulation of evidence-based guideline recommendations about healthcare related testing.
Background
Quality assessment of diagnostic accuracy studies (QUADAS), and more recently QUADAS-2, were developed to aid the evaluation of methodological quality within primary diagnostic accuracy studies. However, its current form, QUADAS-2 does not address the unique considerations raised by artificial intelligence (AI)–centered diagnostic systems. The rapid progression of the AI diagnostics field mandates suitable quality assessment tools to determine the risk of bias and applicability, and subsequently evaluate translational potential for clinical practice.
Objective
We aim to develop an AI-specific QUADAS (QUADAS-AI) tool that addresses the specific challenges associated with the appraisal of AI diagnostic accuracy studies. This paper describes the processes and methods that will be used to develop QUADAS-AI.
Methods
The development of QUADAS-AI can be distilled into 3 broad stages. Stage 1—a project organization phase had been undertaken, during which a project team and a steering committee were established. The steering committee consists of a panel of international experts representing diverse stakeholder groups. Following this, the scope of the project was finalized. Stage 2—an item generation process will be completed following (1) a mapping review, (2) a meta-research study, (3) a scoping survey of international experts, and (4) a patient and public involvement and engagement exercise. Candidate items will then be put forward to the international Delphi panel to achieve consensus for inclusion in the revised tool. A modified Delphi consensus methodology involving multiple online rounds and a final consensus meeting will be carried out to refine the tool, following which the initial QUADAS-AI tool will be drafted. A piloting phase will be carried out to identify components that are considered to be either ambiguous or missing. Stage 3—once the steering committee has finalized the QUADAS-AI tool, specific dissemination strategies will be aimed toward academic, policy, regulatory, industry, and public stakeholders, respectively.
Results
As of July 2024, the project organization phase, as well as the mapping review and meta-research study, have been completed. We aim to complete the item generation, including the Delphi consensus, and finalize the tool by the end of 2024. Therefore, QUADAS-AI will be able to provide a consensus-derived platform upon which stakeholders may systematically appraise the methodological quality associated with AI diagnostic accuracy studies by the beginning of 2025.
Conclusions
AI-driven systems comprise an increasingly significant proportion of research in clinical diagnostics. Through this process, QUADAS-AI will aid the evaluation of studies in this domain in order to identify bias and applicability concerns. As such, QUADAS-AI may form a key part of clinical, governmental, and regulatory evaluation frameworks for AI diagnostic systems globally.
International Registered Report Identifier (IRRID)
DERR1-10.2196/58202
STUDY QUESTION
Does hysterosalpingo-foam sonography (HyFoSy) prior to hysterosalpingography (HSG) or HSG prior to HyFoSy affect visible tubal patency when compared HSG or HyFoSy alone?
SUMMARY ANSWER
Undergoing either HyFoSy or HSG prior to tubal patency testing by the alternative method does not demonstrate a significant difference in visible tubal patency when compared to HyFoSy or HSG alone.
WHAT IS KNOWN ALREADY
HyFoSy and HSG are two commonly used visual tubal patency tests with a high and comparable diagnostic accuracy for evaluating tubal patency. These tests may also improve fertility, although the underlying mechanism is still not fully understood. One of the hypotheses points to a dislodgment of mucus plugs that may have disrupted the patency of the Fallopian tubes.
STUDY DESIGN, SIZE, DURATION
This is a secondary analysis of the randomized controlled FOAM study, in which women underwent tubal patency testing by HyFoSy and HSG, randomized for order of the procedure. Participants either had HyFoSy first and then HSG, or vice versa. Here, we evaluate the relative effectiveness of tubal patency testing by HyFoSy or HSG prior to the alternative tubal patency testing method on visible tubal patency, compared to each method alone.
PARTICIPANTS/MATERIALS, SETTING, METHODS
Infertile women aged between 18 and 41 years scheduled for tubal patency testing were eligible for participating in the FOAM study. Women with anovulatory cycles, endometriosis, or with a partner with male infertility were excluded. To evaluate the effect HyFoSy on tubal patency, we relied on HSG results by comparing the proportion of women with bilateral tubal patency visible on HSG in those who underwent and who did not undergo HyFoSy prior to their HSG (HyFoSy prior to HSG versus HSG alone). To evaluate the effect of HSG on tubal patency, we relied on HyFoSy results by comparing the proportion of women with bilateral tubal patency visible on HyFoSy in those who underwent and who did not undergo HSG prior to their HyFoSy (HSG prior to HyFoSy versus HyFoSy alone).
MAIN RESULTS AND THE ROLE OF CHANCE
Between May 2015 and January 2019, we randomized 1160 women (576 underwent HyFoSy first followed by HSG, and 584 underwent HSG first followed by HyFoSy). Among the women randomized to HyFoSy prior to HSG, bilateral tubal patency was visible on HSG in 467/537 (87%) women, compared with 472/544 (87%) women who underwent HSG alone (risk difference 0.2%; 95% CI: −3.8% to 4.2%). Among the women randomized to HSG prior to HyFoSy, bilateral tubal patency was visible on HyFoSy in 394/471 (84%) women, compared with 428/486 (88%) women who underwent HyFoSy alone (risk difference −4.4%; 95% CI: −8.8% to 0.0%).
LIMITATIONS, REASONS FOR CAUTION
The results of this secondary analysis should be interpreted as exploratory and cannot be regarded as definitive evidence. Furthermore, it has to be noted that pregnancy outcomes were not considered in this analysis.
WIDER IMPLICATIONS OF THE FINDINGS
Tubal patency testing by either HyFoSy or HSG, prior to the alternative tubal patency testing method does not significantly affect visible tubal patency, when compared to alternative method alone. This suggests that both methods may have comparable abilities to dislodge mucus plugs in the Fallopian tubes.
STUDY FUNDING/COMPETING INTEREST(S)
The FOAM study was an investigator-initiated study, funded by ZonMw, a Dutch organization for Health Research and Development (project number 837001504). IQ Medical Ventures provided the ExEm®-FOAM kits free of charge. The funders had no role in study design, collection, analysis, or interpretation of the data. H.R.V. reports consultancy fees from Ferring. M.v.W. received a travel grant from Oxford University Press in the role of Deputy Editor for Human Reproduction and participates in a Data Safety and Monitoring Board as an independent methodologist in obstetrics studies in which she has no other role. M.v.W. is coordinating editor of Cochrane Fertility and Gynaecology. B.W.J.M. received an investigator grant from NHMRC (GNT1176437) and research funding from Merck KGaA. B.W.J.M. reports consultancy for Organon and Merck KGaA, and travel support from Merck KGaA. B.W.J.M. reports holding stocks of ObsEva. V.M. received research grants from Guerbet, Merck and Ferring and travel and speaker fees from Guerbet. The other authors do not report conflicts of interest.
TRIAL REGISTRATION NUMBER
International Clinical Trials Registry Platform No. NTR4746.
... The risk of bias in the included studies was assessed independently in blind by two authors using the QUADAS-AI tool [8]. This tool allows for a detailed and structured assessment of various bias domains, including patient selection, index test, reference standard, and type of validation. ...
... In some research studies [14], there were attempts at precision for estimating CPL size by placing an open biopsy forceps of known diameter next to a CPL before removal, but in clinical practice this is not done. Therefore, misclassification by size may commonly occur [15]. Despite the likely variability in size estimation, the 10 mm cutoff, however subjective it may be, is associated with increased risk of further advanced CPLs during surveillance, and an overall increased risk of CRC. ...
... The aim of testing is to improve people-important outcomes. A test-management pathway provides a visual representation of the essential steps required to move from testing to people-important outcomes, which is crucial in guideline development [15]. If guideline developers do not oversee and consider the consequences of testing, they cannot balance the relevant benefits and harms of testing. ...
... These knowledge gaps are in part due to a historical reliance on animal models, which often exhibit significant disparities from human phenotypes and establish pathophysiology through dissimilar mechanisms. 10,11 As a result, translation of most findings to the human context has seen limited clinical success. To date, resmetirom is the only U.S. FDA-approved therapeutic for adults with metabolic dysfunction-associated steatohepatitis (MASH), an advanced stage of MASLD, exemplifying the ongoing gap in our understanding of metabolic liver disease. ...
... The current state-of-the-art imaging modality for the detection and response evaluation of systemic chemotherapy in CRLM is magnetic resonance imaging (MRI) [7]. The importance of MRI in the preoperative setting was underlined by the recently published CAMINO-Trial showing that the treatment plan had to be modified in 31% of the patients after MRI examination [8]. ...
... The world of clinical trials will change with the consideration of AI + humans rather than AI vs. humans scenarios [95,96]. The latest guidelines lead researchers in regard to writing protocols (e.g., SPIRIT-AI [97]), evaluating AI-prediction models (TRIPOD + AI [98,99]), and reporting their results in the scientific literature (CONSORT-AI [100]). ...
... The CEA of the FOAM trial showed that management based on the results of HyFoSy is associated with slightly lower live birth rates (although not statistically significant), at slightly lower costs compared with management based on the results of HSG. 34 If tubal flushing by HSG results in more pregnancies than HyFoSy, then it would be likely that less women will need expensive fertility treatments, resulting in HSG being less costly than HyFoSy. If HSG is more effective than HyFoSy, and also more costly, then it would be relevant to calculate the costs for an additional live birth. ...
... Moreover, the identification of biomarkers for the early detection and monitoring of disease progression is crucial for tailoring treatment strategies, as highlighted by ongoing research into non-invasive diagnostic tools [132,133]. Overall, the future of MASLD and MASH management lies in a personalized approach that integrates pharmacotherapy with lifestyle interventions, addressing both hepatic and extrahepatic manifestations of metabolic dysfunction. All the drugs or compounds mentioned in this part can be found in Table 2. ...
... Deciding on optimum modality of biologic delivery will be an area of future research in PIBD. Adult studies have suggested equal efficacy with subcutaneous (SC) and intravenous IFX [86,87] and VDZ [88], possibly less immunogenic potential, reduced economic costs, and greater patient satisfaction [88]. Initial pediatric data has suggested elective switching to SC IFX during the maintenance phase resulted in high treatment persistence [86]. ...
... These imaging modalities are now integral to the diagnostic algorithms proposed by the European Society of Cardiology (ESC), highlighting their importance [1]. Their findings are also included in the major imaging criteria of the two 2023 versions of the Duke criteria, one from the ESC and the other from the International Society for Cardiovascular Infectious Diseases (ISCVID) [1,2], both of them showing an increase in sensitivity as compared with the 2015 ESC version [3][4][5][6]. ...