Article

Surgical Skill and Complication Rates after Bariatric Surgery

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Background: Clinical outcomes after many complex surgical procedures vary widely across hospitals and surgeons. Although it has been assumed that the proficiency of the operating surgeon is an important factor underlying such variation, empirical data are lacking on the relationships between technical skill and postoperative outcomes. Methods: We conducted a study involving 20 bariatric surgeons in Michigan who participated in a statewide collaborative improvement program. Each surgeon submitted a single representative videotape of himself or herself performing a laparoscopic gastric bypass. Each videotape was rated in various domains of technical skill on a scale of 1 to 5 (with higher scores indicating more advanced skill) by at least 10 peer surgeons who were unaware of the identity of the operating surgeon. We then assessed relationships between these skill ratings and risk-adjusted complication rates, using data from a prospective, externally audited, clinical-outcomes registry involving 10,343 patients. Results: Mean summary ratings of technical skill ranged from 2.6 to 4.8 across the 20 surgeons. The bottom quartile of surgical skill, as compared with the top quartile, was associated with higher complication rates (14.5% vs. 5.2%, P<0.001) and higher mortality (0.26% vs. 0.05%, P=0.01). The lowest quartile of skill was also associated with longer operations (137 minutes vs. 98 minutes, P<0.001) and higher rates of reoperation (3.4% vs. 1.6%, P=0.01) and readmission (6.3% vs. 2.7%) (P<0.001). Conclusions: The technical skill of practicing bariatric surgeons varied widely, and greater skill was associated with fewer postoperative complications and lower rates of reoperation, readmission, and visits to the emergency department. Although these findings are preliminary, they suggest that peer rating of operative skill may be an effective strategy for assessing a surgeon's proficiency.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Skill evaluation scales were developed and investigated to validate educational achievements in specific procedures [5][6][7][8]. In the past decade, several studies reporting the correlation between surgical performance and postoperative outcomes have demonstrated the potential utility of intraoperative skill evaluation as a predictor of the surgical outcome [9][10][11][12][13][14][15][16][17]. However, most of these studies were limited to using global rating scales for skill evaluation of surgical procedures. ...
... This demonstrated the construct validity with correlation between the JORS-LDG score and experience in performing laparoscopic surgery and LG. Birkmeyer et al. [11], Fecso et al. [14], and Curtis et al. [15] argued that the surgical outcomes were affected by intraoperative performance and not by the duration of training and history. Based on the abovementioned factors, it is important to provide surgical trainees with abundant case experience in addition to competency-based training focusing on the performance of each procedure on advanced procedures such as LDG. ...
... Over the past decade, studies have demonstrated a correlation between intraoperative performance and short-term outcomes, particularly the postoperative complication rates [11][12][13][14][15][16][17]. Fecso et al. [14] examined 61 LG procedures for patients with gastric cancer performed by three surgeons at three institutions. ...
Article
Full-text available
Background The Japanese operative-rating scale for laparoscopic distal gastrectomy (JORS-LDG) was developed through cognitive task analysis together with the Delphi method to measure intraoperative performance during laparoscopic distal gastrectomy. This study aimed to investigate the value of this rating scale as an educational tool and a surgical outcome predictor in laparoscopic distal gastrectomy. Methods The surgical performance of laparoscopic distal gastrectomy was assessed by the first assistant, through self-evaluation in the operating room and by video raters blind to the case. We evaluated inter-rater reliability, internal consistency, and correlations between the JORS-LDG scores and the evaluation methods, patient characteristics, and surgical outcomes. Results Fifty-four laparoscopic distal gastrectomy procedures performed by 40 surgeons at 16 institutions were evaluated in the operating room and with video recordings using the proposed rating scale. The video inter-rater reliability was > 0.8. Participating surgeons were divided into the low, intermediate, and high groups based on their total scores. The number of laparoscopic surgeries and laparoscopic gastrectomy procedures performed differed significantly among the groups according to laparoscopic distal gastrectomy skill levels. The low, intermediate, and high groups also differed in terms of median operating times (311, 266, and 229 min, respectively, P < 0.001), intraoperative complication rates (27.8, 11.8, and 0%, respectively, P = 0.01), and postoperative complication rates (22.2, 0, and 0%, respectively, P = 0.002). Conclusions The JORS-LDG is a reliable and valid measure for laparoscopic distal gastrectomy training and could be useful in predicting surgical outcomes.
... However, the use of VBA to inform competency decisions for trainees requires robust supporting evidence. A landmark paper from Birkmeyer et al. published in 2013 reported a significant association between surgeon technical performance and outcomes after Roux-en-Y gastric bypass, including complications, reoperations, and readmissions [12]. A systematic review, however, identified important limitations in the literature published in this field related to lack of standardized assessment tools and reliance on indirect observations of technical performance such as postoperative imaging or pathological specimen quality [13]. ...
... Twenty-three articles were excluded (articles and reasons for exclusion are listed in Supplemental Digital Content 1) and 11 articles met eligibility criteria. [12,[29][30][31][32][33][34][35][36][37][38]. ...
... All were observational studies (10 cohort and 1 case-control study). All the other ten identified studies followed after the publication of the landmark paper by Birkmeyer et al. [12] Eight of 11 studies were multicenter collaborations. Two studies involved urologic procedures [34,36] with the remainder involving general surgery procedures (foregut/bariatric [n = 4], colorectal [n = 4] and hepatobiliary surgery [n = 1]) [12, 29-33, 35, 37, 38]. ...
Article
Full-text available
Background Efforts to improve surgical safety and outcomes have traditionally placed little emphasis on intraoperative performance, partly due to difficulties in measurement. Video-based assessment (VBA) provides an opportunity for blinded and unbiased appraisal of surgeon performance. Therefore, we aimed to systematically review the existing literature on the association between intraoperative technical performance, measured using VBA, and patient outcomes. Methods Major databases (Medline, Embase, Cochrane Database, and Web of Science) were systematically searched for studies assessing the association of intraoperative technical performance measured by tools supported by validity evidence with short-term (≤ 30 days) and/or long-term postoperative outcomes. Study quality was assessed using the Newcastle–Ottawa Scale. Results were appraised descriptively as study heterogeneity precluded meta-analysis. Results A total of 11 observational studies were identified involving 8 different procedures in foregut/bariatric (n = 4), colorectal (n = 4), urologic (n = 2), and hepatobiliary surgery (n = 1). The number of surgeons assessed ranged from 1 to 34; patient sample size ranged from 47 to 10,242. High risk of bias was present in 5 of 8 studies assessing short-term outcomes and 2 of 6 studies assessing long-term outcomes. Short-term outcomes were reported in 8 studies (i.e., morbidity, mortality, and readmission), while 6 reported long-term outcomes (i.e., cancer outcomes, weight loss, and urinary continence). Better intraoperative performance was associated with fewer postoperative complications (6 of 7 studies), reoperations (3 of 4 studies), and readmissions (1 of 4 studies). Long-term outcomes were less commonly investigated, with mixed results. Conclusion Current evidence supports an association between superior intraoperative technical performance measured using surgical videos and improved short-term postoperative outcomes. Intraoperative performance analysis using video-based assessment represents a promising approach to surgical quality-improvement.
... [1][2][3][7][8][9] Nevertheless, the technical skills (TS) of all different occupational groups are a strong predictor of patient outcome. 10,11 Every member of the team depends on his or her TS and superior manual dexterity in the management of even a simple procedure. TS are even more important in complex and often low-frequency high-risk procedures. ...
... 11 Although it is often intuitively taken for granted that excellent patient outcomes are linked with TS, data supporting such an association is poorly established. 10,11 . ...
... In the OR, technique has an effect on the duration of a procedure, shortening the time under anesthesia and preventing complications, including blood loss or need for coagulation. 10 Compared to attending physicians, trainee participation in surgeries did not directly correlate with increased risk of infections, despite evidence of longer operative times. 35 The relationship between high-volume procedures and favorable patient outcomes has been attributed to high-end procedures, where technical competence is a strong marker for patient outcome. ...
Today’s effective leaders create opportunities for their teams to develop both technical and non-technical skills. In the perioperative arena, the focus until now mainly has been on improving non-technical skills, with very few studies analyzing the relationship between technical skills and patient outcomes. Technical competence requires assessment of one’s own strengths and weaknesses, inclusion of deliberate goal-oriented practice, objective structured feedback assessment, and a focus on best practice and improved patient outcomes. In this article we address the prerequisites, assessment and implications of technical skills for perioperative leadership, and provide key metrics impacting patient outcomes and leadership development.
... , and increasing research has demonstrated that surgeon's skills may be one of the most important determinant factors for patient outcomes [2][3][4]. Even among certificated surgeons, the bottom quartile of surgeons compared with the top quartile ones was associated with higher complications rates, mortality, and shorter overall survival time for cancer [2,5]. ...
... , and increasing research has demonstrated that surgeon's skills may be one of the most important determinant factors for patient outcomes [2][3][4]. Even among certificated surgeons, the bottom quartile of surgeons compared with the top quartile ones was associated with higher complications rates, mortality, and shorter overall survival time for cancer [2,5]. To ensure safety and quality of healthcare, an efficient evaluation system for surgical skills is urgently needed to train and screen competent surgeons. ...
Article
Full-text available
Background Due to varied surgical skills and the lack of an efficient rating system, we developed Surgesture based on elementary functional surgical gestures performed by surgeons, which could serve as objective metrics to evaluate surgical performance in laparoscopic cholecystectomy (LC). Methods We defined 14 LC basic Surgestures. Four surgeons annotated Surgestures among LC videos performed by experts and novices. The counts, durations, average action time, and dissection/exposure ratio (D/E ratio) of LC Surgestures were compared. The phase of mobilizing hepatocystic triangle (MHT) was extracted for skill assessment by three professors using a modified Global Operative Assessment of Laparoscopic Skills (mGOALS). Results The novice operation time was significantly longer than the expert operation time (58.12 ± 19.23 min vs. 26.66 ± 8.00 min, P < 0.001), particularly during MHT phase. Novices had significantly more Surgestures than experts in both hands (P < 0.05). The left hand and inefficient Surgesture of novices were dramatically more than those of experts (P < 0.05). The experts demonstrated a significantly higher D/E ratio of duration than novices (0.79 ± 0.37 vs. 2.84 ± 1.98, P < 0.001). The counts and time pattern map of LC Surgestures during MHT demonstrated that novices tended to complete LC with more types of Surgestures and spent more time exposing the surgical scene. The performance metrics of LC Surgesture had significant but weak associations with each aspect of mGOALS. Conclusion The newly constructed Surgestures could serve as accessible and quantifiable metrics for demonstrating the operative pattern and distinguishing surgeons with various skills. The association between Surgestures and Global Rating Scale laid the foundation for establishing a bridge to automated objective surgical skill evaluation.
... Videotaping procedures is currently the most feasible way to obtain comprehensive data on technical skill. In the surgical field, assessing technical skill using video recordings has been utilized to identify variations in surgeon quality [9] and to provide individualized feedback to drive quality improvement efforts [10]. In gastrointestinal endoscopy, we and others have shown that assessments of technical skill via a structured review of videotaped colonoscopy procedures highly correlates with existing metrics of colonoscopy quality, including ADR and serrated polyp detection rates (SDR) [11][12][13]. ...
... The use of video-based feedback regarding technical skill has gained prominence, predominantly in the surgical literature, since a landmark study demonstrated a strong association between video-based assessment of surgical skill from a single video and clinical outcomes, including death [9]. Subsequent studies have examined the use of video-based feedback derived from multiple sources, including experts, peers, and even novice reviewers. ...
Article
Full-text available
Background and study aims Colonoscopy inspection quality (CIQ) assesses skills (fold examination, cleaning, and luminal distension) during inspection for polyps and correlates with adenoma detection rate (ADR) and serrated detection rate (SDR). We aimed to determine whether providing individualized CIQ feedback with instructional videos improves quality metrics performance. Methods We prospectively studied 16 colonoscopists who already received semiannual benchmarked reports of quality metrics (ADR, SDR, and withdrawal time [WT]). We randomly selected seven colonoscopies/colonoscopist for evaluation. Six gastroenterologists graded CIQ using an established scale. We created instructional videos demonstrating optimal and poor inspection techniques. Colonoscopists received the instructional videos and benchmarked CIQ performance. We compared ADR, SDR, and WT in the 12 months preceding (“baseline”) and following CIQ feedback. Colonoscopists were stratified by baseline ADR into lower (≤ 34 %) and higher-performing (> 34 %) groups. Results Baseline ADR was 38.5 % (range 26.8 %–53.8 %) and SDR was 11.2 % (2.8 %–24.3 %). The proportion of colonoscopies performed by lower-performing colonoscopists was unchanged from baseline to post-CIQ feedback. All colonoscopists reviewed their CIQ report cards. Post-feedback, ADR (40.1 % vs 38.5 %, P = 0.1) and SDR (12.2 % vs. 11.2 %, P = 0.1) did not significantly improve; WT significantly increased (11.4 vs 12.4 min, P < 0.01). Among the eight lower-performing colonoscopists, group ADR (31.1 % vs 34.3 %, P = 0.02) and SDR (7.2 % vs 9.1 %, P = 0.02) significantly increased post-feedback. In higher-performing colonoscopists, ADR and SDR did not change. Conclusions CIQ feedback modestly improves ADR and SDR among colonoscopists with lower baseline ADR but has no effect on higher-performing colonoscopists. Individualized feedback on colonoscopy skills could be used to improve polyp detection by lower-performing colonoscopists.
... Surgical adverse event video datasets: an unmet need in surgical safety. A growing body of evidence supports the quantitative analysis of surgical video 22,[45][46][47][48] . One fundamental discovery has been the detection of signals in surgical video that predict patient outcome: surgeons have heterogeneous skill resulting in heterogeneous outcomes 14,45,46,49 . ...
... A growing body of evidence supports the quantitative analysis of surgical video 22,[45][46][47][48] . One fundamental discovery has been the detection of signals in surgical video that predict patient outcome: surgeons have heterogeneous skill resulting in heterogeneous outcomes 14,45,46,49 . Although low-skill surgeons are more likely to have adverse intraoperative events, video of these events has not been systematically studied. ...
Article
Full-text available
Major vascular injury resulting in uncontrolled bleeding is a catastrophic and often fatal complication of minimally invasive surgery. At the outset of these events, surgeons do not know how much blood will be lost or whether they will successfully control the hemorrhage (achieve hemostasis). We evaluate the ability of a deep learning neural network (DNN) to predict hemostasis control ability using the first minute of surgical video and compare model performance with human experts viewing the same video. The publicly available SOCAL dataset contains 147 videos of attending and resident surgeons managing hemorrhage in a validated, high-fidelity cadaveric simulator. Videos are labeled with outcome and blood loss (mL). The first minute of 20 videos was shown to four, blinded, fellowship trained skull-base neurosurgery instructors, and to SOCALNet (a DNN trained on SOCAL videos). SOCALNet architecture included a convolutional network (ResNet) identifying spatial features and a recurrent network identifying temporal features (LSTM). Experts independently assessed surgeon skill, predicted outcome and blood loss (mL). Outcome and blood loss predictions were compared with SOCALNet. Expert inter-rater reliability was 0.95. Experts correctly predicted 14/20 trials (Sensitivity: 82%, Specificity: 55%, Positive Predictive Value (PPV): 69%, Negative Predictive Value (NPV): 71%). SOCALNet correctly predicted 17/20 trials (Sensitivity 100%, Specificity 66%, PPV 79%, NPV 100%) and correctly identified all successful attempts. Expert predictions of the highest and lowest skill surgeons and expert predictions reported with maximum confidence were more accurate. Experts systematically underestimated blood loss (mean error − 131 mL, RMSE 350 mL, R ² 0.70) and fewer than half of expert predictions identified blood loss > 500 mL (47.5%, 19/40). SOCALNet had superior performance (mean error − 57 mL, RMSE 295 mL, R ² 0.74) and detected most episodes of blood loss > 500 mL (80%, 8/10). In validation experiments, SOCALNet evaluation of a critical on-screen surgical maneuver and high/low-skill composite videos were concordant with expert evaluation. Using only the first minute of video, experts and SOCALNet can predict outcome and blood loss during surgical hemorrhage. Experts systematically underestimated blood loss, and SOCALNet had no false negatives. DNNs can provide accurate, meaningful assessments of surgical video. We call for the creation of datasets of surgical adverse events for quality improvement research.
... Laparoscopic RYGB is a technically advanced procedure and it has been estimated that it takes approximately 100 operations to master the technique [13][14][15], with wide differences in technical skills between surgeons [16]. However, while many studies of the learning process have been conducted, studies regarding the decay of laparoscopic surgical skills are scarce. ...
... Since the decay of laparoscopic skills in experienced surgeons in high-volume surgery facilities seems almost unexplored, there is room for future research in this field. The technical skills may vary widely even for experienced surgeons and although a sensitivity analysis including hospital volume failed to show major differences in the effect of summer closure, individual surgeons may experience differences in decay from absence from surgery [16]. ...
Article
Full-text available
Introduction Bariatric surgery is an effective method of treating obesity, with gastric bypass and sleeve gastrectomy being the most common techniques used worldwide. Despite the technical challenges in these methods, little is known about the effects of summer closure on the incidence of serious postoperative complications in surgeries performed shortly after summer vacation. This has therefore been studied in our large cohort. Materials and methods A retrospective cohort study based on data from the Scandinavian Obesity Surgery Registry was conducted. Patients who underwent a primary gastric bypass or sleeve gastrectomy operation between 2010 and 2019 were included. The rate of serious complications within 30 days after surgery for patients who underwent surgery the first month after summer closure was compared to those who underwent surgery during the rest of the year using the χ ² test and adjusted logistic regression. Results The study included 42,404 patients, 36,094 of whom underwent gastric bypass and 6310 of whom received sleeve gastrectomy. Summer closure was associated with an increased risk for serious postoperative complications in gastric bypass surgery (adjusted odds ratio (adj-OR) = 1.17; 95% confidence interval (CI): 1.01–1.36). No statistically significant association was seen for sleeve gastrectomy (adj-OR = 1.17; 95% CI: 0.72–1.91), nor in overall complication rate. Conclusions Summer closure increases the risk of serious postoperative complications in gastric bypass surgery. No statistically significant association was found for sleeve gastrectomy surgery.
... Another benefit of retrospective review of surgical performance is that is facilitates peer coaching. Learning surgical technical and non-technical skill, which are determinants of patient outcome (Birkmeyer et al. 2013), in real-time during a procedure is often difficult due to external pressures. According to Bonrath et al. (2015b), a way to enhance trainee and surgeon learning is through "… objective assessment, structured debriefing, feedback, behavior-modeling, and guided selfreflection." ...
... In the United States, efforts have been undertaken to collate high-fidelity intraoperative data capture from multiple sites. Statewide digital health repositories such as the Michigan Bariatric Surgery Collaborative (MBSC) and the Michigan Urological Surgery Improvement Collaborative (MUSIC) have taken advantage of data collected from multiple hospitals in order to analyze and optimize the quality of care being delivered in the state (Birkmeyer et al. 2013;Ghani et al. 2016). Through high-volume analysis, research questions can be approached with high volume data and sufficient power in order to draw meaningful conclusions at a statewide level. ...
Book
Full-text available
Addresses the motivation and enablers for digital health innovations Contextualizes the application, technical considerations, as well as socio-psycho-economical ones influencing many digital health technologies’ acceptance and widespread use Presents a comprehensive state-of the-art approach to digital health technologies and practices
... P=0.01) and readmission (6.3% vs. 2.7%) (P<0.001) [20]. Moreover, two decades ago, the American College of Surgeons stated that in healthcare institutions recognized for their expertise in bariatric surgery, there must be a proven commitment to provide properly trained and funded bariatric surgery facilities, equipment, and support staff under the direction of a qualified surgeon. ...
Article
Full-text available
Purpose The effectiveness of enhanced recovery after surgery (ERAS) pathways in patients undergoing bariatric surgery remains unclear. Our objective was to determine the effect of the ERAS elements on patient outcomes following elective bariatric surgery. Materials and Methods Prospective cohort study in adult patients undergoing elective bariatric surgery. Each participating center selected a single 3-month data collection period between October 2019 and September 2020. We assessed the 24 individual components of the ERAS pathways in all patients. We used a multivariable and multilevel logistic regression model to adjust for baseline risk factors, ERAS elements, and center differences Results We included 1419 patients. One hundred and fourteen patients (8%) developed postoperative complications. There were no differences in the incidence of overall postoperative complications between the self-designated ERAS and non-ERAS groups (54 (8.7%) vs. 60 (7.6%); OR, 1.14; 95% CI, 0.73–1.79; P = .56), neither for moderate-to-severe complications, readmissions, re-interventions, mortality, or hospital stay (2 [IQR 2–3] vs. 3 [IQR 2–4] days, 0.85; 95% CI, 0.62–1.17; P = .33) Adherence to the ERAS elements in the highest adherence quartile (Q1) was greater than 72.2%, while in the lowest adherence quartile (Q4) it was less than 55%. Patients with the highest adherence rates had shorter hospital stay (2 [IQR 2–3] vs. 3 [IQR 2–4] days, 1.54; 95% CI, 1.09–2.17; P = .015), while there were no differences in the other outcomes Conclusions Higher adherence to ERAS Society® recommendations was associated with a shorter hospital stay without an increase in postoperative complications or readmissions. Trial Registration ClinicalTrials.gov Identifier: NCT03864861 Graphical abstract
... Finally, task-based objective performance indicators (OPIs) other than total operative time are often neglected despite offering the potential for improved and focused feedback (14)(15)(16)(17). There exists an opportunity to develop more objective methods that can scale for broad use (18)(19)(20) given a limited number of studies use subjective methods to estimate the impact of a surgeon's technical skills on patient outcomes (21)(22)(23)(24)(25). Additionally, these objective methods need to be able to be applied to an individual surgeon, within institutions, or across institutions. ...
Article
Full-text available
Objective Surgical efficiency and variability are critical contributors to optimal outcomes, patient experience, care team experience, and total cost to treat per disease episode. Opportunities remain to develop scalable, objective methods to quantify surgical behaviors that maximize efficiency and reduce variability. Such objective measures can then be used to provide surgeons with timely and user-specific feedbacks to monitor performances and facilitate training and learning. In this study, we used objective task-level analysis to identify dominant contributors toward surgical efficiency and variability across the procedural steps of robotic-assisted sleeve gastrectomy (RSG) over a five-year period for a single surgeon. These results enable actionable insights that can both complement those from population level analyses and be tailored to an individual surgeon's practice and experience. Methods Intraoperative video recordings of 77 RSG procedures performed by a single surgeon from 2015 to 2019 were reviewed and segmented into surgical tasks. Surgeon-initiated events when controlling the robotic-assisted surgical system were used to compute objective metrics. A series of multi-staged regression analysis were used to determine: if any specific tasks or patient body mass index (BMI) statistically impacted procedure duration; which objective metrics impacted critical task efficiency; and which task(s) statistically contributed to procedure variability. Results Stomach dissection was found to be the most significant contributor to procedure duration (β = 0.344, p < 0.001; R = 0.81, p < 0.001) followed by surgical inactivity and stomach stapling. Patient BMI was not found to be statistically significantly correlated with procedure duration ( R = −0.01, p = 0.90). Energy activation rate, a robotic system event-based metric, was identified as a dominant feature in predicting stomach dissection duration and differentiating earlier and later case groups. Reduction of procedure variability was observed between earlier (2015-2016) and later (2017-2019) groups (IQR = 14.20 min vs. 6.79 min). Stomach dissection was found to contribute most to procedure variability (β = 0.74, p < 0.001). Conclusions A surgical task-based objective analysis was used to identify major contributors to surgical efficiency and variability. We believe this data-driven method will enable clinical teams to quantify surgeon-specific performance and identify actionable opportunities focused on the dominant surgical tasks impacting overall procedure efficiency and consistency.
... Studies have demonstrated that technical skills may correlate with surgical outcomes 2,41 . Improvement in technical skills may improve the outcome, hence, current attempts in simulation training are focused on enhancing trainee technical skills acquisition. ...
Article
Full-text available
In procedural-based medicine, the technical ability can be a critical determinant of patient outcomes. Psychomotor performance occurs in real-time, hence a continuous assessment is necessary to provide action-oriented feedback and error avoidance guidance. We outline a deep learning application, the Intelligent Continuous Expertise Monitoring System (ICEMS), to assess surgical bimanual performance at 0.2-s intervals. A long-short term memory network was built using neurosurgeon and student performance in 156 virtually simulated tumor resection tasks. Algorithm predictive ability was tested separately on 144 procedures by scoring the performance of neurosurgical trainees who are at different training stages. The ICEMS successfully differentiated between neurosurgeons, senior trainees, junior trainees, and students. Trainee average performance score correlated with the year of training in neurosurgery. Furthermore, coaching and risk assessment for critical metrics were demonstrated. This work presents a comprehensive technical skill monitoring system with predictive validation throughout surgical residency training, with the ability to detect errors.
... Decreasing these risks correlates with the skill and experience of the surgeon. 27 Thus, it may be expected that patients do not feel comfortable about the involvement of trainee surgeons in their care. ...
Article
Full-text available
Background: As the worldwide demand for specialist surgeons increases, and to complement surgical training provided through governmental institutions, private hospitals are increasingly hosting trainees. Wits Donald Gordon Medical Centre (WDGMC) is a private academic hospital in Johannesburg with a Colorectal Unit (CRU) that hosts several trainees. While published studies demonstrate that the involvement of trainees in surgery does not adversely impact outcomes, private patients' perceptions of the role of trainees in their care have not been as widely researched. Methods: This was a prospective, cross-sectional study using a self-administered questionnaire hosted on a REDCap database. Statistical analysis was performed using SPSS version 26. Results: One hundred and seventy-four patients participated in the study, and 74.1% of respondents felt that training of doctors should occur in private hospitals in South Africa. Of the sample, 83.3% would allow a supervised trainee to perform a part of their operation, provided they had been made aware of trainee participation in advance (78%). Sixty per cent of patients felt that interaction with a trainee enhanced their care, and 52.3% of patients suggested that seeing more than one doctor a day improved their experience. Conclusion: Our results suggest that privately funded patients support the surgical training of medical doctors in private academic training hospitals, and they are willing to be participants in the training process. Moreover, training programmes in this setting appear to enhance the patient experience. We are optimistic that these findings could be used to advocate for expanded training opportunities across the private sector in South Africa.
... Provocative statements from eminent surgeons, within an outcome-driven society, could create a culture where there is reluctance to train. Birkmeyer 40 states that technical proficiency is one of the most important determinants of outcome after surgery, and McMillan 41 observed that rates of CR-POPF were inversely proportional to years of surgical experience. The present study is therefore reassuring for patients, trainees, and trainers. ...
Article
Background The complexity of pancreaticoduodenectomy and fear of morbidity, particularly postoperative pancreatic fistula, can be a barrier to surgical trainees gaining operative experience. This meta-analysis sought to compare the postoperative pancreatic fistula rate after pancreatoenteric anastomosis by trainees or established surgeons. Methods A systematic review of the literature was performed using Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, with differences in postoperative pancreatic fistula rates after pancreatoenteric anastomosis between trainee-led versus consultant/attending surgeons pooled using meta-analysis. Variation in rates of postoperative pancreatic fistula was further explored using risk-adjusted outcomes using published risk scores and cumulative sum control chart analysis in a retrospective cohort. Results Across 14 cohorts included in the meta-analysis, trainees tended toward a lower but nonsignificant rate of all postoperative pancreatic fistula (odds ratio: 0.77, P = .45) and clinically relevant postoperative pancreatic fistula (odds ratio: 0.69, P = .37). However, there was evidence of case selection, with trainees being less likely to operate on patients with a pancreatic duct width <3 mm (odds ratio: 0.45, P = .05). Similarly, analysis of a retrospective cohort (N = 756 cases) found patients operated by trainees to have significantly lower predicted all postoperative pancreatic fistula (median: 20 vs 26%, P < .001) and clinically relevant postoperative pancreatic fistula (7 vs 9%, P = .020) rates than consultant/attending surgeons, based on preoperative risk scores. After adjusting for this on multivariable analysis, the risks of all postoperative pancreatic fistula (odds ratio: 1.18, P = .604) and clinically relevant postoperative pancreatic fistula (odds ratio: 0.85, P = .693) remained similar after pancreatoenteric anastomosis by trainees or consultant/attending surgeons. Conclusion Pancreatoenteric anastomosis, when performed by trainees, is associated with acceptable outcomes. There is evidence of case selection among patients undergoing surgery by trainees; hence, risk adjustment provides a critical tool for the objective evaluation of performance.
... Moreover, we found similar results regarding advanced age and obstetricians' application of the guidelines for the prevention of preterm birth [47]. Nevertheless, literature results in this domain remain disparate [48]. Maintaining a high level of competence through continuous training of health professionals is a major challenge that must be met to ensure a high level of care. ...
Article
Full-text available
Background: Postpartum hemorrhage (PPH) remains a leading cause of maternal morbidity and mortality worldwide. Midwives play a key role in the initial management of PPH. Uterotonic agents are widely used in its prevention and treatment, with oxytocin the first-line agent. Nonetheless, a standardized guideline for optimal dose and rate of administration has not been clearly defined. The aim of this study was to investigate French midwives' practices regarding first-line oxytocin treatment and the factors influencing its delayed administration. Methods: This multicenter study was based on clinical vignettes of PPH management collected using an anonymous online questionnaire. A random sample of midwives from 145 maternity units in France from 15 randomly selected perinatal networks were invited to participate by email. The Previously validated case vignettes described two different scenarios of severe PPH. Vignette 1 described a typical immediate, severe PPH, and vignette 2 a less typical case of severe but gradual PPH They were constructed in three successive steps and included multiple-choice questions proposing several types of clinical practice options at each stage. For each vignette separately, we analyzed the lack of prompt oxytocin administration and the factors contributing to them, that is, characteristics of the midwives and organizational features of maternity units. Bivariate analysis and multivariable logistic regression analysis were applied. Results: In all, 450 midwives from 87 maternity units provided complete responses. Lack of promptness was observed in 21.6% of responses (N = 97) in Vignette 1 and in 13.8% (N = 62) in Vignette 2 (p < .05). After multivariate analysis, the risk of delay was lower among with midwives working in university maternity hospitals (ORa 0.47, 95% 0.21, 0.97) and in units with 1500 to 2500 births per year (ORa 0.49, 95% CI 0.26, 0.90) for Vignette 1. We also noticed that delay increased with the midwives' years of experience (per 10-year period) (ORa 1.30, 95% CI 1.01, 1.69). Conclusions: This study using clinical vignettes showed delays in oxytocin administration for first-line treatment of PPH. Because delay in treatment is a major cause of preventable maternal morbidity in PPH, these findings suggest that continuing training of midwives should be considered, especially in small maternity units.
... Factors including excess body weight, remission of comorbidities as well as CD4 counts, VL and ARV exposure are all essential parameters to monitor [26]. Other individual factors may influence the outcome of the surgery, including technical skills of the surgery centres [104], the patients characteristics [30] and the willingness to change a lifestyle after surgery [105]. All these factors are difficult to standardise or to anticipate. ...
Article
Full-text available
Bariatric surgery is increasingly applied among people living with HIV to reduce obesity and the associated morbidity and mortality. In people living with HIV, sufficient antiretroviral exposure and activity should always be maintained to prevent development of resistance and disease progression. However, bariatric surgery procedures bring various gastrointestinal modifications including changes in gastric volume, and acidity, gastrointestinal emptying time, enterohepatic circulation and delayed entry of bile acids. These alterations may affect many aspects of antiretroviral pharmacokinetics. Some drug characteristics may result in subtherapeutic exposure and the potential related risk of treatment failure and resistance. Antiretrovirals that require low pH, administration of fatty meals, longer intestinal exposure, and an enterohepatic recirculation for their absorption may be most impacted by bariatric surgery procedures. Additionally, some antiretrovirals can interact with the polyvalent cations in supplements or drugs inhibiting gastric acid, thereby preventing their use as these comedications are commonly prescribed post-bariatric surgery. Predicting pharmacokinetics on the basis of drug characteristics solely proved to be challenging, therefore pharmacokinetic studies remain crucial in this population. Here, we discuss general implications of bariatric surgery on antiretroviral outcomes in people living with HIV as well as drug properties that are relevant for the choice of antiretroviral treatment in this special patient population. Additionally, we summarise studies that evaluated the pharmacokinetics of antiretrovirals post-bariatric surgery. Finally, we performed a comprehensive analysis of theoretical considerations and published pharmacokinetic and pharmacodynamic data to provide recommendations on antiretrovirals for people living with HIV undergoing bariatric surgery.
... Paradoxically, higher surgical activity may lead to attenuation of adverse outcomes. Birkmeyer et al., in a study linking surgical skill and complication rates after bariatric surgery, observed technical skill was strongly correlated to procedural volume (32). ...
Article
Full-text available
Improving organ acceptance and utilization rates is critical to ensure we maximize usage of donated organs as a scarce resource. Many factors underlie unnecessary discard of viable organs. Declined transplantation opportunities for candidates is associated with increased wait-list mortality. Technological advancements in organ preservation may help bridge the gap between donation and utilization, but an overlooked obstacle is the practice of risk aversion by transplant professionals when decision-making under risk. Lessons from behavioral economics, where experimental work has outlined the impact of loss or risk aversion on decision-making, have not been translated to transplantation. Many external factors can influence decision-making when accepting or utilizing organs, which are potentially amendable if external conditions are improved. However, attitudes and perceptions to risk for transplant professionals can pervade decision-making and influence behaviour. If we wish to change this behavior, then the underlying nature of decision-making under risk when accepting or utilizing organs must be studied to facilitate the design of targeted behavior change interventions to convert risk aversion to risk tolerance. To ensure optimal use of donated organs, we need more research into decision-making under risk.
... The technical skills of a transplant surgeon are paramount to achieving good patient outcomes in kidney transplantation. Birkmeyer et al. argue that technical proficiency might be one of the most important determinants of surgical outcomes [1]. For instance, the time and technical proficiency with which vascular anastomoses are performed during a kidney transplant influence the warm ischemia time of the donated organ. ...
Article
One of the most challenging aspects of the kidney transplant operation is performing vascular anastomoses in the confines and depths of the iliac fossa. General surgery residents need to be adequately trained in this skill to maximize their intraoperative experience during their transplant surgery rotation. While several kidney transplant models have been developed, they are limited in their ability to simulate the challenges of performing anastomoses at varying depths and in confined spaces. Furthermore, they may be expensive or require specialized equipment, such as three-dimensional printers, to build. In this technical report, we describe how to build a low-fidelity, low-cost, and portable kidney transplant model capable of simulating vascular anastomoses at varying depths. Our model can be easily replicated for less than 30 USD using materials available in local stores. It uses inexpensive and reusable parts, allowing trainees a high volume of repetitions.
... Traditionally, clinicians have used pre-operative variables to predict the degree of gallbladder inflammation and thus surgical difficulty [3]. Increasingly intraoperative grading scores have been shown to be associated with operative outcomes and technical difficulty [4][5][6][7] . Given outcomes are often related to actions taken intraoperatively, quantification of technical difficulty allows for operative benchmarking, prediction of postoperative outcomes, and development of research standards [8] . ...
Article
Full-text available
Aim: Computer vision is a subset of machine learning (ML) technology that allows automated analysis of large operative video datasets. The aim of this study was to use a commercially available ML-driven platform to evaluate a subjective grading of operative difficulty in laparoscopic cholecystectomy (LC). Methods: Patients undergoing LC prospectively consented, and their operations were recorded. The intra-operative findings were prospectively graded (1-4) based on intraoperative gallbladder appearance assessments. Deidentified videos were uploaded to Touch SurgeryTMand run through the platform’s algorithm, providing automated analytics including the total operative length and operative phase length. The rate of critical view of safety (CVS) achievement was also included in the analysis. Results: 206 LC were included. 27 LC were excluded due to incomplete video recording and were therefore not amenable to the final data analysis. Grade 1 and 2 patients had significantly shorter operative time than grade 3 and 4 patients [17min and 53s (IQR 15min and 24s- 21min and 38s) vs. 25 min and 49s (IQR 20min and 12s-38min and 38s) (P < 0.010)]. The operative phases for each step were significantly longer in patients with gallbladders graded 3 or 4 compared to those patients graded 1 or 2 (P < 0.043). The CVS was achieved in 94% of grade 1 patients, 88% of grade 2 patients, 85% of grade 3 patients and 73% of grade 4 patients (P = 0.177). Conclusion: Increased operative time and decreased ability to achieve the CVS with more difficult intraoperative findings supports the utility of the proposed grading system. ML in surgery is a nascent field, but this study demonstrates the potential of commercially available platforms for use in operative analytics, documentation, audit and training of future surgeons.
... Most importantly, the goal of all surgical training is to improve patient safety and outcomes but, currently, there is little data to tie surgical performance analysis to post-operative outcomes. In a study from 2013, Birkmeyer et al. established that surgical skill as determined by expert manual review during bariatric operations predicts patient outcomes [45]. That skills assessments are not validated with correlation to clinical outcomes is especially true in robotic surgery. ...
Article
Full-text available
Background Evaluation of robotic surgical skill has become increasingly important as robotic approaches to common surgeries become more widely utilized. However, evaluation of these currently lacks standardization. In this paper, we aimed to review the literature on robotic surgical skill evaluation. Methods A review of literature on robotic surgical skill evaluation was performed and representative literature presented over the past ten years. Results The study of reliability and validity in robotic surgical evaluation shows two main assessment categories: manual and automatic. Manual assessments have been shown to be valid but typically are time consuming and costly. Automatic evaluation and simulation are similarly valid and simpler to implement. Initial reports on evaluation of skill using artificial intelligence platforms show validity. Few data on evaluation methods of surgical skill connect directly to patient outcomes. Conclusion As evaluation in surgery begins to incorporate robotic skills, a simultaneous shift from manual to automatic evaluation may occur given the ease of implementation of these technologies. Robotic platforms offer the unique benefit of providing more objective data streams including kinematic data which allows for precise instrument tracking in the operative field. Such data streams will likely incrementally be implemented in performance evaluations. Similarly, with advances in artificial intelligence, machine evaluation of human technical skill will likely form the next wave of surgical evaluation.
... Increasing evidence has suggested intraoperative skills are associated with patient outcomes 1,2 . The European School of Coloproctology (ESC) of the European Society of Coloproctology (ESCP) was set up to improve training and benchmark standard in different colorectal procedures and to improve patient outcomes 3 . ...
Article
Background: This study aimed to evaluate the use of binary metric-based (proficiency-based progression; PBP) performance assessments and global evaluative assessment of robotic skills (GEARS) of a robotic-assisted low anterior rectal resection (RA-LAR) procedure. Method: A prospective study of video analysis of RA-LAR procedures was carried out using the PBP metrics with binary parameters previously developed, and GEARS. Recordings were collected from five novice surgeons (≤30 RA-LAR previously performed) and seven experienced surgeons (>30 RA-LAR previously performed). Two consultant colorectal surgeons were trained to be assessors in the use of PBP binary parameters to evaluate the procedure phases, surgical steps, errors, and critical errors in male and female patients and GEARS scores. Novice and experienced surgeons were categorized and assessed using PBP metrics and GEARS; mean scores obtained were compared for statistical purpose. Also, the inter-rater reliability (IRR) of these assessment tools was evaluated. Results: Twenty unedited recordings of RA-LAR procedures were blindly assessed. Overall, using PBP metric-based assessment, a subgroup of experienced surgeons made more errors (20 versus 16, P = 0.158) and critical errors (9.2 versus 7.8, P = 0.417) than the novice group, although not significantly. However, during the critical phase of RA-LAR, experienced surgeons made significantly fewer errors than the novice group (95% CI of the difference, Lower = 0.104 - Upper = 5.155, df = 11.9, t = 2.23, p = 0.042), and a similar pattern was observed for critical errors. The PBP metric and GEARS assessment tools distinguished between the objectively assessed performance of experienced and novice colorectal surgeons performing RA-LAR (total error scores with PBP metrics, P = 0.019-0.008; GEARS scores, P = 0.029-0.025). GEARS demonstrated poor IRR (mean IRR 0.49) and weaker discrimination between groups (15-41 per cent difference). PBP binary metrics demonstrated good IRR (mean 0.94) and robust discrimination particularly for total error scores (58-64 per cent). Conclusions: PBP binary metrics seem to be useful for metric-based training for surgeons learning RA-LAR procedures.
... There is no current way to accurately describe the technical complexity of a surgical procedure. While the link between technical skill and outcomes has yet to be established [13,20], potential future studies could use medical history or clinical outcomes as proxies for case difficulty and skill. Given that this was a small study, we were unable to perform subgroup analyses. ...
Article
Full-text available
Introduction Gender bias has been identified consistently in written performance evaluations. Qualitative tools may provide a standardized way to evaluate surgical skill and minimize gender bias. We hypothesized that there is no difference in operative time or GEARS scores in robotic hysterectomy for men vs women surgeons. Methods Patients undergoing robotic hysterectomies performed between June 2019 and March 2020 at 8 hospitals within the same hospital system were captured into a prospective database. GEARS scores were assigned by crowd-sourced evaluators by a third party blinded to any surgeon- or patient-identifying information. One-way ANOVA was used to compare the mean operative time and GEARS scores for each group, and significant variables were included in a one-way ANCOVA to control for confounders. Two-tailed p-value < 0.05 was considered significant. Results Seventeen women and 13 men performed a total of 188 hysterectomies; women performed 34 (18%) and men performed 153 (81%). Women surgeons had a higher mean operative time (133 ± 58 vs 86.3 ± 46 min, p = 0.024); after adjustment, there were no significant differences in operative time (p = 0.607). There was no significant difference between the genders in total GEARS score (20.0 ± 0.77 vs 20.2 ± 0.70, p = 0.415) or GEARS subcomponent scores: bimanual dexterity (3.98 ± 0.03 vs 4.00 ± 0.03, p = 0.705); depth perception (4.04 ± 0.04 vs 4.05 ± 0.02, p = 0.799); efficiency (3.79 ± 0.02 vs 3.82 ± 0.02, p = 0.437); force sensitivity (4.01 ± 0.04 vs 4.05 ± 0.05, p = 0.533); or robotic control (4.16 ± 0.03 vs 4.26 ± 0.01, p = 0.079). Conclusion There was no difference in GEARS score between men vs women surgeons performing robotic hysterectomies. Video-based blinded assessment of skills may minimize gender biases when evaluating surgical skill for competency evaluation and credentialing.
... In contrast to this complexity, the overarching goal of surgery is quite straightforward; to improve post-operative surgical and patient outcomes. Although the determinants of post-operative patient outcomes are multifactorial, recent studies have demonstrated the relationship between intra-operative surgical activity (what and how a surgical procedure is performed) and long-term patient outcomes 1 . A better understanding of this relationship, whose mechanics remain unknown for the majority of surgical procedures, would shed light, for example, on what and how surgical behaviour can be modulated to ultimately improve patient outcomes. ...
Preprint
Full-text available
Surgery is a high-stakes domain where surgeons must navigate critical anatomical structures and actively avoid potential complications while achieving the main task at hand. Such surgical activity has been shown to affect long-term patient outcomes. To better understand this relationship, whose mechanics remain unknown for the majority of surgical procedures, we hypothesize that the core elements of surgery must first be quantified in a reliable, objective, and scalable manner. We believe this is a prerequisite for the provision of surgical feedback and modulation of surgeon performance in pursuit of improved patient outcomes. To holistically quantify surgeries, we propose a unified deep learning framework, entitled Roboformer, which operates exclusively on videos recorded during surgery to independently achieve multiple tasks: surgical phase recognition (the what of surgery), gesture classification and skills assessment (the how of surgery). We validated our framework on four video-based datasets of two commonly-encountered types of steps (dissection and suturing) within minimally-invasive robotic surgeries. We demonstrated that our framework can generalize well to unseen videos, surgeons, medical centres, and surgical procedures. We also found that our framework, which naturally lends itself to explainable findings, identified relevant information when achieving a particular task. These findings are likely to instill surgeons with more confidence in our framework's behaviour, increasing the likelihood of clinical adoption, and thus paving the way for more targeted surgical feedback.
... The skill of the surgeon is the single most important determinant of the success of a surgical procedure 1 . Assessment of surgical skills may be formative or summative. ...
Preprint
To ensure satisfactory clinical outcomes, surgical skill assessment must be objective, time-efficient, and preferentially automated - none of which is currently achievable. Video-based assessment (VBA) is being deployed in intraoperative and simulation settings to evaluate technical skill execution. However, VBA remains manually- and time-intensive and prone to subjective interpretation and poor inter-rater reliability. Herein, we propose a deep learning (DL) model that can automatically and objectively provide a high-stakes summative assessment of surgical skill execution based on video feeds and low-stakes formative assessment to guide surgical skill acquisition. Formative assessment is generated using heatmaps of visual features that correlate with surgical performance. Hence, the DL model paves the way to the quantitative and reproducible evaluation of surgical tasks from videos with the potential for broad dissemination in surgical training, certification, and credentialing.
... Surgeons' skill in the operating room affects patient outcomes (Birkmeyer et al, 2013;Nathan et al, 2012Nathan et al, , 2014Curtis et al, 2020). This means that interventions to optimize surgeons' skill can potentially improve quality of patient care. ...
Preprint
Purpose: The objective of this investigation is to provide a comprehensive analysis of state-of-the-art methods for video-based assessment of surgical skill in the operating room. Methods: Using a data set of 99 videos of capsulorhexis, a critical step in cataract surgery, we evaluate feature based methods previously developed for surgical skill assessment mostly under benchtop settings. In addition, we present and validate two deep learning methods that directly assess skill using RGB videos. In the first method, we predict instrument tips as keypoints, and learn surgical skill using temporal convolutional neural networks. In the second method, we propose a novel architecture for surgical skill assessment that includes a frame-wise encoder (2D convolutional neural network) followed by a temporal model (recurrent neural network), both of which are augmented by visual attention mechanisms. We report the area under the receiver operating characteristic curve, sensitivity, specificity, and predictive values with each method through 5-fold cross-validation. Results: For the task of binary skill classification (expert vs. novice), deep neural network based methods exhibit higher AUC than the classical spatiotemporal interest point based methods. The neural network approach using attention mechanisms also showed high sensitivity and specificity. Conclusion: Deep learning methods are necessary for video-based assessment of surgical skill in the operating room. Our findings of internal validity of a network using attention mechanisms to assess skill directly using RGB videos should be evaluated for external validity in other data sets.
... Mastery of bimanual psychomotor skills is a defining goal of surgical education, 1,2 and wide variation in surgical skill among practitioners is associated with adverse intraoperative and postoperative patient outcomes. 3,4 Novel technologies, such as surgical simulators using artificial intelligence (AI) assessment systems, are improving our understanding of the composites of surgical expertise and have the potential to reduce skill heterogeneity by complementing competency-based curriculum training. [5][6][7] Virtual reality simulation and machine learning algorithms can objectively quantify performance and improve the precision and granularity of bimanual technical skills classification. ...
Article
Full-text available
Importance: To better understand the emerging role of artificial intelligence (AI) in surgical training, efficacy of AI tutoring systems, such as the Virtual Operative Assistant (VOA), must be tested and compared with conventional approaches. Objective: To determine how VOA and remote expert instruction compare in learners' skill acquisition, affective, and cognitive outcomes during surgical simulation training. Design, setting, and participants: This instructor-blinded randomized clinical trial included medical students (undergraduate years 0-2) from 4 institutions in Canada during a single simulation training at McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre, Montreal, Canada. Cross-sectional data were collected from January to April 2021. Analysis was conducted based on intention-to-treat. Data were analyzed from April to June 2021. Interventions: The interventions included 5 feedback sessions, 5 minutes each, during a single 75-minute training, including 5 practice sessions followed by 1 realistic virtual reality brain tumor resection. The 3 intervention arms included 2 treatment groups, AI audiovisual metric-based feedback (VOA group) and synchronous verbal scripted debriefing and instruction from a remote expert (instructor group), and a control group that received no feedback. Main outcomes and measures: The coprimary outcomes were change in procedural performance, quantified as Expertise Score by a validated assessment algorithm (Intelligent Continuous Expertise Monitoring System [ICEMS]; range, -1.00 to 1.00) for each practice resection, and learning and retention, measured from performance in realistic resections by ICEMS and blinded Objective Structured Assessment of Technical Skills (OSATS; range 1-7). Secondary outcomes included strength of emotions before, during, and after the intervention and cognitive load after intervention, measured in self-reports. Results: A total of 70 medical students (41 [59%] women and 29 [41%] men; mean [SD] age, 21.8 [2.3] years) from 4 institutions were randomized, including 23 students in the VOA group, 24 students in the instructor group, and 23 students in the control group. All participants were included in the final analysis. ICEMS assessed 350 practice resections, and ICEMS and OSATS evaluated 70 realistic resections. VOA significantly improved practice Expertise Scores by 0.66 (95% CI, 0.55 to 0.77) points compared with the instructor group and by 0.65 (95% CI, 0.54 to 0.77) points compared with the control group (P < .001). Realistic Expertise Scores were significantly higher for the VOA group compared with instructor (mean difference, 0.53 [95% CI, 0.40 to 0.67] points; P < .001) and control (mean difference. 0.49 [95% CI, 0.34 to 0.61] points; P < .001) groups. Mean global OSATS ratings were not statistically significant among the VOA (4.63 [95% CI, 4.06 to 5.20] points), instructor (4.40 [95% CI, 3.88-4.91] points), and control (3.86 [95% CI, 3.44 to 4.27] points) groups. However, on the OSATS subscores, VOA significantly enhanced the mean OSATS overall subscore compared with the control group (mean difference, 1.04 [95% CI, 0.13 to 1.96] points; P = .02), whereas expert instruction significantly improved OSATS subscores for instrument handling vs control (mean difference, 1.18 [95% CI, 0.22 to 2.14]; P = .01). No significant differences in cognitive load, positive activating, and negative emotions were found. Conclusions and relevance: In this randomized clinical trial, VOA feedback demonstrated superior performance outcome and skill transfer, with equivalent OSATS ratings and cognitive and emotional responses compared with remote expert instruction, indicating advantages for its use in simulation training. Trial registration: ClinicalTrials.gov Identifier: NCT04700384.
... Introduction Surgical skills and techniques are central components of a surgeon's skill set and directly correlate with patient benefits [1]. It is critical to assess surgical skills among trainees in surgical specialties to identify their competence and confidence to practice independently [2]. ...
Article
Full-text available
Evaluation of surgical skills during minimally invasive surgeries is needed when recruiting new surgeons. Although surgeons’ differentiation by skill level is highly complex, performance in specific clinical tasks such as pegboard transfer and knot tying could be determined using wearable EMG and accelerometer sensors. A wireless wearable platform has made it feasible to collect movement and muscle activation signals for quick skill evaluation during surgical tasks. However, it is challenging since the placement of multiple wireless wearable sensors may interfere with their performance in the assessment. This study utilizes machine learning techniques to identify optimal muscles and features critical for accurate skill evaluation. This study enrolled a total of twenty-six surgeons of different skill levels: novice (n = 11), intermediaries (n = 12), and experts (n = 3). Twelve wireless wearable sensors consisting of surface EMGs and accelerometers were placed bilaterally on bicep brachii, tricep brachii, anterior deltoid, flexor carpi ulnaris (FCU), extensor carpi ulnaris (ECU), and thenar eminence (TE) muscles to assess muscle activations and movement variability profiles. We found features related to movement complexity such as approximate entropy, sample entropy, and multiscale entropy played a critical role in skill level identification. We found that skill level was classified with highest accuracy by i) ECU for Random Forest Classifier (RFC), ii) deltoid for Support Vector Machines (SVM) and iii) biceps for Naïve Bayes Classifier with classification accuracies 61%, 57% and 47%. We found RFC classifier performed best with highest classification accuracy when muscles are combined i) ECU and deltoid (58%), ii) ECU and biceps (53%), and iii) ECU, biceps and deltoid (52%). Our findings suggest that quick surgical skill evaluation is possible using wearables sensors, and features from ECU, deltoid, and biceps muscles contribute an important role in surgical skill evaluation.
... It is more common to find that experienced and skilled surgeons perform better with very low performance variability [ 14 , 15 ]. In other studies in surgery and interventional medicine which have made similar observations the data were reanalyzed after partitioning performance data at the median score for each group [16][17][18][19] . Errors and Total Errors and Attending Take Over (ATO) as well as the capacity of the Checklist assessments specificity levels at 0.8 Sensitivity threshold for the surgeon groups who scored above and below the Median score for Total Error scores (as shown in Fig. 2 ). ...
Article
Full-text available
Introduction Identifying objective performance metrics for surgical training in orthopedic surgery is imperative for effective training and patient safety. The objective of this study was to determine if an internationally agreed, metric-based objective assessment of video recordings of an unstable pertrochanteric 31A2 intramedullary nailing procedure distinguished between the performance of experienced and novice orthopedic surgeons. Materials and Methods Previously agreed procedure metrics (i.e., 15 phases of the procedure, 75 steps, 88 errors, and 28 sentinel errors) for a closed reduction and standard cephalomedullary nail fixation with a single cephalic element of an unstable pertrochanteric 31A2 fracture. Experienced surgeons trained to assess the performance metrics with an interrater reliability (IRR) > 0.8 assessed 14 videos from 10 novice surgeons (orthopaedic residents/trainees) and 20 videos from 14 experienced surgeons (orthopaedic surgeons) blinded to group and procedure order. Results The mean IRR of procedure assessments was 0.97. No statistically significant differences were observed between the two groups for Procedure Steps, Errors, Sentinel Errors, and Total Errors. A small number of Experienced surgeons made a similar number of Total Errors as the weakest performing Novices. When the scores of each group were divided at the median Total Error score, large differences were observed between the Experienced surgeons who made the fewest errors and the Novices making the most errors (p < 0.001). Experienced surgeons who made the most errors made significantly more than their Experienced peers (p < 0.003) and the best performing Novices (p < 0.001). Error metrics assessed with Area Under the Curve demonstrated good to excellent Sensitivity and Specificity (0.807 – 0.907). Discussion Binary performance metrics previously agreed by an international Delphi meeting discriminated between the objectively assessed video-recorded performance of Experienced and Novice orthopedic surgeons when group scores were sub-divided at the median for Total Errors. Error metrics discriminated best and also demonstrated good to excellent Sensitivity and Specificity. Some very experienced surgeons performed similar to the Novice group surgeons that made most errors. Conclusions The procedure metrics used in this study reliably distinguish Novice and Experienced orthopaedic surgeons' performance and will underpin quality-assured novice training.
... While much of the focus regarding surgical outcomes has focused on the abilities, practices, and characteristics of operating surgeons [1][2][3][4][5][6] , coordinated teamwork is critical for optimal patient outcomes. Surgeons and anesthesiologists have key leadership roles in facilitating coordinated teamwork in the operating room and anesthesiologist characteristics may meaningfully contribute to postoperative outcomes 7 . ...
Article
Full-text available
Objective: To examine the effect of surgeon-anesthesiologist sex discordance on postoperative outcomes. Summary background data: Optimal surgical outcomes depend on teamwork, with surgeons and anesthesiologists forming two key components. There are sex and gender-based differences in interpersonal communication and medical practice which may contribute to patients' perioperative outcomes. Methods: We performed a population-based, retrospective cohort study among adult patients undergoing one of 25 common elective or emergent surgical procedures from 2007-2019 in Ontario, Canada. We assessed the association between differences in sex between surgeon and anesthesiologists (sex discordance) on the primary endpoint of adverse postoperative outcome, defined as death, readmission, or complication within 30-days following surgery using generalized estimating equations. Results: Among 1,165,711 patients treated by 3,006 surgeons and 1,477 anesthesiologists, 791,819 patients were treated by sex concordant teams (male surgeon/male anesthesiologist: 747,327 and female surgeon/female anesthesiologist: 44,492) while 373,892 were sex discordant (male surgeon/female anesthesiologist: 267,330 and female surgeon/male anesthesiologist: 106,562). Overall, 12.3% of patients experienced one or more adverse postoperative outcomes of whom 1.3% died. Sex discordance between surgeon and anesthesiologist was not associated with a significant increased likelihood of composite adverse postoperative outcomes (adjusted odds ratio [aOR] 1.00, 95% confidence interval [CI] 0.97-1.03). Conclusions: We did not demonstrate an association between intraoperative surgeon and anesthesiologist sex discordance on adverse postoperative outcomes in a large patient cohort. Patients, clinicians, and administrators may be reassured that physician sex discordance in operating room teams is unlikely to clinically meaningfully affect patient outcomes after surgery.
... According to recent studies, around nine million major complications occur each year based on an estimated 300 million surgeries worldwide [13]. According to several studies, there is evidence that a surgeon's lack of individual expertise, as well as poor surgical skill, cause severe complications in patients [14,15]. New technological innovations in the operating room, such as robotic systems, might provide a novel way to tackle this problem. ...
Article
Full-text available
In the early 2020s, the coronavirus pandemic brought the notion of remotely connected care to the general population across the globe. Oftentimes, the timely provisioning of access to and the implementation of affordable care are drivers behind tele-healthcare initiatives. Tele-healthcare has already garnered significant momentum in research and implementations in the years preceding the worldwide challenge of 2020, supported by the emerging capabilities of communication networks. The Tactile Internet (TI) with human-in-the-loop is one of those developments, leading to the democratization of skills and expertise that will significantly impact the long-term developments of the provisioning of care. However, significant challenges remain that require today’s communication networks to adapt to support the ultra-low latency required. The resulting latency challenge necessitates trans-disciplinary research efforts combining psychophysiological as well as technological solutions to achieve one millisecond and below round-trip times. The objective of this paper is to provide an overview of the benefits enabled by solving this network latency reduction challenge by employing state-of-the-art Time-Sensitive Networking (TSN) devices in a testbed, realizing the service differentiation required for the multi-modal human-machine interface. With completely new types of services and use cases resulting from the TI, we describe the potential impacts on remote surgery and remote rehabilitation as examples, with a focus on the future of tele-healthcare in rural settings.
Article
Objectives We investigated whether surgical skill and procedure were related to oncological outcomes in cervical cancer patients who underwent Laparoscopic Radical Hysterectomy (LRH). Methods We previously assessed data of LRH from 251 patients with FIGO stage (2009) IA2, IB1and IIA1 cervical cancer collected for JGOG 1081s study. 1) The JGOG 1081s cohort study was re-examined to refine the surgical details and extend the follow-up period as chart review. 2) Unedited videos for recurrent cases and matched non-recurrent control cases were newly compared by experts for various surgical skills and surgical procedures using the modified Objective Structured Assessment of Technical Skills (OSATS) tool, without awareness of the recurrence status as video review. Results After a median follow-up of 46 months, tumors had recurred in 31 of the 251 patients. The five-year Recurrence-Free Survival rate was 86.9% (81.8–90.6) and five-year Overall Survival rate was 93.7% (87.5–96.8). Multivariate analysis from chart reviews found that an experience with LRH of less than 20 cases per institution was an independent prognostic factor for recurrence (Hazard Ratio (HR) 2.49, 95%CI 1.12–5.53, p = 0.025). For the surgical video review, we compared 23 videos of recurrent cases with 23 background-matched non-recurrent controls. Lower modified OSATS scores from the video review were consistently trended to have a higher risk of recurrence. Conclusions Our new study has found that LRH surgical experience and skill trended to have better oncological outcomes.
Article
Background Laparoscopic anti-reflux surgery, including hiatus hernia repair, is a common operation performed by both general and thoracic surgeons and an important learning objective for surgical trainees. This study aimed to design a competency assessment instrument for laparoscopic anti-reflux surgery. Method A comprehensive competency assessment instrument was designed by a process of logical analysis by 4 expert thoracic surgeons with an interest in foregut surgery, and then reviewed informally by a panel of experts. The instrument was then further assessed and refined using a modified Delphi process. The Delphi questionnaire was distributed to all members of the Fellowship Training Committee of the American Foregut Society (n = 21). Results A first draft of the competency assessment instrument included 32 steps in 4 categories. The first round of the Delphi review was completed by 14 respondents (response rate 66.7%). A total of 3 rounds of Delphi review were performed. Ultimately, 25 items were retained from the original instrument and 1 modified and 4 new items were added. The final instrument has 30 steps in 4 categories. Conclusions An international and inter-specialty consensus was established on the key components of assessing competence to perform anti-reflux surgery. The resulting instrument could be used to guide competency based assessments of general and thoracic surgeons and trainees.
Article
Full-text available
Background: A computer vision (CV) platform named EndoDigest was recently developed to facilitate the use of surgical videos. Specifically, EndoDigest automatically provides short video clips to effectively document the critical view of safety (CVS) in laparoscopic cholecystectomy (LC). The aim of the present study is to validate EndoDigest on a multicentric dataset of LC videos. Methods: LC videos from 4 centers were manually annotated with the time of the cystic duct division and an assessment of CVS criteria. Incomplete recordings, bailout procedures and procedures with an intraoperative cholangiogram were excluded. EndoDigest leveraged predictions of deep learning models for workflow analysis in a rule-based inference system designed to estimate the time of the cystic duct division. Performance was assessed by computing the error in estimating the manually annotated time of the cystic duct division. To provide concise video documentation of CVS, EndoDigest extracted video clips showing the 2 min preceding and the 30 s following the predicted cystic duct division. The relevance of the documentation was evaluated by assessing CVS in automatically extracted 2.5-min-long video clips. Results: 144 of the 174 LC videos from 4 centers were analyzed. EndoDigest located the time of the cystic duct division with a mean error of 124.0 ± 270.6 s despite the use of fluorescent cholangiography in 27 procedures and great variations in surgical workflows across centers. The surgical evaluation found that 108 (75.0%) of the automatically extracted short video clips documented CVS effectively. Conclusions: EndoDigest was robust enough to reliably locate the time of the cystic duct division and efficiently video document CVS despite the highly variable workflows. Training specifically on data from each center could improve results; however, this multicentric validation shows the potential for clinical translation of this surgical data science tool to efficiently document surgical safety.
Article
Background To assess the impact of difficult location (based on preoperative computed tomography) of liver metastases from colorectal cancer (LMCRC) on surgical difficulty, and occurrence of severe postoperative complications (POCs). Methods A retrospective single-centre study of 911 consecutive patients with LMCRC who underwent hepatectomy by the open approach between 1998 and 2011, before implementation of laparoscopic surgery to obviate approach selection bias. LMCRC with at least one of the following four features on preoperative imaging: tumor invading the hepatocaval confluence or retro-hepatic inferior vena cava, centrally located (Segments 4,5,8) and >10 cm in diameter, invading the supra-hilar area or involving the paracaval portion or caudate process of Segment 1; were considered as topographically difficult (top-diff). Independent predictors of surgical difficulty assessed by number of blood units transfused, duration of ischemia, and number of sessions of pedicle clamping during surgery and of severe POCs were identified by multivariate analysis before, and after propensity score matching. Results Top-diff tumor location independently predicted surgical difficulty. Severe POCs were associated with the tumor location [top-diff vs. topographically non difficult (non top-diff)], preoperative portal vein embolization, and variables related to surgical difficulty. Conclusions LMCRC in difficult location independently predicts surgical difficulty and severe POCs.
Article
Backgrounds: To achieve a competency-based training paradigm, the ability to obtain reliable and valid quantitative assessments of intraoperative performance is required. Through this, weaknesses can be identified and practiced, and competency assessed. This study aimed to determine the validity and reliability an objective evaluation tool for assessment of performance in laparoscopic appendicectomy (LA). Methods: A prospective single-blinded observational study design was used. Videos of inexperienced (performed <10 LAs) and experienced (performed >100 LAs) surgeons performing LA surgery were collected. Surgical performance during each recording was rated by two independent, blinded expert surgeons using the LA Rating Scale (LARS) and the modified Objective Structured Assessment of Technical Skill (OSATS) scale. Results: The intraclass correlation coefficient (ICC) for LARS was 0.95 (95%CI 0.83-0.98). The ICC for each step ranged from 0.48 to 0.90, and the test-retest ICC for LARS was 0.91 (95%CI 0.69-0.98). Significant differences (P < 0.001) between median performance scores as rated by LARS were observed between the inexperienced and experienced surgeons. A Spearman's correlation coefficient of 0.87 (P < 0.001) was observed between LARS performance scores and modified OSATS scores. Conclusion: LARS demonstrated excellent inter-rater and test-retest reliability, and construct and concurrent validity and can be used to quantitatively evaluate performance during LA. This can potentially allow specific weaknesses to be identified and improved upon through deliberate practice. Progress can be tracked through re-evaluation and scores of expert surgeons can be used as performance goals for credentialing in LA.
Article
Importance: Surgical data scientists lack video data sets that depict adverse events, which may affect model generalizability and introduce bias. Hemorrhage may be particularly challenging for computer vision-based models because blood obscures the scene. Objective: To assess the utility of the Simulated Outcomes Following Carotid Artery Laceration (SOCAL)-a publicly available surgical video data set of hemorrhage complication management with instrument annotations and task outcomes-to provide benchmarks for surgical data science techniques, including computer vision instrument detection, instrument use metrics and outcome associations, and validation of a SOCAL-trained neural network using real operative video. Design, setting, and participants: For this quailty improvement study, a total of 75 surgeons with 1 to 30 years' experience (mean, 7 years) were filmed from January 1, 2017, to December 31, 2020, managing catastrophic surgical hemorrhage in a high-fidelity cadaveric training exercise at nationwide training courses. Videos were annotated from January 1 to June 30, 2021. Interventions: Surgeons received expert coaching between 2 trials. Main outcomes and measures: Hemostasis within 5 minutes (task success, dichotomous), time to hemostasis (in seconds), and blood loss (in milliliters) were recorded. Deep neural networks (DNNs) were trained to detect surgical instruments in view. Model performance was measured using mean average precision (mAP), sensitivity, and positive predictive value. Results: SOCAL contains 31 443 frames with 65 071 surgical instrument annotations from 147 trials with associated surgeon demographic characteristics, time to hemostasis, and recorded blood loss for each trial. Computer vision-based instrument detection methods using DNNs trained on SOCAL achieved a mAP of 0.67 overall and 0.91 for the most common surgical instrument (suction). Hemorrhage control challenges standard object detectors: detection of some surgical instruments remained poor (mAP, 0.25). On real intraoperative video, the model achieved a sensitivity of 0.77 and a positive predictive value of 0.96. Instrument use metrics derived from the SOCAL video were significantly associated with performance (blood loss). Conclusions and relevance: Hemorrhage control is a high-stakes adverse event that poses unique challenges for video analysis, but no data sets of hemorrhage control exist. The use of SOCAL, the first data set to depict hemorrhage control, allows the benchmarking of data science applications, including object detection, performance metric development, and identification of metrics associated with outcomes. In the future, SOCAL may be used to build and validate surgical data science models.
Article
Background Previous studies have examined how factors such as gender, education, type of training (MD or DO), and experience of the treating surgeon affect patient outcomes. We investigated patient complications after elective laparoscopic cholecystectomy based on surgeon characteristics. Methods A Medicare database was used to identify surgeon-specific data. The main outcome measure was the adjusted complication rates (ACR) for individual surgeons as reported by the ProPublica Surgeon Scorecard. Surgeon gender, type of training, medical school rank, years since graduation, procedure volume, and teaching status of the primary hospital affiliation were assessed for any association with increased ACR using logistic regression analysis. We explored the associations among procedure volume, years of experience, and ACR using Spearman correlation. Results 1107 predominantly male (94.6%) surgeons were included. 94.4% were MDs and 34.5% were affiliated with teaching hospitals. Mean length of practice was 24 ± 9 years, and median surgeon procedure volume was 28 (IQR = 23, 37). Overall median ACR was 4.3%. Multivariate analysis demonstrated that surgeon gender ( P = .71), medical school rank, type of training ( P = .68), or hospital affiliation ( P = .77) did not have a significant impact on ACR. Increased surgeons’ years in practice (r = −.028, P = .35) and increased surgeon procedure volume (r = −.021, P = .49) were negatively associated with increased ACR. Conclusion Surgeon gender, type of training, medical school rank, or hospital affiliation had no impact on complications after laparoscopic cholecystectomy. Surgeon experience and procedure volume may have clinical implications for patient outcomes. Further studies to elucidate factors associated with surgeon quality and patient outcomes are necessary.
Article
Introduction: Although robot-assisted radical prostatectomy (RARP) has become a standard treatment modality in patients with prostate cancer (PCa), RARP is a complicated and difficult surgical procedure due to the risk of serious surgery-related complications. This study aimed to evaluate the validation of a standardized training system for RARP in patients with PCa at a single institute. Material and methods: We retrospectively reviewed the clinical and pathological records of 155 patients with PCa who underwent RARP at Gifu University between August 2018 and April 2021. We developed an institutional program for new surgeons based on the separation of the RARP procedure into six checkpoints. The primary endpoints were surgical outcomes and perioperative complications among three groups (expert, trainer, and novice surgeon groups). Results: The console time was significantly longer in the novice surgeon group than in the other groups. Regarding bladder neck dissection, ligation of lateral pedicles, and vesicourethral anastomosis, the operative time was significantly shorter in the expert group than in the other groups. Surgery-related complications occurred in 15 patients (9.7%). Conclusions: Our training system for RARP might help reduce the influence of the learning curve on surgical outcomes and ensure that the surgeries performed at low-volume institutions are safe and effective.
Article
Video recording is widely available in modern operating rooms. Here, I argue that, if patient consent and suitable technology are in place, video recording of surgery is an ethical duty. I develop this as a duty to protect, arguing for professional and institutional duties, as distinguished for duties of rescue . A professional duty to protect is described in mental healthcare. Practitioners have to take reasonable steps to prevent serious, foreseeable harm to their clients and others, even if that entails a non-consensual breach of confidentiality. I argue surgeons have a similar duty to patients which means that, provided the patient consents, surgery should be routinely videoed. This avoids non-consensual breaches of patient confidentiality and is aligned with stated professional obligations. An institutional duty to protect means institutions have to take reasonable steps to prevent serious, foreseeable harm at the hands of their surgeons. Rulli and Millum highlighted how institutions can meet their duty using a more consequentialist approach that balances wider interests. To test the force and scope of such duties, I examine potential impacts of routine videoing on aspects of autonomy, justice, beneficence and non-maleficence. I find routine videoing can benefit areas including safety, candour, consent and fairness in access (to surgical careers and expertise). Countervailing claims, for example, on liability, confidentiality and privacy can be resisted—such that where consent and the technology are in place, routine videoing meets a duty of easy protection . In other words, its use should be standard of care.
Article
Background: Transanal total mesorectal excision (TATME) is difficult to learn and can result in serious complications. Current paradigms for assessing performance and competency may be insufficient. This study aims to develop and provide preliminary validity evidence for a TATME virtual assessment tool (TATME-VAT) to assess the cognitive skills necessary to safely complete TATME dissection. Methods: Participants from North America, Europe, Japan and China completed the test via an interactive online platform between 11/2019 and 05/2020. They were grouped into expert, experienced and novice surgeons depending on the number of independently performed TATMEs. TATME-VAT is a 24-item web-based assessment evaluating advanced cognitive skills, designed according to a blueprint from consensus guidelines. Eight items were multiple choice questions. Sixteen items required making annotations on still frames of TATME videos (VCT) and were scored using a validated algorithm derived from experts' responses. Annotation (range 0-100), multiple choice (range 0-100), and overall scores (sum of annotation and multiple-choice scores, normalized to μ = 50 and σ = 10) were reported. Results: There were significant differences between the expert, experienced, and novice groups for the annotation (p < 0.001), multiple-choice (p < 0.001), and overall scores (p < 0.001). The annotation (p = 0.439) and overall (p = 0.152) scores were similar between the experienced and novice groups. Annotation scores were higher in participants with 51 or more vs. 30-50 vs. less than 30 cases. Scores were also lower in users with a self-reported recent complication vs. those without. Conclusions: This study describes the development of an interactive video-based virtual assessment tool for TATME dissection and provides initial validity evidence for its use.
Article
Background: Procedure-specific complications can have devastating consequences. Machine learning-based tools have the potential to outperform traditional statistical modeling in predicting their risk and guiding decision-making. We sought to develop and compare deep neural network (NN) models, a type of machine learning, to logistic regression (LR) for predicting anastomotic leak after colectomy, bile leak after hepatectomy, and pancreatic fistula after pancreaticoduodenectomy (PD). Methods: The colectomy, hepatectomy, and PD National Surgical Quality Improvement Program (NSQIP) databases were analyzed. Each dataset was split into training, validation, and testing sets in a 60/20/20 ratio, with fivefold cross-validation. Models were created using NN and LR for each outcome. Models were evaluated primarily with area under the receiver operating characteristic curve (AUROC). Results: A total of 197,488 patients were included for colectomy, 25,403 for hepatectomy, and 23,333 for PD. For anastomotic leak, AUROC for NN was 0.676 (95% 0.666-0.687), compared with 0.633 (95% CI 0.620-0.647) for LR. For bile leak, AUROC for NN was 0.750 (95% CI 0.739-0.761), compared with 0.722 (95% CI 0.698-0.746) for LR. For pancreatic fistula, AUROC for NN was 0.746 (95% CI 0.733-0.760), compared with 0.713 (95% CI 0.703-0.723) for LR. Variables related to intra-operative information, such as surgical approach, biliary reconstruction, and pancreatic gland texture were highly important for model predictions. Discussion: Machine learning showed a marginal advantage over traditional statistical techniques in predicting procedure-specific outcomes. However, models that included intra-operative information performed better than those that did not, suggesting that NSQIP procedure-targeted datasets may be strengthened by including relevant intra-operative information.
Article
Introduction Decay of surgical skills due to paucity of opportunity to operate is a potential threat to patients being cared for by the Defence Medical Services while on operational deployment. Our aim was to review the literature regarding skill decay in the trained surgeon in order to understand how it may affect clinical performance and patient outcomes. We also wished to survey the likely causes of such decay and possible means of mitigation. Methods A systematic review of the literature was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Study bias assessment was also undertaken. Content summaries for the papers included study design and methodology, participant level of experience, measures and magnitude of effect, duration of no practice, and study limitations. Results Five papers met the selection criteria. There were insufficient quantitative data on the impact of surgical skill decay on patient outcome, surgeon performance or mitigation strategies, and a meaningful quantitative synthesis could not be undertaken. Conclusions This systematic review of the literature found very little specific evidence confirming or refuting surgical skill decay in trained surgeons, with measurement of decay hampered by the lack of an accepted methodology. Studying this in the deployed setting may offer a firmer evidence base from which to generate policy. Potential mitigation strategies are discussed. PROSPERO registration number ID260846.
Article
Purpose: Surgeons' skill in the operating room is a major determinant of patient outcomes. Assessment of surgeons' skill is necessary to improve patient outcomes and quality of care through surgical training and coaching. Methods for video-based assessment of surgical skill can provide objective and efficient tools for surgeons. Our work introduces a new method based on attention mechanisms and provides a comprehensive comparative analysis of state-of-the-art methods for video-based assessment of surgical skill in the operating room. Methods: Using a dataset of 99 videos of capsulorhexis, a critical step in cataract surgery, we evaluated image feature-based methods and two deep learning methods to assess skill using RGB videos. In the first method, we predict instrument tips as keypoints and predict surgical skill using temporal convolutional neural networks. In the second method, we propose a frame-wise encoder (2D convolutional neural network) followed by a temporal model (recurrent neural network), both of which are augmented by visual attention mechanisms. We computed the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and predictive values through fivefold cross-validation. Results: To classify a binary skill label (expert vs. novice), the range of AUC estimates was 0.49 (95% confidence interval; CI = 0.37 to 0.60) to 0.76 (95% CI = 0.66 to 0.85) for image feature-based methods. The sensitivity and specificity were consistently high for none of the methods. For the deep learning methods, the AUC was 0.79 (95% CI = 0.70 to 0.88) using keypoints alone, 0.78 (95% CI = 0.69 to 0.88) and 0.75 (95% CI = 0.65 to 0.85) with and without attention mechanisms, respectively. Conclusion: Deep learning methods are necessary for video-based assessment of surgical skill in the operating room. Attention mechanisms improved discrimination ability of the network. Our findings should be evaluated for external validity in other datasets.
Article
Introduction: Surgical skill evaluation while performing minimally invasive surgeries is a highly complex task. It is important to objectively assess an individual's technical skills throughout surgical training to monitor progress and to intervene when skills are not commensurate with the year of training. The miniaturization of wireless wearable platforms integrated with sensor technology has made it possible to non-invasively assess muscle activations and movement variability during performance of minimally invasive surgical tasks. Our objective was to use electromyography to deconstruct the motions of a surgeon during robotic suturing and distinguish quantifiable movements that characterize the skill of an experienced, expert urologic surgeon from trainees. Methods: Three skill groups of participants: novice (n=11), intermediate (n=12) and expert (n=3) were enrolled in the study. A total of 12 wireless wearable sensors consisting of surface electromyograms (EMGs) and accelerometers were placed along upper extremity muscles to assess muscle activations and movement variability, respectively. Participants then performed a robotic suturing task. Results: EMG-based parameters: total time, dominant frequency, cumulative muscular workload (CMW were significantly different across the three skill groups. We also found nonlinear movement variability parameters such as correlation dimension, Lyapunov exponent trended differently across the three skill groups. Conclusions: These findings suggest that economy of motion variables and nonlinear movement variabilities are affected by surgical experience level. Wearable sensor signal analysis could make it possible to objectively evaluate surgical skill level periodically throughout the residency training experience.
Article
Topic Despite significant recent advances in artificial intelligence (AI) technology within several ophthalmic subspecialties, AI is underutilised in the diagnosis and management of cataract. In this article, we review AI technology that has been reported within research settings that may soon become central to the cataract surgical pathway, from diagnosis to completion of surgery. Clinical relevance This review describes recent advances of AI in the preoperative, intraoperative and postoperative phase of cataract surgery demonstrating its impact on the pathway and the surgical team. Methods A systematic search on PubMed has been conducted in order to identify relevant publications on the topic of Artificial Intelligence for cataract surgery. Articles of high quality and relevance to the topic have been selected. Results Preoperatively, diagnosis and grading of cataracts through AI based image analysis has been demonstrated in several research settings. Optimal IOL power to achieve the desired postoperative refraction can be calculated with a higher degree of accuracy using AI based modelling compared to traditional IOL formulae. Intraoperatively, innovative AI based video analysis tools are in development promoting a paradigm shift for documentation, storage and cataloguing libraries of surgical videos with applications for teaching and training, complication review and surgical research. Situation-aware computer-assisted devices can be connected to surgical microscopes for automated video capture and cloud storage upload. AI based software can provide workflow analysis, tool detection and video segmentation for skill evaluation by the surgeon and the trainee. Mixed reality features such as real-time intraoperative warnings may have a role in improving surgical decision making with the key aim of reducing complications by recognising surgical risks in advance and alerting the operator to them. For the management of patient flow through the pathway, AI-based mathematical models generating patient referral patterns are in development as are simulations to optimise operating room utilisation. In the postoperative phase, AI has been shown to predict the posterior capsule status with reasonable accuracy and therefore improve the triage pathway in the treatment of posterior capsular opacification. Conclusion AI for cataract surgery will be as relevant as in other subspecialties of ophthalmology and eventually constitute a future cornerstone for an enhanced cataract surgery pathway.
Article
Full-text available
Context Although physicians report spending a considerable amount of time in continuing medical education (CME) activities, studies have shown a sizable difference between real and ideal performance, suggesting a lack of effect of formal CME.Objective To review, collate, and interpret the effect of formal CME interventions on physician performance and health care outcomes.Data Sources Sources included searches of the complete Research and Development Resource Base in Continuing Medical Education and the Specialised Register of the Cochrane Effective Practice and Organisation of Care Group, supplemented by searches of MEDLINE from 1993 to January 1999.Study Selection Studies were included in the analyses if they were randomized controlled trials of formal didactic and/or interactive CME interventions (conferences, courses, rounds, meetings, symposia, lectures, and other formats) in which at least 50% of the participants were practicing physicians. Fourteen of 64 studies identified met these criteria and were included in the analyses. Articles were reviewed independently by 3 of the authors.Data Extraction Determinations were made about the nature of the CME intervention (didactic, interactive, or mixed), its occurrence as a 1-time or sequenced event, and other information about its educational content and format. Two of 3 reviewers independently applied all inclusion/exclusion criteria. Data were then subjected to meta-analytic techniques.Data Synthesis The 14 studies generated 17 interventions fitting our criteria. Nine generated positive changes in professional practice, and 3 of 4 interventions altered health care outcomes in 1 or more measures. In 7 studies, sufficient data were available for effect sizes to be calculated; overall, no significant effect of these educational methods was detected (standardized effect size, 0.34; 95% confidence interval [CI], −0.22 to 0.97). However, interactive and mixed educational sessions were associated with a significant effect on practice (standardized effect size, 0.67; 95% CI, 0.01-1.45).Conclusions Our data show some evidence that interactive CME sessions that enhance participant activity and provide the opportunity to practice skills can effect change in professional practice and, on occasion, health care outcomes. Based on a small number of well-conducted trials, didactic sessions do not appear to be effective in changing physician performance.
Article
Full-text available
There is evidence that collaborations between hospitals and physicians in particular regions of the country have led to improvements in the quality of care. Even so, there have not been many of these collaborations. We review one, the Michigan regional collaborative improvement program, which was paid for by a large private insurer, has yielded improvements for a range of clinical conditions, and has reduced costs in several important areas. In general and vascular surgery alone, complications from surgery dropped almost 2.6 percent among participating Michigan hospitals-a change that translates into 2,500 fewer Michigan patients with surgical complications each year. Estimated annual savings from this one collaborative are approximately $20 million, far exceeding the cost of administering the program. Regional collaborative improvement programs should become increasingly attractive to hospitals and physicians, as well as to national policy makers, as they seek to improve health care quality and reduce costs.
Article
Full-text available
To determine whether high rates of compliance with perioperative processes of care used for public reporting and pay-for-performance are associated with lower rates of risk-adjusted mortality and high-risk surgical complications. Retrospective analysis of Medicare inpatient claims data (from January 1, 2005, through December 31, 2006). Hierarchical logistic regression models assessed the relationship between adverse outcomes and hospital compliance with the surgical processes of care reported on the Hospital Compare Web site. Two thousand US hospitals. Beneficiaries who underwent 1 of 6 high-risk operations in 2005 and 2006. Thirty-day postoperative mortality rate, venous thromboembolism, and surgical site infection. Process compliance ranged from 53.7% in low compliance hospitals to 91.4% in high compliance hospitals. Risk-adjusted outcomes did not vary at high compliance hospitals relative to medium compliance hospitals for mortality rate (odds ratio, 0.98; 95% confidence interval, 0.92-1.05), surgical site infection (1.01; 0.90-1.13), or venous thromboembolism (1.04; 0.89-1.20). Outcomes also did not vary at low compliance hospitals. Stratified analyses by operation type confirm these trends for the 6 procedures individually. Currently available information on the Hospital Compare Web site will not help patients identify hospitals with better outcomes for high-risk surgery. The Centers for Medicare and Medicaid Services needs to identify higher leverage process measures and devote greater attention to profiling hospitals based on outcomes to improve public reporting and pay-for-performance efforts.
Article
Full-text available
Despite the growing popularity of bariatric surgery, there remain concerns about perioperative safety and variation in outcomes across hospitals. To assess complication rates of different bariatric procedures and variability in rates of serious complications across hospitals and according to procedure volume and center of excellence (COE) status. Involving 25 hospitals and 62 surgeons statewide, the Michigan Bariatric Surgery Collaborative (MBSC) administers an externally audited, prospective clinical registry. We evaluated short-term morbidity in 15,275 Michigan patients undergoing 1 of 3 common bariatric procedures between 2006 and 2009. We used multilevel regression models to assess variation in risk-adjusted complication rates across hospitals and the effects of procedure volume and COE designation (by the American College of Surgeons or American Society for Metabolic and Bariatric Surgery) status. Complications occurring within 30 days of surgery. Overall, 7.3% of patients experienced perioperative complications, most of which were wound problems and other minor complications. Serious complications were most common after gastric bypass (3.6%; 95% confidence interval [CI], 3.2%-4.0%), followed by sleeve gastrectomy (2.2%; 95% CI, 1.2%-3.2%), and laparoscopic adjustable gastric band (0.9%; 95% CI, 0.6%-1.1%) procedures (P < .001). Mortality occurred in 0.04% (95% CI, 0.001%-0.13%) of laparoscopic adjustable gastric band, 0 sleeve gastrectomy, and 0.14% (95% CI, 0.08%-0.25%) of the gastric bypass patients. After adjustment for patient characteristics and procedure mix, rates of serious complications varied from 1.6% (95% CI, 1.3-2.0) to 3.5% (95% CI, 2.4-5.0) (risk difference, 1.9; 95% CI, 0.08-3.7) across hospitals. Average annual procedure volume was inversely associated with rates of serious complications at both the hospital level (< 150 cases, 4.1%; 95% CI, 3.0%-5.1%; 150-299 cases, 2.7%; 95% CI, 2.2-3.2; and > or = 300 cases, 2.3%; 95% CI, 2.0%-2.6%; P = .003) and surgeon level (< 100 cases, 3.8%; 95% CI, 3.2%-4.5%; 100-249 cases, 2.4%; 95% CI, 2.1%-2.8%; > or = 250 cases, 1.9%; 95% CI, 1.4%-2.3%; P = .001). Adjusted rates of serious complications were similar in COE and non-COE hospitals (COE, 2.7%; 95% CI, 2.5%-3.1%; non-COE, 2.0%; 95% CI, 1.5%-2.4%; P = .41). The frequency of serious complications among patients undergoing bariatric surgery in Michigan was relatively low. Rates of serious complications are inversely associated with hospital and surgeon procedure volume, but unrelated to COE accreditation by professional organizations.
Article
Full-text available
The Surgical Care Improvement Project (SCIP) aims to reduce surgical infectious complication rates through measurement and reporting of 6 infection-prevention process-of-care measures. However, an association between SCIP performance and clinical outcomes has not been demonstrated. To examine the relationship between SCIP infection-prevention process-of-care measures and postoperative infection rates. A retrospective cohort study, using Premier Inc's Perspective Database for discharges between July 1, 2006 and March 31, 2008, of 405 720 patients (69% white and 11% black; 46% Medicare patients; and 68% elective surgical cases) from 398 hospitals in the United States for whom SCIP performance was recorded and submitted for public report on the Hospital Compare Web site. Three original infection-prevention measures (S-INF-Core) and all 6 infection-prevention measures (S-INF) were aggregated into 2 separate all-or-none composite scores. Hierarchical logistical models were used to assess process-of-care relationships at the patient level while accounting for hospital characteristics. The ability of reported adherence to SCIP infection-prevention process-of-care measures (using the 2 composite scores of S-INF and S-INF-Core) to predict postoperative infections. There were 3996 documented postoperative infections. The S-INF composite process-of-care measure predicted a decrease in postoperative infection rates from 14.2 to 6.8 per 1000 discharges (adjusted odds ratio, 0.85; 95% confidence interval, 0.76-0.95). The S-INF-Core composite process-of-care measure predicted a decrease in postoperative infection rates from 11.5 to 5.3 per 1000 discharges (adjusted odds ratio, 0.86; 95% confidence interval, 0.74-1.01), which was not a statistically significantly lower probability of infection. None of the individual SCIP measures were significantly associated with a lower probability of infection. Among hospitals in the Premier Inc Perspective Database reporting SCIP performance, adherence measured through a global all-or-none composite infection-prevention score was associated with a lower probability of developing a postoperative infection. However, adherence reported on individual SCIP measures, which is the only form in which performance is publicly reported, was not associated with a significantly lower probability of infection.
Article
Full-text available
Although physicians report spending a considerable amount of time in continuing medical education (CME) activities, studies have shown a sizable difference between real and ideal performance, suggesting a lack of effect of formal CME. To review, collate, and interpret the effect of formal CME interventions on physician performance and health care outcomes. Sources included searches of the complete Research and Development Resource Base in Continuing Medical Education and the Specialised Register of the Cochrane Effective Practice and Organisation of Care Group, supplemented by searches of MEDLINE from 1993 to January 1999. Studies were included in the analyses if they were randomized controlled trials of formal didactic and/or interactive CME interventions (conferences, courses, rounds, meetings, symposia, lectures, and other formats) in which at least 50% of the participants were practicing physicians. Fourteen of 64 studies identified met these criteria and were included in the analyses. Articles were reviewed independently by 3 of the authors. Determinations were made about the nature of the CME intervention (didactic, interactive, or mixed), its occurrence as a 1-time or sequenced event, and other information about its educational content and format. Two of 3 reviewers independently applied all inclusion/exclusion criteria. Data were then subjected to meta-analytic techniques. The 14 studies generated 17 interventions fitting our criteria. Nine generated positive changes in professional practice, and 3 of 4 interventions altered health care outcomes in 1 or more measures. In 7 studies, sufficient data were available for effect sizes to be calculated; overall, no significant effect of these educational methods was detected (standardized effect size, 0.34; 95% confidence interval [CI], -0.22 to 0.97). However, interactive and mixed educational sessions were associated with a significant effect on practice (standardized effect size, 0.67; 95% CI, 0.01-1.45). Our data show some evidence that interactive CME sessions that enhance participant activity and provide the opportunity to practice skills can effect change in professional practice and, on occasion, health care outcomes. Based on a small number of well-conducted trials, didactic sessions do not appear to be effective in changing physician performance.
Article
Full-text available
This retrospective study investigated the impact of patient and procedure-related parameters on the complication rate following revision total hip arthroplasty. Complications included vessel and nerve damage, periprosthetic femoral fracture, wound infection, wound bleeding, prosthesis dislocations, thromboembolism, cardiac and pulmonary complications, and death. The influence of operation duration, gender, revision status, ASA classification, and type of fixation of the primary implant on the perioperative morbidity was investigated in a sample of 60 revision procedures (cemented stems, cemented or cementless cups). Odds ratio [OR] and 95% confidence interval [CI] were estimated with multiple regression models. Perioperative morbidity was significantly correlated to operation duration (OR = 1.03; CI: 1.00-1.05), but not to age (OR = 1.01; CI: 0.93-1.09), gender (OR = 2.66; CI: 0.50-14.05), revision status (OR = 2.34; CI: 0.54-10.05), ASA classification (OR = 1.24; CI: 0.30-5.18), or type of fixation of the primary implant (OR = 2.49; CI: 0.47-13.17) Duration of the revision operation appeared as a predictive parameter for perioperative morbidity in revision total hip arthroplasty in our study group.
Article
Full-text available
T times are used to categorize surgical procedures into long and short durations. They constitute a part of the US National Nosocomial Infection Surveillance (NNIS) risk index that is widely used internationally in surveillance for surgical site infections (SSIs). The objective of this study was to compare the US NNIS T times with data collected in England. The Surgical Site Infection Surveillance Service in England holds data collected by 168 hospitals in 13 categories of surgical procedures between 1997 and 2002. The 75(th) percentile and corresponding T time were calculated from English data and compared with US times. Differences in rates of SSI above and below the T times were compared. Graphical methods were used to assess the cut points that exhibited an association with risk of SSI. The results show that English and US T times were the same for all surgical categories except coronary artery bypass graft and vascular surgery, where the English T time was 4 h. The 75(th) percentile time for hip hemiarthroplasties was 40 min less than for total hip replacements (THR). Although the incidence of SSI in THR was significantly higher in operations lasting for longer than the T time (P<0.05), no association between risk of SSI and T times set at 1, 1.5 or 2 h was observed for hip hemiarthroplasties. In conclusion, operations lasting for longer than the T time were associated with a higher risk of SSI in most categories. In the hip prosthesis category, this association only applied to THR.
Article
This study analyzes data from New York State's new Cardiac Surgery Reporting System, which contains information about cardiac preoperative risk factors, postoperative complications, and hospital discharge. The purposes of the study were to determine the set of significant clinical risk factors and to identify cardiac surgical centers most likely to have serious quality-of-care problems. Significant risk factors for in-hospital death were age, gender, ejection fraction, previous myocardial infarction, number of open heart operations in previous admissions, diabetes requiring medication, dialysis dependence, disasters (acute structural defect, renal failure, cardiogenic shock, gunshot), unstable angina, intractable congestive heart failure, left main trunk narrowed more than 90%, and type of operation performed. Four of the 28 hospitals had significantly higher mortality rates than expected, given the risk factors of their patients. Subsequent site visits and medical record reviews confirmed that these facilities had high percentages of quality-of-care problems among cases resulting in mortality.(JAMA. 1990;264:2768-2774)
Article
Objective. —A prospective regional study was conducted to determine if the observed differences in in-hospital mortality rates associated with coronary artery bypass grafting (CABG) are solely the result of differences in patient case mix.
Article
Background: There is no objective scale for assessment of operative skill in laparoscopic gastric bypass (LGBP). The objective of this study was to develop and demonstrate feasibility of use, validity, and reliability of a Bariatric Objective Structured Assessment of Technical Skill (BOSATS) scale. Study design: The BOSATS scale was developed using a hierarchical task analysis (HTA), a Delphi questionnaire, and a panel of international experts in bariatric surgery. The feasibility of use, reliability, and validity of the developed scale were demonstrated by reviewing 52 prospectively collected video recordings of LGBP performed by novice and experienced surgeons. Results: A total of 214 discrete steps were identified in HTA. A total of 12 and 17 panel members completed the first and second round of the Delphi questionnaire, respectively. Consensus among the panel was achieved after the second round (Cronbach's alpha = 0.85). The BOSATS scale demonstrated high inter-rater (intraclass correlation coefficient [ICC] = 0.954; p < 0.001) and test-retest reliability (ICC = 0.99; p < 0.001). Significant differences between BOSATS scores of experienced and novice surgeon groups were noted for the creation of jejunojejunostomy (JJ), gastric pouch, linear stapled gastrojejunostomy (GJ), circular stapled GJ, and hand-sewn GJ. Moderate to high correlations between BOSATS scale and Objective Structured Assessment of Technical Skills Global Rating Scale (OSATS GRS) were seen for JJ (rho = 0.59; p = 0.001), gastric pouch (rho = 0.48; p = 0.0004), linear stapled GJ (rho = 0.70; p = 0.0001), and hand-sewn GJ (rho = 0.96; p < 0.0001). Conclusions: The BOSATS scale is a feasible to use, reliable, and valid instrument for objective assessment of operative performance in LGBP. Implementation of this scale is expected to facilitate deliberate practice and provide a means for future certification in bariatric surgery.
Article
Purpose: Most assessment of surgical trainees is based on measures of knowledge, with limited evaluation of their competence to actually perform various surgical procedures. In this study, the authors evaluated a tool they designed to assess a trainee's competence to perform an entire surgical procedure independently, regardless of procedure type or postgraduate year (PGY). Method: In phase 1, the Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) was piloted in the University of Ottawa's Division of Orthopaedic Surgery. In phase 2, the refined 11-item tool (8 items rated on a 5-point competency scale, 1 item assessing procedural competence, 2 feedback items) was used in the Divisions of Orthopaedic Surgery and General Surgery to assess residents' performance on 11 common procedures. Quantitative and qualitative analyses were conducted. Results: In phase 2, 34 orthopaedic and general surgeons assessed the performance of 37 residents in 163 procedures. ANOVA demonstrated an effect of PGY. Post hoc analysis found that total procedure scores for PGYs 1 and 2 were lower than those for PGY 3 (P<.001), and PGY 3 scores were lower than those for PGYs 4 and 5 (P<.02). Analysis of qualitative data indicated that the rating scale was practical and useful for surgeons and residents. Conclusions: This novel evaluation tool successfully discriminated between junior and senior residents and identified surgical competency across various PGY levels regardless of procedure type. Multiple sources of evidence support the O-SCORE as a valid tool for the assessment of trainee operative competency.
Article
Duration of femoral-popliteal bypass is based on multiple patient-specific, system-specific, and surgeon-specific factors, and is subject to considerable variability. We hypothesized that shorter operative duration is associated with improved outcomes and might represent a potential quality-improvement measure. Patients who underwent primary femoral-popliteal bypass with autogenous vein between 2005 and 2009 were identified from the American College of Surgeons NSQIP dataset using ICD-9 codes. Operative duration quartiles (Q) were determined (Q1: ≤149 minutes, Q2: 150 to 192 minutes, Q3: 193 to 248 minutes; and Q4: ≥249 minutes). Perioperative outcomes included mortality, surgical site infection, cardiopulmonary complications, and length of hospital stay. Relevant patient-specific and system-specific confounders, including age, body mass index, smoking, diabetes, end-stage renal disease, indication, American Society of Anesthesiologists' class, type of anesthesia, intraoperative transfusion, nonoperative time in the operating room, and participation of a trainee during the procedure, were adjusted for using multivariable regression. There were 2,644 femoral-popliteal bypass procedures in our study. Mean age was 65.9 years and 62% of patients were male. Longer duration of surgery was associated with increased perioperative surgical site infection (Q1: 6.3%; Q2: 9.0%; Q3: 10.1%; and Q4: 13.9%; p < 0.001) and longer length of stay (5.4 ± 6.8 days; 6.1 ± 6.7 days; 7.0 ± 11.3 days; 8.1 ± 8.0 days, respectively; p < 0.001). In multivariable analysis, longer operative duration was independently associated with higher surgical site infection and longer hospital length of stay. Operative duration of ≥260 minutes increased the risk of surgical site infection by 50% compared with operative time of 150 minutes. Longer duration of femoral-popliteal bypass with autogenous vein was associated with a significantly higher risk of perioperative surgical site infection and longer hospital length of stay. Surgeon-specific parameters that lead to faster operative time might lead to improved clinical outcomes and more efficient hospital resource use.
Article
New information technology systems at hospitals and medical centers provide administrators and policymakers with greater information than ever before on both the characteristics of patients and on the outcomes associated with individual surgeons. This paper develops a Bayesian hierarchical bivariate probit model describing surgeon performance in terms of the 30-day mortality and 30-day morbidity of their patients. We apply the model to a sample of 2,578 patients who received care from one of 36 surgeons at a large hospital. The model is estimated using Markov Chain Monte Carlo (MCMC) simulation methods. After accounting for observed differences in the health status of patients prior to surgery and the complexity of the procedure performed, we construct quality indices measuring surgeon performance in terms of morbidity and mortality. These indices are used to evaluate surgeon performance against absolute standards for morbidity and mortality rates, and then to conduct “head to head” comparisons of individual surgeons within subspecialty surgery departments at the hospital. Our approach highlights the potential benefits of new information technologies for monitoring surgeon quality.
Article
The surgical learning curve persists for years after training, yet existing continuing medical education activities targeting this are limited. We describe a pilot study of a scalable video-based intervention, providing individualized feedback on intraoperative performance. Four complex operations performed by surgeons of varying experience--a chief resident accompanied by the operating senior surgeon, a surgeon with less than 10 years in practice, another with 20 to 30 years in practice, and a surgeon with more than 30 years of experience--were video recorded. Video playback formed the basis of 1-hour coaching sessions with a peer-judged surgical expert. These sessions were audio recorded, transcribed, and thematically coded. The sessions focused on operative technique--both technical aspects and decision-making. With increasing seniority, more discussion was devoted to the optimization of teaching and facilitation of the resident's technical performance. Coaching sessions with senior surgeons were peer-to-peer interactions, with each discussing his preferred approach. The coach alternated between directing the session (asking probing questions) and responding to specific questions brought by the surgeons, depending on learning style. At all experience levels, video review proved valuable in identifying episodes of failure to progress and troubleshooting alternative approaches. All agreed this tool is a powerful one. Inclusion of trainees seems most appropriate when coaching senior surgeons; it may restrict the dialogue of more junior attendings. Video-based coaching is an educational modality that targets intraoperative judgment, technique, and teaching. Surgeons of all levels found it highly instructive. This may provide a practical, much needed approach for continuous professional development.
Article
Background: Obesity is a recognized risk factor for venous thromboembolism (VTE). The aims of the present study were to determine the risk factors for symptomatic VTE in morbidly obese patients undergoing laparoscopic bariatric surgery. Methods: This was a retrospective study that included consecutive patients who had undergone bariatric surgery from January 2007 to May 2010. Thromboprophylaxis included routine application of low-molecular-weight heparin, pneumatic calf compression, and early ambulation. Extensive measures, such as temporary insertion of a caval filter (n = 5) and anticoagulation (n = 11), were used in selected higher risk patients. The patients were followed up for a minimum of 3 months after surgery to determine the incidence of clinical VTE. The results are presented as the mean and range. Results: A total of 500 consecutive patients aged 44.7 years (range 19-77) with a body mass index of 49.2 kg/m(2) (range 32.1-84.3) underwent laparoscopic bariatric surgery (442 gastric bypass, 20 sleeve gastrectomy, and 38 gastric banding). No conversions to open surgery occurred, and the operative time, morbidity rate, and mortality rate was 93.7 minutes (range 20-325), 2.8%, and .2%, respectively. No clinical deep vein thrombosis was encountered, although 3 patients (.6%) developed pulmonary embolism. Cox regression multivariate analysis identified the operative time as the only independent predictor of postoperative VTE (relative risk .0002 per min, P = .009). Multivariate analysis identified the body mass index as an independent predictor of the operating time. Conclusion: Increasing obesity was associated with a longer operative time, which consequently increased the risk of VTE.
Article
To develop a risk prediction model for serious complications after bariatric surgery. Despite evidence for improved safety with bariatric surgery, serious complications remain a concern for patients, providers and payers. There is little population-level data on which risk factors can be used to identify patients at high risk for major morbidity. The Michigan Bariatric Surgery Collaborative is a statewide consortium of hospitals and surgeons, which maintains an externally-audited prospective clinical registry. We analyzed data from 25,469 patients undergoing bariatric surgery between June 2006 and December 2010. Significant risk factors on univariable analysis were entered into a multivariable logistic regression model to identify factors associated with serious complications (life threatening and/or associated with lasting disability) within 30 days of surgery. Bootstrap resampling was performed to obtain bias-corrected confidence intervals and c-statistic. Overall, 644 patients (2.5%) experienced a serious complication. Significant risk factors (P < 0.05) included: prior VTE (odds ratio [OR] 1.90, confidence interval [CI] 1.41-2.54); mobility limitations (OR 1.61, CI 1.23-2.13); coronary artery disease (OR 1.53, CI 1.17-2.02); age over 50 (OR 1.38, CI 1.18-1.61); pulmonary disease (OR 1.37, CI 1.15-1.64); male gender (OR 1.26, CI 1.06-1.50); smoking history (OR 1.20, CI 1.02-1.40); and procedure type (reference lap band): duodenal switch (OR 9.68, CI 6.05-15.49); laparoscopic gastric bypass (OR 3.58, CI 2.79-4.64); open gastric bypass (OR 3.51, CI 2.38-5.22); sleeve gastrectomy (OR 2.46, CI 1.73-3.50). The c-statistic was 0.68 (bias-corrected to 0.66) and the model was well-calibrated across deciles of predicted risk. We have developed and validated a population-based risk scoring system for serious complications after bariatric surgery. We expect that this scoring system will improve the process of informed consent, facilitate the selection of procedures for high-risk patients, and allow for better risk stratification across studies of bariatric surgery.
Article
The objective of this study was to evaluate whether the Surgical Care Improvement Project (SCIP) improved surgical site infection (SSI) rates using national data at the patient level for both SCIP adherence and SSI occurrence. The SCIP was established in 2006 with the goal of reducing surgical complications by 25% in 2010. National Veterans' Affairs (VA) data from 2005 to 2009 on adherence to 5 SCIP SSI prevention measures were linked to Veterans' Affairs Surgical Quality Improvement Program SSI outcome data. Effect of SCIP adherence and year of surgery on SSI outcome were assessed with logistic regression using generalized estimating equations, adjusting for procedure type and variables known to predict SSI. Correlation between hospital SCIP adherence and SSI rate was assessed using linear regression. There were 60,853 surgeries at 112 VA hospitals analyzed. SCIP adherence ranged from 75% for normothermia to 99% for hair removal and all significantly improved over the study period (P < 0.001). Surgical site infection occurred after 6.2% of surgeries (1.6% for orthopedic surgeries to 11.3% for colorectal surgeries). None of the 5 SCIP measures were significantly associated with lower odds of SSI after adjusting for variables known to predict SSI and procedure type. Year was not associated with SSI (P = 0.71). Hospital SCIP performance was not correlated with hospital SSI rates (r = -0.06, P = 0.54). Adherence to SCIP measures improved whereas risk-adjusted SSI rates remained stable. SCIP adherence was neither associated with a lower SSI rate at the patient level, nor associated with hospital SSI rates. Policies regarding continued SCIP measurement and reporting should be reassessed.
Article
Surgeons are increasingly being scrutinized for their performance and there is growing interest in objective assessment of technical skills. The purpose of this study was to review all evidence for these methods, in order to provide a guideline for use in clinical practice. A systematic search was performed using PubMed and Web of Science for studies addressing the validity and reliability of methods for objective skills assessment within surgery and gynaecology only. The studies were assessed according to the Oxford Centre for Evidence-based Medicine levels of evidence. In total 104 studies were included, of which 20 (19.2 per cent) had a level of evidence 1b or 2b. In 28 studies (26.9 per cent), the assessment method was used in the operating room. Virtual reality simulators and Objective Structured Assessment of Technical Skills (OSATS) have been studied most. Although OSATS is seen as the standard for skills assessment, only seven studies, with a low level of evidence, addressed its use in the operating room. Based on currently available evidence, most methods of skills assessment are valid for feedback or measuring progress of training, but few can be used for examination or credentialing. The purpose of the assessment determines the choice of method.
Article
Studies of specific procedures have shown increases in infectious complications with operative duration. We hypothesized that operative duration is independently associated with increased risk-adjusted infectious complication (IC) rates in a broad range of general surgical procedures. We queried the American College of Surgeons National Surgical Quality Improvement Program database for general surgical operations performed from 2005 to 2007. ICs (wound infection, sepsis, urinary tract infection, and/or pneumonia) and length of hospital stay (LOS) were evaluated versus operative duration (OD, ie, incision to closure). Multivariable regression adjusted for 38 patient risk variables, operation type and complexity, wound class and intraoperative transfusion. We also analyzed isolated laparoscopic cholecystectomies in patients of American Society of Anesthesiologists class 1 or 2, without intraoperative transfusion and with a clean or clean-contaminated wound class. In 299,359 operations performed at 173 hospitals, unadjusted IC rates increased linearly with OD at a rate of close to 2.5% per half hour (chi-square test for linear trend, p < 0.001). After adjustment, IC risk increased for each half hour of OD relative to cases lasting <or=1 hour, almost doubling at 2.1 to 2.5 hours (odds ratio = 1.92; 95% CI, 1.82 to 2.03; p < 0.001). In isolated laparoscopic cholecystectomy, IC rates increased linearly with OD (n = 17,018, chi-square test for linear trend, p < 0.001) with rates for 1.1 to 1.5 hour cases (1.4%) doubling those lasting <or=0.5 hour (0.7%). Across all procedures, adjusted LOS increased geometrically with operative duration at a rate of about 6% per half hour (coefficient for natural log transformed LOS = 0.059 per half hour; 95% CI, 0.058 to 0.060; p < 0.001). Operative duration is independently associated with increased ICs and LOS after adjustment for procedure and patient risk factors.
Article
The increased focus on quality and efficiency improvement within academic surgery has met with variable success among plastic surgeons. Traditional surgical performance metrics, such as morbidity and mortality, are insufficient to improve the majority of today's plastic surgical procedures. In-process analyses that allow rapid feedback to the surgeon based on surrogate markers may provide a powerful method for quality improvement. The authors reviewed performance data from all bilateral reduction mammaplasties performed at their institution by eight surgeons between 1995 and 2007. Multiple linear regression analyses were conducted to determine the relative impact of key factors on operative time. Explanatory learning curve models were generated, and complication data were analyzed to elucidate clinical outcomes and trends. A total of 1068 procedures were analyzed. The mean operative time for bilateral reduction mammaplasty was 134 +/- 34 minutes, with a mean operative experience of 11 +/- 4.7 years and total resection volume of 1680 +/- 930 g. Multiple linear regression analyses showed that operative time (R = 0.57) was most closely related to surgeon experience and resection volume. The complication rate diminished in a logarithmic fashion with increasing surgeon experience and in a linear fashion with declining operative time. The results of this study suggest a three-phase learning curve in which complication rates, variance in operative time, and operative time all decrease with surgeon experience. In-process statistical analyses may represent the beginning of a new paradigm in academic surgical quality and efficiency improvement in low-risk surgical procedures.
Article
Surgical site infections (SSI) continue to be a significant problem in surgery. The American College of Surgeons-National Surgical Quality Improvement Program (ACS-NSQIP) Best Practices Initiative compared process and structural characteristics among 117 private sector hospitals in an effort to define best practices aimed at preventing SSI. Using standard NSQIP methodologies, we identified 20 low outlier and 13 high outlier hospitals for SSI using data from the ACS-NSQIP in 2006. Each hospital was administered a process of care survey, and site visits were conducted to five hospitals. Comparisons between the low and high outlier hospitals were made with regard to patient characteristics, operative variables, structural variables, and processes of care. Hospitals that were high outliers for SSI had higher trainee-to-bed ratios (0.61 versus 0.25, p < 0.0001), and the operations took significantly longer (128.3+/-104.3 minutes versus 102.7+/-83.9 minutes, p < 0.001). Patients operated on at low outlier hospitals were less likely to present to the operating room anemic (4.9% versus 9.7%, p=0.007) or to receive a transfusion (5.1% versus 8.0%, p=0.03). In general, perioperative policies and practices were very similar between the low and high outlier hospitals, although low outlier hospitals were readily identified by site visitors. Overall, low outlier hospitals were smaller, efficient in the delivery of care, and experienced little operative staff turnover. Our findings suggest that evidence-based SSI prevention practices do not easily distinguish well from poorly performing hospitals. But structural and process of care characteristics of hospitals were found to have a significant association with good results.
Article
A prospective regional study was conducted to determine if the observed differences in in-hospital mortality rates associated with coronary artery bypass grafting (CABG) are solely the result of differences in patient case mix. DESIGN-Regional prospective cohort study. Data including patient demographic and historical data, body surface area, cardiac catheterization results, priority of surgery, comorbidity, and status at hospital discharge were collected. This study presents data for 3055 CABG patients between July 1, 1987, and April 15, 1989. This study includes data from all surgeons performing cardiothoracic surgery in Maine, New Hampshire, and Vermont; the data were collected from five regional medical centers. Data were collected from all consecutive isolated CABG surgery patients during the study period. Crude and adjusted in-hospital mortality rates associated with CABG. The overall crude in-hospital mortality rate for isolated CABG was 4.3%. The rate varied among centers (range, 3.1% to 6.3%) and among surgeons (range, 1.9% to 9.2%). Predictors of in-hospital mortality included increased age, female gender, small body surface area, greater comorbidity, reoperation, poorer cardiac function as indicated by a lower ejection fraction, increased left ventricular end diastolic pressure and emergent or urgent surgery. After adjusting for the effects of potentially confounding variables, substantial and statistically significant variability was observed among medical centers (P = .021) and among surgeons (P = .025). We conclude that the observed differences in in-hospital mortality rates among institutions and among surgeons in northern New England are not solely the result of differences in case mix as described by these variables and may reflect differences in currently unknown aspects of patient care. Understanding this variation requires a detailed understanding of the processes of care.
Article
This study analyzes data from New York State's new Cardiac Surgery Reporting System, which contains information about cardiac preoperative risk factors, postoperative complications, and hospital discharge. The purposes of the study were to determine the set of significant clinical risk factors and to identify cardiac surgical centers most likely to have serious quality-of-care problems. Significant risk factors for in-hospital death were age, gender, ejection fraction, previous myocardial infarction, number of open heart operations in previous admissions, diabetes requiring medication, dialysis dependence, disasters (acute structural defect, renal failure, cardiogenic shock, gunshot), unstable angina, intractable congestive heart failure, left main trunk narrowed more than 90%, and type of operation performed. Four of the 28 hospitals had significantly higher mortality rates than expected, given the risk factors of their patients. Subsequent site visits and medical record reviews confirmed that these facilities had high percentages of quality-of-care problems among cases resulting in mortality.
Article
Although the relation between hospital volume and surgical mortality is well established, for most procedures, the relative importance of the experience of the operating surgeon is uncertain. Using information from the national Medicare claims data base for 1998 through 1999, we examined mortality among all 474,108 patients who underwent one of eight cardiovascular procedures or cancer resections. Using nested regression models, we examined the relations between operative mortality and surgeon volume and hospital volume (each in terms of total procedures performed per year), with adjustment for characteristics of the patients and other characteristics of the providers. Surgeon volume was inversely related to operative mortality for all eight procedures (P=0.003 for lung resection, P<0.001 for all other procedures). The adjusted odds ratio for operative death (for patients with a low-volume surgeon vs. those with a high-volume surgeon) varied widely according to the procedure--from 1.24 for lung resection to 3.61 for pancreatic resection. Surgeon volume accounted for a large proportion of the apparent effect of the hospital volume, to an extent that varied according to the procedure: it accounted for 100 percent of the effect for aortic-valve replacement, 57 percent for elective repair of an abdominal aortic aneurysm, 55 percent for pancreatic resection, 49 percent for coronary-artery bypass grafting, 46 percent for esophagectomy, 39 percent for cystectomy, and 24 percent for lung resection. For most procedures, the mortality rate was higher among patients of low-volume surgeons than among those of high-volume surgeons, regardless of the surgical volume of the hospital in which they practiced. For many procedures, the observed associations between hospital volume and operative mortality are largely mediated by surgeon volume. Patients can often improve their chances of survival substantially, even at high-volume hospitals, by selecting surgeons who perform the operations frequently.
Article
To inform surgeons about the practical issues to be considered for successful integration of virtual reality simulation into a surgical training program. The learning and practice of minimally invasive surgery (MIS) makes unique demands on surgical training programs. A decade ago Satava proposed virtual reality (VR) surgical simulation as a solution for this problem. Only recently have robust scientific studies supported that vision A review of the surgical education, human-factor, and psychology literature to identify important factors which will impinge on the successful integration of VR training into a surgical training program. VR is more likely to be successful if it is systematically integrated into a well-thought-out education and training program which objectively assesses technical skills improvement proximate to the learning experience. Validated performance metrics should be relevant to the surgical task being trained but in general will require trainees to reach an objectively determined proficiency criterion, based on tightly defined metrics and perform at this level consistently. VR training is more likely to be successful if the training schedule takes place on an interval basis rather than massed into a short period of extensive practice. High-fidelity VR simulations will confer the greatest skills transfer to the in vivo surgical situation, but less expensive VR trainers will also lead to considerably improved skills generalizations. VR for improved performance of MIS is now a reality. However, VR is only a training tool that must be thoughtfully introduced into a surgical training curriculum for it to successfully improve surgical technical skills.
Article
Traditionally, surgeons have been trained and evaluated on the basis of their performance of surgical procedures in live patients. This article in the Medical Education series explores the use of mechanical devices for the teaching and evaluation of surgical skills.