August 2024
·
14 Reads
Journal of General Internal Medicine
Background Institutions rely on student evaluations of teaching (SET) to ascertain teaching quality. Manual review of narrative comments can identify faculty with teaching concerns but can be resource and time-intensive. Aim To determine if natural language processing (NLP) of SET comments completed by learners on clinical rotations can identify teaching quality concerns. Setting and Participants Single institution retrospective cohort analysis of SET ( n = 11,850) from clinical rotations between July 1, 2017, and June 30, 2018. Program Description The performance of three NLP dictionaries created by the research team was compared to an off-the-shelf Sentiment Dictionary. Program Evaluation The Expert Dictionary had an accuracy of 0.90, a precision of 0.62, and a recall of 0.50. The Qualifier Dictionary had lower accuracy (0.65) and precision (0.16) but similar recall (0.67). The Text Mining Dictionary had an accuracy of 0.78 and a recall of 0.24. The Sentiment plus Qualifier Dictionary had good accuracy (0.86) and recall (0.77) with a precision of 0.37. Discussion NLP methods can identify teaching quality concerns with good accuracy and reasonable recall, but relatively low precision. An existing, free, NLP sentiment analysis dictionary can perform nearly as well as dictionaries requiring expert coding or manual creation.