Artificial intelligence systems designed to guide surgical trainees appear to be more effective when paired with human instructors who can interpret and personalize artificial intelligence feedback, according to a randomized clinical trial.
In a study published in JAMA Surgery, medical students training on a virtual reality neurosurgical simulator achieved higher technical performance and better skill transfer when expert instructors delivered real-time feedback informed by artificial intelligence (AI) performance data, compared with AI tutoring alone or expert feedback delivered verbatim from the AI system.
The trial enrolled 87 medical students with no prior simulator experience and randomly assigned them to 1 of 3 feedback strategies during simulated brain tumor resections: AI-generated verbal feedback alone, expert feedback using identical AI-generated wording, or personalized expert feedback guided by AI-derived performance metrics. Performance was assessed using an AI-calculated composite expertise score ranging from novice to expert.
Across repeated practice trials, students receiving AI-augmented personalized expert instruction demonstrated greater improvement in surgical performance than those in the AI-only group. The difference was most evident in later practice sessions and persisted during a more complex resection task designed to assess skill transfer. In that scenario, the personalized expert group achieved higher overall expertise scores and lower bleeding and tissue injury risk compared with the AI-only group.
Expert instruction alone also outperformed AI tutoring on some measures, suggesting that the presence of a human instructor yields benefits even when feedback content is identical. However, the strongest gains were observed when instructors could adapt their guidance using real-time AI data rather than relying on scripted prompts.
Secondary analyses examined emotional responses and cognitive load. Trainees receiving personalized expert instruction reported higher intrinsic cognitive load and greater negative activating emotions, such as frustration, particularly during demanding tasks. Despite this, the group showed superior technical performance.
In their discussion, the authors frame AI as an adjunct to, rather than a substitute for, human instruction in surgical training. While AI systems can deliver objective performance metrics consistently, they note that expert instructors are needed to interpret and personalize feedback and suggest that future training should focus on integrating AI tools into instructor-led education rather than replacing it.
Disclosures can be found in the study.
Source: JAMA Surgery