AI in Surgery: Risks and Regulatory Challenges Explored

Key Takeaways

From 7 to 100+ adverse event reports. After a single AI update to a surgical navigation device, FDA filings involving that product increased 14-fold — including strokes, skull perforations, and cerebrospinal fluid leaks.

Over 1,300 AI-enabled medical devices are already FDA-authorized — and fewer than 2% were supported by randomized clinical trial data before reaching patients.

Publicly traded medtech companies are nearly 6 times more likely to have an AI device recalled than private ones. Investor pressure to launch fast appears to be a measurable safety variable.

The “black box” problem is intraoperative. When an AI system tells a surgeon where to cut in real time and is wrong, there is often no mechanism for the surgeon to know it — until the damage is done.

No one is clearly liable. When AI contributes to surgical harm, responsibility is split between surgeon, hospital, manufacturer, and regulator. Currently, the patient absorbs the consequence while accountability evaporates across the chain.

A Leap Forward That Stumbled

In 2021, Acclarent — a subsidiary of Johnson & Johnson — announced what it called “a leap forward” in surgical technology. The company had integrated a machine-learning algorithm into its TruDi Navigation System, a device used by ear, nose, and throat surgeons to navigate the sinuses during procedures for chronic sinusitis. The AI component, Acclarent claimed, would help surgeons identify anatomical landmarks and calculate optimal instrument paths in real time, making a delicate procedure safer and more efficient.

Before the AI update, the U.S. Food and Drug Administration had received seven reports of device malfunctions and one injury report associated with TruDi. In the four years following the AI integration, the FDA received at least 100 reports of malfunctions and adverse events. At least 10 patients were injured. The injuries included cerebrospinal fluid leaks, a puncture at the base of a patient’s skull, and at least two strokes caused by accidental damage to the carotid artery. One patient required removal of part of her skull to relieve brain swelling. Two lawsuits filed in Texas federal courts allege that the AI-enhanced TruDi system’s guidance contributed directly to these outcomes.

The TruDi story is the most detailed case study in a sweeping Reuters investigation published February 9, 2026, which drew on FDA adverse event databases, internal legal filings, regulatory documents, and interviews with current and former FDA scientists. Its findings are not about the distant future of surgical AI. They are about what is already happening — and what the regulatory system has so far struggled to address.

The Scale of AI’s Entry into Medicine

The TruDi device is one of more than 1,357 AI-enabled medical devices now listed in the FDA’s official tracker — a number that has more than doubled since the end of 2022. That acceleration is driven by both genuine technological progress and strong commercial incentives: the AI-enabled medical device market was valued at roughly $13.7 billion in 2024 and is projected by some analysts to exceed $250 billion by 2033.

These devices span an enormous clinical range: radiology systems that detect tumors in CT scans, algorithms that interpret electrocardiograms for arrhythmias, software that guides orthopedic implant placement, prenatal ultrasound tools that assess fetal anatomy, and surgical navigation platforms like TruDi. The vast majority — nearly 97% — entered the U.S. market through the FDA’s 510(k) clearance pathway, which requires manufacturers to demonstrate that a new device is “substantially equivalent” to a predicate device already on the market. This pathway does not require prospective clinical trials on human patients. Many AI devices have reached operating rooms without ever being formally tested on the populations that will ultimately use them.

What the Adverse Event Data Reveals

The Reuters review identified at least 1,401 adverse event reports filed with the FDA between 2021 and October 2025 involving devices on the FDA’s AI list. Of those, at least 115 specifically mentioned problems with software, algorithms, or programming — the components that AI adds to an otherwise familiar device.

The TruDi case was not isolated. The Reuters investigation also examined FDA filings involving the Sonio Detect system, AI-powered prenatal ultrasound software owned by Samsung Medison, a subsidiary of Samsung Electronics. An FDA report from June 2025 alleged that Sonio Detect’s algorithm incorrectly labeled fetal structures and assigned them to the wrong anatomical locations. No patient harm was reported in that filing. Samsung Medison said the FDA report “does not indicate any safety issue” and that the agency had not requested corrective action. Additionally, at least 16 FDA reports implicated AI-assisted cardiac monitors made by Medtronic in failures to detect abnormal heart rhythms or significant pauses — again, with no confirmed patient injuries reported, and Medtronic attributing some cases to user confusion about data display rather than algorithmic error.

These cases represent different severity levels and different degrees of confirmed harm, and it would be inaccurate to treat them as equivalent. But they share a common thread: AI systems behaving in ways that diverged, sometimes dangerously, from what clinicians and patients expected.

The TruDi Case: A Cautionary Anatomy

The specific mechanics of TruDi’s alleged failures illuminate the particular risks of AI in intraoperative navigation. The system is designed to track the precise location of surgical instruments inside a patient’s head using imaging data, and to inform the surgeon of where the instrument tip is relative to critical structures — the carotid artery, the skull base, the orbit. In the most serious adverse event reports, the system allegedly misinformed surgeons about instrument location in real time, leading to incisions in the wrong anatomical location.

Internal legal filings cited in the Reuters investigation allege that Acclarent set a goal of only 80% accuracy for some AI components before integrating them into the TruDi system — a threshold that, in the context of operating near the carotid artery, represents a startling tolerance for error. A lawsuit brought by surgeon Marc Dean — who had been a paid consultant to Acclarent — alleges that he had warned the company of unresolved safety issues before the AI update was released. The complaint states that Acclarent proceeded nonetheless and “lowered its safety standards to rush the new technology to market.”

Integra LifeSciences, which acquired Acclarent and the TruDi system in 2024, contests the causal claims: the company states that adverse event reports “do nothing more than indicate that a TruDi system was in use in a surgery where an adverse event took place” and that no credible evidence establishes a causal link between the AI component and any injury. This defense is technically valid — FDA adverse event reports are not determinations of causation, and surgical complications can and do occur independent of device failures. But the 14-fold increase in adverse event reports following the AI update is a signal that regulators and independent scientists have found difficult to dismiss.

A Structural Problem: Regulation Built for Static Devices

The core regulatory challenge exposed by these incidents is that existing frameworks were designed for a fundamentally different kind of medical technology. Traditional medical devices — a scalpel, an implantable hip, even a conventional surgical navigation system — perform fixed functions. Their performance can be characterized before authorization and remains stable afterward. Regulatory review can verify that the device does what it claims.

AI systems, particularly those that use machine learning, are categorically different. Their outputs emerge from pattern recognition trained on specific datasets. They may perform well on data similar to their training set and poorly on cases outside it — unusual anatomy, rare pathology, patients from demographic groups underrepresented in training data. And many current systems function as “black boxes”: even their developers cannot fully explain why they produce a particular output in a given case, only that statistically, across the training distribution, outputs have been accurate with acceptable frequency.

The 510(k) pathway’s substantial-equivalence standard was not designed with this adaptive, probabilistic, opaque category of technology in mind. A 2025 analysis of 691 FDA-cleared AI devices published in JAMA found that 46.7% of FDA decision summaries did not describe the study design used to evaluate the device, and 53.3% omitted the sample size. Fewer than 2% of cleared AI devices were supported by randomized clinical trial data. Only three devices — fewer than 1% — reported patient health outcomes as part of their authorization evidence.

Investor Pressure and the Recall Pattern

A peer-reviewed study published in JAMA Health Forum in August 2025 by researchers from Johns Hopkins, Georgetown, and Yale adds a structural layer to the regulatory picture. The team analyzed recall records for nearly 1,000 FDA-authorized AI-enabled medical devices and found that 60 devices had been involved in 182 total recalls. Among AI devices cleared through 510(k), 43.4% of recalls occurred within the first year of authorization — approximately twice the recall rate observed for all 510(k) devices in the same period.

More striking was the distribution by company type. Publicly traded manufacturers accounted for just over half of all AI medical devices studied but were responsible for more than 90% of recall events, and were nearly six times more likely to have a device recalled than private companies. The study’s corresponding author, Tinglong Dai of Johns Hopkins Carey Business School, attributed this to investor-driven pressure to launch quickly: publicly traded medtech companies face quarterly earnings expectations and shareholder pressure in ways that private companies do not, creating structural incentives to prioritize speed over the longer timelines required for thorough clinical validation.

The FDA’s Capacity Problem

The investigation also revealed that regulatory capacity has not grown at anything approaching the pace of AI device submissions. Five current and former FDA scientists told Reuters that the agency is struggling to manage the increasing volume and complexity of AI-related submissions, in part due to staffing reductions following government cost-cutting measures. Reviewing a traditional medical device requires engineers and clinicians familiar with materials, mechanics, and established clinical evidence. Reviewing a machine-learning algorithm requires additional expertise in data science, algorithmic auditing, training data composition, and distributional shift — a skill set that remains scarce in government.

A spokesperson for the Department of Health and Human Services, which encompasses the FDA, said the agency is working to expand its capacity in AI device review. The FDA has also signaled plans to develop new frameworks for what it calls “Predetermined Change Control Plans” — mechanisms that would allow AI manufacturers to update their algorithms post-authorization within pre-agreed parameters without filing an entirely new application. Whether this expedites safe iteration or creates new gaps in oversight remains actively debated among device safety researchers.

The Accountability Vacuum

When an AI system contributes to a surgical complication, the existing liability landscape provides no clear answer about who bears responsibility. The surgeon operated the device. The hospital purchased and deployed it. The manufacturer sold it with FDA clearance. The algorithm was trained by engineers on data curated by data scientists. The clearance pathway was approved by a regulator with inadequate resources to probe its limitations.

In the TruDi litigation, plaintiffs have named the device manufacturer and its predecessors. In future cases, hospitals may face claims for failing to adequately vet the systems they adopt, or for failing to train surgeons in the limitations of AI guidance. Manufacturers may argue that FDA clearance confers a degree of regulatory immunity. These questions remain unresolved in U.S. courts, and the absence of clear legal precedent contributes to a risk-distribution problem: currently, patients bear the consequences of AI failure, while responsibility for those consequences remains diffuse.

What Responsible AI Integration Looks Like

None of the experts who spoke with Reuters, nor the academic researchers who have studied AI device recalls, argue that AI has no place in surgery. There is genuine evidence that AI can reduce surgical errors in specific, well-validated contexts. Systems designed to highlight “go” and “no-go” zones during laparoscopic cholecystectomy have shown strong performance in multicenter studies. AI interpretation of intraoperative video for phase recognition and instrument tracking has demonstrated accuracy competitive with expert human review across thousands of annotated surgical cases.

The distinction that researchers and clinicians consistently draw is between AI that augments clinical judgment and AI that replaces or overrides it — and between AI that has been rigorously validated on representative populations before deployment and AI that is optimized in the laboratory and tested, in effect, on patients. The TruDi case exemplifies the second scenario in both dimensions.

Responsible integration requires prospective clinical validation on diverse, representative patient populations; mandatory post-market surveillance with structured adverse event reporting; algorithmic transparency sufficient for clinicians to understand when an AI output should be questioned; and liability frameworks that incentivize manufacturers to invest in safety rather than speed. Several of these elements are currently either absent or structurally underenforced in the U.S. regulatory environment.

***

The Reuters investigation published in February 2026 is not a case against AI in medicine. It is a case against the conditions under which AI is currently entering medicine — specifically, the combination of accelerated commercial deployment, regulatory frameworks designed for an older generation of devices, resource-constrained oversight bodies, inadequate clinical validation standards, and an accountability structure that has yet to assign clear responsibility when automated systems cause harm.

AI will not leave the operating room. The economic forces, the genuine clinical potential, and the momentum of adoption are all too substantial. The question is whether the patients on whom these systems are being used will benefit from the same rigorous evidence standards that govern drug approvals — or whether the medtech industry’s speed advantage over regulators will continue to be measured in patient injuries.

Sources

Robbins R, et al. As AI enters the operating room, reports arise of botched surgeries and misidentified body parts. Reuters. February 9, 2026.

Lee B, Kramer P, Sandri S, et al. Early Recalls and Clinical Validation Gaps in Artificial Intelligence–Enabled Medical Devices. JAMA Health Forum. 2025;6(8):e253172. doi:10.1001/jamahealthforum.2025.3172

Johns Hopkins Hub. Investor pressure may be driving risky AI medical device launches. October 30, 2025. Available at: https://hub.jhu.edu/2025/10/30/investor-pressure-risky-ai-medical-devices/

FDA AI/ML-Enabled Medical Devices Tracker. Available at: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices

Vice / Motherboard. AI-Powered Tools in the Operating Room Is Probably a Bad Idea. February 15, 2026. Available at: https://www.vice.com/en/article/ai-powered-tools-in-the-operating-room-is-probably-a-bad-idea/

Hashimoto DA, Rosman G, Rus D, Meireles OR. Artificial Intelligence in Surgery: Promises and Perils. Ann Surg. 2018;268(1):70–76. doi:10.1097/SLA.0000000000002693

Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med. 2018;1:39. doi:10.1038/s41746-018-0040-6

Maier-Hein L, et al. Surgical data science — from concepts toward clinical translation. Artif Intell Med. 2022;76:102313. doi:10.1016/j.artmed.2022.102313

Medboundtimes.com. AI Enters Operating Rooms, Raising Alarms Over Surgical Errors and Patient Safety. February 2026. Available at: https://www.medboundtimes.com/medicine/ai-operating-rooms-safety-concerns-botched-surgeries

IntuitionLabs. FDA’s AI Medical Device List: Stats, Trends & Regulation. November 2025. Available at: https://intuitionlabs.ai/articles/fda-ai-medical-device-tracker