Can AI improve emergency rooms?

January 14, 2026

UM-Dearborn Associate Professor Mohamed Abouelenien and researchers from the U-M Medical School are looking to machine learning to boost ER survival rates.

Unless you work in one, most of what people know of the frenetic vibe of hospital emergency rooms comes from the never-ending stream of TV medical dramas. Some are truer to life than others. HBO Max’s current incarnation, “The Pitt,” which depicts the dynamics of a Pittsburgh ER team in real-time, one-hour increments, has actually won praise from medical practitioners for being fairly representative of real ERs. Watch it and you’ll see a lot of what doctors say makes the ER so intense: Teams trying to quickly diagnose and remedy life-threatening medical issues on a time crunch, veteran practitioners mentoring younger ones, doctors butting up against the limits of their expertise and experience, and brains generally running at full tilt to think clearly and stay cool.

UM-Dearborn Associate Professor of Computer and Information Science Mohamed Abouelenien hasn’t seen “The Pitt.” But the brand of intense, potentially life-and-death decision making that happens in the ER is right up his alley. For years, he has used his expertise in artificial intelligence and virtual reality to study the complexities of the cognitive load experienced by human drivers in an effort to make driving and vehicle technologies safer. So when U-M Medical School Assistant Professor of Learning Health Sciences Vitaly Popov approached him about collaborating on a study he was leading about the cognitive load experienced by ER teams, Abouelenien was all in. Popov’s core hypothesis was that as mental resources — both of individuals and teams — become overloaded, it could lead to diminished performance and decision making. The project, whose initial phases were funded by internal grants from MIDAS and eHail and has now received funding from the National Science Foundation, seeks to study the phenomenon in a quantifiable way — using an advanced multimodal machine learning model to better understand the factors that impact cognitive load of ER teams.

Right off the bat, this subject presents challenges that make direct study difficult. For example, the Health Insurance Portability and Accountability Act, or HIPAA, the federal law protecting sensitive patient health information, prevents recording of medical procedures without patient consent, which would be difficult or impossible to obtain for the kinds of severe medical emergencies the research team is interested in. Moreover, in situations where someone’s life is at stake, the study team also wouldn’t want to overload practitioners and ERs with potentially distracting sensors. So instead, they turned to virtual reality — creating a custom environment where multiple practitioners could work together in real time on a VR patient with a simulated medical emergency. During the simulations, the research team then collected diverse data from the environment, including audio and headset data that tracked the gazes of the participants.

A screenshot of a medical VR simulation showing a hospital emergency room and a VR patient on a gurney — An example of the virtual reality environment the study subjects used to do simulated surgeries. Image courtesy Mohamed Abouelenien and U-M Medical School.

Abouelenien’s main role in this project is to take the data from the simulations and other relevant streams and train a sophisticated machine learning model capable of detecting features related to cognitive load. For example, his team is currently working with specially tweaked large language models to analyze transcriptions from the simulated surgeries, to see if particular language features are associated with negative or positive outcomes. Similarly, the team plans to analyze non-linguistic “acoustic signals” — like tone of voice, changes in pitch, pace of speech, etc. from the audio recordings. Interestingly, they’ll also be blending data about these more empirical features of the environment with qualitative data, like the background of the team, how many times an individual has done a similar procedure, and post-game analysis of what the practitioners themselves thought went right or wrong in a particular simulation.

One of the beauties — and core challenges — of this project is how you take so many dissimilar data streams and integrate them into a single AI model. To do this, Abouelenien says researchers use an approach called multimodal learning, a type of machine learning that’s particularly good at accommodating diverse types of data and providing more holistic views of complex environments. He says there are a few different methods for doing this, and the team is still working out which one might be the best fit for this project. One method, for example, calls for building separate machine learning models for each kind of modality, i.e. an algorithm that makes assessments of cognitive load based on linguistic features, another that does it based on acoustic signals, another that uses data about the practitioner's gaze, etc. “So then for each of the modalities you’re using, each model will make its own decision, and then you essentially take a majority vote,” Abouelenien explains. Because different modalities may be better at detecting cognitive load under different conditions, he says there are also ways to refine the model to assign “scores” to which modalities are providing the most reliable conclusions at a given time.

Abouelenien says the ultimate goal of the project is to use the model’s insights to improve emergency room outcomes and thus save lives. Of course, ER teams already have other ways of doing this: Post-op debriefs are standard practice for improving future performance. But this approach relies on human intelligence — and the value of machine learning is that it can spot unusual correlations that might never occur to us. Abouelenien says the AI-based insights from this project could, for example, help lead to new data-informed ER practices that head off cognitive overload. Or, the model might directly power a system that could monitor the ER environment in real time, probing for signs of cognitive overload — sort of like how a baseball pitching coach looks for signs of fatigue in a starting pitcher.

Practical tools like this would be a ways off. And it’s worth noting that such systems would raise some interesting ethical questions. Would doctors, for example, trust the system or be tempted to power through fatigue the way they shrug off a 24-hour shift on the TV shows? Would these technologies have unexpected outcomes — as suggested by a recent study that found an AI tool commonly used to help doctors screen for colon cancer might be diminishing doctors’ own human abilities? And would such an AI-powered system introduce a complex liability landscape in situations when something goes wrong?

Abouelenien says these are all fascinating and — for now — far off questions. Currently, in year one of the three-year project, he and his graduate students are mostly working on processing, analyzing and cleaning the data so they can begin developing their machine learning models for each type of data stream. “I guess I’m lucky that’s our role,” he says. “I can leave some of those trickier questions to the doctors.”

###

Story by Lou Blouin