Will GPT-4 be the technology that finally transforms healthcare for the better?

By Robert M. Wachter
Professor and Chair, Department of Medicine, University of California, San Francisco
A close-up image of a digitally designed series of cubes that range from colorless to glowing vibrant shades of green. This visual represents how an idea ignites the potential of AI to contribute to human flourishing, magnifying endless possibilities in this rapid revolution.

Healthcare’s digital transformation is far from an unalloyed success story. Soon after electronic health records became commonplace in the 2010’s, physicians began complaining bitterly that their relentless demand for documentation was turning doctors into data-entry clerks. And the portals that enable patients to see their physician’s notes, labs, and x-rays—illustrating the long-awaited digital democratization of care—generate a tsunami of digital messages to doctors, to be answered after-hours and without compensation.

As GPT-4 and other versions of Generative Artificial Intelligence (AI) take the world by storm, the question arises:

Will this be the technology that finally transforms healthcare for the better?

There is no question that GPT-4 represents a breathtaking advance in medical AI. I fed it a series of very tough clinical scenarios—the kinds of twisty-turny cases that challenge our very best clinicians. I found its overall clinical reasoning abilities akin to those of a very good medical resident—well beyond novice, but not quite expert. Still, considering the years and dollars it takes to mint an experienced physician, and the fact that billions of people don’t have access to one, I was impressed.

It’s worth appreciating that the search for diagnostic help, which was my first request of GPT-4, is the hardest ask of an AI tool in medicine. The healthcare system is replete with lower-hanging fruit–areas in which AI could address points of operational and logistical friction and lead to improvements in quality, access, and safety, or decrease cost. In testing GPT-4 for a variety of use cases in these domains, the system impressed me as a polymath. It was as comfortable describing copayments as explaining CRISPR and as adept at drafting a prior authorization as in predicting emergency department utilization.

In short, I am optimistic that GPT-4 will lead to far more benefit than harm in healthcare. I feel confident that it will succeed in automating significant parts of billing and coding, documentation, patient and clinician scheduling, and supply chain management. Each of these currently takes massive numbers of expensive FTEs, often completing tedious tasks that add little value for patients.

But for GPT-4 and its successors to be truly transformative in healthcare, it will need to change more than the way we handle paperwork. It will have to reshape clinical care.

In today’s healthcare system, patients and families have limited tools to manage their own problems, requiring them to interact with an array of credentialed and expensive experts. By changing this core dynamic—enabling everyone in the healthcare system to “practice at the top of their license”—GPT-4 may have its greatest impact. Aided by generative AI and similar tools, one can imagine patients and families managing some of their own health conditions without the help of a physician. Non-physician providers would handle cases previously requiring a physician. A single radiologist or pathologist would do the work of five by leveraging AI-assisted readings. Generalist physicians would deal with situations that today prompt a specialist consultation. These rosy scenarios would improve healthcare quality and access, and lower costs.


Will future versions of GPT-4 be the killer app that finally allows healthcare to deliver new levels of quality, safety, and equity, and improve patient and provider experiences?

But the barriers to this digital nirvana should not be minimized. New companies— either standalone entities or ones building products that integrate into the workflows of legacy organizations—will need to be formed to leverage these AI tools. Most of the stakeholders in the current healthcare system benefit from the status quo and will push back mightily against substantive changes in the flow of dollars and work. Privacy and data security concerns will add friction to the data-sharing needed to facilitate algorithm development and deployment. The training of all healthcare professionals and administrators will need to evolve.

In fact, I worry most about training. Until the AI is perfect (and GPT-4 is not—most vividly illustrated by its propensity for “hallucination” [situations in which AI simply makes things up in response to certain prompts]), there will need to be a human operator, usually an MD, to endorse or modify GPT’s diagnosis or recommendations, particularly since someone must accept legal liability for misfires. The challenges of automation complacency and overreliance on technology will be substantial.

How can we ensure that physicians and other clinical personnel stay alert when the AI is correct 98 percent of the time?

And how can our training programs give clinicians the skills to practice independently when the technology fails them? We’ve seen this problem in aviation. Several deadly crashes have occurred when generally reliable technologies went awry and relatively inexperienced pilots could no longer fly the plane without their digital wingman. In healthcare, given the complexity of the human condition and the systems in which we work, the technologies will fail often. Will there be a doctor or nurse available who is sufficiently skilled to grab the metaphorical wheel? I hope so, but truly don’t know.

The first decade of healthcare’s digital journey, which was dominated by the implementation of electronic health records, has been characterized by unanticipated consequences and considerable disappointment. Will future versions of GPT-4 be the killer app that finally allows healthcare to deliver new levels of quality, safety, and equity, and improve patient and provider experiences? And will it generate efficiencies that allow our healthcare system to operate at a cost that doesn’t bankrupt businesses, government, and patients? I believe the answer will be yes, but it exceeds the capacity of this particular human to predict whether these benefits will be seen in two years or twenty.

The view, opinion, and proposal expressed in this essay is of the author and does not necessarily reflect the official policy or position of any other entity or organization, including Microsoft and OpenAI. The author is solely responsible for the accuracy and originality of the information and arguments presented in their essay. The author’s participation in the AI Anthology was voluntary and no incentives or compensation was provided.

Wachter, R. M. (2023, May 30). Will GPT-4 Be the Technology That Finally Transforms Healthcare for the Better? In: Eric Horvitz (ed.), AI Anthology. https://unlocked.microsoft.com/ai-anthology/robert-wachter


Robert M. Wachter

Robert M. Wachter M.D., is professor and chair of the Department of Medicine at the University of California, San Francisco. He is author of The Digital Doctor: Hope, Hype and Harm at the Dawn of Medicine’s Computer Age, and advises several companies in the digital healthcare space.

A black and white photo of a man in a lab coat.