Close

New AI Research From CDW

See how IT leaders are tackling AI opportunities and challenges.

Apr 15 2025
Artificial Intelligence

Prompt Engineering in Healthcare: Best Practices, Strategies and Trends

Guiding a large language model via specific prompts is critical for producing desired outputs and ensuring healthcare systems make the most out of generative artificial intelligence.

Artificial intelligence can generate significant value for healthcare organizations, and large language models can be useful tools for improving administrative efficiencies and enhancing care quality. However, since artificial superintelligence is still the stuff of science fiction, LLMs need proper guidance in order to be useful to medical professionals, researchers and others in the healthcare industry.

An AI-powered system requires training about how to interpret and make sense of the data that it’s trained on, and users must know how to craft their questions to solicit the answers they need from the tool. That’s where prompt engineering comes in. 

“Health systems are increasingly turning to AI solutions to ease burdens, expand care access and accelerate clinical insights,” says Kenneth Harper, general manager of the Dragon product portfolio at Microsoft. “Placing strong emphasis on prompt engineering ensures the healthcare industry can harness the full potential of AI to improve patient outcomes and streamline operations. It is a key piece in driving success and lasting, positive impact through AI.”

Click the banner below to read the 2025 CDW AI Research Report.

 

What Is Prompt Engineering in Healthcare?

Prompt engineering is the process of telling an AI solution what to do and how to do it. Using precise and effective natural language prompts, users provide the LLM with a set of instructions about how to complete the task to generate accurate and useful answers. This can include telling the LLM the type of sources to reference and the format in which the user wants the information presented. 

Google notes that “prompt engineering is the art and science of designing and optimizing prompts to guide AI models, particularly LLMs, towards generating the desired responses.” Amazon Web Services notes that prompt engineers “choose the most appropriate formats, phrases, words, and symbols that guide the AI” and that the process requires a combination of “creativity plus trial and error” to achieve intended outcomes.

What Are Key Best Practices for AI Prompt Engineering in Healthcare?

Here are some prompt engineering best practices to keep in mind:

Prompts Must Be Specific

AI prompts need to be very specific to avoid irrelevant responses. Use clear and concise language and tell the LLM the desired response format, such as a summary, chart or list. For example, a physician could ask the LLM to “summarize three possible treatment plans for a 55-year-old male diagnosed with Type 2 diabetes, and limit each summary to 300 words.”

“In healthcare, you don’t want the LLM sourcing Wikipedia or an entertainment magazine for diagnoses recommendations,” says Dr. Tim Wetherill, chief clinical officer at Machinify. “You can instruct the LLM to use only peer-reviewed sources, and to share whether there are any flagged concerns about the literature it reviews.” 

Provide Relevant Context With Follow-Up Prompts

Follow-up prompts provide more context and help generate more specific responses. A follow-up to the prompt about treatments for a patient with diabetes could be, “The patient is immunocompromised due to a recent organ transplant. Adjust the treatment plan to account for potential drug interactions and infection risk.” 

Wetherill says when he is experimenting with drafting prompts, “one of the things I do is I tell the LLM to ask me questions or to make suggestions that will improve the output.” He describes prompt engineering as “half art and half science. It’s not a one-step process. You have to be willing to put in the time to get value.” 

EXPLORE: Today's AI involves data governance, LLMs and a quest to avoid bias and inaccuracy.

Give Examples of Desired Outputs

In prompt engineering, users can generate desired outputs by demonstrating what a proper response looks like. The AI learns from the provided examples and can use that knowledge to continually improve outputs. A negative example can also show the AI outputs what to avoid. 

“The more specific we can be, the less we leave the LLM to infer what to do in a way that might be surprising for the end user,” says Jason Kim, a prompt engineer and technical staff member at Anthropic, which developed Claude AI. “We have classic examples for Claude to follow that stipulate the format and the nature of the process that we want it to build from.”

Consider Feedback From Users

As a healthcare organization incorporates an LLM into its system, prompt engineering best practices may evolve based on how the AI performs. To analyze how the LLM is working, “we get evaluations from doctors and researchers,” Kim says. “With feedback, you’re able to tweak and update the design of the prompts.”

“Prompt engineering in healthcare should involve continuous testing, evaluation and improvement based on feedback from performance metrics and medical professionals,” Harper adds. “It is important for the output to be tested and validated in real clinical settings prior to being deployed at scale.”

Jason Kim
The more specific we can be, the less we leave the LLM to infer what to do in a way that might be surprising for the end user.”

Jason Kim Prompt Engineer and Technical Staff Member, Anthropic

Do Different LLMs Affect Prompt Engineering Strategies?

Trial and error is a critical part of prompt engineering because not all large language models perform the same way. Users should experiment with the format of their AI prompts to determine whether the model responds better to direct commands or conversational questions. 

The LLM may prefer more structured inputs. To use the example of creating a treatment plan for the patient with diabetes, the LLM may function better if given a bulleted list that includes information such as the patient’s diagnosis, comorbidities and medications. 

“You have to understand that each model has its own behavior,” Wetherill says. “Learning how the specific LLM works, and going back and forth and tweaking the prompts, that’s part of the process.” 

A more general LLM, such as ChatGPT, will also perform differently compared with a model that was designed specifically for medical professionals or researchers, such as Google’s Med-PaLM or Microsoft’s BioGPT

“IT professionals should tailor their approaches based on the specific LLM strengths and considerations,” Harper recommends. “Building AI prompts on top of a fine-tuned clinical LLM will yield different results than building AI prompts on top of a general purpose LLM.”

DISCOVER: A qualified technology partner can help healthcare organizations drive business results with AI.

What Is the Future of Prompt Engineering in Healthcare AI?

“Prompt engineering tools are already becoming increasingly more sophisticated and will allow for more complex actions using contextual memory,” Harper says. He predicts that as the technology evolves, “not only will AI prompting trigger an output but also a set of actions to be carried out.” 

Kim expects that as LLMs become smarter, there will be more of an emphasis on explaining how it arrived at particular answers. “I think the focus will shift to include not only completeness and intelligence, but also a higher degree of traceability or auditability.”

Wetherill also advocates for medical professionals having a more direct role in training AI models. “You need prompt engineers who understand the content, and that’s why it’s really important for healthcare professionals and the data scientists to collaborate,” he says.

NanoStockk/Getty Images