Close

See How Your Peers Are Moving Forward in the Cloud

New research from CDW can help you build on your success and take the next step.

Mar 24 2025
Artificial Intelligence

5 Questions About Retrieval-Augmented Generation

This framework can help healthcare organizations refine their use of large language models.

Artificial intelligence continues to be a major focus in healthcare. As organizations deploy their own solutions to support clinicians and guide patients through their care journeys, they’ll find that retrieval-augmented generation may be key to improving tools using large language models.

Click the banner below to achieve meaningful AI transformation in healthcare with expert guidance.

 

1. What Is Retrieval-Augmented Generation?

RAG is a way to leverage already-trained large language models such as GPT-4, Gemini, Bard and Llama when building applications powered by artificial intelligence. By adding local knowledge (such as hospital policy or protocol information), context (such as clinician profile information) or history (such as patient clinical data), RAG augments LLMs to avoid common AI problems, such as a lack of specific information and LLM hallucinations.

2. How Does RAG Work?

RAG essentially “wraps” an LLM by adding relevant information to the prompt (query) that is sent to the LLM. For example, suppose a clinician wants to ask a question: “Should I increase the dosage of this drug for this patient?” With RAG, the question is processed first to understand the kind of query and specifics being addressed. Then, the RAG tool might retrieve the hospital protocol for the drug, manufacturer’s recommendations, patient history and recent lab results, sending all of this to the LLM along with the question from the clinician. This gives the LLM local knowledge, context and history to help answer the question. All of this is invisible to the clinician because the RAG wrapper is doing the work of choosing what to send to the LLM.

DIVE DEEPER: How can organizations avoid LLM bias and inaccuracy using data governance?

3. How Does RAG Compare to Fine-Tuning?

Fine-tuning an existing LLM adds information to the model, usually private data. This can be useful to make the LLM better at specific tasks. RAG improves the LLM with up-to-date and contextually useful information at the moment the LLM is queried. The patient data isn’t saved in the model, yet the model always has the latest information it needs to help answer questions, and there’s no security issue of leaking confidential data.

4. What Are the Benefits of Using RAG?

RAG extends the value of LLMs by giving the model additional information: local, relevant documents and protocols as well as real-time information from clinical databases. Clinicians and researchers can ask questions based on what is happening this minute, not when the LLM was trained. This additional information lets the LLM deliver answers that are both more relevant and more accurate. IT teams can also build in higher levels of security and tighter access controls, only feeding in information that the person asking the question is allowed to know.

5. What Are the Challenges of RAG?

RAG applications must preprocess user prompts to decide what additional information to send down. That can be a difficult job, and there’s a chance that the RAG application will send the wrong data down. Also, just because you’ve given an LLM additional information doesn’t mean that it is going to properly understand that data and incorporate it into the response.

hirun / Getty Images