What Is Retrieval-Augmented Generation?
With RAG, an LLM is better positioned to optimize its output before generating a response, says Tehsin Syed, Amazon Web Services’ general manager of health AI. This is valuable when a user is asking specific or technical questions.
“An authoritative external knowledge base is usually more current than the model’s training data, which is a key advantage,” he says. “For healthcare, this means LLMs can tap into the latest medical research, clinical guidelines and patient data to provide more accurate and contextually relevant responses.”
Along with improving accuracy, RAG can help organizations address concerns about bias in AI models that misrepresent risk and underestimate the need for care in minority populations. LLMs rely solely on pretrained knowledge, while Syed explains that RAG allows organizations to “curate more diverse, representative knowledge bases” and enables users to trace responses back to the source of information.
It’s important to note that RAG goes beyond simply fine-tuning an existing LLM. Fine-tuning adapts a model to a specific domain, and requires an extensive feedback loop of inputting additional training material and generating new questions and answers, Stroum says. Not surprisingly, that can be time-intensive and expensive.
RAG, on the other hand, doesn’t change the model but “augments its capabilities by retrieving and incorporating external information at runtime,” Syed says. “This approach offers greater flexibility, allowing the model to access the most current information without needing to be retrained.”
EXPLORE: Here are three areas where RAG implementation can be improved.
Benefits of RAG for Healthcare Institutions
By pulling in up-to-date information, RAG is meant to address the limitations of more traditional LLMs that don’t have access to the latest medical research, Syed says. Use cases for integrating the Amazon Comprehend Medical natural language processing service into a RAG workflow include automating medical coding, generating clinical summaries, analyzing medication side effects and deploying decision support systems.
Internally, RAG makes it possible for LLMs to pull in patient records and other confidential sources that general-purpose LLMs were never trained on. Health systems can use RAG to create highly personalized patient education materials, Syed notes.
This highlights a key benefit of RAG, which is its ability to navigate unstructured data. Stroum points to evidence of coverage documents; an insurer operating in multiple states can easily have hundreds of these. With RAG, it’s possible to prompt a model to pull up the copay for a specific procedure under a specific plan in a specific county.
RAG is also a significant step forward from traditional search functionality, which struggles to recognize that differences between verb tenses (such as ran and run) shouldn’t necessarily impact search results.
“Today’s models can see what you’re asking, and they’re more forgiving,” Stroum says.
As a result, RAG is more accessible to end users who are less tech-savvy, who otherwise may get frustrated. It also allows for more in-depth prompts. An HR team, for example, can search a repository of resumes for candidates with at least three years of experience in Current Procedural Terminology coding. “RAG still uses the base expectations of the language model, but now you can modulate the level of the conversation,” Stroum adds.