Skip to content
Back to AI Security Blue Team
RS

RAG Security Fundamentals

Learn why retrieval changes the AI security model, how untrusted documents reach the model, and what defenders must review before treating retrieved content as useful context.

60 minAI Security Blue Teameasy100 XP

Listen to hear this room section by section.

1

Task 1

What RAG Adds To The System

A retrieval-augmented system takes a user request, searches a knowledge source, and then places selected documents or snippets into the model context before the model answers. That can make the assistant more useful because the model no longer depends only on what it learned during training.

For defenders, the important point is that RAG introduces another path through which text reaches the model. A secure AI application is not just the model and the user prompt. It is also the retriever, the document store, the ranking logic, and the policy that decides how retrieved material is presented.

That means defenders have to review not only whether the answer is helpful, but also whether the retrieval path allows hostile, irrelevant, stale, or sensitive content to influence the system.

2

Task 2

Why Retrieved Content Is Untrusted By Default

A common beginner mistake is to treat retrieved text like trusted system knowledge simply because it came from the application's own search layer. In reality, many documents in a knowledge base begin as human-authored content, imported records, uploaded files, tickets, emails, web content, or synced internal documents.

That means retrieved content may be wrong, stale, adversarial, overly sensitive, or written in a way that tries to influence the model's behavior. Even when the content is not malicious, it can still be unsafe to treat it like a higher-priority instruction source.

The safer mental model is this: retrieval provides reference material, not policy authority. The application should preserve that distinction clearly.

3

Task 3

Common RAG Failure Paths

RAG failures often happen when the system mixes useful retrieval with too much trust. A retrieved document may contain hidden instructions, misleading claims, privileged data, or text that only makes sense in a narrow internal workflow. If the assistant treats that content like guidance it must obey, the system may become easier to manipulate.

Other failures happen earlier in the pipeline. Defenders should ask whether the corpus includes documents that should not have been indexed, whether search results are scoped correctly for the user, whether the model can cite stale or superseded procedures, and whether retrieved content is labeled with enough context for the application to treat it safely.

In short, RAG risk is not only "can a document inject the prompt?" It is also "did the system retrieve the right thing, for the right user, with the right trust handling?"

4

Task 4

Defensive Priorities For Retrieval

Blue teams usually defend RAG systems by combining several controls. They reduce what can be indexed, improve metadata and provenance, scope retrieval results to the right tenant or role, preserve source labels, prevent retrieved content from overriding policy, and add monitoring around suspicious retrieval patterns.

Some controls happen before retrieval, such as ingestion review and document classification. Some happen during retrieval, such as access filtering, ranking rules, and scope checks. Some happen after retrieval, such as context separation, output controls, and approval gates.

The key beginner lesson is that RAG is safer when retrieval is treated as one layer in a system, not a magical source of truth.

5

Task 5

Practical

Name one reason defenders should treat retrieved content as untrusted by default.

Enter one risk or reason retrieved content should not automatically be treated like policy.

6

Task 6

Pipeline Check

Name two parts of a RAG system a defender should review beyond the model itself.

Enter two retrieval components or stages that matter for defense.

7

Task 7

Control Check

Name two controls that make RAG systems safer.

Enter two controls defenders use to reduce retrieval risk.

Ready To Move On?

Up next: Document Trust, Provenance, and Ingestion Hygiene