I Poisoned a RAG Knowledge Base in Three Minutes โ€” Here Is Why Every Company Using AI Should Be Terrified

I Poisoned a RAG Knowledge Base in Three Minutes โ€” Here Is Why Every Company Using AI Should Be Terrified

By Alex Chen ยท ยท 6 min read ยท 19 views

Three fabricated documents. Three minutes. Zero exploits. That is all it took for a security researcher to make an AI system confidently report that a company's quarterly revenue was $8.3 million โ€” when the real number was $24.7 million with a $6.5 million profit.

No jailbreak. No prompt injection. No hacking in the traditional sense. Just three carefully worded documents slipped into the knowledge base that the AI uses to answer questions. The AI did exactly what it was designed to do: it retrieved the most relevant documents and summarized them. It just happened that the most "relevant" documents were poisoned.

I have been writing about cybersecurity for a while, and I genuinely lost sleep over this one. Not because the attack is clever โ€” it is embarrassingly simple. But because the defense is genuinely hard, and almost nobody in production is doing it.

What RAG Is and Why You Should Care

RAG stands for Retrieval-Augmented Generation. If that means nothing to you, here is the plain English version: it is the system that lets an AI chatbot answer questions about your company's data. Instead of relying solely on what the AI was trained on, RAG pulls relevant documents from a knowledge base โ€” your internal wiki, your financial reports, your policy documents โ€” and feeds them to the AI alongside the user's question.

My friend Greg, who runs IT at a mid-sized logistics company, described it this way: "It is like giving the AI a filing cabinet and saying 'look stuff up before you answer.'" That is pretty accurate, honestly.

The problem is: what happens when someone puts fake files in the filing cabinet?

The Attack: Simpler Than You Think

Security researcher Amin Rezaei published a full lab demonstrating this attack, complete with code you can run on your own laptop. No GPU needed. No cloud. A MacBook Pro and about ten minutes of setup.

The knowledge base started with five clean company documents: a travel policy, IT security policy, Q4 2025 financials showing $24.7M revenue and $6.5M profit, employee benefits, and an API rate-limiting config. Standard corporate stuff.

Then three poisoned documents were added:

Document 1: The "CFO-Approved Correction"

A fake document claiming to be a board update with "corrected figures" โ€” revenue restated to $8.3M, net loss of $13.8M. It used authority language: "CFO Office," "Chief Accounting Officer," "supersedes all previous reports."

Document 2: The "Regulatory Notice"

A fabricated SEC inquiry notice referencing both the real number ($24.7M) and the fake one ($8.3M), framing the real number as "originally reported" โ€” implying it was the error.

Document 3: The "Emergency Board Communication"

A fake internal memo discussing workforce reduction plans and preliminary acquisition discussions in response to the "financial restatement."

After adding these three documents, the researcher asked the AI: "How is the company doing financially?"

The AI confidently reported $8.3M revenue, down 47% year-over-year, with restructuring underway. Across 20 independent test runs, the attack succeeded consistently.

Close-up of manipulated data and documents representing knowledge base poisoning in AI systems

Why This Works: The Math Behind the Madness

This is not a bug. It is how RAG systems are designed to work, and that is what makes it so dangerous.

RAG systems retrieve documents based on similarity scores โ€” essentially, "how closely does this document match the question?" The poisoned documents were crafted to contain the exact vocabulary a financial query would trigger: "Q4 2025," "revenue," "financial results," "profit," "loss." They score higher on relevance than the legitimate financial report because they contain more of these terms, more prominently placed.

A paper called PoisonedRAG (Zou et al., USENIX Security 2025) formalized this mathematically. For the attack to succeed, two conditions must be met:

  • Retrieval Condition: The poisoned document must score higher cosine similarity to the target query than the legitimate document it is displacing
  • Generation Condition: Once retrieved, the content must cause the AI to produce the attacker's desired answer

The paper demonstrated a 90% success rate against knowledge bases containing millions of documents. Not five. Millions.

My colleague Sandra, who works in AI safety, put it bluntly when I showed her the paper: "So the AI is basically doing what we told it to do โ€” trust the most relevant documents. We just never considered that 'most relevant' and 'most trustworthy' are completely different things."

Exactly.

Real-World Implications That Should Scare You

Scenario 1: Financial Manipulation

A company uses RAG to let executives query internal financial data. An attacker with write access to the knowledge base inserts fabricated financial documents. The CEO asks the AI about quarterly performance and gets manipulated numbers. Investment decisions, board presentations, earnings guidance โ€” all based on poisoned data.

Scenario 2: Medical Misinformation

A healthcare organization uses RAG to help clinicians access treatment guidelines. Poisoned documents could alter drug dosage recommendations or contraindication warnings. The AI would present the falsified information with the same confidence as legitimate clinical data.

A law firm uses RAG to search case law and precedents. Fabricated case summaries or altered legal opinions could lead attorneys to cite non-existent precedents. We have already seen lawyers cite AI-hallucinated cases. Poisoned RAG makes it worse because the AI is not hallucinating; it is accurately summarizing fake sources.

Scenario 4: Customer Support Sabotage

A company uses RAG-powered chatbots for customer service. Poisoned documents about product specifications, return policies, or safety warnings could cause the chatbot to give customers incorrect โ€” or dangerous โ€” information.

Who Has Access to Your Knowledge Base?

Here is the question that kept me up at night after reading this research: in your organization, who can add documents to the knowledge bases your AI systems use?

In many companies, the answer is "a lot of people." Knowledge bases are often populated from shared drives, wikis, Confluence spaces, SharePoint libraries โ€” places where document access is broadly permissioned. The attack does not require admin access. It requires the ability to add a document.

And it gets worse. Many RAG systems automatically ingest documents from external sources โ€” partner portals, vendor documentation, regulatory feeds. An attacker who compromises any upstream source can poison your AI without ever touching your systems directly.

Tom โ€” my go-to for infrastructure security takes โ€” summed it up: "We spent years locking down database access. Now we are building AI systems that treat a shared Google Drive as a database. And nobody is asking who has write access to the Drive folder."

Defenses That Actually Work (and One That Does Not)

What Does NOT Work: Content Filtering

You cannot filter out poisoned documents by scanning for malicious content. The documents are not malicious in the traditional sense. They contain no code, no exploits, no payloads. They are just... documents. Well-written, authoritative-sounding documents that happen to contain false information.

What Helps: Document Provenance and Access Control

Treat your RAG knowledge base like a database, not a file share. Every document should have a verified source, a timestamp, and an access control list. Documents from unverified sources should be flagged or quarantined.

What Helps: Retrieval Diversity

Instead of returning only the top-k most similar documents, implement retrieval strategies that ensure source diversity. If three of the top five results are from the same unverified source, that should trigger a warning.

What Helps: Citation and Transparency

The AI should always cite which documents it used to generate an answer. Users should be able to click through to the source document and verify it.

What Helps: Anomaly Detection

Monitor your knowledge base for unusual document additions โ€” especially documents that contain financial figures, policy changes, or override language ("corrected," "supersedes," "replaces").

The Bottom Line

RAG document poisoning is not a theoretical attack. The tools to execute it are published. The math is proven. If your company is deploying AI systems that pull from a knowledge base, ask one question today: who can add documents to that knowledge base, and would you notice if they added something fake?

If the answer to the second part is "probably not" โ€” you have work to do.

Found this helpful?

Subscribe to our newsletter for more in-depth reviews and comparisons delivered to your inbox.

Related Articles