insights

RAG vs fine tuning: which one your problem actually needs

21 January 2026 By LiverpoolAI Editorial 4 min read

They solve different problems. Pick the wrong one and you pay for it in cost, accuracy or both.

RAG vs fine tuning: which one your problem actually needs

Retrieval-augmented generation versus fine-tuning is the most common technical decision we make on the way into a Liverpool engagement. The honest answer is that most problems are retrieval problems — but the cases where fine-tuning is the right tool are real, and worth knowing.

This piece is the version of that conversation, written so you can have it with your team before you talk to a consultancy about which one to build.

What RAG actually is

Retrieval-augmented generation, in production form, is a system that takes a user question, retrieves relevant documents from a corpus you control (your contracts, your guidelines, your support documentation), inserts those documents into the prompt, and lets a base language model generate an answer grounded in the retrieved content. Every answer cites the documents it drew from.

The base model never "knows" your corpus. It reads the retrieved documents at request time and synthesises an answer. This means:

You can update the corpus without retraining anything.
The model can cite its sources, because the source is right there in its prompt.
The model can refuse questions that fall outside the corpus, because retrieval will surface nothing relevant.
The cost per request is higher (you are paying for retrieved tokens) but the build cost is dramatically lower than fine-tuning.

What fine-tuning actually is

Fine-tuning takes a base language model and continues training it on a domain-specific dataset, adjusting the model's weights so it specialises for that domain. The model "internalises" the training data — there is no retrieval step at request time.

Done well, fine-tuning gives a model that:

Writes in your house style or follows your structured output format more reliably.
Understands domain-specific terminology that would otherwise need to be retrieved.
Costs less per request than RAG (no retrieval tokens) at scale.
Cannot be updated without re-training when the underlying knowledge changes.

The diagnostic question

The single question that decides which one your problem actually needs:

Does the right answer depend on specific documents or data that change over time?

If yes — pricing, policies, product catalogues, internal documentation, regulatory guidance, customer records — your problem is a retrieval problem. Use RAG.
If no — house writing style, structured output format, domain language understanding, consistent persona — your problem may be a fine-tuning problem. Test both.

Most problems we are asked about are in the first category. Almost every internal-knowledge-assistant project, every customer-support copilot, every regulated drafting system is a retrieval problem.

When retrieval wins clearly

Categories where we default to RAG without much hesitation:

Internal knowledge assistants over policies, procedures, precedents, guideline libraries. The corpus changes; you want updates to propagate without re-training.
Customer-facing support copilots over product information, pricing, FAQs, account-specific data. Same reasoning — anything the assistant says about your product had better be tied to the current version of your documentation.
Document intelligence pipelines over your own document corpus — contracts, claims, KYC packs. The structure is in your data, not in any base model's weights.
Anything that needs citations. Citing a fine-tuned model is hard. Citing a RAG system is built in.
Anything in a regulated context. The audit trail of "what documents did the model read and how did it answer" is much cleaner with retrieval.

When fine-tuning wins clearly

Categories where we will recommend fine-tuning, sometimes alongside retrieval:

Strict structured-output formats the base model gets wrong too often — domain-specific JSON schemas, edge-case classification tasks, output formats with hard syntax rules.
House writing style at scale — when a single fine-tuned model that writes in your voice is cheaper than running every output through a style-checking layer.
Domain language where the base model genuinely does not have the vocabulary — niche medical, legal or scientific terminology where retrieval is not enough to bridge the gap.
Very high-volume, latency-sensitive workflows where the cost of retrieval tokens at scale starts to outweigh the cost of fine-tuning.

When you need both

For more sophisticated systems, the right answer is often "both" — a fine-tuned model that handles your structure and style, with retrieval on top for the changing factual content. This is the right call surprisingly rarely; we ship it for maybe one in twenty engagements. But when it is right, it is unambiguously right.

What this means for a Liverpool buyer

If a vendor or consultancy is recommending fine-tuning for what sounds like a retrieval problem, ask why. The honest answer might be "because we have a fine-tuning product to sell" rather than "because your data structure requires it". Most knowledge-assistant and document-intelligence problems we ship are retrieval problems, not fine-tuning problems.

If a vendor is recommending retrieval and you genuinely care about consistent house style or strict output formats at high volume, ask whether they have fine-tuning capability and have run the comparison.

The right answer for your specific problem is the one that comes out of the eval. Insist on the comparison being run before any committed build. We covered the broader scoping discipline in how to scope an AI project in a week.

If you would like to talk through a specific RAG vs fine-tuning decision with us, book a 30-minute discovery call.

What production-ready AI actually means — the engineering bar both RAG and fine-tuning have to clear.
AI for legal firms in Liverpool — where retrieval-grounded systems are the right tool.
AI for healthcare in Liverpool — where retrieval-grounded knowledge assistants are paying off.

RAG vs fine tuning: which one your problem actually needs

What RAG actually is

What fine-tuning actually is

The diagnostic question

When retrieval wins clearly

When fine-tuning wins clearly

When you need both

What this means for a Liverpool buyer

More from insights

The Liverpool AI ecosystem in 2026: a who's who

AI for retail in Liverpool: what is working in 2026

AI for financial services in Liverpool: practical use cases for 2026

What RAG actually is

What fine-tuning actually is

The diagnostic question

When retrieval wins clearly

When fine-tuning wins clearly

When you need both

What this means for a Liverpool buyer

Related reading

More from insights

The Liverpool AI ecosystem in 2026: a who's who

AI for retail in Liverpool: what is working in 2026

AI for financial services in Liverpool: practical use cases for 2026