AI Context Management Policies Before Agent Rollouts
Tools & Technical Tutorials
28 May 2026 | By Ashley Marshall
Quick Answer: AI Context Management Policies Before Agent Rollouts
A context management policy defines which data, prompts, memory, tool outputs and system instructions an AI agent can use. UK businesses should put these controls in place before rollout because prompt injection, weak retrieval permissions and excessive tool access can turn a useful agent into an unmanaged operational risk.
AI agents fail governance at the context layer first. UK businesses need retrieval, prompt and tool policies before they let agents touch live systems.
Context Is Now A Control Plane
Most AI governance conversations still start with the model. That is understandable, but it is no longer enough. In agentic systems, the model is only one part of the decision path. The agent also receives system prompts, retrieved documents, chat history, user profile data, memory, tool schemas, API responses and workflow instructions. That bundle is the context layer, and it decides what the agent can see, believe and do.
The UK evidence shows why this matters now. DSIT's 2026 AI Adoption Research found that 16% of UK businesses were using at least one AI technology, while 85% of AI adopters were using natural language processing and text generation. Agentic AI was still only used by 7% of AI adopters, which means many organisations are about to move from simple assistant use into higher risk workflows. Microsoft also reports that over 80% of the Fortune 500 is deploying active agents built with low-code or no-code tools, which is a named example of how agent creation is spreading beyond central engineering teams.
That shift changes the control problem. A chatbot can produce a weak answer. An agent can read internal data, update a CRM record, send a message, create a ticket or trigger a workflow. The policy question is therefore not simply whether a model is approved. It is whether the organisation can prove which context was supplied, why it was supplied, whether the user was entitled to it and whether the resulting action stayed within the intended boundary.
What this means in practice is that every serious rollout needs a context inventory. List each source that may enter the agent's context window, including RAG indexes, SharePoint or Google Drive connectors, CRM fields, email bodies, memories, tool outputs and system prompts. For each source, assign an owner, sensitivity level, retention rule, retrieval rule and logging requirement. If a source cannot be classified or audited, it should not be available to production agents.
Retrieval Needs Permissions Before Relevance
Retrieval augmented generation is often sold as the practical fix for enterprise AI: connect the model to trusted documents, then let it answer with current business knowledge. The idea is useful, but the security boundary is frequently placed in the wrong location. Relevance ranking is not authorisation. If the RAG pipeline can retrieve a board paper, a disciplinary note or a customer record for the wrong user, the answer may be fluent, sourced and still unauthorised.
Microsoft's Azure guidance on RAG document security is direct: when building a chat application with your own data, each user should receive an answer based on their permissions. Its reference pattern adds authentication, stores user or group access details in the search index and filters documents that do not match the authenticated user. AWS makes the same point in its generative AI security reference architecture, recommending controls at ingestion, storage, retrieval and inference time, including metadata filtering, role-based access controls, data redaction and source attribution.
UK businesses should therefore separate retrieval policy from prompt policy. The retrieval policy should define which repositories are approved, which document types can be indexed, which metadata fields are mandatory, how access control lists are synced, how stale permissions are reindexed and what happens when a document lacks a sensitivity label. This is also where data minimisation becomes operational. The agent should retrieve the smallest relevant chunk from the narrowest authorised corpus, not a generous bundle of potentially useful context.
The practical governance test is simple: could the same user have opened the source document outside the AI system at the moment the answer was generated? If the answer is no, the RAG system is creating a new route around access control. If the organisation cannot answer because metadata is missing, the rollout is premature. The policy should also treat external or user-uploaded documents as untrusted content until scanned, labelled and approved for retrieval, because malicious instructions can be hidden inside apparently normal documents.
Prompts Need Governance Like Code
Prompt governance sounds lightweight until a prompt becomes part of a production workflow. A system prompt can define the agent's role, escalation path, data handling rules, refusal behaviour and tool selection logic. A hidden prompt can become a de facto policy engine, yet many organisations still manage it as a text box in a vendor console. That is not enough for regulated or operationally sensitive use cases.
The NCSC's recent blog, Prompt injection is not SQL injection, is useful because it addresses the common misconception. Current large language models do not enforce a clean security boundary between instructions and data inside a prompt. OWASP also lists prompt injection as LLM01 in its 2025 Top 10 risks for LLM and generative AI applications, alongside system prompt leakage, excessive agency and vector or embedding weaknesses.
The conclusion is not that prompts are useless. It is that prompts should be treated as governed configuration, not as magic locks. A context management policy should require prompt version control, named ownership, peer review for material changes, environment separation, rollback capability and regression testing. It should also require adversarial tests for direct prompt injection, indirect prompt injection through retrieved content, attempts to reveal system prompts, attempts to alter tool arguments and attempts to bypass escalation rules.
What this means in practice is that every production agent should have a prompt register. The register should state the prompt purpose, model, tools available, approved data sources, expected outputs, failure modes, test set, review date and change owner. Prompt changes should move through the same release discipline as code changes where the agent can affect customers, money, contracts, HR records, regulated advice or operational systems. A prompt that tells the model not to leak information is helpful. A retrieval filter, permission check, output validator and audit log are what make the policy enforceable.
Tool Context Is Where Risk Turns Operational
The riskiest context is often not a document. It is a tool. Once an agent can call a CRM, ticketing system, payment platform, email account, database, browser, code repository or automation workflow, context stops being just input. It becomes the basis for action. A malicious email, compromised web page or poisoned support ticket can influence the agent's next step if tool outputs are treated as trusted instructions.
AWS describes agentic systems as different from stateless request and response model use because agents introduce autonomous execution, persistent memory, tool orchestration, identity and external system integration. Its agentic security guidance also separates agency from autonomy: agency is what systems and operations the agent is allowed to use, while autonomy is how far it can proceed without human oversight. That distinction should be written into policy before rollout.
For UK businesses, the minimum tool context policy should define allowed tools by role, action type and environment. Read actions should be separated from write actions. Drafting an email should be separated from sending it. Searching a CRM should be separated from exporting a customer list. Creating a ticket should be separated from closing it. Each high impact action should have a human approval gate, a deterministic policy check or both. Credentials should never be pasted into prompts or exposed to the model. The agent should call a brokered tool with scoped permissions, short-lived tokens and logging.
The leading counterargument is that too much gating makes agents less useful. That is true if controls are bolted on randomly. It is false if the policy mirrors operational risk. A support triage agent may safely classify messages and draft replies. It should not refund customers, delete account records or email attachments outside the organisation without a separate approval path. The aim is not to prevent automation. The aim is to make the blast radius explicit, small and auditable.
UK Compliance Has To Reach The Context Layer
UK compliance teams should avoid treating AI context as a purely technical detail. Context can contain personal data, confidential business data, special category data, privileged material, customer records, employee information and regulatory evidence. It can also create inferences about people. If that data enters prompts, retrieved snippets, memory or tool logs, it has to be governed under the same legal and assurance expectations as the underlying system.
The ICO's AI and data protection guidance says organisations using AI to process personal data need to comply with data protection by design and by default. It also frames AI governance around lawfulness, fairness, transparency, purpose limitation, data minimisation, accuracy, storage limitation, security and accountability. In its generative AI work, the ICO notes that prompts may be collected after deployment for ongoing model development and that organisations must provide clear information where personal data is collected directly from individuals.
The UK government's Code of Practice for the Cyber Security of AI makes the same issue operational. It says developers should document and create an audit trail for models, datasets and prompts, and that permissions granted to AI systems interacting with other systems or data sources should only be provided as required for functionality and risk assessed. Those requirements land directly on context management.
What this means in practice is that context policies should sit beside DPIAs, supplier assessments, cyber risk registers and data maps. The policy should state whether prompts and outputs are logged, how long they are retained, whether they are used for model improvement, who can audit them, how personal data is redacted, how subject access requests could be handled and how sensitive material is excluded from training or retrieval. A policy that only approves a vendor misses the point. The business also needs to approve the data paths, prompts, tools and logs created by the deployment.
The Misconception: Context Controls Slow Rollouts
The most common objection is commercial: if every retrieval source, prompt and tool needs policy review, agent rollouts will stall. That fear is understandable. UK businesses are under pressure to show AI productivity gains, and teams do not want another committee slowing down practical automation. But the evidence points in the opposite direction. The businesses that struggle are usually not those with too much context discipline. They are the ones that cannot tell which agents exist, what data they can see or what actions they can take.
The 2025/2026 Cyber Security Breaches Survey found that 31% of businesses were using, adopting or actively considering AI, but only 24% of that group had cyber security practices or processes in place to manage AI risk. A further 31% of businesses in that group had no plans to implement such practices. That is the gap context policies are meant to close.
The answer is not a huge governance programme before any experiment. It is a minimum viable policy that scales with risk. For pilots, require approved tools, no sensitive production data, named owners, prompt logging and clear user warnings. For internal production use, add retrieval access control, prompt versioning, automated evaluations, incident reporting and tool permission reviews. For customer-facing, regulated or write-capable agents, add DPIA review, human approval gates, red team testing, audit sampling, supplier due diligence and board-level risk acceptance.
Context governance should make rollout faster because it gives product, security, legal and operations teams a shared checklist. Teams can reuse approved retrieval patterns, prompt templates, tool permission profiles and logging standards. Exceptions become visible rather than hidden. The result is a practical route to deployment: start with narrow context, prove behaviour, widen access gradually and keep an evidence trail. That is how agents move from impressive demos to dependable business systems.
Frequently Asked Questions
What is an AI context management policy?
It is a governance document that defines which information an AI system can place in the model context, including prompts, retrieved documents, memory, user data, tool responses and system instructions. It should also define ownership, permissions, logging, retention and review rules.
Why does context management matter before AI agent rollout?
Agents can act on the context they receive. If retrieval, prompts or tool outputs are poorly governed, the agent may expose data, follow malicious instructions, call the wrong tool or make decisions outside its intended scope.
How should a UK business control RAG retrieval?
Start with approved repositories, mandatory data classification, metadata filters, user and group permission checks, source attribution and reindexing when access rights change. The model should only receive context the authenticated user is allowed to see.
Are system prompts a security control?
System prompts are useful instructions, but they should not be treated as a security boundary. Security should be enforced through retrieval controls, tool permissions, output validation, human approval gates and audit logs.
Does RAG remove hallucination and compliance risk?
No. RAG can improve grounding and freshness, but it can also retrieve unauthorised, stale, poisoned or misleading content if the pipeline is not controlled. Retrieval policy, data quality and provenance still matter.
What should be logged for AI agent governance?
Log the user, model, prompt version, retrieved source IDs, tool calls, tool arguments, outputs, policy decisions, approvals and errors. Avoid logging unnecessary personal data and set retention rules that match the risk.
Does this apply to tools like Microsoft Copilot, OpenAI, Gemini, Claude and Amazon Bedrock?
Yes. The exact implementation differs by platform, but the same questions apply: what context can the system access, how is it authorised, what tools can it call, what is logged and who owns the risk?
How often should context policies be reviewed?
Review them before production launch, after any major data source or tool change, after incidents, when vendor terms change and at a fixed cadence such as quarterly for high risk systems or twice yearly for lower risk internal assistants.