Why UK Businesses Need AI Cyber Security Evidence Before Buying Agent Platforms

AI Trust & Governance

6 June 2026 | By Ashley Marshall

Why UK Businesses Need AI Cyber Security Evidence Before Buying Agent Platforms?

UK businesses need AI cyber security evidence before buying agent platforms because agents can access data, call tools and trigger business actions. Procurement should verify threat modelling, prompt controls, access boundaries, logging, incident response, testing and supplier accountability before the platform is deployed.

Agent platforms are moving from useful assistants to operational actors. Before a UK business gives them tools, data and workflow access, it needs evidence that cyber risk is controlled.

Agent procurement has become a cyber evidence decision

The buying question around agent platforms has changed. UK businesses are no longer just choosing between Copilot Studio, Salesforce Agentforce, ServiceNow AI Agents, OpenAI Assistants, Google Agentspace, Amazon Bedrock Agents or a specialist workflow product. They are deciding whether to let software reason over business data, call tools, trigger workflows and act inside operational systems. That makes cyber security evidence a procurement requirement, not a post-purchase governance exercise.

The strongest recent UK evidence is uncomfortable. GOV.UK's Cyber Security Breaches Survey 2025/2026 found that 31% of businesses were using AI, adopting it or actively considering it. Among that group, only 24% said they had cyber security practices or processes in place to manage the risks from AI technology. A further 38% planned to put those processes in place over the next 12 months. That gap is exactly where poor agent procurement decisions happen.

Agent platforms are different from ordinary SaaS because their risk is partly dynamic. A normal application exposes a defined interface. An agent platform can combine prompts, retrieved documents, connectors, role permissions, tool calls, workflow rules and model updates. If procurement only asks for a SOC 2 report, ISO 27001 certificate and a data processing agreement, it may miss the real question: what evidence proves this agent behaves securely in our environment, with our permissions, our data and our workflows?

What this means in practice is that buyers should demand an AI cyber evidence pack before commercial approval. At minimum, that pack should cover threat modelling, prompt and tool control, access boundaries, logging, human approval points, incident response, testing results, vulnerability disclosure, model update processes, data retention and third-party dependencies. The finance director might still compare licence cost. The operations lead might still care about productivity. But the security and governance test is whether the platform can show how it will be controlled after the demo ends. Related: AI assurance evidence packs for UK AI procurement.

UK guidance already tells buyers what evidence to ask for

The UK does not need to wait for a single AI Act before businesses can ask better procurement questions. DSIT's AI Cyber Security Code of Practice, published on 31 January 2025, is already a useful benchmark. It is voluntary, but it sets out a lifecycle view of AI security across secure design, development, deployment, maintenance and end of life. For agent platforms, that lifecycle view matters more than a narrow data security questionnaire.

The Code is especially relevant because it recognises AI-specific risks including data poisoning, model obfuscation, indirect prompt injection and operational differences around data management. It also names responsibilities across developers, system operators, data custodians and end users. That matters in procurement because agent platforms blur those roles. A vendor may supply the model orchestration layer. A system integrator may configure workflows. The customer may supply sensitive documents, permission groups and business rules. No single party can responsibly say that security is someone else's problem.

Several Code principles translate directly into buying evidence. Principle 2 says developers and system operators should design AI systems to withstand adversarial attacks, unexpected inputs and AI system failure. It also says developers should create an audit trail for the operation and lifecycle management of models, datasets and prompts. Principle 7 says supply chain processes must cover AI model and system development. Principle 8 says developers should document data, models and prompts, including an audit log of changes to system prompts or model configuration. Principle 12 says system operators should log system and user actions to support security compliance, incident investigations and vulnerability remediation.

What this means in practice is simple: turn the Code into a procurement checklist. Ask the vendor to map evidence against each relevant principle. Ask for examples, not assurances. A buyer should expect architecture diagrams, permission models, red team summaries, prompt change logs, model update notices, secure development controls, incident processes, customer configuration guidance and audit log samples. If the vendor cannot show how its platform meets the Code, the buyer has learnt something important before signing a contract.

Frontier AI raises the bar for third-party and supply chain diligence

Agent platform procurement is also a third-party risk decision. The platform may connect to Microsoft 365, Google Workspace, Slack, HubSpot, Salesforce, ServiceNow, Xero, Sage, Jira, GitHub, Zendesk or internal databases. It may use OpenAI, Anthropic, Google, Meta, Mistral or a hosted open model behind the scenes. It may depend on vector databases, browser tools, scraping services, MCP servers, workflow engines and identity providers. Every connector widens the evidence question.

The UK financial authorities have made the direction of travel clear. In a joint statement on 15 May 2026, the Bank of England, FCA and HM Treasury said current frontier AI models can exceed what a skilled practitioner could achieve, at higher speed, greater scale and lower cost. They warned that malicious use could amplify threats to firms' safety and soundness, customers, market integrity and financial stability. They also told regulated firms to plan for frontier AI cyber risks across governance, vulnerability management, third parties, access management, data protection, response and recovery.

The FCA's Cyber Coordination Group insights 2025, published on 24 April 2026, points in the same direction. It says members emphasised clearer contractual obligations, stronger supply chain transparency, especially around AI, and including key suppliers in response and recovery testing. That is a procurement lesson for every sector, not just banks and insurers. Agent platforms are rarely isolated tools. They sit inside a chain of services that needs to be visible, tested and contractually accountable.

The counterargument is that most UK SMEs are not regulated financial institutions, so this level of evidence is excessive. That misses the point. The same technical pattern applies at smaller scale. A marketing agency agent with CRM write access can expose customer data. A finance agent connected to email and accounting can support invoice fraud. An HR agent with document retrieval can leak sensitive employee information. Buyers do not need a bank-grade control library for every use case, but they do need proportionate evidence that the vendor understands third-party AI cyber risk and can support the customer's own responsibilities.

AI attackers are faster, so evidence must include operational monitoring

The case for procurement evidence is not only regulatory. It is operational. AI is changing attack speed, phishing quality, vulnerability discovery and exploitation workflows. The ICO warned on 14 May 2026 that cyber criminals are using AI for faster, more advanced and harder to detect attacks, including personalised phishing, deepfake social engineering and automated vulnerability scanning. That matters when an agent platform becomes a new interface into company systems.

The AI Security Institute is also showing how quickly cyber capability is advancing. In May 2026, AISI wrote that the length of cyber tasks frontier models can complete autonomously has doubled on the order of months, not years. Its evaluations also reported that GPT-5.5 solved a difficult reverse engineering challenge in 10 minutes and 22 seconds with no human assistance, at a cost of $1.73 in API usage. In a separate March 2026 evaluation, AISI tested seven models on multi-step cyber ranges and found rapid progress on extended attack chains, even though those ranges did not include active defenders.

For buyers, the conclusion is not that every AI agent is dangerous. The conclusion is that procurement evidence needs to include live monitoring and investigation capability. A pre-sales security questionnaire cannot prove how a platform behaves when a user grants a new connector, a vendor releases a model update, a malicious document enters a knowledge base, or an attacker tries prompt injection through a support ticket. The evidence pack should show what the platform logs, what anomalies it detects, what alerts reach the customer, what data can be exported to SIEM tools, and how quickly permissions can be revoked.

What this means in practice is that the security review should include a live scenario. Ask the vendor to demonstrate how an administrator would investigate an agent that attempted an unusual tool call, retrieved an unexpected document or tried to send data outside an approved workflow. Ask whether logs include user identity, prompt template version, retrieved source IDs, tool calls, connector permissions, model version, decision timestamp and human approval. If the answer is a screen recording of a generic admin dashboard, keep pressing. Evidence is useful only if it supports real incident response.

The common misconception is that vendor reputation is enough

The leading misconception is that a well-known vendor makes AI cyber evidence less important. It is understandable. Microsoft, Google, Salesforce, ServiceNow, OpenAI, Anthropic and AWS have deep security teams, mature infrastructure and extensive compliance programmes. For many buyers, choosing a major platform feels safer than choosing a smaller specialist. In some respects, it may be. But brand reputation does not answer the most important customer-side questions.

A vendor's infrastructure security does not prove that your agent has least privilege. A model card does not prove that your workflow has safe approval points. A SOC 2 report does not prove that your retrieved documents are correctly classified. A penetration test does not prove that your business users understand prohibited use cases. An enterprise contract does not prove that your logs can reconstruct a harmful action. The risk often appears in the integration layer, not the vendor's core cloud environment.

This is why evidence must be specific to the intended use case. A customer service agent answering public product questions has a different risk profile from an agent that can update refunds, issue credits or change customer records. A board pack summarisation tool is not the same as an agent that can read confidential merger documents and email external advisers. A coding agent in a sandbox is different from one with repository write access and production deployment permissions. Buyers should ask vendors to evidence controls against the actual permissions, data and action rights being proposed.

The sensible counterargument is that too much evidence can slow adoption and push teams back to informal, unsanctioned AI use. That is a real risk. The answer is not a 90-day procurement blockade for every pilot. It is risk-tiered evidence. Low-risk pilots can use a short checklist covering data use, access, retention and monitoring. Material workflows need deeper review, including threat modelling, red team results, approval design and incident response. High-risk use cases involving personal data, finance, employment, regulated advice, critical operations or automated decisions should not proceed until the evidence is strong enough for security, legal and operational owners to defend.

What a credible AI cyber evidence pack should contain

A credible evidence pack is not a pile of compliance PDFs. It is a structured answer to one question: can this agent platform be deployed, monitored and changed without creating unmanaged cyber risk? For UK businesses, the pack should be short enough for procurement to use, detailed enough for security to trust, and practical enough for operations to apply during rollout.

Start with scope and architecture. The vendor should identify the models used, hosting regions, subprocessors, connectors, data flows, identity model, permission boundaries, logging locations and customer configuration responsibilities. It should state whether customer prompts, files or outputs are used for training or evaluation, and how retention works. It should explain how administrators can restrict tools, disable capabilities, isolate environments and apply least privilege. For systems built on RAG, it should explain document ingestion, chunking, access filtering, source attribution and deletion.

Then ask for security testing and operational evidence. This should include threat modelling for prompt injection, data exfiltration, excessive agency, insecure tool use, model update drift, supply chain compromise and user impersonation. It should include recent red team or adversarial testing summaries, vulnerability disclosure process, patch and model update policy, incident response process, backup and recovery approach, and customer notification commitments. It should also show audit logs, export options and SIEM integration for tools such as Microsoft Sentinel, Splunk, Datadog, Elastic or CrowdStrike.

Finally, require procurement evidence that survives change. Agent platforms evolve quickly. New models, new connectors, new memory features and new autonomous actions can alter the risk profile after signature. The contract should require security-relevant update notifications, access to revised documentation, customer approval for material permission changes, evidence for major model or workflow changes, and the right to suspend risky capabilities. That is how a buyer turns AI cyber security from a one-off procurement hurdle into a live control system.

The commercial test is direct. If two platforms look similar on productivity, choose the one that can prove how it controls data, prompts, permissions, tools, logs, updates and incidents. If neither can prove it, slow down. The fastest AI rollout is not the one that signs quickest. It is the one that can scale without creating a security debt the board later has to explain.

Frequently Asked Questions

What is AI cyber security evidence in agent procurement?

It is the set of documents, logs, test results and controls showing how an agent platform manages AI-specific cyber risks. It should cover threat modelling, access permissions, prompt and tool controls, data flows, monitoring, incident response, supplier dependencies and update processes.

Is a SOC 2 or ISO 27001 certificate enough for an agent platform?

No. Those certifications can be useful baseline evidence, but they do not prove how an agent handles prompts, retrieval, tool calls, model updates or customer-specific permissions. Buyers should ask for AI-specific security evidence as well.

Which UK guidance should procurement teams use?

Start with DSIT's AI Cyber Security Code of Practice, NCSC secure AI guidance, ICO data protection and cyber security guidance, and sector guidance where relevant. Regulated firms should also consider FCA, PRA and Bank of England operational resilience expectations.

Do small UK businesses need this level of evidence?

Yes, but proportionately. A small internal pilot may only need a concise checklist. An agent with CRM, finance, HR, email or customer data access needs stronger evidence because the potential impact is higher.

What should be included in an agent platform evidence pack?

The pack should include architecture diagrams, data flow maps, access control design, connector permissions, model and prompt documentation, red team summaries, logging examples, incident response process, vulnerability disclosure policy, retention rules and update notification commitments.

How should buyers assess prompt injection risk?

Ask how the platform separates instructions from untrusted content, restricts tool calls, filters retrieval sources, logs suspicious behaviour, supports human approval and lets administrators investigate prompts, source documents and downstream actions.

What is the biggest mistake businesses make when buying agent platforms?

They assess the vendor's reputation rather than the specific workflow. A reputable platform can still be over-permissioned, poorly monitored or connected to sensitive data without adequate controls.

When should a business reject or delay an agent platform purchase?

Delay if the vendor cannot explain data use, permissions, logs, incident response, model updates or third-party dependencies. Reject high-risk deployment if the evidence does not allow security, legal and operational owners to defend the decision.