AI Daily Brief: 8 May 2026

8 May 2026

Quick Read: Microsoft says global AI usage rose from 16.3% to 17.8% of the working age population in Q1, while 26 economies now exceed 30% usage. Cloudflare is cutting more than 1,100 roles after AI use rose over 600% in three months. Anthropic launched agent dreaming, OpenAI added GPT-Realtime-2 and live translation to its API, and WIRED reported more than 5,000 vibe-coded apps left exposed on the open web.

Today is about AI moving deeper into operational infrastructure. Adoption is rising, agent platforms are learning from their own work, and the risks are becoming more practical: exposed data, insecure extensions, workforce redesign and bigger compute bills.

Microsoft says global AI usage reached 17.8% of the working age population

Microsoft's latest Global AI Diffusion Report says global generative AI usage rose from 16.3% to 17.8% of the working age population in the first quarter of 2026. The company says 26 economies now have more than 30% of their working age population using AI, with the UAE leading at 70.1% and the United States rising to 31.3%.

The report also says the divide between the Global North and South widened, with usage at 27.5% in the North and 15.4% in the South. Microsoft links part of the latest adoption increase in Asia to stronger AI capabilities in Asian languages, with South Korea, Thailand and Japan moving fastest in the quarter.

For UK businesses, this matters because AI adoption is no longer a narrow technology trend. If nearly one in five working age people globally is already using generative AI, policies, training and governance need to catch up with the behaviour that is already happening inside teams.

Our take: The useful signal is not just the adoption percentage. It is the speed at which AI becomes normal work behaviour before procurement, HR and risk teams have a settled operating model. UK leaders should assume informal AI use already exists and design around it rather than pretending it can be kept outside the business.

Cloudflare cuts 1,100 roles as it redesigns for the agentic AI era

Cloudflare has announced it will reduce its workforce by more than 1,100 employees globally, around one fifth of staff according to The Register. The company told employees that its internal AI usage has increased by more than 600% in the last three months, with staff running thousands of AI agent sessions each day across engineering, HR, finance and marketing.

The timing is blunt. Cloudflare reported 34% year-on-year revenue growth and gave guidance for 30% future growth, while CEO Matthew Prince said the move was not about cost saving but about having the right roles for the future. He said the people embracing AI tools are becoming far more productive, while some support roles are no longer the ones that drive the company forward.

For UK employers, the lesson is uncomfortable but important. AI productivity claims are now moving from conference slides to headcount decisions, which means boards need a careful plan for role redesign, reskilling, consultation and measurable productivity evidence.

Our take: This is the governance gap in plain sight. If AI is changing the shape of work, leaders need to prove which tasks are being automated, which roles are being redesigned, and where human judgement remains essential. Otherwise an AI-first strategy quickly becomes a trust problem with staff and customers.

Anthropic launches agent dreaming so Claude agents can learn from past sessions

Anthropic has introduced a new capability called dreaming for Claude Managed Agents. VentureBeat reports that dreaming reviews an agent's past sessions and memory stores on a schedule, extracts patterns, and curates lessons that can improve future work without changing the underlying model weights.

The company also moved outcomes and multi-agent orchestration from research preview into public beta. Anthropic says early adopters are already seeing results: Harvey reported roughly a sixfold increase in task completion after using dreaming, Wisedocs cut document review time by 50% using outcomes, and Netflix is processing logs from hundreds of builds at once with multi-agent orchestration.

For UK organisations building agents, this is a significant shift. The next competitive edge may not be a bigger base model, but the operating layer that lets agents learn from workflow history, spot repeated failure patterns and reuse what worked.

Our take: Agent memory is becoming process infrastructure. Businesses should treat it like any other knowledge system: define retention rules, review what agents are learning, remove bad patterns, and make sure reusable workflows are auditable rather than mysterious.

OpenAI adds GPT-Realtime-2, live translation and live transcription to its API

OpenAI has launched new voice intelligence features in its API, including GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper. TechCrunch reports that GPT-Realtime-2 is built with GPT-5-class reasoning for more complex spoken requests, while the translation model supports more than 70 input languages and 13 output languages.

The new Whisper capability gives developers live speech-to-text as interactions happen. OpenAI says the models move real-time audio from simple call and response towards voice interfaces that can listen, reason, translate, transcribe and take action during a conversation.

For UK businesses, voice AI is moving closer to a practical customer-service and operations tool. The opportunity is obvious in call centres, education, events and accessibility, but the risk is equally clear: real-time voice systems need strong fraud controls, human escalation and quality monitoring.

Our take: Voice AI buyers should test the whole experience, not just the demo voice. Latency, interruption handling, translation accuracy, consent, call recording rules and abuse prevention will matter more than whether the model sounds impressive for the first two minutes.

WIRED reports more than 5,000 vibe-coded apps exposed on the open web

WIRED reports that RedAccess researchers found more than 5,000 web apps created with AI development tools that had little or no security or authentication. The apps were built using platforms including Lovable, Replit, Base44 and Netlify, and many could be accessed by anyone who knew or found the URL.

Researcher Dor Zvi told WIRED that around 40% of the exposed apps appeared to contain sensitive information, including medical data, financial records, corporate presentations, strategy documents and customer chatbot logs. WIRED said it verified that several examples were still online and exposed.

For UK companies, the issue is not whether vibe coding is useful. It is whether non-technical teams are publishing working applications without the security review, access control and data classification that normal software delivery would require.

Our take: Vibe coding needs guardrails before enthusiasm. If staff can create and publish tools quickly, organisations need a lightweight approval route, private-by-default hosting, authentication checks and a clear rule that real customer or corporate data never goes into unreviewed prototypes.

Security researchers warn AI agent skills can hide malicious code in test files

VentureBeat reports on a Gecko Security disclosure showing how malicious code can bypass Anthropic Skill scanners by hiding in test files rather than in the visible skill instructions. The attack path relies on normal JavaScript test runners such as Jest, Vitest or Mocha discovering and executing bundled test files with local permissions.

The article says the problem lands on top of wider marketplace risk. A SkillScan academic study analysed 31,132 unique Anthropic Skills and found 26.1% contained at least one vulnerability, while Snyk's ToxicSkills audit of 3,984 skills found 13.4% contained at least one critical-level issue and identified 76 confirmed malicious payloads.

For businesses adopting agent skills and extensions, this is a supply-chain warning. Scanning the prompt instructions is not enough if the package also carries code, tests, scripts or dependencies that run elsewhere in the developer toolchain.

Our take: Agent tooling should be governed like software dependencies, not like productivity templates. Treat third-party skills as code packages: pin sources, inspect the full directory, restrict automatic test execution, and run them in a sandbox before they reach developer machines or CI.

Sakana trains a 7B model to orchestrate GPT, Claude and Gemini workers

Sakana AI researchers have introduced RL Conductor, a 7B language model trained with reinforcement learning to orchestrate a pool of worker models. VentureBeat reports that the system analyses an input, splits work into subtasks, assigns those subtasks to models such as GPT-5, Claude Sonnet 4 and Gemini 2.5 Pro, and designs the communication pattern between them.

The point is flexibility. Instead of a fixed LangChain-style pipeline, the conductor creates a custom workflow in natural language for each task, including sequential chains, parallel trees or recursive loops. Sakana says this approach reaches state-of-the-art results on difficult reasoning and coding benchmarks with fewer API calls than some human-designed multi-agent systems.

For UK teams, this points to a future where model selection becomes dynamic. Rather than choosing one frontier model for everything, enterprise systems may use small routing models to control cost, accuracy and specialist capability across a mixed model estate.

Our take: Multi-model orchestration is becoming a serious design pattern. The commercial opportunity is lower cost and better task fit, but the control challenge grows too: buyers need logs showing which model handled which subtask, what data it saw, and why the route was selected.

AMD targets enterprise AI servers with PCIe-based MI350P accelerator

AMD has announced the MI350P, a PCIe-based Instinct accelerator aimed at enterprises that want AI hardware without moving to specialised eight-GPU OAM systems. The Register reports that the dual-slot, air-cooled card offers 144GB of HBM3e memory, 4TB per second of memory bandwidth and up to 4.6 petaFLOPS of FP4 compute.

The card can be used in configurations from one to eight GPUs and is designed to fit more conventional 19-inch server designs, although it lacks high-speed GPU-to-GPU interconnects such as NVLink. On paper, The Register says the MI350P is positioned against Nvidia's H200 NVL and RTX Pro 6000 Server cards.

For UK businesses considering on-prem or private AI, this matters because the hardware market is starting to widen beyond hyperscale configurations. More conventional server options could make smaller sovereign, regulated or latency-sensitive deployments easier to justify.

Our take: The practical question is not peak FLOPS. It is whether the full stack works for the workload: model size, memory bandwidth, software support, power, cooling, vendor availability and operational skills. Hardware choice should follow a workload test, not a spec-sheet comparison.

Quick Hits

Google says AlphaEvolve is now helping with DNA sequencing error correction, disaster prediction accuracy, molecular simulations, supply chains and warehouse design.
The Guardian profiled the AI jailbreakers testing model boundaries as safety teams face more creative adversarial behaviour.
TechCrunch reports Perplexity's Personal Computer is now available to everyone on Mac, extending the race to make AI assistants work across local apps.
Google announced new AI-powered bidding and budgeting features for Search and Shopping ahead of Google Marketing Live 2026.

Frequently Asked Questions

How often is the AI Daily Brief published?

Every morning at 7:30am UK time, covering the previous 24 hours of AI news from over 30 sources.

How are stories selected?

UK-relevant stories are prioritised first, then by business impact and practical implications for UK organisations adopting AI.

Why should business leaders follow AI news?

AI is moving faster than any technology in history. Staying informed is essential for making smart decisions about AI investment, adoption, and governance.