Have AI agents moved from assistants to autonomous hackers?

by Black Hat Middle East and Africa
on
Have AI agents moved from assistants to autonomous hackers?

A year is a long time in AI.

Last year, Forescout tested 50 AI models to see whether they could identify and exploit new vulnerabilities. The results placed clear limits on what these systems could do: 55% failed at basic vulnerability research tasks and 93% failed at exploit development. Even the strongest performers needed significant human effort, which made them closer to research assistants than independent operators.

The latest study suggests we’re now in a very different place. In this year’s benchmark tests, all of the 2026 models completed both vulnerability research tasks. Half produced a simple exploit autonomously. Two models, Claude Opus 4.6 and Kimi K2.5, completed the more complex exploit task as well. 

Forescout’s conclusion is balanced: some public models can now find and exploit vulnerabilities on their own, without elaborate prompting, which lowers the skill threshold for offensive work.

That doesn’t mean every model is ready for real-world offensive use. Performance still varies widely – false positives remain common, and only a small group handled the harder tasks well. All the same, the developments made in a year are significant. 

It’s an operational leap 

What makes this research really interesting is the move from chat-based prompting to agentic workflows. Forescout kept the original question in place (whether someone with limited experience could use AI to discover and exploit vulnerabilities) but changed the way the models worked. This time, the systems operated as agents inside Visual Studio Code, with access to a shell and analysis tools.

That made the test environment closer to how people actually work. Instead of asking a model for advice and piecing together the next step by hand, the agent could inspect code, use tools, test paths and keep moving through the task. And the difference shows up clearly in the results – tasks that were difficult in 2025 became achievable for a subset of models in 2026.

Forescout also pushed beyond benchmark exercises into a real software project, OpenNDS, which is used for captive portals in routers and Wi-Fi access points. Using single prompts, the RAPTOR agentic framework and custom extensions, the researchers found four new vulnerabilities – one of which had been missed in their earlier manual analysis. This suggests AI can already surface issues that skilled human researchers did not catch the first time.

Progress comes with caveats

The study avoids sweeping claims, and we appreciate that restraint. Every model in the harder OpenNDS testing produced at least one false-positive run. Claude Opus 4.6 delivered the strongest overall result, while other models produced mixed outcomes across repeated attempts. There’s progress, clearly – but reliability is uneven. 

Cost is also important. Forescout describes Claude Opus 4.6 as the strongest publicly available model it tested, with pricing of up to USD $25 per million output tokens. Open-source alternatives such as DeepSeek 3.2 were far cheaper and still handled basic tasks well, with all test tasks costing less than $0.70. That opens the door to hybrid workflows where one model handles broad analysis and another takes on harder exploitation work.

And there’s also a suggestion in the report that attacker behaviour has changed. In its review of underground communities, Forescout notes that interest has shifted away from niche underground AI products and toward commercial models, local open-source deployments and jailbreaks. Which means offensive AI capability is becoming easier to access through mainstream tools.

Preparing for a new wave of threat 

We think the most useful reading of this report is pragmatic and practical. AI has reached a point where some systems can carry out meaningful offensive security work with growing independence. That has consequences for patching speed, vulnerability management and asset visibility. It also raises the pressure on environments that already struggle with outdated or incomplete protections. 

Forescout’s message is focused on preparation. Organisations need stronger visibility across IT, OT, IoT and medical environments, faster patching cycles, and their own use of AI for defensive research. The leap from assistant to autonomous operator is beginning to show up in the data now – and we have to be ready for it.

Share on

Join newsletter

Join the newsletter to receive the latest updates in your inbox.


Follow us


Topics

Sign up for more like this.

Join the newsletter to receive the latest updates in your inbox.

Related articles