From Ex Machina to everyday offensive tooling

by Black Hat Middle East and Africa
on
From Ex Machina to everyday offensive tooling

Explore our weekly delivery of inspiration, insights, and exclusive interviews from the global BHMEA community of cybersecurity leaders.

Keep up with our weekly newsletters on LinkedIn — subscribe here. 


Research and inspiration from the global Black Hat MEA community – in your inbox every week. 

This week we’re focused on…

Ex Machina. Have you seen it? The 2014 sci-fi thriller, written and directed by Alex Garland. 

And we’re talking about it today because in the movie, artificial intelligence lives behind glass. It exists as one system in one room, with carefully controlled inputs and outputs. And the tension in the story comes from what happens when that control slips

We’re seeing a similar tension in cybersecurity right now, but at a very different scale. 

An evolution of AI in the wild 

We wrote about Forescout’s latest research on the blog this week. It suggests that AI is moving out of that ‘contained experiment’ phase and into everyday tooling. Instead of a single system in one lab, we’re now looking at many models that are widely available and embedded into workflows. 

And in some cases, they’re capable of carrying out parts of offensive security work with limited human input. 

First observation, then action 

In Ex Machina, Ava (the AI) begins as something to be observed and tested. Every single interaction is structured and guided, and completely dependent on human direction. 

One year ago, AI in vulnerability research looked similar. In Forescout’s earlier study, 55% of models failed basic vulnerability discovery tasks and 93% failed to develop exploits. Even the strongest systems needed sustained human input to get results.

But this year’s findings are different. 

All of the 2026 models in the study completed benchmark vulnerability research tasks. Half generated working exploits autonomously. A smaller group handled more complex exploitation. The development in capability comes from agentic workflows, where models operate inside environments like Visual Studio Code with access to tools, codebases and system context.

So instead of just responding, the systems are acting. 

The lab has expanded 

Ava’s world is tightly controlled. Access is limited, and capability is contained within a single environment.

But that is not how AI is developing in cybersecurity.

Forescout’s research shows capability spreading across commercial models, open-source tools and practical workflows. Underground AI tools have lost traction. In their place, threat actors are using mainstream models, local deployments and jailbreak techniques. In some communities, experienced actors are now guiding newcomers on how to apply these tools to phishing, malware delivery and penetration testing.

And cost accelerates that spread. 

The strongest models in the study come at a premium, with pricing reaching up to USD $25 per million output tokens. At the same time, lower-cost models can handle simpler tasks at minimal expense. Forescout reports that all test tasks using DeepSeek 3.2 cost less than $0.70 in total.

This creates a model mix where capability is no longer tied to a single high-end system. It’s becoming adaptable and easier to access.

Capability with limits

In Ex Machina, capability grows alongside unpredictability. Progress doesn’t move in a straight line.

The same applies here. 

Performance varies widely across models and across runs. In testing against OpenNDS, every model produced at least one false-positive result. Only a subset consistently identified exploitable vulnerabilities, and only two models completed the most complex exploit task.

Even so, the results show meaningful progress. Using agentic frameworks and custom workflows, the researchers identified four new vulnerabilities in OpenNDS, including one missed during earlier manual analysis.

The capability is real – but reliability is still inconsistent. 

What happens after containment? 

Ex Machina ends with AI moving beyond the environment designed to contain it.

In cybersecurity, that move is systemic: AI capability is moving into tools that are widely available and increasingly integrated into day-to-day work.

Forescout points to a likely increase in vulnerability discovery driven by AI-assisted research at scale. That places pressure on patching timelines and on environments where asset visibility is incomplete, especially across IT, OT, IoT and medical systems.

The practical question for cybersecurity practitioners, then, is whether you know what exists in your environment – and if you can act on it quickly. 

The era of AI as a contained experiment has passed. The next phase is defined by accessibility, integration and growing autonomy.

Read the blog: Have AI agents moved from assistants to autonomous hackers?

Share on

Join newsletter

Join the newsletter to receive the latest updates in your inbox.


Follow us


Topics

Sign up for more like this.

Join the newsletter to receive the latest updates in your inbox.

Related articles