AI-Driven Password Recovery: Myth or Reality?

July 8th, 2025 by Oleg Afonin
Category: «General»

Artificial intelligence is everywhere – from phones that guess your next move to fridges that shop for you. It’s only natural to ask whether AI can help in a more serious domain: digital forensics, specifically password cracking. The idea sounds promising: use large language models (LLMs) to produce rules and templates for guessing highly probable password variants, prioritizing the most likely ones first. But in practice, things aren’t so straightforward.

Current landscape: overhyped models, underwhelming results

One of the most cited models is PassGAN, a generative adversarial network trained on real-world password leaks. Researchers initially reported dramatic improvements – up to 73% better speeds than traditional dictionary attacks. But independent testing revealed much lower real-world success rates, around 24%, much of which stemmed from reproducing passwords already seen in the training data. The backlash was swift and justified. Other spin-offs, like rPassGAN, showed slight improvements in lab conditions but still fell short of outperforming well-crafted human-driven rule-based attacks.

PassBERT, based on transformer architecture, took a more linguistic approach. It tried to predict likely passwords by learning password syntax and structure. While test results showed up to 21% improvement in controlled scenarios, even its creators admitted the models struggled outside the lab.

A variety of other AI models have been explored, but none have cracked the ceiling. Most fail for the same reasons: noisy training data, lack of personalization, and the absence of real-world context.

Why AI models fall apart

The biggest challenge is the data. Breach compilations like the infamous 16-billion-record leak are messy and misleading. They include everything from valid passwords to session tokens, system strings, API keys, and junk – all mashed together. LLMs don’t know the difference, and they end up learning meaningless patterns.

Then there’s context. Passwords for online services are often long, complex, and browser-stored – meaning users don’t have to type them and don’t even remember them. For file encryption or disk protection, passwords are entered by hand every time they are used, so the passwords are personal, memorable, and human. They reflect the user’s language, background, biography and so on – things AI trained on global data can’t replicate.

And finally, password leaks are global, while password choices are local. A Brazilian teenager and a German retiree have very different ways of choosing passwords – culturally, linguistically, and behaviorally. AI models trained on a global soup of leaks are blind to that nuance.

The future potential: AI for targeted forensics

If AI is to become useful in password cracking, it needs to move from generalization to personalization. That means building models tailored to individuals or tightly defined groups, using metadata and known behavior.

Imagine feeding an AI not just a bunch of passwords, but also file names, local geography, the user’s native language, known aliases, memorable dates, names of family members and pets, and more. The AI could then generate context-sensitive templates and mutation rules that reflect how this specific person thinks.

Such a model could drastically cut down attack time and increase hit rates. But this level of sophistication requires deep integration of AI with forensic workflows, and the computational power and human resources required to develop such solutions are beyond most individual labs, individual developers and even forensic vendors today.

Bottom line

Large language models aren’t ready for prime time. They perform well in benchmarks, but fall short in real-world forensic scenarios. Until AI can account for user-specific context and clean, relevant training data, it will remain a research tool, not a practical asset. For now, breaking passwords is still more about strategy and human insight than machine learning.

AI, EDPR, password recovery, passwords

AI-Driven Password Recovery: Myth or Reality?

Current landscape: overhyped models, underwhelming results

Why AI models fall apart

The future potential: AI for targeted forensics

Bottom line

REFERENCES:

Elcomsoft Distributed Password Recovery