Distributed Password Recovery Goes 64-bit: Ready for RTX 5090

March 26th, 2026 by Oleg Afonin
Category: «General»

We have just released a major update to Elcomsoft Distributed Password Recovery. While the release notes might simply say “migrated to 64-bit,” the reality under the hood is far more complex and significant. This is not a cosmetic update or a simple recompile; it is a fundamental architectural shift necessitated by the evolution of GPU hardware. Put simply: if you want to use the latest NVIDIA RTX 50-series Blackwell GPUs for password recovery, you can no longer use 32-bit code.

Here is why we did it, why it took so long, and why it matters for your forensic lab.

The introduction and evolution of hardware acceleration

For years, the highest performance in password recovery could be achieved by utilizing SIMD (Single Instruction Multiple Data) hardware – ranging from professional accelerators to consumer video cards powered by AMD and NVIDIA GPUs. In early 2007, we developed a method to accelerate password recovery using GPU hardware, a technology that would forever change password recovery. At a time when GPUs were thought of purely as graphics engines, Elcomsoft engineers found a way to pair them with CPUs to dramatically accelerate cryptographic calculations. This discovery came just as NVIDIA officially released CUDA – a parallel computing platform that allows software to offload computations onto GPU hardware. It was the first toolkit to open GPUs to general-purpose computing. Together these developments opened the door to much faster attacks on consumer hardware.

Since then, we’ve been using CUDA exclusively to work with NVIDIA GPUs. Back then, the entire code base was 32-bit; 64-bit compute on consumer hardware was largely unheard of. Eventually, 64-bit architecture started gaining traction in the consumer space, but we continued developing Distributed Password Recovery in 32-bit. Why? Because it just worked, and we didn’t see a real benefit in switching to a different code base. Spoiler: we still don’t. 64-bit instructions don’t magically accelerate things that don’t require huge memory pages – and password recovery is not one of the things that benefits from 64-bit instructions directly. Yet, the indirect benefit is clear.

NVIDIA has been signaling the exit from 32-bit compute for a long time. They deprecated 32-bit x86 CUDA support back in 2018 with CUDA 10, and removed it in CUDA 11. However, the real “hard stop” has arrived with the Blackwell architecture (RTX 50 series), which requires CUDA 12.8 or newer. These new GPUs and the accompanying CUDA 12.8 toolkit have dropped support for 32-bit compute applications entirely. You cannot run a 32-bit CUDA kernel on a Blackwell GPU; the driver simply won’t allow it.

For a long time, we maintained EDPR as a 32-bit application because it “just worked” on the hardware available at the time (Ampere, Ada Lovelace). But with the arrival of Blackwell, we faced a binary choice: stay 32-bit and lose support for all future NVIDIA hardware, or rewrite the entire engine for 64-bit. We chose the latter.

Under the hood: the EDPR architecture

To understand the scale of this migration, you have to look at how Elcomsoft Distributed Password Recovery is built. The system consists of three distinct components:

  • The Server: This acts as the command center. It manages the queue, breaks password recovery jobs into manageable chunks, and distributes them to available clients.
  • The Console: The GUI where the investigator sets up attacks, configures masks, and monitors progress.
  • The Agents: These are the workers. They run on the network workstations (potentially including the local machine), receive the chunks from the server, perform the actual brute-force or dictionary attacks, and report the results back.

The Agent is where the heavy lifting happens, and crucially, where the GPU acceleration lives. This is where we hit the complexity.

The plugin problem

EDPR does not use a monolithic engine for all file formats. Instead, it uses a plugin architecture. We have over 160 discrete plugins, each designed to handle a specific data format – whether it’s a ZIP archive, a RAR5 file, a BitLocker volume, or an iOS backup.

Each of these plugins is highly optimized. In password recovery, efficiency is everything; a 5% drop in speed can mean adding days to a recovery job. To achieve maximum throughput, many of these plugins were written with heavy use of inline assembler and low-level optimizations specifically tuned for 32-bit registers and instruction sets.

When NVIDIA removed 32-bit support, we couldn’t just hit “Recompile” in Visual Studio and target x64. The inline assembler code simply does not port over. The 64-bit architecture brings more registers and a different calling convention, but it also invalidates the decades of hand-tuned 32-bit assembly we had relied on.

The migration struggle

We started working on the 64-bit port immediately after NVIDIA announced the end of 32-bit support, but the process was grueling. We had to take each of those 160+ plugins, strip out the 32-bit assembly, and rewrite the computational kernels for 64-bit.

This introduced two major challenges:

  • Rewriting: Writing optimized 64-bit code from scratch for hundreds of algorithms is time-consuming.
  • Regression testing: In some cases, the initial 64-bit ports were actually slower than their 32-bit predecessors. We had to spend months profiling and re-optimizing specific plugins to ensure that the move to 64-bit didn’t result in a performance penalty on older hardware. Even today, some 64-bit plugins still show regression compared to their 32-bit counterparts as direct code migration just does not give the same level of optimization.

There were moments where we saw regressions on specific hash types because the 64-bit compiler optimizations didn’t behave exactly as our hand-tuned 32-bit ASM did. We had to manually intervene to bring the speed back up. That takes time. A lot of time.

The result: ready for RTX 5090

The result of this refactoring is the new 64-bit build of Elcomsoft Distributed Password Recovery. By moving the code base to 64-bit, we have unlocked native support for the latest CUDA and the NVIDIA Blackwell architecture.

If you secure an NVIDIA GeForce RTX 5090 or 5080 for your lab, you can now utilize it fully with EDPR. The new code can communicate directly with the latest drivers, utilizing the massive parallel throughput of the new cards. This update also future-proofs the tool. With Intel and AMD compute frameworks also being 64-bit only, EDPR is now aligned with the entire GPU acceleration ecosystem.

We are currently running extensive benchmarks comparing the RTX 4090 vs the RTX 5080 and RTX 5090 on this new build. Initial results show significant gains in high-iteration formats like VeraCrypt and specialized hash types. We will publish those numbers in a follow-up post.

Finally, it is worth emphasizing that no matter how well we optimize our code or how powerful the hardware becomes – even with thousands of high-end GPUs at your disposal – certain modern encryption algorithms simply cannot be defeated by brute force alone. In these cases, a more targeted approach is required. This begins with forensic triage using tools like Elcomsoft Quick Triage, which can instantly aggregate saved credentials from a live system to find “low-hanging fruit.” Beyond that, successful recovery often depends on building a detailed suspect profile to move from “cold” attacks to “smart” ones. By leveraging personal data to create targeted dictionaries, applying complex masks, and using rule-based mutations, investigators can focus their computational power on the most likely password candidates rather than wasting cycles on a mathematically impossible search.

Benchmarks

REFERENCES:

Elcomsoft Distributed Password Recovery

Build high-performance clusters for breaking passwords faster. Elcomsoft Distributed Password Recovery offers zero-overhead scalability and supports GPU acceleration for faster recovery. Serving forensic experts and government agencies, data recovery services and corporations, Elcomsoft Distributed Password Recovery is here to break the most complex passwords and strong encryption keys within realistic timeframes.

Elcomsoft Distributed Password Recovery official web page & downloads »