Microsoft Office encryption evolution: from Office 97 to Office 2019

October 31st, 2019 by Oleg Afonin
Category: «Cryptography», «GPU acceleration», «Software»

The first Microsoft Office product was announced back in 1988. During the past thirty years, Microsoft Office has evolved from a simple text editor to a powerful combination of desktop apps and cloud services. With more than 1.2 billion users of the desktop Office suite and over 60 million users of Office 365 cloud service, Microsoft Office files are undoubtedly the most popular tools on the market. With its backward file format compatibility, Microsoft Office has become a de-facto standard for documents interchange.

Since Word 2.0 released in 1991, Microsoft has been using encryption to help users protect their content. While certain types of passwords (even in the latest versions of Office) can be broken in an instant, some passwords can be extremely tough to crack. In this article we’ll explain the differences between the many types of protection one can use in the different versions of Microsoft Office tools, and explore what it takes to break such protection.

Word 2.0 – 95, Excel 4.0 – 95

These versions of Microsoft Office apps employed a weak encryption algorithm based on XOR operation. Even back in the day the strength of this algorithm was more of an obfuscation rather than encryption. All passwords for all versions of Word and Excel up to and including Office 95 could be recovered instantly; brute-force attacks were not required.

Summary: instant recovery.

Office 97

Microsoft Office 97 apps were released with deliberately weak encryption, which was carried over to Office 2000. The artificial vulnerability existed due to US export restrictions. While native US versions of Microsoft Office could be configured to use somewhat stronger encryption, the setting was rarely enabled because of compatibility concerns.

Technically speaking, Office 97 apps made use of RC4 for encryption and MD5 for hashing. A 40-bit encryption key and a single iteration of MD5 hashing were used to protect information.

Back in the day, 40-bit encryption was considered weak enough to be cracked with a bunch of computers in reasonable time. Today, just a few days of straightforward brute forcing is all you need to break 40-bit encryption on an average consumer-grade Intel Core i7 CPU even without using video cards or smarter attacks.

However, brute forcing is not even needed to break 40-bit encryption. Back in the day we fully refactored all possible 40-bit keys to build Elcomsoft Thunder Tables ™, an extension of the Rainbow Tables attack. Using Thunder Tables (and Elcomsoft Advanced Office Recovery), you can break all encrypted Word 97 documents and most (about 97%) Excel 97 spreadsheets in a matter of seconds.

It is important to note that the actual password is not needed to decrypt documents. Instead, we attack the binary encryption key. If, for any reason, you absolutely must recover the password, running a GPU-assisted attack offers speeds in the order of tens of millions password combinations per second on a single computer.

Summary: instant decryption with Thunder Tables or extremely fast recovery of the original password.

Office XP and 2003 with default encryption

While never fully lifting the export restrictions on cryptographic products, the export control restrictions became mostly nonexistent by 2000. However, Microsoft continued using weak encryption for several more years in Office XP and 2003. These versions of Microsoft Office use Office 97 encryption and hashing by default. As a result, most Office XP and Office 2003 documents can be decrypted in a matter of seconds with Elcomsoft Advanced Office Recovery using Thunder Tables.

Starting with Office XP, Microsoft enables the use of external cryptographic service providers to increase key length. If an external cryptographic provider is used (which is rarely the case), one must attack the original plain-text password instead of recovering the 40-bit encryption key. Interestingly, the password recovery speed does not depend on the choice of a crypto provider or the selected key length; the time required to break the password depends exclusively on the password length and complexity. As a result, simple passwords can be recovered almost instantly, while an average 7-character alphanumeric password can take about a day and a half.

Even with external cryptographic providers, Office XP/2003 protection is still very weak. An average consumer-grade Intel Core i7 CPU delivers the speed of about 3 million passwords per second. A GPU can deliver about 200 times the speed of the CPU, making attacks extremely fast.

Summary: instant decryption with Thunder Tables or extremely fast recovery of the original password for default encryption settings. Custom encryption requires attacking the original password; recovery speeds are extremely high at about 3 million passwords per second on a single CPU.

Office 2007: the cryptographic nightmare begins

Seven years after the US government eased cryptography export control, Microsoft finally moved away from the default 40-bit encryption. However, the new encryption scheme was not the only thing new with Office 2007.

Starting in 2007, the updated versions of Word and Excel (as well as many other Microsoft Office apps) introduced the new default save format. For Microsoft Word, the file extension was changed from DOC to DOCX. Excel spreadsheets are now saved as XLSX files instead of XLS. The new formats were not just “eXtended” or “eXtreme” versions of the original OLE-based file formats dating back to the 1990s. The extra “X” stands for the Office Open XML standard (not to be confused with the OpenOffice.org XML File Format). The history of Microsoft Office file formats is available in this article (a great reading if you ask me). What’s important for us is that the new file formats were accompanied with the new encryption scheme.

Along with the new document format, Office 2007 uses massively improved encryption. Instead of 40-bit RC4 and a single MD5 hashing iteration, Microsoft employed the industry-standard AES-128 for encryption. 50,000 SHA-1 iterations for hashing signify the departure from insecure single-iteration hashes. The use of a long encryption key makes attacks on the key itself unfeasible. Instead, we must recover the original plain-text password, which in turn means that we must run a brute-force attack (or a smart attack based on a dictionary).

Even today, documents encrypted with Office 2007 (and not saved in compatibility mode) are moderately secure. A typical Intel Core i7 CPU provides the recovery speed of about 1,000 passwords per second, while GPU-assisted attacks result in about 200,000 passwords per second using a single NVIDIA Tesla V100. We recommend smart dictionary-based attacks must to recover the password.

Summary: new file format and completely new encryption. Instant decryption for “password to open” no longer possible. Reasonably fast attacks, brute-force can be used to recover relatively short and simple passwords. More complex passwords require smart dictionary attacks.

Office 2010: twice as secure

In Office 2010, Microsoft has continued with the encryption scheme they introduced in Office 2007. The new Office still uses AES-128 for encryption, and still relies on SHA-1 for hashing. However, the number of hash iterations was doubled from 50,000 (Office 2007) to 100,000 (Office 2010). This was made to account for the evolution of hardware to make passwords at least as secure as they were three years ago.

On today’s hardware we can try about 500 password combinations per second on a single Intel Core i7 CPU. If we use a GPU, this number goes up to about 100,000 passwords per second (NVIDIA Tesla V100). This speed is about average among the other data formats. At this point, straightforward brute force attacks become unfeasible for all but the shortest passwords; a smart dictionary attack is highly recommended.

Summary: twice as strong as Office 2007. Medium speed attacks, brute-force can be used to recover some very short and simple passwords. More complex passwords require smart dictionary attacks.

Office 2013, 2016, 2019

Since Office 2013, Microsoft continuously increases the strength of encryption (the documents are still backward compatible with earlier versions of Microsoft Office). A new encryption method (AES-256) and new hashing algorithm (SHA-512) were introduces in Office 2013.

Office 2013, 2016 and 2016 employ AES-128 or AES-256 for encryption and 100,000 SHA-512 iterations for hashing. This resulted in a dramatic loss of performance for all password recovery tools. We can try about 50 passwords per second on a single Intel Core i7. Using a GPU (NVIDIA Tesla V100) accelerates the recovery speed to about 20,000 passwords per second. This low speed rules out pure brute force attacks for all but the simplest passwords (1 to 3 characters long). For all passwords that are longer than that, smart dictionary attacks are the only viable recovery method. The use of a GPU is absolutely required. For breaking long and complex passwords we strongly recommend running dictionary-based attacks on a distributed network (Elcomsoft Distributed Password Recovery).

Summary: five times stronger compared to Office 2010. Low speed attacks. GPU-assisted brute force attacks can be used to recover passwords of up to 3 characters. An average passwords requires smart dictionary attacks, while long and complex passwords must be attacked with a distributed network.

Instant Password Recovery

Different tools are required to break documents in the old DOC/XLS format and the new DOCX/XLSX format.

Advanced Office Password Breaker is designed to quickly remove password protection from documents in the old DOC and XLS formats regardless of the version of Microsoft Office that was used to save the files. These file formats were used by default in Microsoft Word 97/2000 and Excel 97/2000. They remained the default format in Microsoft Office XP and 2003 (near-instant recovery only possible with default encryption settings).

Instead of recovering the original password, the tool targets the 40-bit encryption key protecting the documents using Elcomsoft’s patented Thunder Tables. This allows achieving a 97% password recovery rate for Excel spreadsheets and a 100% recovery guarantee for Microsoft Word 97/2000 documents.

Distributed Attacks

The new encryption standard used in Microsoft Office 2007 and above made instant recovery impossible (unless the document was saved in the old format in compatibility mode). For all versions of Microsoft Office using the Office Open Format (files with DOCX and XLSX extensions), you’ll have to attack the password instead of the encryption key.

Modern versions of Microsoft Office feature secure encryption making pure brute-force attacks unfeasible. We recommend using GPU-assisted smart dictionary attacks on a distributed network. Elcomsoft  Distributed Password Recovery does exactly that, allowing you to break complex passwords and unlock documents in a production environment using multiple computers equipped with up to 8 GPUs each. You can also use Distributed Password Recovery if you wish to recover the original password protecting the older DOC/XLS documents; in this case, the recovery speed will be extremely fast.


REFERENCES:

Elcomsoft Distributed Password Recovery

Build high-performance clusters for breaking passwords faster. Elcomsoft Distributed Password Recovery offers zero-overhead scalability and supports GPU acceleration for faster recovery. Serving forensic experts and government agencies, data recovery services and corporations, Elcomsoft Distributed Password Recovery is here to break the most complex passwords and strong encryption keys within realistic timeframes.

Elcomsoft Distributed Password Recovery official web page & downloads »