An Intro to Using Hashes to Check File Integrity

By: Brett S. | Cyber Threat Intelligence Analyst, NJCCIC

Disclaimer: If technical jargon makes you queasy, proceed with caution!

When downloading new software or updating existing software, how do you ensure that what you are installing is safe, unaltered, and from a reputable source? The simple answer is to compare the checksum of the file you downloaded to the hash of the original source file.

We realize for most people checksum and hash are akin to a foreign language, so for starters, a hash—also known as a checksum —is a cryptographic sequence of letters and numbers generated when a file is analyzed by a hashing algorithm.

Two recent examples demonstrate the threat posed by unverified files and the importance of confirming the validity of downloads or software updates:

The Common Hashes

The most widely used hash is known as MD5, short for Message Digest Version 5. MD5 is a unique 128-bit value generated by a hash algorithm and is typically represented as a 32 hexadecimal digit sequence. An MD5 is comparable to a fingerprint - if one byte in the original file is changed, the output hash value will be completely different. MD5 is similar to SHA-1—which stands for Secure Hash Algorithm—and the two are sometimes used interchangeably. However, SHA-1 produces a 160-bit hash value and is typically rendered as a 40-digit sequence. SHA-1 was designed by the US National Security Agency and is a Federal Information Processing Standard published by the National Institute of Standards and Technology.

Example of MD5 hash:


New Hash on the Block

SHA-256 is an emerging technology that is considered a stronger option for conducting a checksum for file integrity. SHA-256 uses 256-bits compared to 128 used for an MD5, which exponentially increases the different letter/number combinations that can be generated and decreases the plausibility of a collision or repeatability, which theoretically exists with MD5 and SHA1. Below you can see the increased length of a SHA-256 hash when this blog post is loaded into a SHA-256 generator.

Example of SHA-256 hash:


Now that you have a general understanding of hashes, what next?

There are plenty of reputable hash generators available for free or purchase, either as a download, browser extension, or online resource – querying for ‘hash generator’ using a search engine should yield several trustworthy results. After you identify a tool that best fits your needs, you can use it to validate your files. Once you know what file you want to download, conducting a search engine query for the file or software name + "file hash" should yield a legitimate result, if one exists. Use caution when searching on third party sites that seem to be off the beaten path, as authenticity is difficult to verify. After you locate the hash code and download the file in question, launch the hash generator. Load the file you want to verify, run it through the checksum generator and then compare it to the original file. If the hash matches, you now know the file has not been altered and is safe to use.

I hope this brief tutorial provided you with a basic understanding of how to check the integrity of a file and, more importantly, put another tool in your arsenal to ensure that your computer or network remains safe and secure. Understanding how to validate hashes can mitigate the threat of malware exposure.