Friday, August 2, 2013

Password Hashing in .NET: The Right Way

Far too many times I see websites, including certain popular social media sites, that do not store passwords in a safe and secure manner. When you create a new account with a website and you get that confirmation e-mail that includes your login name and password, that is bad. Bad website! Bad!

Your password should NEVER be stored in a form that can be retrieved and sent to you in plain text, ever. This means it should never be encrypted in a form that can be decrypted. The sad reality of web security is that nothing is truly secure or uncrackable. If it can be done, it can be undone. A password should be stored in a hashed format, that is, a one-way encryption that can not be reversed.

Simply hashing a password is not enough though. A hashed password should be large enough that it cannot be susceptible to collision attacks. A collision attack is the equivalent to a brute-force attack on a hash value where the attacker tries to create a value that will result in the same has value when hashed. A hash value should also use a unique, cryptographically secure key, referred to as a "salt", to help generate the hash value. Unfortunately, even with all these steps, an attacker can still attempt to brute-force a hash collision. While we can't completely stop this from happening, we can slow the rate at which an attack can be made. In fact, we can slow the rate at which a collision attack can be down to a crawl, effectively mitigating the attack.

In this article, I will show you all of these steps. We'll briefly touch on hash algorithms and which are acceptable for password hashing, as well as those which are not. We'll cover salt use and briefly touch on entropy. We'll also cover iterative hashing and password-based key-derived functions, and validating hash values. Finally, we will create a fully-functional, production-ready example making use of all these features.

There are several hash algorithms that ship with the .NET framework out of the box and each has it's own ideal scenarios for use. Only certain algorithms should even be considered for hashing passwords though, since we need to ensure that our password hashes are of a large enough size that a collision attack would not be very effective. Below are some of the common hash algorithms and some details on each.

Name Password-Acceptable Details
MD5 no Produces 128-bit hash values which are not collision-resistant
SHA1 no Produces 160-bit hash values which have been found to be susceptible to attack
SHA2 yes Produces hash values in the sizes of 224-bit, 256-bit, 384-bit and 512-bit, which are immune to collision attacks as of yet

In the .NET framework, these hash algorithms have a few different implementations: CryptoServiceProvider versions, and Managed versions. The key difference between these two variations of each are that the CryptoServiceProvider implementations make use of the CryptoServiceProvider internally to compute the hash functions, while the Managed implementations make use of the managed library to provide their own implementation of the algorithm(s). The key difference between the two is that CryptoServiceProvider functions are in accordance with the NIST guidelines and therefor, approved for use in government software, where as the Managed functions do not adhere to those guidelines and are therefore not approved for use in government software.

There is one more hash algorithm worth mentioning here. The one we will be using in our example: Rfc2898DeriveBytes. This algorithm is in adherence to RFC standards for password storage and provides multiple benefits. It allows for hashing of variable length keys, implements iterative hashing and uses key-stretching. It also supports salting and is very strong against password-guessing attacks if used properly.

A password salt is a value used in addition to the password/passphrase to generate the hash value. It's like a password for a password. In the context of password storage, a salt value should be unique and not reused across multiple passwords. Each password should have it's own unique salt value. The most common reason otherwise secure authentication systems become breached are from static salt values. That is, using the same salt value for all passwords. In addition to being unique, a salt value should have a strong amount of entropy. Entropy is a measurement of randomness. For password storage, you should generate salt values with at least 128 random bits. In our example, we will be using the RNGCryptoServiceProvider to generate 128-bit cryptographically secure salt values.

Before we get started, I'd like to talk about the Rfc2898DeriveBytes algorithm we will be using. As I mentioned before, this performs iterative hashing, key stretching and allows for variable length key input. Sounds great but what does that all mean?

Variable key length means you can input any length of bytes and it will hash it, as opposed to some other algorithms where you must input data of a specific size. Variable length is ideal for passwords since passwords vary in length (or should).

Key stretching is a method of enhancing or strengthening the input value by stretching it's size and adding entropy (randomness to it). In this way, if a user was to create a relatively week password, it will be strengthened internally regardless. This is important because if one user was allowed to make a weak password, their password being cracked could lead to the entire user store having their passwords compromised as well. Unique salting helps mitigate this also.

Iterative hashing means the data is hashed repeatedly in a loop. This is also part of how key stretching takes place. By doing this you accomplish two things: The key is strengthened, and the speed at which a collision attack can take placed is decreased drastically. Ideally, you want the number of iterations to be an amount which does slow the speed but not to an extent that will affect the user experience. The suggested minimum value is 1'000 iterations, however many implementations will use much more than this, sometimes higher than 10'000. The amount which will be suitable without being noticeable to users depends on the hardware which it is running on.

Let's get started creating our password hashing class. We'll call the class "PBKDF2Managed" (in accordance with the Microsoft cryptography naming conventions).

namespace System.Security.Cryptography
{
    public sealed class PBKDF2Managed : IDisposable
    {

    }
}

We'll be inheriting from the IDisposable interface to allow for our class to be used in using (...) statements, although this isn't necessary as there is no unmanaged resources that must be disposed. This is simply a design preference in this case.

Next we're going to add some constants and fields to our class:

        const int SIGNATURE_SIZE = 4;
        const int SALT_SIZE = 128;
        const int KEY_SIZE = 256;
        const int HASH_SIZE = 1 + SIGNATURE_SIZE + SALT_SIZE + KEY_SIZE;

        int _iterations;
        byte[] _signature;
        byte[] _salt;
        byte[] _key;

The constants identify the static sizes of the various components of our hashed values. For performance reasons, we make them constant since their values will not be changing.

Next, we will add the following methods to the class:

        public byte[] ComputeHash(byte[] password)
        {
            if (password == null)
            {
                throw new ArgumentNullException("password");
            }
            using (Rfc2898DeriveBytes _pbkdf2 = new Rfc2898DeriveBytes(password, _salt, _iterations))
            {
                _key = _pbkdf2.GetBytes(KEY_SIZE);
            }
            byte[] _hash = new byte[HASH_SIZE];
            Buffer.BlockCopy(_signature, 0, _hash, 1, SIGNATURE_SIZE);
            Buffer.BlockCopy(_salt, 0, _hash, 1 + SIGNATURE_SIZE, SALT_SIZE);
            Buffer.BlockCopy(_key, 0, _hash, 1 + SIGNATURE_SIZE + SALT_SIZE, KEY_SIZE);
            return _hash;
        }

        public void Dispose()
        {
            // clean up any unmanaged resources
        }

The Dispose() method is simply to satisfy the inheritance of the IDisposable interface.

The ComputeHash method is what actually creates our hash values. First we check to see if the password byte array supplied is null, and throw a ArgumentNullException if it is. We don't want to try and hash a non-existent value. You may have also noticed that method accepts an array of bytes as the password input and not a string. We do this so that the method is "encoding-agnostic" so to speak. Instances of this class will never have to worry about what encoding the input was in, it simply handles the bytes of the data. This allows us to handle the encoding issues within the caller, also making this class localization-friendly.

Next we create an Rfc2898DeriveBytes instance in a using statement, supplying the password bytes, the salt, and the number of iterations to be performed. Within the using statement we assign the computed hash value of the data and assign it to the _key field. Don't be concerned about the initialization of the _signature, _key, _salt, and _iterations fields just yet. We take care of that in the constructors which we will be creating in the next step. Just know that this instance does all the heavy lifting for us and performs our iterative, key-stretching hash on our input using the values we provide it.

In the final stages of the ComputeHash() method, we are formatting and finalizing the output of our hashed value. We create a new byte array called _hash. This will hold the completed hash value returned to the caller. We use Buffer.BlockCopy() to write to this array as it is our best performance option here aside from using pointers. First, the _signature field's bytes are written to the hash array. Notice we are leaving the first byte of the hash array blank. We will be using this as a validation step in later use of our computed hashes, which we will cover later. The _signature byte array is actually an array of 4 bytes that represent the number of iterations used to compute the hash. This is used in later steps when we need to recreate the hash to validate password input against a stored password hash. Next, the _salt is written to the array immediately after the signature array. Like the _signature array, the _salt array is embedded in the hash so we can retrieve it for later use when validating user password input against stored password hashes. The reason we store the salt in the key itself is that it can be passed anywhere needed without being stored separately in a database. Databases can be compromised and if an attacked can simply look up the salt for a specific password, it gives them an advantage of already knowing which salt to use. Doing it our way, they will have to guess at which bytes in the array represent the salt value, which would be an extremely frustrating task, assuming they would even know it's embedded in it at all.

Finally, in the last step, the computed hash key is written to the hash byte array immediately after the salt value. Our hash value now contains a blank validation byte, a unique 128-bit salt, and a 256-bit computed hash value, giving us a 385-bit hash value in total.

Next we create the constructors for our class:

        public PBKDF2Managed(int iterations)
        {
            if (iterations < 1)
            {
                throw new CryptographicException("Invalid iteration value.");
            }
            _iterations = iterations;
            _signature = BitConverter.GetBytes(_iterations);
            _salt = new byte[SALT_SIZE];
            using (RNGCryptoServiceProvider _rng = new RNGCryptoServiceProvider())
            {
                _rng.GetBytes(_salt);
            }
        }

        public PBKDF2Managed(byte[] salt, int iterations)
        {
            if (iterations < 1)
            {
                throw new CryptographicException("Invalid iteration value.");
            }
            if (salt == null)
            {
                throw new CryptographicException("Invalid salt value.");
            }
            if (salt.Length != SALT_SIZE)
            {
                throw new CryptographicException("Invalid salt value.");
            }
            _iterations = iterations;
            _signature = BitConverter.GetBytes(_iterations);
            _salt = salt;
        }

        public PBKDF2Managed(byte[] hash)
        {
            if (hash == null)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            if (hash.Length != HASH_SIZE)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            if (hash[0] != 0x00)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            _signature = new byte[SIGNATURE_SIZE];
            Buffer.BlockCopy(hash, 1, _signature, 0, 4);
            _iterations = BitConverter.ToInt32(_signature, 0);
            if (_iterations < 1)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            _salt = new byte[SALT_SIZE];
            Buffer.BlockCopy(hash, 1 + SIGNATURE_SIZE, _salt, 0, SALT_SIZE);
        }

We have created 3 different initializers, and each one allows instances of our class to serve a different purpose, which are as follows:

The first initializer is used when we will be creating a password hash for a user. It accepts a 32-bit integer as its input parameter which tells the algorithm how many iterations are to be performed when hashing. Within the body of the constructor a secure, unique 128-bit salt is generated using the RNGCryptoServiceProvider I mentioned earlier. Next, the _signature byte array is assigned the byte-value of the iteration count. Also, note how the iterations parameter is validated to ensure the value is greater than or equal to 1. We don't want to iterate 0 times or -256 times, etc. Also note the type of exceptions we throw and the messages we use in them. They are vague, non-descriptive values. Most attackers find their way through vulnerabilities by using useful information found in overly informative exception messages. We don't want to give a potential attacker anything to use to their advantage.

The second constructor, like the first, will be used when creating a password hash for a user, however, this constructor assumes that a salt provided in the input parameter is to be used for computing the hash value. It will validate that the salt passed to it is the correct length and not null or it will throw non-descriptive error.

The third and final constructor is used when validating user input against a stored password hash. It accepts a hash value as it's input parameter, which is first validated to make sure that:

  1. the hash is not null
  2. the hash is of the correct length
  3. the first byte of the hash (the validation byte) is blank
Next, it reads the signature from the hash and assigns it to the _signature field. It then casts that value to an System.Int32 value, assigns it to the _iterations field, and checks to make sure it is a valid value. If not, it throws a non-descriptive error. Finally, it reads the salt value from the hash and assigns it to the _salt field. If we were to call ComputeHash(...) on an instance of this class that has used this constructor, it will use the iterations and salt value used to create the stored hash to create a new hash. Comparing the newly created password hash value against the stored password hash value for equality would allow us to determine if the password input is the correct password.

Here is the full example:

namespace System.Security.Cryptography
{
    public sealed class PBKDF2Managed : IDisposable
    {
        const int SIGNATURE_SIZE = 4;
        const int SALT_SIZE = 128;
        const int KEY_SIZE = 256;
        const int HASH_SIZE = 1 + SIGNATURE_SIZE + SALT_SIZE + KEY_SIZE;

        int _iterations;
        byte[] _signature;
        byte[] _salt;
        byte[] _key;


        public PBKDF2Managed(int iterations)
        {
            if (iterations < 1)
            {
                throw new CryptographicException("Invalid iteration value.");
            }
            _iterations = iterations;
            _signature = BitConverter.GetBytes(_iterations);
            _salt = new byte[SALT_SIZE];
            using (RNGCryptoServiceProvider _rng = new RNGCryptoServiceProvider())
            {
                _rng.GetBytes(_salt);
            }
        }

        public PBKDF2Managed(byte[] salt, int iterations)
        {
            if (iterations < 1)
            {
                throw new CryptographicException("Invalid iteration value.");
            }
            if (salt == null)
            {
                throw new CryptographicException("Invalid salt value.");
            }
            if (salt.Length != SALT_SIZE)
            {
                throw new CryptographicException("Invalid salt value.");
            }
            _iterations = iterations;
            _signature = BitConverter.GetBytes(_iterations);
            _salt = salt;
        }

        public PBKDF2Managed(byte[] hash)
        {
            if (hash == null)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            if (hash.Length != HASH_SIZE)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            if (hash[0] != 0x00)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            _signature = new byte[SIGNATURE_SIZE];
            Buffer.BlockCopy(hash, 1, _signature, 0, 4);
            _iterations = BitConverter.ToInt32(_signature, 0);
            if (_iterations < 1)
            {
                throw new CryptographicException("Invalid hash value.");
            }
            _salt = new byte[SALT_SIZE];
            Buffer.BlockCopy(hash, 1 + SIGNATURE_SIZE, _salt, 0, SALT_SIZE);
        }


        public byte[] ComputeHash(byte[] password)
        {
            if (password == null)
            {
                throw new ArgumentNullException("password");
            }
            using (Rfc2898DeriveBytes _pbkdf2 = new Rfc2898DeriveBytes(password, _salt, _iterations))
            {
                _key = _pbkdf2.GetBytes(KEY_SIZE);
            }
            byte[] _hash = new byte[HASH_SIZE];
            Buffer.BlockCopy(_signature, 0, _hash, 1, SIGNATURE_SIZE);
            Buffer.BlockCopy(_salt, 0, _hash, 1 + SIGNATURE_SIZE, SALT_SIZE);
            Buffer.BlockCopy(_key, 0, _hash, 1 + SIGNATURE_SIZE + SALT_SIZE, KEY_SIZE);
            return _hash;
        }

        public void Dispose()
        {
            // clean up any unmanaged resources
        }
    }
}

Here are some things to consider about the design of the example:

While we provided constant values for certain things, such as salt sizes, key sizes, and hash sizes, we have not even provided a default iterations value. This is so the class can be configured on a "per-application" basis. If a developer feels the need to use a significantly higher amount of iterations, they simply supply the required number in the constructor. Conversely, if a developer knows that the system running their application will be a low-grade, less capable system with minimal processing power, they can adjust the amount of iterations accordingly.

Like the reasoning behind accepting byte array input values in ComputeHash(...) calls, the output of ComputeHash() is an array of bytes as well, and for the same reasoning. It is encoding-agnostic. By our design, you are able to encode your hash output in whichever encoding you see fit and there is no dependency formed towards a particular encoding.

On a final note, let's talk about the format in which our final hash values are written before being returned to the caller. In our example, we leave the first byte blank, followed by the salt, then the hashed key value. This order is not written in stone and you can re-organize the order in which they are written to the final result how you see fit. You can add multiple blank validation bytes in various locations, put the salt last, or first. Doing so makes YOUR implementation that much more unique and less likely to be guessed on how it is structured. Just be sure you adjust your implementation accordingly and I wouldn't suggest separating each portion with a blank byte as this could be a dead giveaway on which part is the salt and which is the actual key. You want that salt to be as protected as possible.

Let's wrap things up with a sample console application to test out our class. We'll use UTF8 encoding for text and Base64 encoding for the hash value and 1000 iterations just for an example:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Security.Cryptography;

namespace System
{
    class Program
    {
        static void Main(string[] args)
        {
            string myPassword = "my super secret password";

            byte[] pwdHash;
            using (var pbkdf2 = new PBKDF2Managed(1000))
            {
                pwdHash = pbkdf2.ComputeHash(Encoding.UTF8.GetBytes(myPassword));
            }

            string base64PwdHash = Convert.ToBase64String(pwdHash);

            // let's try guessing the password
            byte[] guessedPwdHash;
            using (var pbkdf2 = new PBKDF2Managed(pwdHash))
            {
                guessedPwdHash = pbkdf2.ComputeHash(Encoding.UTF8.GetBytes("Just guessing"));
            }

            string base64GuessedPwdHash = Convert.ToBase64String(guessedPwdHash);

            // tell us if we guesed right
            Console.WriteLine(string.Equals(base64PwdHash, base64GuessedPwdHash));

            // ok, let's use the proper password now
            byte[] userPwdHash;
            using (var pbkdf2 = new PBKDF2Managed(pwdHash))
            {
                userPwdHash = pbkdf2.ComputeHash(Encoding.UTF8.GetBytes("my super secret password"));
            }

            string base64UserPwdHash = Convert.ToBase64String(userPwdHash);

            // tell us if we entered the correct password
            Console.WriteLine(string.Equals(base64PwdHash, base64UserPwdHash));

            Console.ReadLine();
        }
    }
}

In the above test, we first create a "super secret password" and hash it. Then we try to validate a password we know is wrong and compare the hashed values. The console will tell us false because it's incorrect (or at least it better!). Next we try the proper password and of course, it tells use true because we entered the right password.

Hopefully this article has been both informative and helpful and as always feel free to use the code examples and modify them as you see fit (just be sure you do so in a secure way) and feel free to ask any questions you may have or share tips in the comments below.

Happy coding!

No comments :

Post a Comment