Entries filed under SQL

Migrate from MD5 to bcrypt password hashes without disrupting customers

Posted on February 12, 2017

The MD5 algorithm is well known, first published in 1992, but it is a poor option for encrypting user passwords. Yet, MD5 is still widely used today. This is, in part, due to the arduous task of migrating passwords. Fortunately, there’s bcrypt and it makes for a great password hashing algorithm that can grow with time, unlike MD5 which is effectively stuck forever. For more information on why you should use bcrypt over MD5, SHA1, SHA256, SHA512, SHA-3, etc take a look at this.

But how can you migrate your existing MD5 password hashes to bcrypt without bothering or disrupting your user base / customers?  This is a problem that many organizations face.

PHP in particular makes it very easy to implement bcrypt based hashes. As of PHP 5.5.0, the Password Hashing Functions were added. Encrypting a password using bcrypt is as simple as this:

Validating that hash then is just as easy with the verify function:

And that’s it. There is no need to even specify a salt — the function handles that for you all in one. But… what if you already have a database full of MD5 passwords? You might have noticed the function password_needs_rehashed — but this will NOT work for MD5 passwords. So, what can do you? Reversing the hash to reveal their real password to rehash it is not possible (or realistic).

There are two approaches that you an use to convert user passwords, and we’ll look at them.

Approach (A): Rehash all existing MD5 hashes using password_hash()

You can get all of your users using bcrypt right now with this approach. This means that you are still forced to use MD5 in your existing code forever, but it will work. The MD5 hash from the database will be protected using bcrypt with an automatically generated salt.

As an example, we might do this to the user’s table to upgrade each user’s existing hash to bcrypt. Before you do this, ensure that your database’s password field or however you are storing the passwords is large enough to fit it. varchar(255) is recommended for MySQL/MariaDB even though bCrypt is 60 characters, the password_hash function in PHP could change over time and you want to be able to handle this in the future. varchar(32) will cause any of your updates or inserts to fail or replace the hashes with unusable data, so please be cautious that your tables are ready.

The script to update all MD5 hashes to bcrypt hashes would be similar to this:

Now when the user logs in, you can compare their password with this method:

Advantages of this approach:

  • All of your users are on bcrypt immediately!
  • Hashing is done all at the same time.

Disadvantages of this approach:

  • It could take a long time. Each user could take up to 500ms to 3 seconds! With thousands, hundreds of thousands, or millions of users, this could take literally hours or days depending on your hardware and “cost” generation factor.  That could add up to hours upon hours per user. Some downtime might even be required since user auths will be “screwed up” during this process.
  • You will still need to use MD5 for the foreseeable future. While this is not terrible, it is something to think about.
  • Depending on how you handle saved passwords for logins, all of your users on the system will become immediately logged out if the hash is saved in a cookie, since it will no longer match.

Approach (B): Rehash real passwords to bcrypt upon login

This approach is more gradual – and will involve a mixture of MD5 and bcrypt passwords. Again, before you do this, ensure that your database’s password field or however you are storing the passwords is large enough to fit it. The previous approach talks more about this. In short: make sure your password column in the database is set to varchar(255).

First, a check is done against the row in the database to see if the hash is MD5. The fingerprint is always a 128-bit value, which translates to 32 characters. This string-length check will be reliable then to know whether to authenticate the user with MD5 one last time. Once you authenticate them with MD5, you can now hash their real password with password_hash(), save it, and authenticate them. The next time they log in, their password will not be equal to 32 characters in length (and likely won’t ever be again — the algorithm is not expected to produce smaller strings over time).

You can proceed to add this method to all of the other sections of your application: forgotten password recovery, password resets,  and so on.

Advantages of this approach:

  • User accounts are upgraded gradually, and this method will not require any “downtime” even for thousands of users since they won’t be done all at once.
  • In case you need to use MD5 for any reason, you still have the option. Passwords don’t HAVE to be converted to MD5 as well if you need to set this on an individual basis.
  • You are not double hashing, and, in time, will eventually be able to remove the MD5 from your code once all of your users have been converted or forced to reset their password.

Disadvantages of this approach:

  • Users will not immediately be on bcrypt passwords, which is less secure. After some time, you could resolve this issue by dropping their hashes and forcing those users who have not logged in to change their password.

Final Thoughts:

Before implementing bcrypt in any fashion, be sure to find an appropriate ‘cost’ value for your server. For that, see Example #4, on the password_hash page. The cost is always included in every hash, and you can change it at any time in the future when you’d like newer passwords to be even more secure. At present time, you don’t want to use a cost value that takes any less than 300ms-500ms. A cost of 10 to 12 should be sufficient to slow down an attacker without hindering most systems.