The conclusion of my blog posts on the LastPass breach and on Bitwarden’s design flaws is invariably: a strong master password is important. This is especially the case if you are a target somebody would throw considerable resources at. But everyone else might still get targeted due to flaws like password managers failing to keep everyone on current security settings.
There is lots of confusion about what constitutes a strong password however. How strong is my current password? Also, how strong is strong enough? These questions don’t have easy answers. I’ll try my best to explain however.
If you are only here for recommendations on finding a good password, feel free to skip ahead to the Choosing a truly strong password section.
Where strong passwords are crucial
First of all, password strength isn’t always important. If your password is stolen as clear text via a phishing attack or a compromised web server, a strong password won’t help you at all.
In order to reduce the damage from such attacks, it’s way more important that you do not reuse passwords – each web service should have its own unique password. If your login credentials for one web service get into the wrong hands, these shouldn’t be usable to compromise all your other accounts e.g. by means of credential stuffing. And since you cannot possibly keep hundreds of unique passwords in your head, using a password manager (which can be the one built into your browser) is essential.
But this password manager becomes a single point of failure. Especially if you upload the password manager data to the web, be it to sync it between multiple devices or simply as a backup, there is always a chance that this data is stolen.
Of course, each password manager vendor will tell you that all the data is safely encrypted. And that you are the only one who can possibly decrypt it. Sometimes this is true. Often enough this is a lie however. And the truth is rather: nobody can decrypt your data as long as they are unable to guess your master password.
So that one password needs to be very hard to guess. A strong password.
Oh, and don’t forget enabling Multi-factor authentication (MFA) where possible regardless.
How password guessing works
When someone has your encrypted data, guessing the password it is encrypted with is a fairly straightforward process.
Ideally, your password manager made step 2 in the diagram above very slow. The recommendation for encryption is allowing at most 1,000 guesses per second on common hardware. This renders guessing passwords slow and expensive. Few password managers actually match this requirement however.
But password guesses will not be generated randomly. Passwords known to be commonly chosen like “Password1” or “Qwerty123” will be tested among the first ones. No amount of slowing down the guessing will prevent decryption of data if such an easy to guess password is used.
So the goal of choosing a strong password isn’t choosing a password including as many character classes as possible. It isn’t making the password look complex either. No, making it very long also won’t necessarily help. What matters is that this particular password comes up as far down as possible in the list of guesses.
The mathematics of guessing passwords
A starting point for password guessing are always passwords known from previous data leaks. For example, security professionals often refer to
rockyou.txt: a list with 14 million passwords leaked 2009 in the RockYou breach.
If your password is somewhere on this list, even at 1,000 guesses per second it will take at most 14,000 seconds (less than 4 hours) to find your password. This isn’t exactly a long time, and that’s already assuming that your password manager vendor has done their homework. As past experience shows, this isn’t an assumption to be relied on.
Since we are talking about computers here, the “proper” way to express large numbers is via powers of two. So we say: a password on the RockYou list has less than 24 bits of entropy, meaning that it will definitely be found after 224 (16,777,216) guesses. Each bit of entropy added to the password results in twice the guessing time.
But obviously the RockYou passwords are too primitive. Many of them wouldn’t even be accepted by a modern password manager. What about using a phrase from a song? Shouldn’t it be hard to guess because of its length already?
Somebody calculated (and likely overestimated) the number of available song phrases as 15 billion, so we are talking about at most 34 bits of entropy. This appears to raise the password guessing time to half a year.
Except: the song phrase you are going to choose won’t actually be at the bottom of any list. That’s already because you don’t know all the 30 million songs out there. You only know the reasonably popular ones. In the end it’s only a few thousand songs you might reasonably choose, and your date of birth might help narrow down the selection. Each song has merely a few dozen phrases that you might pick. You are lucky if you get to 20 bits of entropy this way.
Estimating the complexity of a given password
Now it’s hard to tell how quickly real password crackers will narrow down on a particular password. One can look at all the patterns however that went into a particular password and estimate how many bits these contribute to the result. Consider this XKCD comic:
An uncommon base word chosen from a dictionary with approximately 50,000 words contributes 16 bits. The capitalization at the beginning of the word on the other hand contributes only one bit because there are only two options: capitalizing or not capitalizing. There are common substitutions and some junk added at the end contributing a few more bits. But the end result are rather unimpressive 28 bits, maybe a few more because the password creation scheme has to be guessed as well. So this is a password looking complex, it isn’t actually strong however.
The (unmaintained) zxcvbn library tries to automate this process. You can try it out on a webpage, it runs entirely in the browser and doesn’t upload your password anywhere. The
guesses_log10 value in the result can be converted to bits: divide through 3 and multiply with 10.
Tr0ub4dor&3 it shows
guesses_log10 as 11. Calculating 11 ÷ 3 × 10 gives us approximately 36 bits.
Note that zxcvbn is likely to overestimate password complexity, like it happened here. While this library knows some common passwords, it knows too few. And while it recognizes some English words, it won’t recognize some of the common word modifications. You cannot count on real password crackers being similarly unsophisticated.
How strong are real passwords?
So far we’ve only seen password creation approaches that max out at approximately 35 bits of entropy. My guess it that this is in fact the limit for almost any human-chosen password. Unfortunately, at this point it is only my guess. There isn’t a whole lot of information to either support or disprove it.
For example, Microsoft published a large-scale passwords study in 2007 that arrives on the average (not maximum) password strength being 40 bits. However, this study is methodically flawed and wildly overestimates password strength. In 2007 neither XKCD comic 936 nor zxcvbn existed. So the researchers calculate password strength by looking at the character classes used. Going by their method, “Password1!” is a perfect password, whooping 63 bit strong. The zxcvbn estimate for the same password is merely 14 bits.
Another data point is the password strength indicator used for example on LastPass and Bitwarden registration pages. How strong are the passwords at the maximum strength?
Turns out, both these password managers use zxcvbn on their registration pages. And both will display a full strength bar for the maximum zxcvbn score: 4 out of 4. Which is assigned to any password that zxcvbn considers stronger than 33 bits.
Finally, there is another factor to consider: we aren’t very good at remembering complex passwords. A study from 2014 concluded that humans are capable of remembering passwords with 56 bits of entropy via a method the researchers called “spaced repetition.” Even using their method, half of the participants needed more than 35 login attempts in order to learn this password.
Given this, it’s reasonable to assume that in reality most people choose considerably weaker passwords: passwords that are still shown as “strong” by their password manager’s registration page, and that they can remember without a week of exercises.
Choosing a truly strong password
As I mentioned already, we are terrible at choosing strong passwords. The only realistic way to get a strong password is having it generated randomly.
But we are also very bad at remembering some gibberish mix of letters and digits. Which brings us to passphrases: sequences of multiple random words, much easier to remember at the same strength.
A typical way to generate such a passphrase would be diceware. You could use the EFF word list for five dice for example. Either use real dice or a website that will roll some fake dice for you.
Let’s say the result is ⚄⚀⚂⚅⚀. You look up 51361 in the dictionary and get “renovate.” This is the first word of your passphrase. Repeat the process to get the necessary number of words.
Update (2023-01-31): If you want it more comfortable, the Bitwarden password generator will do all the work for you while using the same EFF word list (type has to be set to “passphrase”).
How many words do you need? As a “regular nobody,” you can probably feel confident if guessing your password takes a century on common hardware. While not impossible, decrypting your passwords will simply cost too much even on future hardware and won’t be worth it. Even if your password manager doesn’t protect you well and allows 1,000,000 guesses per second, a passphrase consisting out of four words (51 bits of entropy) should be sufficient.
Maybe you are a valuable target however. If you hold the keys to lots of money or some valuable secrets, someone might decide to use more hardware for you specifically. You probably want to use at least five words then (64 bits of entropy). Even at a much higher rate of 1,000,000,000 guesses per second, guessing your password will take 900 years.
Finally, you may be someone of interest to a state-level actor. If you are an important politician, an opposition figure or a dissident of some kind, some unfriendly country might decide to invest lots of money in order to gain access to your data. A six words password (77 bits of entropy) should be out of reach even to those actors for the foreseeable future.
Thanks for this article that sorts and merges a lot of information that was scattered around the net, great
What's your take on the acronym method of using the first letter of each word in a sentence?
It's an easy way to generate long passwords that are easy to remember for humans but (hopefully) more resistant to dictionary and rainbow table attacks, but I'm not sure how much more entropy it's adding due to the limited character set it implies.
For example, the password such as
AAM:Iaabovlb, alwbm.bWTPcould be remembered by:
It doesn’t really matter whether you use full phrases from songs/books or shorten them somehow. There are simply too few such phrases that you could use. So the biggest challenge for a password cracker here is guessing the method you used to produce your password. And there aren’t all too many of these…
I tend to use a short sentence, maybe 8 words or so with punctuation and some numbers based on how easy it would be for me to type quickly. For example: "I feel it's 7ecure, am I fooling myself?". Granted, can be a little annoying to punch out on a phone but once I'm comfortable typing it on a keyboard it's very fast.
The point is: human-chosen passphrases are never as secure as randomly generated ones, regardless of their length. It’s hard to tell exactly how many guesses are necessary here, but it will be an order of magnitude less than with a random passphrase.
But you can use a random phrase not a song book or whatever - its a good way to generate a pseudo random set of letters
Very informative article. I use a 3-word master password with capitals and a standard (for me) set of substitutions. Do the capitals and substitutions add significant entropy. My three words were chosen from life experience and so are easy to remember.
As you can see from the XKCD comic I quoted, it’s hard to add much entropy via substitutions and similar tricks. It definitely isn’t enough to compensate the big issue: three words would only amount to 39 bits even if chosen randomly from a decently sized dictionary, and you didn’t choose yours randomly.
The program kpcli will randomly generate word-based passphrases for users, if desired.
It seems to me the problem is the 1000 guesses per second thing. Usually my dumb luck is that I've mistyped my password 3 times and get locked out for an hour, a day, or until support unlocks it for me. Which systems are allowing 1000 guesses a second? That seems like a dead giveaway to me that a brute force hack is underway. Even if we only talk about the password managers, shouldn't they be restricting repeated password attempts already?
Fascinating topic - thanks for writing about it!
As long as there is a server limiting the number of attempts – yes, 1000 guesses per second are unrealistic. This is the relevant number for the scenario where someone managed to steal the data (e.g. from the vendor’s backup servers like in LastPass’ case) or the vendor themselves has turned evil. Since they already have the data then, they merely need to decrypt it and there is no attempt limiting. The limiting factor are only the capabilities of modern hardware.
And then 1000 guesses per second are an ideal scenario. For LastPass, their most current settings allowed 88 thousand guesses per second on a single graphics card. And they left plenty of accounts on extremely outdated settings, for some this meant 8.8 billion (!) guesses per second on the same graphics card.
"correct horse battery staple" is not 28 random letters, 26^28 (4.1615368362200383420985518189585e+39) but is 4 random words 600,000^4 (129600000000000000000000) so maybe using a bunch of words is not so secure if that is a popular scheme that might be guessed?
I’m actually going with a 7776 words dictionary for this article, so it’s 7776⁴ possible options. Which, as I explain in this article, is the equivalent of 51 bits and sufficient for most use cases.
It isn’t meant to be as “secure” as 28 random letters – people cannot remember random letter combinations at far smaller lengths already, and a 131 bit password would have been a total overkill anyway.
Hi, Just a comment on this an most articles that describe password guessers. The guessing is random, so it might guess your password on the first or 10th or 100th or 1000th guess. To say it will take 900 years to guess all the passwords does not mean it will take 900 years to guess yours.
When it comes to random passwords, I’m usually saying how long it takes to guess it on average. Yes, there is a tiny chance that a password taking 900 years on average will be guessed on the first day (0.00015%), but this is irrelevant in practice. We could just as well talk about the chances of an asteroid destroying the Earth tomorrow and preventing the guessing.
Can you update the article so there is an equivalent password length suggestion for people that still want to use the "gibberish mix of letters and digits"? This is the number I am looking for. Thanks
That information generally isn’t terribly useful, so this info doesn’t belong into the blog post. However, if you use a random mix of lowercase letters and digits, one word is the exact equivalent of 2.5 letters. So instead of four words you take 10 letters. And six words would be 15 letters.
How badly is entropy affected in the diceware scheme if I reroll the dice until I find some "cool" passphrase? I guess humans being humans will not settle with the first passphrase they are shown, and will eventually search for some "nice" passphrase.
Good question. I’d say (though I’m not entirely certain) that it depends on the number of rolls. You roll twice – that’s one bit off the password complexity. Four times – two bits. Eight times would be three bits. And so on with the powers of two.
There are still parts to the password guessing game that I don’t understand. For instance, though the bad guys may generate several million guesses at my password, my bank / email / medical records etc. would lock out after three attempts. So how do they keep entering guesses and not get locked out? Am I missing something obvious?
This isn’t about trying to log into a website. It’s about the scenario where the data is stolen already (e.g. from the vendor’s backup storage, like it happened for LastPass) and only needs to be decrypted. There is no longer a gate keeper limiting access. The only factor limiting attempts here are the capabilities of the hardware testing the guesses.
I once had a 5-character legacy password on a site that required 8-characters minimum. Despite it's short length, it felt pretty secure since it would be silly for an interloper to be testing passwords deemed unacceptable by the system's rules. :-)
If an attacker knows which word list was used, the number of words used, and the fact that the user tried to go after a "cool" passphrase, then wouldn't they try to test for passphrase that seem "cool" to the average person first? By only going after passphrases that sound "cool", you end up introducing a human factor into it again...
You are too smart. Wicked smart. I have been using computers for years and thought I was clever with passwords. Your astute comments here have left me feeling like a village idiot.
I guess my task now is to find a way to remember random words to use as a master password. Not sure if stapling a correct horse to a battery will work for me. Or was that a battered horse with the correct staple? Good Grief!
Thank you for this article. I would like to know what the latest cracking attempts are using technology-wise and which breaches they are targeting.
AI will also remove entropy from password guessing attempts, and even passphrases from song lyrics or quotations by knowing which words usually appear in proximity. The key will be to use the "dice" approach to choose words not commonly associated.
Technology depends on the algorithm. LastPass uses PBKDF2, it’s an algorithm that can be sped up massively on a graphics card.
It really doesn’t need AI to generate good guesses. Software like hashcat does it out of the box, and password crackers can further improve the quality of the guesses using the information they have on the target.
Thank you for this blog post. We are a new user of LastPass (joined Dec 2022) and are shocked to hear about all of this - and so soon after we thought we were securing things down by using a password manager at all.
I followed the link to the lastpast support site to change the iterations and this what their site currently says, "Within the "Security" section, for the Password Iterations field, enter your desired number of rounds (our minimum recommendation is 600000 rounds)." So, it looks like they are no longer even suggesting 100100 iterations anymore... The maddening part is that they aren't proactively taking care of their customers, leaving each of us to independently 'up our game' on the security.
Great easy-to-understand article! Thankfully I came across the XKCD comic years ago and have used such a passphrase ever since as my password manager's master password. Would the entropy actually go up if I were to use words from different languages? Say "correctchevalbateriagraffetta" instead of "correcthorsebatterystaple"?
In principle: yes. But the point is that the passphrase should be easy to remember for you, despite using a reasonably sized dictionary. So probably best to use only languages you are proficient in.
Related to the topic of password strength, I'd be interested in your thoughts on the practice of various websites / corporate environments requiring users to change our password every 30 / 60 / 90 days. If we're all using a strong(ish) password with a combination of upper/lower-case letters, numbers & symbols etc, are using separate passwords for each site and are storing these in a password safe, is there any benefit in having to change passwords all the time? I am regularly frustrated by the requirement and see no benefit in it.
This is generally considered a bad practice. Users who have strong passwords don’t need to change passwords. And confronting users who have weak passwords with the requirement to change passwords will usually result in a minimal variation of the previous password, if not even in a weaker password. Security experts generally recommend against it.
In all the discussions on password entropy, i have never seen anything that takes into account that surely the attacker also needs to guess the length and format of the password for these entropy numbers to be valid? Or is there some simple way that an attacker can establish password length, character set or password format from the encrypted blob before they start guessing? Please could you explain what am i missing here? Thanks
The password length is growing “automatically” as the complexity of the guesses increases. The other factors need to be guessed, though these don’t add as many entropy bits as one would wish.
Of course, anything that’s known about the password helps weed out wrong guesses early. So a vendor storing password complexity information next to the hashed password actually makes the password easier to guess. Same goes for any boasting about one’s passwords on social networks.
I can see some logic to that, but surely password guessing strategies don't simply guess passwords in length order do they? For example, i would assume that they would guess something like Password1 long before it would guess something like gf7% even though Password1 is longer (obviously not arguing either of these are secure, just trying to illustrate a point)
Similarly with your point about song lyrics for passwords, those seem long but probably are quite commonly used so would presumably be quite high on the password guessing list compared to something shorter but based on a less common password picking strategy. Even sticking with the song lyrics, surely it is not as simple as saying entropy is based on number of lines from popular songs. Surely the 20bits of entropy that you suggest for this method is further increased because the attacker is likely to guess passwords from a few other methods first (i.e. the rockyou list, common names and dates lists etc) before they start guessing song lyrics. Similarly even adding a single non obvious character somewhere in an otherwise common password should add a fair bit of entropy because the password guesser surely has to guess not just the correct extra character, but also where that character is. If they dont know how long the password is, they surely don't know that they have just "missed" a password guess hy a single character and will likely carry on guessing their next most probable guess rather than itterating through all the combinations of capitalisations, character substitutions or random character insertions for the guess that just failed. E.g. if we took the example password of "BaaBaaBlackSheep", surely changing it to "BaaBaa2BlackSheep" would make it significantly harder to guess because if you dont know password strategy, you wouldnt know why "BaaBaaBlackSheep" guess had failed, and to find the correct inserted character and the correct location for it would add a huge number of combinations, let alone the chances are that the attacker would probably move on to the next password guess altogether, maybe "wheelsonthebus" or something.
Not trying to be difficult here, but just trying to understand what i am missing?