Thursday, December 13, 2018

Common passwords are bad ... by definition

It's that time of the year again, time for the annual lists of worst passwords.  Top of at least one list: 123456, followed by password.  It just goes to show how people never change.  Silly people!

Except ...

A good password has a very high chance of being unique, because a good password is selected randomly from a very large space of possible passwords.  If you pick your password at random from a trillion possibilities*, then the odds that a particular person who did the same also picked your password are one in a trillion, the odds that one of a million other such people picked your password are about one in a million, as are the odds that any particular two people picked the same password.  If a million people used the same scheme as you did, there's a good chance that some pair of them accidentally share a password, but almost certainly almost all of those passwords are unique.

If you count up the most popular passwords in this idealized scenario of everyone picking a random password out of a trillion possibilities, you'll get a fairly tedious list:
  • 1: some string of random gibberish, shared by two people
  • 2 - 999,999: Other strings of random gibberish, 999,998 in all
Now suppose that seven people didn't get the memo.  Four of them choose 123456 and three of them choose password.  The list now looks like
  • 1: 123456,  shared by four people
  • 2: password,  shared by three people
  • 3: some string of random gibberish, shared by two people
  • 4-999,994:  Other strings of random gibberish, 999,991 in all
Those seven people are pretty likely to have their passwords hacked, but overall password hygiene is still quite good -- 99.9993% of people picked a good password.  It's certainly better than if 499,999 people picked 123456 and 499,998 picked password, two happened to pick the same strong password and the other person picked a different strong password, even though the resulting rankings are the same as above.

Likewise, if you see a list of 20 worst passwords taken from 5 million leaked passwords, that could mean anything from a few hundred people having picked bad passwords to everyone having done so.  It would be more interesting to report how many people picked popular passwords as opposed to unique ones, but that doesn't seem to make its way into the "wow, everyone's still picking bad passwords" stories.

From what I was able to dig up, that portion is probably around 10%.  Not great, but not horrible, and probably less than it was ten years ago.  But as long as some people are picking bad passwords, the lists will stay around and the headlines will be the same, regardless of whether most people are doing a better job.

(I would have provided a link for that 10%, but the site I found it on had a bunch of broken links and didn't seem to have a nice tabular summary of bad passwords vs other passwords from year to year, so I didn't bother)

*A password space of a trillion possibilities is actually pretty small.  Cracking passwords is roughly the same problem as the hash-based proof-of-work that cyrptocurrencies use.  Bitcoin is currently doing around 100 million trillion hashes per second, or a trillion trillion hashes every two or three hours.  The Bitcoin network isn't trying to break your password, but it'll do for estimating purposes.  If you have around 100 bits of entropy, for example if you choose a random sequence of fifteen capital and lowercase letters, digits and 30 special characters, it would take a password-cracking network comparable to the Bitcoin network around 400 years to guess your password.  That's probably good enough.  By that time, password cracking will probably have advanced far beyond where we are and, who knows, maybe we'll have stopped using passwords by then.

2 comments:

earl said...

Most of us don't use strings of random gibberish, because we can't remember them and anyway we don't really know what "random" means, and if we do it's too much bother. Mostly we use what we hope are unusual strings of caumouflaged private jokes and make them long enough that we hope they're ok.

Are they?

And when do we do away with passwords?

David Hull said...

This is why major tech companies have gotten into the vouching business. A lot of sites these days have "Log in using Facebook/Google", meaning "If you can convince FB/Google that you're you, I'll take their word for it". For sites that don't use that, I believe all the major browsers have some facility for storing passwords, so you only have to authenticate to them (e.g., sign into Google/gMail using Chrome) and they'll fill in passwords when needed. They'll also create strings of random gibberish that you don't have to remember.

There are a couple of obvious possible problems with this. If your main account is compromised, so is everything else (put all your eggs in one basket ... then watch that basket!) and if you ever have to log in without access to your main account, you're stuck. The second isn't such a problem if you have a smartphone, but that also makes it extra important to keep your phone secure.

If you don't want to go either of those routes, I'd suggest the "short random words" approach from this post. The trick is to use the generated password a few times early on so it sticks in your head. Something like "richtwinacelane5" (the example from the post) is not too hard to remember, but still has pretty good security.

Camouflaged private jokes might or might not be OK, depending on the joke and the camouflage. If, for example, you use the first letters of the words in some privately meaningful sentence, that will skew towards common starting letters (t o a w b c d s f m r h i y e g l n p u j k ...). It could still work, but you've probably only got a few bits of entropy per letter.

What really matters is the algorithm/dictionary your attacker will be using. The advantage of generating passwords randomly is that it doesn't really care what algorithm/dictionary is.