Saturday, August 27, 2011

Building a better password

[I've updated this post slightly to reflect the back-of-the-envelope calculation in this post suggesting that 100 bits of entropy is probably more reasonable than my original statement that 48 bits was "not bad".  Under the assumptions in that post, a 48-bit password would take on the order of microseconds to crack --D.H. Feb 2020]

I've recently complained about the irritating nature of the password strength checkers that have been popping up everywhere, so I feel obliged at least to try to analyze the problem and offer solutions.  This is leaving aside the question of whether password authentication is a useful approach at all.

Fundamentally the real measure of password strength is how many passwords you'd expect to have to guess in order to get the right one.  A more formal version of this is the notion of bits of entropy.  If you had a list of all possible passwords in your scheme, I could identify any particular one so long as I could get answers to a series of yes/no questions, for example:  "Is it in the first half of the list or the last?",   "Is it in the first half of that half or the last?" and so forth.  The number of such questions I need is the number of bits of entropy.  Twenty questions means twenty bits, etc..

If I know that your password is either "0" or "1", you have exactly one bit of entropy.  If I know it's an uppercase letter, lowercase letter, digit, "$" or "%", there are 64 possibilities, so you have 6 bits of entropy.  If I know it's two such characters, you have 12 bits, and if it's seventeen such characters you have 102 bits, which is not too bad.  Someone trying to guess your password would have to guess about two thousand billion billion billion passwords, on average, before stumbling on yours.  That may seem like a lot, but keep in mind that the current network of Bitcoin miners can try on the order of a hundred thousand billion billion hashes -- roughly the same problem as guessing a password -- every second.

[Don't assume that guessing a password requires typing it in to the same text box you have to use.  If someone steals the right data from your service provider, they can throw as much computing power as they've got at guessing the passwords.  Quite possibly they'll be happy enough just to try a few thousand weak passwords for each account, since that will crack depressingly many, but attacks like running through the OED with simple substitutions of letters for numbers are absolutely feasible as well, even on fairly ordinary hardware.]

This is assuming that you picked eight characters at random.  If I knew instead that your password was either "F1%ldN0t3$" or "sasssafras" (maybe I'd watched you read your password off a piece of paper with only those two words on it but couldn't quite see which you were typing), then you have only a single bit of entropy, even though both passwords are not just eight but ten characters long and one has plenty of non-letters.

More realistically, if I knew you'd picked an uncommon English word and maybe changed some of the letters to numbers, you'd have somewhere around two dozen bits of entropy.  That's not nothing, but keeping in mind that each added bit doubles the number of passwords a cracker has to try, it's nearly a billion billion billion times weaker than the 102-bit scheme above.

The fundamental flaw of password strength checkers is that they can only look at the password you gave them.  They have no idea what other possible passwords you might have chosen.  The assumption is that if you're forced to jump through enough hoops you'll be forced to expand your parameters, but in fact it's possible to generate passwords in a secure manner using only letters, and or to generate them insecurely in a way that will still satisfy any strength checker out there.  Which is why I half-grimace, half-laugh when I see the "password strength indicator" jump from "poor" to "great" as soon as I type a number.

Now, it's perfectly possible to generate completely random 17-character passwords.  The problem is that something like "qcrQf1x2" or "u%js%hPQ" is a pain to try to memorize, so most people will fall back to picking a "hard" word and maybe altering it a bit.  However, as xkcd points out, it's possible to do a lot better by using random short words.

For example, here's a kind of clunky way of producing a random, memorable password:

BIG HONKING DISCLAIMER: This is just for demo purposes.  The second site I mention uses http, not https, so in theory anyone could be looking in on your session.  Even with https, the sites might be logging all your traffic and recording the results you come up with.  I personally seriously doubt they would, and it's hard to imagine they would be able to connect the dots and figure out what you were using the generated password for, but if you really want to be on solid ground, get the source, look it over, run it locally and use something like /dev/urandom or D&D dice to generate the random input (23d20 will give you close to 100 bits ... not that I would have any idea at all what "23d20" means).  There are also smartphone apps that do more or less the same thing, I believe.

[I last checked that the recipe below worked on 28 Feb 2020]

With that out of the way:
  • Go to this site and copy the random string you see there (e.g., 60990FFC250C).  If for some reason you don't like what you see, just reload.
  • Go to this site.
  • Type some short number and a space into the Challenge box and paste the random string from the first step in after it (e.g. 123 60990FFC250C)
  • Type anything at all into the Secret box (e.g., "secret").  This doesn't have to be hard to guess.  The real entropy is coming from the random string (alternatively, put any number you like, a space, and anything else into the "challenge" box and paste the random bytes into the "secret" box).
  • Press the Compute with SHA-1 button.  Again, the cryptographic details of how strong SHA-1 is don't matter here.  You're just converting a random number to short words.  A simple table lookup would do just as well.
In the Response box you will see six short words followed by some hexadecimal gibberish (in this case, WOVE COOT SLEW WIT SIGH I (FE2D 5F7B 22CD BC39)).  Each of those words represents just over 10 bits of entropy.  We'll need ten words in all, so repeat the procedure but this time just take the first four words (I got FIRE CUFF GALA MINK from A4B455FEFFE7BFAD).

You can play around with this formula to get words that are easier to memorize, or type, or are just more to your liking.  If you reorder your words or try typing in several different things instead of 123 or secret and then picking what you like, you're decreasing your entropy, depending on what criteria you're using to filter out passphrases you don't like and whether your attacker knows what kinds of phrases you like.  If you just try a few different secrets until you see something that seems memorable, that should be fine.  If you do something like sort the words (and your attacker knows only to try sorted lists of words), you've lost almost 22 bits of entropy, which wouldn't be good.

Once you've selected your words add a random punctuation character, number, capital letters or whatever makes your site's password strength checker happy.  Voila!  Your password is now Wovecootslewwitsighifirecuffgalamink5? or whatever.  This isn't great to have to type, but it's pretty secure as passwords go, and probably better than trying to remember something like C;cTbfThoO4ePFTt or 67EE386A205C4563DB8908A6C4.

If your site's password checker imposes an 8-character limit (and, incredibly enough, some do), cry.

4 comments:

earl said...

http://www.gocomics.com/pearlsbeforeswine/2011/08/22

indonesiatooverseas said...

sometime it’s hard to me for remember what my password is if the password too long and complicate

David Hull said...

The important thing is to find something that's memorable to you, but not easy to guess. Different people's memories work differently. If the short random words approach doesn't work, a couple more are:

Shocking nonsense, which recommends picking random words that are memorable for being offensive in some way. Personally, I'd rather not have logging in be a shocking experience just for the sake of being memorable. Also, I'd rather not find out I'd just typed something obscene into a key logger or a chat window when I thought I was logging in.

Taking the first letters of a memorable phrase (shocking or otherwise). Including punctuation and capitalizing the first letter will probably also keep strength checkers happy. Such a password will be reasonably short (but not too short, at a guess you'd want at least a dozen characters), and if you picked something offensive and someone does get your password, you can always say it stood for something else. On the other hand, such passwords can be difficult to type, and you have to remember what punctuation to include (was that comma there or not ... or did I use a dash?).

In any case, one fundamental rule of passwords is to pick something that, as far as you can tell, has never been written down anywhere, ever. For the first-letters approach, that would also mean that the underlying sentence should be brand new.

Hope that helps.

earl said...

My password may never have been written down, but it's going to be, because I can't remember it. Of course, that means an attacker has to get into my house and find that piece of paper. Could happen, but I'm not lying awake...