Tuesday, August 30, 2011

Considerate software

I first heard the motto "Considerate software remembers" a job or two ago from interaction designer  Carl Seglem, who credited it to Alan Cooper of About Face fame.  The phrase has stuck in my head ever since, so the other day I went searching for it and found this extract on codinghorror.com.

There's a lot to like about the very idea of considerate software.  If I'm using a piece of software, I want it to do something for me.  I'm going to be devoting a great deal of attention to it, asking it to do this or that and expecting responses to those requests.  Ideally, someone or something I'm working with that closely will treat me considerately, just as I should make every effort to treat a person I'm working with considerately.

More subtly, the metaphor of considerate software cuts the designers and implementors of the software completely out of the picture.  This is surely deliberate and completely appropriate.  Once software is deployed, the designers and implementors are out of the picture.  I can't come and ask them how to deal with some puzzling or frustrating bit of behavior (and lucky for them, sometimes).  As far as I'm concerned it's the software that's being helpful or annoying.

There are clearly limits on how considerate software could possibly be.  If I decide to type in a long treatise on considerate software into the "shipping address" field of some form, I wouldn't expect the app to respond "Why yes, that's very interesting.  I personally find Cooper's work exemplary.  Shall we continue this conversation over coffee?"  However, it doesn't seem too much to expect a politely phrased, helpful response pointing out that "I first heard the motto ..." followed by several paragraphs does not look like a valid street address.

I don't need to go into detail here about how far short much software falls in this regard.  I'm sure you've got your own examples.  Neither do I want to go into how and why software comes to be inconsiderate, though that's an interesting topic in itself.  Instead, I'd like to go into what qualities make software considerate or inconsiderate.

The list I referred to above hits a lot of interesting points, but it feels more like a list of this and that than a thorough taxonomy.  In particular, the headings, while snappy, don't always seem to match up well with what they head.

Some of the points fall under "Considerate software remembers":
  • "Considerate software takes an interest" is really just saying it shouldn't ask for the same information over and over.  That is, it should remember what you've already told it.
  • "Considerate software is perceptive" says that software should remember what we do.  It also says that it should adapt its behavior based on what it knows.  More on that shortly.
  • "Considerate software takes responsibility." says that software should remember where it is and be able to restore its state as closely as possible to where it had been before something derailed it.
Other points assert that software should know the kinds of things that we know and it can reasonably be expected to know:
  • "Considerate software uses common sense."  Common sense is not some magical filter that separates sensible behavior from senseless.  It's largely a body of knowledge, whether learned or instinctive.  To keep from, say, sending a check for $0, it needs to know that checks should only be sent for positive amounts.
  • "Considerate software anticipates needs."  To anticipate needs, a piece of software needs to know what those needs are.
  • "Considerate software knows when to bend the rules." Is saying that it should know how (and when) to do more than just the narrow definition of its task.
  • "Considerate software is forthcoming." says primarily that software should actually tell us useful information that it knows, but to do that it may need to know information outside a narrow view of what it should be doing.
A third set has more to do with knowing when and when not to offer information
  • "Considerate software keeps you informed/is forthcoming." Not only should it know useful things we didn't specifically ask it to know, it should let us know that and modify its behavior accordingly.  But ...
  • "Considerate software doesn't burden you with its personal problems/is self-confident/doesn't ask a lot of questions." It should limit itself to interactions useful to us, present information in ways that are easy for us to absorb and ask for information in ways that are easy for us to present.
A couple seem more about letting us exercise our judgment instead of trying to exercise it for us
  • "Considerate software is deferential."  Software should not prohibit things that might be useful.  Instead it should make sure we know the consequences of a choice an then let us make it.  It occurs to me that the "undo" feature is particularly helpful here.
  • "Considerate software is conscientious." The principle here seems to be that software should know that some things are dangerous and not simply assume that we mean to do them.
Taking a stab at boiling this all down:
  • Considerate software knows as much as reasonably possible about its domain.
  • Considerate software remembers what's happened, what we've told it and what it's told us.
  • Considerate software modifies its behavior where appropriate based on the above.
  • Considerate software gives us ways to access to what it knows (including the state of the world as it used to be).
  • Considerate software actively tells us important things we might not already know.
  • Considerate software communicates efficiently -- taking into account how human minds work.
These principles seem fairly universal, but it's worth noting that one of the first extensions to the original web protocols, and one that enabled major improvements in the experience of using the web, was the cookie -- a way of letting a web site remember things that have happened before and, ideally, act accordingly.

Saturday, August 27, 2011

Building a better password

I've recently complained about the irritating nature of the password strength checkers that have been popping up everywhere, so I feel obliged at least to try to analyze the problem and offer solutions.  This is leaving aside the question of whether password authentication is a useful approach at all.

Fundamentally the real measure of password strength is how many passwords you'd expect to have to guess in order to get the right one.  A more formal version of this is the notion of bits of entropy.  If you had a list of all possible passwords in your scheme, I could identify any particular one so long as I could get answers to a series of yes/no questions, for example:  "Is it in the first half of the list or the last?",   "Is it in the first half of that half or the last?" and so forth.  The number of such questions I need is the number of bits of entropy.  Twenty questions means twenty bits, etc..

If I know that your password is either "0" or "1", you have exactly one bit of entropy.  If I know it's an uppercase letter, lowercase letter, digit, "$" or "%", there are 64 possibilities, so you have 6 bits of entropy.  If I know it's two such characters, you have 12 bits, and if it's eight such characters you have 48 bits, which is not too bad.  Someone trying to guess your password would have to guess about 140 trillion passwords, on average, before stumbling on yours.

[Don't assume that guessing a password requires typing it in to the same text box you have to use.  If someone steals the right data from your service provider, they can throw as much computing power as they've got at guessing the passwords.  Quite possibly they'll be happy enough just to try a few thousand weak passwords for each account, since that will crack depressingly many, but attacks like running through the OED with simple substitutions of letters for numbers are absolutely feasible as well.  Here's an article from 2012 about hardware that can guess 350 billion Windows passwords per second.  48 bits suddenly doesn't seem like so much.]

This is assuming that you picked eight characters at random.  If I knew instead that your password was either "F1%ldN0t3$" or "sasssafras" (maybe I'd watched you read your password off a piece of paper with only those two words on it but couldn't quite see which you were typing), then you have only a single bit of entropy, even though both passwords are not just eight but ten characters long and one has plenty of non-letters.

More realistically, if I knew you'd picked an uncommon English word and maybe changed some of the letters to numbers, you'd have somewhere around two dozen bits of entropy.  That's not trivial, but keeping in mind that each added bit doubles the number of passwords a cracker has to try, it's tens of millions of times weaker than the 48-bit scheme above.

The fundamental flaw of password strength checkers is that they can only look at the password you gave them.  They have no idea what other possible passwords you might have chosen.  The assumption is that if you're forced to jump through enough hoops you'll be forced to expand your parameters, but in fact it's possible to generate passwords in a secure manner using only letters, and or to generate them insecurely to satisfy any strength checker out there.  Which is why I half-grimace, half-laugh when I see the "password strength indicator" jump from "poor" to "great" as soon as I type a number.

Now, it's perfectly possible to generate completely random 8-character passwords.  The problem is that something like "qcrQf1x2" or "u%js%hPQ" is a pain to try to memorize, so most people will fall back to picking a "hard" word and maybe altering it a bit.  However, as xkcd points out, it's possible to do a lot better by using random short words.

For example, here's a kind of clunky way of producing a random, memorable password:

BIG HONKING DISCLAIMER: This is just for demo purposes.  The second site I mention uses http, not https, so in theory anyone could be looking in on your session.  Even with https, the sites might be logging all your traffic and recording the results you come up with.  I personally seriously doubt they would, and it's hard to imagine they would be able to connect the dots and figure out what you were using the generated password for, but if you really want to be on solid ground, get the source, look it over, run it locally and use something like /dev/urandom or D&D dice to generate the random input (15d10 will give you close to 50 bits ... not that I would have any idea at all what "15d10" means).  There are also smartphone apps that do more or less the same thing, I believe.

[I last checked that this (slightly updated) recipe worked on 4 Nov 2018]

With that out of the way:
  • Go to this site and copy the random string you see there (e.g., A6727933B0169E89).  To get more randomness, just reload the page.
  • Go to this site.
  • Type some short number and a space into the Challenge box and paste the random string from the first step in after it (e.g. 123 A6727933B0169E89)
  • Type anything at all into the Secret box (e.g., "secret").  This doesn't have to be hard to guess.  The real entropy is coming from the random string (alternatively, put any number you like, a space, and anything else into the "challenge" box and paste the random bytes into the "secret" box).
  • Press the Compute with SHA-1 button.  Again, the cryptographic details of how strong SHA-1 is don't matter here.  You're just converting a random number to short words.  A simple table lookup would do just as well.
In the Response box you will see six short words followed by some hexadecimal gibberish (in this case, LANE RICH ACE HIS TWIN BLUE (AA59 E401 0D7F 1CB3)).  Each of those words represents 11 bits of entropy (technically, slightly less, since we only started with 64 random bits, but who's counting?).  Take at least four, preferably five or all six.  If you don't like those words, just try again with fresh random bits, or change the number you used in the "challenge" box, or ask for more than one password in the "Compute __ passwords" box.  This will give you a fresh window with however many six-word passwords you asked for.

Feel free to take the short words you need from them in any order you like. Strictly speaking, this may reduce entropy since it will bias towards things that make sense, but that shouldn't be a huge problem if you, say, generate six passwords and take one word from each.  If you really want more entropy, go through the process twice and get more than six words.  The sky's the limit.

Then add a random punctuation character or whatever makes your site's password strength checker happy.  Voila!  Your password is now richtwinacelane5 or whatever.

If your site's password checker imposes an 8-character limit (and, incredibly enough, some do), cry.

Oh right ... I write a blog, don't I?

A couple of housekeeping items, before I attempt to get back to real blogging:
  • No, I haven't fallen off the face of the Earth, been trapped under a large object or wandered off to Nepal to contemplate the mysteries of the universe.  Just busy, and decided to devote what little blogging bandwidth I've had lately to contemplating the nature of awareness on the other blog.  Hmm ... maybe Nepal wasn't so far off.
  • A couple of logins ago, AdSense advised me that I appeared to have a "popular blog" and I should consider advertising on it.  I'm always glad to know that people are reading Field Notes, but I suspect that AdSense and I have somewhat different notions of "popular".  As much as I would like to bump my employer's revenue stream up by another 0.0000000000000001% or so, I have no plans to do that at the moment or any time soon.  I'm not against running ads per se, but I don't see the point of cluttering up the layout for what I doubt would be any significant gain.  If you ever do start seeing ads here, it will be because there has been a dramatic surge in demand for occasionally-posted web.musings, in which case why not?
  • Prompted by a couple of recent comments, including a couple of completely appropriate ones,  I've settled on a definition of spam comments:  If it's completely independent of the post it's supposedly commenting on, it's spam and will be summarily removed. Mentioning your favorite business as part of a thoughtful response to a post on customer service is just fine.  Mentioning your website, commercial or otherwise, with nothing more than a generic "Hey, great blog!" comment is spam.
  • Mind, I reserve the right to delete any comment for any reason or no reason (hey, it's my blog).  But as a practical matter I'd only expect to do so in cases of spam or incivility, should it occur.  As part of recusing myself from matters Google (and yet still trying to write about the web), I would also remove any speculation about what Google might be up to, be it public information or not, accurate or otherwise.  I don't expect that to be a problem, but thought I'd mention it.
And ... we're back!