Showing posts with label bigger fish to fry. Show all posts
Showing posts with label bigger fish to fry. Show all posts

Friday, August 22, 2008

The "bigger fish to fry" rule

A long while back I was talking to a developer about a messaging system that was meant to deliver messages reliably even if the network was misbehaving (in other words, it was like TCP, but on a different time scale). I asked the obvious question: "What happens if the network goes down?"

"The sender keeps track of what messages have been sent. It retries if it doesn't get an acknowledgment back that a message was received."

"But what if, say, the sending machine loses power?"

"There's an option to log to disk. It's slower, but the system will recover when the machine comes back up."

"But what if there's a disk crash when the power goes off?"

"You can configure sending processes on more than one machine. It'll cost you more speed, but if one sender fails, the others will take over."

"But what if they all go down?"

At this point the developer started to lose patience. There's really not much more you can do at that point, except keep more copies and reduce the chances of them all being destroyed simultaneously. At some point, it's just not worth it, if only because keeping everyone sufficiently in sync takes more and more effort, especially since you're assuming an unreliable network in the first place. There is no 100% reliable system -- in computing or anywhere else that I know of.

On the other hand, if you had, say, three senders, all logging to disk, keeping in sync over a fast, decently reliable local network (it's the outside network we're most concerned with here), and they all crash unrecoverably, what's going on? Most likely the building is on fire or something similarly bad is happening, and whatever is trying to produce the messages in the first place is probably not able to do its job either. You'd better have enough off-site backups to get things going again after the fire trucks leave.

In such a case, the messaging system is certainly going to fail. But its job is not to be perfect. Its job, and everyone else's, is to be good enough that if it fails there are bigger fish to fry.

Who owns the cloud?

Along the lines of "the usual yada yada," NPR recently ran a story on the downside of storing important personal data -- email, pictures, schedules, secret recipes, whatever -- "in the cloud", that is, online somewhere, you neither know nor care where, conveniently managed and backed up by someone else.

They mention three main problems:
  • Your provider could fold, taking your data with it.
  • Depending on the terms of its yada yada, your provider could shut down your account for any number of reasons beyond your control. For example, a random person could tell them, without proof, that they think you're engaged in committing a crime.
  • Again depending on the terms of the yada yada, the provider might share your data with anyone and everyone.
Now, without meaning to be harsh on anyone (when was the last time I scraped a copy of this blog?), these seem like problems one could anticipate, if only on the basis that in any sweet deal, there's got to be a catch someplace. But that doesn't stop them from being serious concerns.

The holy grail here is a service whereby your data is:
  • Safe: It won't go away, barring disasters in multiple, geographically separated sites (in which case there are probably bigger fish to fry). You may lose access to it, whether because you don't have connectivity, or because your provider folded and the data is temporarily in escrow, or because you really are accused of a crime, or whatever.
  • Secure: Only you can get at it. If you provider leaks your data, it's liable up to some fairly substantial point. If you lose your keys, you can have them replaced conveniently.
  • Yours: You have the rights to whatever you store (provided you created it or otherwise had the rights to it in the first place) unless and until you explicitly sign them away. As I understand it, this is one of the key tenets behind personal datastores.
In most cases, providers are implicitly suggesting this kind of service, and since no one reads the yada yada, everyone is expecting it. Providers also have a strong incentive to make this level of service a reality. If it's too far off, word will eventually get out and fewer people will want to buy in. In particular, the chances of one of the major players folding and taking your data down, or simply losing large portions of it, appear fairly small. Not zero by any means, but fairly small.

On the other hand, there is probably room for a few well-placed regulations to help things along here. In particular:
  • That data held by a provider that goes out of business should go into escrow and made available to former customers for a reasonable period.
  • That data remains private unless specifically made public.