Friday, August 31, 2007

Deutsch's "Fallacies of distributed computing"

For a while, I had Peter Deutsch's list of eight fallacies of distributed computing on my office wall for easy reference. I still refer to them often. It seems to me that our next magical trick, and one we're making some progress in being able to pull off, is to make end users think the fallacies actually hold, while simultaneously varying how true they are in practice.

That is, if I'm using the web, I see a network that
  • is reliable
  • is secure
  • has as much bandwidth as I want
  • with negligible latency
  • and negligible transport cost
  • always looks the same
  • looks the same everywhere
  • and has one administrative point of contact
Actually, most of the time I don't even see the network. I see resources on the network (that is, I see the web) with a dead-simple topology: the resources are right there. Administration largely means saying which resources I want access to and what access I grant to ones I own.

Meanwhile, we web illusionists work to push the various bandwidth/reliability/mobility trade-offs further out, cache and use other tricks to reduce latency and increase apparent bandwidth, streamline the administrative interfaces, get disparate networks talking to one another politely, and otherwise spin the illusion that none of the problems we deal with need to be dealt with.

E-passports in 4G

Re-reading, it occurred to me that the "let me into my car and onto the plane" function of the 4G pocket-thing smelled a lot like an e-passport (one with an RFID chip and some sort of biometric hash of the bearer built in). The main difference, I think, is that current RFID technology will happily broadcast information to anyone who asks. The pocket-thing (or, I fervently hope, the next generation of e-passports) will use the usual PK mojo to make sure it only gives out information to parties it trusts. That won't cure every ill, but it would certainly help.

It's also worth noting that biometrically-secured hardware is just another variant of dongle-based copy protection. How well it works depends on how closely it can be tied to a physical event. If someone in the security line is physically checking that it's my actual eye that's being scanned and compared to the hash in my passport or whatever, then I'm much more comfortable than if I'm just supposed to stand in front of a scanner at a door.

Not to say that people can't be fooled, just that people are currently better at the "is someone actually standing there" test. This may have to be revisited in a couple of decades.

4G, BodyNet and future web experiences

It seems the buzz in the wireless world is over 4G: ubiquitous, fast, IP-based connectivity that doesn't care what it's carrying or who's carrying it. You'll hear the word "convergence" swirling around this, which might bring back ugly memories, but there was never anything wrong with the premise that phones, TVs and computers were going to meld into each other. The problem was with the notion it was all going to happen tomorrow, smoothly and using some particular favorite technology.

Convergence is happening, but by fits and starts, with mixed support from various industry players, over relatively long periods of time and in not-always-predictable ways. At least, I'll happily claim ignorance as to exactly how it will all play out.

Meanwhile, smaller-scale technologies like bluetooth and to some extent WiFi have helped decouple devices from each other. My phone -- a pretty basic model -- doesn't care whether it's using its builtin mic and speaker, my headset or the system in the car. I believe it also has some limited MP3 capability. This is just a taste of what Olin Shivers outlined in 1993 in BodyNet, though as always it's interesting to compare the vision with actuality.

Put those two trends together and assume that the market and we geeks will somehow make all this happen, and we could end up with a pretty slick setup. Everything is IP, so there's no such thing as phone service or TV service per se. It's all just bits. Peripherals like displays and speakers that today are just starting to be net-aware are net-aware by default. Transceivers like phones and routers use SDR (software-defined radio) to not care whether you're talking son-of-GSM, son-of-CDMA or something else entirely.

Here's a sketch of what it might look like. I make no claim of originality here. All the ideas are already in the air, and many of them already exist in some concrete form:

I'm at home getting ready to leave on a business trip. I'm listening to a radio news program (since it's news, it's probably live, but it could just as well be a playlist from my collection). In my pocket is something that looks more or less like a present-day phone or PDA. I slip on some sort of headset assembly with ear buds and microphone. My news program is still playing, but it's on the headset. The house speakers go mute as I leave.

I go out and get in my car. The thing in my pocket tells my car that it's satisfied that I'm with it -- maybe I pressed my thumb against it, maybe it uses some kind of magic. They use appropriate PK mojo to establish trust in one another. The car opens up and I get in. I have the option of leaving the sound in the headset (say, if I'm carpooling and the art of conversation has died) or putting it on the car speakers (as I do on this trip since it's just me).

I get a phone call. The audio stream pauses while I take the call and resumes when I'm done. The navigation system notifies me that there's a traffic jam on my usual route and suggests an alternate. I've only recently subscribed to this service. Just this morning, in fact. It wasn't a feature that the car's manufacturer planned for. Quite possibly the thing in my pocket is doing most of the negotiations behind the scenes and the car's navigation system is just following its cue, but I neither know nor care.

I get to the airport and follow the car's guidance to a good parking place. I'm not carrying a laptop per se. I'm carrying a keyboard/pad and display, though I contemplated leaving them behind. I get them out while I'm waiting at the gate and work on my presentation. They use the thing in my pocket for CPU, local cache and connectivity. At home, the system might use my router and non-mobile connection for better bandwidth or possibly because it's cheaper, but again I neither know nor care.

I get on the plane. The thing in my pocket tells the plane that I'm here. The plane knows I'm ticketed. The pocket-thing may also help me through security, but they may want better proof than a possibly-hacked piece of hardware. My keyboard and display are packed away because the ones provided on the plane are a nicer fit for that space. I board, settle in and finish up a last bit of work before I pick out a movie from my collection.

Another phone call comes in, the caller ID flashing on the screen. It's not urgent so I let it ring through. It's a short flight and I want some shuteye, so I pause the movie. The plane lands. I get off and follow the instructions from the voice in my headset to the train platform. Once I'm seated comfortably, I check my voicemail, deal with the call, then pull out my display and watch a bit more of the movie. Or maybe I'd rather just watch it on the small screen of the pocket-thing.

At the client site I pull up a desk, pull out my keyboard (because I like it) but use the display on the desk (because it's bigger and sharper than my portable one). It'll be a similar story at the hotel. I'll say hi to the staff, go up to my room, which will let me in on cue from the pocket-thing and lock behind me when I go. I'll watch the rest of the movie on the in-room gigundo-screen surround sound system. Nice perk, that. But first I'll call home and videoconference with the family, then do a little web browsing on the long-term effects of living in a constant radio bath.

Most likely, though, I'll still have brought a book along.

Thursday, August 30, 2007

A notable web site

If you ever need some random bits, say for a password or other magic number, there's always John Walker's HotBits generator. Nestled somewhere in the hills of Switzerland it uses a (moderately) radioactive source (in a shielded location in the basement) to generate truly random bits.

There are, of course, other ways of getting high-quality random bits, including some that aren't radioactive -- LavaRnd comes to mind. HotBits is one of the older ones on the web and so has a certain charm and hack value to it.

Even if you don't need any random bits today, Walker's fourmilab site is worth visiting on its own merits. Walker was a founder of Autodesk, makers of, among other things, the hugely successful AutoCAD, which Walker co-authored. Some of the other highlights, in no particular order:
  • The Autodesk File, chronicling the early years of a startup from the PC boom (yes, there were booms before the internet boom). This is a personal pick. Autodesk crushed a PC-CAD startup I was involved with very early in my career. Eventually, it even bought out one of the remaining pieces. The Autodesk File helped me understand how this all happened, but it should also be of general interest. The more things change ...
  • Ada Augusta, Countess of Lovelace's translation and extensive commentary Sketch of the Analytical Engine. That would be Babbage's analytical engine and that would be the Ada they named the programming language after.
  • A bunch of interesting astronomy and space pages, including an earth and moon viewer and an applet for plotting orbits around a black hole or neutron star (in case you ever get the chance to visit).
  • Essays on a variety of topics whether geekly, business-related, some of each or neither.
  • There's even an online diet planner. Bon apetit!

Wednesday, August 29, 2007

Tag everything!

How many things can I tag?
  • I can tag my email
  • I can tag pictures on the web with flickr (and others)
  • I can tag pictures on my local disk (and on the web) with Picasa
  • I can tag web sites on (and others)
  • I can tag and rate songs with my favorite media player
  • I can rate movies and books through Amazon, Netflix, Blockbuster etc.
  • I can tag files on disk, to varying extents, depending on my operating system
  • I can tag entries in my calendar, depending on my calendar app
It would appear there's a bit of consolidation to be done ...

WiFi vs. WiMax in Chicago

According to this CNN article [no longer available], Chicago has shelved its plans to blanket the entire city with WiFi because it would cost too much. They are, however, looking at using WiMax for the same purpose.

WiMax clearly wins on range. I haven't run the numbers, but I'm curious as to how it performs when you have thousands (or more?) users all trying to use the same access point.

Tuesday, August 28, 2007

Jim Gray et. al. on disks and scan times

Here are a couple of highlights Jim Gray and Prashant Shenoy's 1999 paper "Rules of Thumb in Data Engineering", with approximate updates for 2007.

Two key parameters for disk storage are
  • Price: 1994: $42K/TB. Predicted for 2004: $1K/TB. Seagate currently offers a 500GB drive which can be had for $180, or $0.36K/TB. This isn't the bleeding edge. Seagate is announcing a 1TB drive, and I haven't done anything like a thorough search across all manufacturers.
  • Scan time (time required to read every byte on a disk or other medium): Disks have been getting faster, but they've been getting bigger faster than they've been getting faster. In 1999 a typical 70GB drive with a transfer rate of 25MB/s would scan in about 45 minutes. The paper predicts 500GB, 75MB/s and 2 hours for 2004. The Seagate 500GB drive can sustain 72MB/s.
The price trend is just Moore's law. The main lesson, as with most hardware, is don't buy any more than you have to. It'll be cheaper tomorrow.

Increasing scan time has more subtle but crucial effects. We're used to thinking of disks as random-access devices (at least in comparison to, say, tapes). That's why we use them for virtual memory. But they're actually becoming more like tapes and less like RAM. Random access on a disk takes seek time and rotation time. Sequential access just takes transfer time. Seek time and rotation time are becoming more and more expensive relative to transfer.

This has a whole host of implications. Some that Gray and Shenoy mention:
  • Mirroring makes more sense for RAID performance than parity. With mirrors you can spread read accesses out across multiple copies, clawing back some of the lost random access performance.
  • Mirroring also makes more sense for backup. Gray and Shenoy look at tape backup and conclude that tape storage will soon (i.e., now) be purely archival. It just takes too long to scan through all the data on tape. They don't look at CD/DVD, but 500GB of disk is about 60 dual layer DVDs (neglecting compression). Better just to keep multiple copies online.
  • Log-structured file systems will make more and more sense for general use (and were already prevalent in high-performance database systems in 1999). This dovetails with the "change by adding" viewpoint of wikis, version control systems and such.
These effects are more visible behind the scenes than on the web at large. When we factor in CPU and network performance, the results are more directly visible. I'll get to that ...

Jim Gray and real computing

I used to have a bookmark of Jim Gray's piece "Rules of Thumb in Data Engineering". I have no idea which old profile the bookmark is hidden under, but googling "'sequential access' 'disk' 'rules of thumb'" brought it up as the first hit. Another win for Google over bookmarks (though I was a bit lucky to remember the phrase "rules of thumb").

The piece examines the changing ratios among CPU, disk capacity, network speed and other basic parameters and reaches some interesting conclusions about databases and caching. It also makes the larger point is that with Moore's law and its cousins in effect, the basic assumptions behind our engineering trade-offs change faster than we think. The article itself was written in 1999. The trends it outlines look to be holding up well and the main point even more so.

As you may know, Jim Gray was lost at sea last January. Gray's body of work is classic engineering -- developing a firm, objective grasp of the basic facts in order to put all this wonderful machinery to good use. Even a quick look at his home page shows what a loss this is for our profession, to say nothing of those close to him.

Monday, August 27, 2007

One thing I actually knew about Wikipedia

I see Wikipedia has snuck a little "Ten things you didn't know about Wikipedia" link onto their pages (now that's not very NPOV, is it? [Ed. Note: It's since been softened to "Ten things you may not know ..."]). The sleeper in the list is #4: "You cannot actually change anything in Wikipedia... (you can only add to it)".

This feature of Wikis, dating back to the original WikiWikiWeb, is what gives them their agility. Since there's a complete history, there's an "undo" button on everything. This lets editors "be bold", makes it easy to revert vandalism, provides a fascinating view of the process behind a wiki apart from the results, and doubtless provides any number of other benefits I haven't thought of.

I've long been convinced that this sort of "change by adding" model is the norm in the non-virtual world, or at least closer to the norm than the "instant, compete overwrite" model that the CPU sees. Some random examples:
  • If you return an item, the charge doesn't disappear from your account. Instead, a matching credit appears.
  • Other recordkeeping -- school transcripts, employment records, hospital charts, etc. -- tend to follow a similar pattern. Your state changes and the record maintains a history, adding new notations and documents as things happen.
  • A print newspaper, by necessity, can't unprint an article. Instead, it issues a correction or retraction.
  • Similarly, reference books accrue addenda and errata. When these are finally published in a new edition, the previous edition still exists.
These same patterns appear on the web, as well, but there is also the option of keeping only the current state. An online news source may issue a correction but will often quietly update the referenced article as it does so. For that matter, I'm free to edit any post on this blog after the fact (and I occasionally will).

Web protocols handle either scenario. It's up to the author to decide whether a link means something ephemeral or something permanent. Thus the distinction between permalink, which implies that the contents will not change, and plain link, which carries no particular implication.

There is a full spectrum of mutability available, from a true permalink from a source like Wikipedia that promises no changes absent a full-blown catastrophe, to something completely ephemeral like the wind speed at a given weather station. There is also a lot of interesting territory in between that seems worth exploring.

Why do we still have spam?

I'm asking literally. I'm pretty confident some form of spam will always be with us. Snail mail has been around for centuries and we still get junk mail. I'm more wondering why we have the quantity and content of spam we do.

While spam email and physical junk mail are certainly kin to one another, spam has a quality all its own. I haven't taken measurements, but the sheep/goat ratio in my physical mailbox is considerably higher than in my main email inbox, and most of the junk mail is stuff I might plausibly be interested in. At worst, it's at least clearly related to something I signed up for a while ago.

Most spam is just dreck, and there's an awful lot of it. No matter how many chances I get to plow my fee from transferring the fortune of an ex-member of the Nigerian government into the European lottery and use the winnings to buy replica watches and pharmaceuticals, I'm just not going to click on the link and go for it. Sorry, guys.

Why is there such a difference in quality and quantity between the two? The obvious answer is economics. It costs something to send me an envelope. It costs nearly nothing to send me an email. I'm guessing the main cost is the inconvenience of occasionally getting busted and shut down, and even that is obviously not a great cost except possibly in the most egregious cases. So that rules out one possible solution. No one wants to pay to send legitimate email, so sending spam is always going to be cheap, too.

That more or less leaves technical fixes, and it's fascinating to watch the Darwinian arms race play out. Dumb rules don't work at all. That's why we see all the creative misspellings. It's interesting, too, that the "this doesn't have enough actual text" rule bit the dust pretty quickly. Embedded images still seem to be popular, even though > 90% of the time they're garbage. It's that other < 10% that's the problem. Bayesian smart rules seem to work a bit better, but the spammers have been getting smarter. For a while T-bird's filters would get a good majority of spam, but that no longer seems to be the case, even with constant training.

One of these days I'll look up the mechanics behind that filter. As I understand it's based on words or phrases, but the same approach ought to work for other kinds of feature extraction. Instead of a hard "not enough text means spam" or "too many misspellings" dumb rule, image/text ratio and misspellings could be features that, together with, say, "replica" and "watch" in the subject line might be likely to trigger the filter. I also wonder if it's worth trying to make word matching fuzzy, so that "replica" and "rep lica" and "repl1ca" would all count the same. I'm guessing those approaches have been tried by now (or don't work as well as one might expect) and what we have is probably about as good as it's going to get.

To my knowledge, I've never used a central-database-driven approach, where everyone mails any spam that reaches them and the mail client checks incoming mail to see if someone else has already flagged it. I assume, though, that this is why you see random blocks of text at the bottom of spam messages and little noisy dots in images. Everyone's copy of the spam message is unique, presumably preventing a match. This sort of random noise is actually not unlike the "type the word you see" bot blocker. In both cases our visual system cares less about the noise than a computer does and it's a hard problem to make the computer do better.

My most effective spam blocker so far is the simple whitelist, and I'm irked that it took me as long as it did to try it. Quite a while ago I'd noticed that once I'd set up filters shunting each category of real mail into its own folder, the residue in my inbox was nearly pure spam. Of course, I could never remember which folder particular pieces of real mail had gone to, but that's for another post.

Unfortunately, "nearly pure" meant it was a hassle to find the now-rare real mail that didn't happen to match one of my real mail filters. A year or so ago I finally took the next logical step and added a saved search called "not in address book" to my inbox in T-bird. VoilĂ ! The contents of that folder were very pure spam. I could then spend a few seconds every so often scanning them to make sure of their purity, gun the whole bunch and anything left in the inbox was good. Still a bit of a hassle, but it at least brought the spam-scanning down to around the time spent actually reading and understanding real subject lines.

Further, if anything legit made it into "not in address book", it was generally because I'd just asked for it and I knew to look for it. Finally, the "not in address book" folder stayed empty, because it was constantly cleared out, so there was less of a chance of losing an email from a long-lost acquaintance. (I use past tense above because I haven't brought the saved search over to my new setup since I wanted to see how the Bayesian filters were doing these days. Suffice to say the saved search will be back soon. [Ed. Note: And so it was. And it still worked just fine])

This gives me some hope that we may yet get spam mostly under control. When I look at my solicited email, there's still a fair bit I don't really want to read, but it's much like my snail mailbox. The stuff I don't want is at least plausible and it's easily ignored.

Further, it's possible to tighten up the whitelist approach securely, using digital signatures. Perhaps some day enough people will be using signatures that I can change my "not in address book" search to "unsigned". Ideally, email-driven "opt in" schemes would have an extra step whereby the vendor (say, my bank) gives me a key that it will use to sign the "click on this link to register" message it's about to send. My long-lost acquaintance would go through my social networking site to do the initial handshake, and conversely accepting an invitation would involve an exchange of keys.

If you buy that, and it's certainly they kind of scenario I've heard mentioned, then my original question reduces to "Why doesn't everyone use digital signatures?" An interesting question, that one.

Saturday, August 25, 2007

Not so much to do with the web ...

... but this certainly seems like an interesting use of VR technology.

Friday, August 24, 2007

What happened to my bookmarks?

[If you came here trying to recover lost bookmarks for Firefox, has a knowledge base article on the topic. For Chrome, try this Google search. For IE, try this Google search. For Safari, try this one. For Opera, try this one. The Opera search also turned up this PC Today article for Firefox, IE and Opera. For Ma.gnolia (FaceBook and maybe others?), try this. In any case, please feel free to have a look around since you're here.]

Oh they're still there, but the thing is I care less and less. Just how did that happen?

Back when Firefox was Netscape, I spent a fair bit of time grooming my bookmarks list -- checking for broken links, sorting them, organizing them into folders, making sure they followed me from machine to machine. Now not so much. I'm only starting to use, so I'll have to report later on what effect that might have, but it seems more slanted towards finding cool new things as opposed to old lost things.

What changed? A couple of things.

First, browsers got smarter. Firefox (and others) will remember where you've been recently. To get to my banking site, I have to type "b-a-" into the bar at the top and then hit down arrow a couple of times. This is at least as easy as browsing through my bookmarks, even if I put my banking site at the top (at the expense of whatever else), probably because it works regardless of whether I made a particular effort to remember the site or not.

For news sources, I have RSS/Atom/whatever it is these days. Other interesting sites install themselves in the tool bar and look more like applications than web sites.

That pretty much takes care of the "remember frequently-visited sites" function. If it's frequently-visited, it's in the browser's memory pretty much by definition. If it's particularly well-used, it's probably hooked into the browser one way or the other.

Which leads me to the other bookmark-killer: Google. Early on, I sort of remember thinking that the useful web was mainly a smallish set of known sites and it was up to me to remember what to find where. In such a world it makes sense to use the otherwise memory-impaired early browser's bookmark feature to collect the main portals to the known world. Early search engines also had a higher chaff/wheat ratio than modern ones, discouraging their use somewhat.

These days I accept that I have only a dim idea of what's out there. If I want to find out about something I do what everyone does: put together a couple of search terms likely to nail it down and set Google at it (or Wikipedia, depending).

Google was probably what finally really convinced me that "dumb is smarter" could work in a big way. There's still value in hand-selected indexes and summaries, which is really what a bookmark list is, and the whole Web 2.0-style collaborative tagging thing definitely has value, but a comprehensive, frequently updated search engine will win on coverage and agility every time. That sets a reasonably high bar for anything else.

Like browser history, searching works without any explicit help. I could try to remember whether the Murky News is (nope) or (the actual domain) or (redirects). I could bookmark it and find the bookmark. Or I could just Google "san jose mercury" and get it.

Searches also don't go stale. Bookmarks tend to rot over time as things get reshuffled and relocated, even though the document itself is still out there somewhere. I'm not yet sure how adaptive tags can be. Having one's site prominently tagged will tend to discourage one from moving it.

If bookmarks are not that useful in remembering frequently-visited sites or as a starting point for research, what are they good for? For my money there is one core function they still perform well, namely remembering particularly memorable pages, things you're glad you found but probably won't be revisiting on a daily basis.

I do use bookmarks (and now for that, though I don't find myself referring to them much. That's probably because I just don't find myself wanting to replay the greatest hits very much. There's too much interesting new stuff on the web. Old bookmarks are more for rainy days, and it hasn't been raining much lately.

Thursday, August 23, 2007

E-Tickets and copy protection

(Back in the day, back before my day, "e-ticket" meant the best rides at Disneyland -- or maybe the Pasadena Freeway. I forget.)

Two questions come to mind about buying tickets to a events online:
  • Do they have to call the because-we-can and the because-the-venue-can fees "convenience charges"? Just exactly whose convenience are we talking about here?
  • Copy protection always fails. Why doesn't that matter here?
I can't answer the first one, but the second one is easy. You can make as may copies of an electronic ticket as you want. Knock yourself out. All you're really doing is copying a number. But you can only use that number once.

It says so right on the ticket, something like "This ticket may only be scanned once." If someone gets hold of one of the copies you've made and they get to the gate first, you're out of luck. Sorry, that ticket, meaning the magic number and not the piece of paper it's on, has already been used.

That's interesting, actually. You can make as many copies as you want, but you don't want to make any more than you have to. Enforcement is by incentive, not prohibition. This is a bit different from airline tickets, which are tied to an individual so that the would-be ticket thief will also need to forge your ID. I remember being a bit surprised that I didn't have to show ID to use an e-ticket to a show.

The underlying principle is that copy protection exists in the physical world. Things can only be in one place at a time and a particular event only happens once. To limit copy protection in the virtual world, you have to tie your virtual object to something the physical world, either an object or an event.

Copy protection based on objects has a long history. It worked fine when copying was physically difficult. Printing a book ties the virtual object (the contents of the book) to physical ink on a physical page. For centuries this was something most people couldn't easily do. Even with a photocopier, it's not something most people could do very cheaply or easily.

Since this approach worked so well it's no surprise that a lot of early copy protection schemes tried to emulate it. I remember writing code to talk to a dongle that hung off the printer port of a PC every so often to make sure the thing was still there and shut down the application if it wasn't.

The dongle itself was (according to its manufacturer) based on strong encryption, so you weren't going to be able to make a working copy of the dongle for your friend without factoring some infeasible-to-factor numbers. But I could never figure out why you'd have to.

The tie between the virtual object (the app) and the physical one (the dongle) was inherently weak. It couldn't be too hard for someone to figure out what part of the code talked to the dongle and replace it with something that only pretended to.

I remember spending quite a bit of time trying to design dodges like putting the dongle-handling code in some encrypted dynamic module, but it never took long to figure out a way around that, too. I'm pretty sure I later ran across papers in the literature saying the same thing more rigorously, and evidently the market came to the same conclusion. You don't see dongles anymore.

The same basic story has played out repeatedly. CDs worked fine until everyone had a CD burner (so copying the CD was easy) or an MP3 player (cutting the virtual/physical tie so you didn't need a CD player to hear a song). DVDs are more or less in the same boat now (and I'm not even talking about CSS).

The iPod effectively tried to tie iTunes songs to the player, but Apple's heart was never really in it. If nothing else you could burn songs to CD and re-rip them for your favorite player (so much for CDs as a copy-protection mechanism!). Certain recent operating systems appear to try to tie playback to physical artifacts like MAC addresses, but at this point I'm thinking I've seen this movie before and I know how it ends.

On the other hand, tying virtual objects to events seems to fare rather better. E-tickets work fine and rake in tons of money in because-we-can fees. Smartcard-based authentication systems are a variation of the same theme. A particular magic number will only flash on the display once and the server knows when that will happen.

I'm not sure how broadly this applies, though. In both the cases above the virtual object is the key to something physical of interest. It's not the object of interest itself. If it's the (virtual) content that's of interest, it's not clear that the tie to the physical world can ever be ironclad.

I was about to cite live broadcasting as an example, but this really depends on control of the broadcast mechanism. There's no technical reason I couldn't take the picture on my TV screen and stream it to all my friends and have a big virtual pay-per-view party. I personally don't have the bandwidth for such things, not to mention it being illegal, but the bandwidth will be there sooner or later and illegality won't deter everyone.

Other models are on even shakier technical ground. Producing an advertising-free public copy of a particular news source or private database is not a problem technically. It's an interesting question why it doesn't happen more.

Is it because The Man can come after someone who put up such a site (and why put it up unless lots of people will find out about it)? Is it because people don't like breaking the law? Is it because most people intuitively understand that if writers can't make money there won't be any content to steal? Or is it maybe because people just don't mind ads that much and it's not worth trying to pirate the content?

My guess is that it's a combination of all of those factors. It has to be something. Technology is not going to protect content. For most mass-market applications "strong" copy protection, where the virtual/physical tie is inherently strong and does not depend on control of some particular mechanism, seems doomed from the beginning.

That leaves a web (if you will) of legal and social constructs. Same as makes the rest of the world go round.