Tuesday, March 31, 2009

Not very much about Conficker.c

Having caught the Sixty Minutes episode in which LeBron James sinks a one-handed underhand shot from the opposite free-throw line -- in one take, no less -- I couldn't help also noticing a piece about the Conficker worm. Well, actually it was an advertisement for Symantec in which a spokesman showed how malware in general could do all kinds of scary things, just like it always has, but you could use Symantec to protect yourself. No mention of, say, Kaspersky or McAfee.

OK, so do the Windows boxes have this thing or not? There are several ways to find out. US Cert, for example, recommends checking for connectivity to several sites. Microsoft has its own page.

Now, useful as these tips are, any of them just represents someone's best guess. Granted, it's a bunch of smart and experienced someones, but still, there's always the chance that the worm's authors have found some way to decoy around this. Even if not, there are many, many infected systems out there and no one really knows what if anything they'll do when the worm kicks into high gear, oh, right about now. As I've said, one of these days something out there is going to do serious damage. This might be it, or it might be another damp squib.

The thing that struck me, though, was that none of the sites I've seen mentioned to check or to download a scanner from start with https://. So here's hoping that no one is monkeying around with DNS while all this is going on.

Grumpily yours ...

Thursday, March 26, 2009

Demos vs. applications

Here's one of the slicker demos I've seen in a while, courtesy of Ben Fry. What does it do? Try it, and don't forget the "zoom" option. I'll wait.

Slick, huh? Note how the overall map shows the US population distribution without explicitly showing features like cities or streets*, and invites browsing and personal readings. Edward Tufte no doubt loves it. Note how, as you type, you see not only a bit of how the US Postal Service thinks, but also see another indication of population density. The '6' and '8' regions, in particular, are huge, larger than most of the world's countries, while the '0' and '1' regions, for example, are much smaller.

But ... how often do I want to get the location for a ZIP code, anyway? The other way around, sure. If I do, do I really need to see the area narrowing as I type? If I do need to see it, wouldn't I probably want a bit more context, for example state/county/city names instead of just dots and numbers? People apparently sell three-digit ZIP code maps for sales planning, the first three digits being important because they correspond to the postal service's sorting centers, but such a map would be worthless without the other information you generally get from a map.

Again, it's a great demonstration of how to display a moderately large** data set. It's fascinating to browse, particularly because it's interactive rather than static. But am I likely to use it every day? Probably not. Is it likely to be embedded in some application that actually needs to decode ZIP codes? Again, probably not. You don't need to spend all that screen space on a fascinating map when a simple form will do.

One obvious question: What sort of task would a display like this be useful for? Dunno.

(*) Fry does a similar exercise with streets, though I'm not sure either I or Google Maps believe the suspiciously sharp-edged empty patches in the Midwest [Since fixed -- D.H. May 2015]. I'm also reminded of my previous comment on nematodes, of all things.

(**) I say "moderately large" because by definition there can't be more than 100,000 five-digit ZIP codes and in practice there are more like 43,000. Compare this to, say, 26 million segments of road in the US or 3 billion base pairs in the human genome. It's large enough to be visually interesting when plotted all together, but small enough that the applet can carry the whole data set with it rather than querying interactively, AJAX-style.  [From a Google point of view, this would be somewhere between "small" and "tiny", of course --D.H. May 2015]

Wednesday, March 25, 2009

Amazon on Roku

A while ago, the Roku set-top box quietly upgraded itself, mostly re-arranging menus slightly, but also promising nifty new things to come. A few weeks ago, the other shoe dropped. Amazon is now offering a wide range of movies and TV shows -- much wider, or at least much more recent and popular, than Netflix's "watch instantly" service. As a general rule you can either rent a show for a 24-hour period or "buy" unlimited access (more or less) for about three times the rental cost. Rental costs are comparable to cable pay-per-view rental costs, but the selection is, again, much wider.

It's certainly an interesting development and probably a significant part of the Way of the Future, but there are still some kinks to work out. For example, it's best to check whether the program you can rent from Amazon is also available for free (meaning you've already paid for it) via your regular cable or satellite service or (I recall seeing at least one case) via Netflix on the same box.

Maybe someone could write a plug-in to sort all this out. Except I doubt that a security-conscious, DRM-friendly box like Roku's will be very friendly to plug-ins.

It will also be interesting to see how (or whether) Netflix responds to this. I'd mentioned a possible "premium instant" service before. Thinking it through, another option might be just to emulate the current DVD queue. For $X a month you get access to any N DVDs at one time.

Except, hmm ... that works for physical DVDs because it takes time to send in the old ones and get the new ones in the mail. With the box, you could just "send in" the movie you just finished and pull the next one off the queue. Lather, rinse, repeat and you have access to the whole catalog for the price of one. Maybe it would only let you "return" titles once a day, or whatever?

[Remarkably little has changed since then, except that on-demand from the cable company is much more likely to provide pay-per-view.  If you've "cut the cord", YMMV.  In particular, Netflix's subscription model hasn't changed.  Instead, they've focused on producing their own content, much as HBO did when it was at a similar stage. --D.H. 2015]

Saturday, March 21, 2009

March Madness, baby!

Every spring here in the US, the NCAA hosts its college basketball tournaments, and every year some many-digit number of dollars of productivity is lost as employees (and bosses) try to keep track of how their brackets are doing. The effect is particularly pronounced in the first round, a glorious orgy of 32 games, roughly in batches of four, spanning a Thursday and Friday in late March. The games go from about noon to midnight on the East Coast, and thus pretty much through the workday on the Pacific side.

What does this have to do with the web? Every year there's a bit more Madness online. This year, CBS is streaming every game for "free" on ncaa.com, using Silverlight. The quality is good (not that anyone at my place of work would have checked, of course), but "free" means you get a commercial at the beginning, an obnoxious animated ad off to the side, and the usual further commercials during what used to be the "TV time outs" and are now the "media time outs."

There's also the "boss button" that throws up an imitation spreadsheet. This must be just for laughs. If it actually fools your boss, you obviously need hoops to liven up what must otherwise be a stunningly dull workday.

All of this has been in place for a while, but each year's version is a little slicker and a little smoother as the media players get better and the net becomes better able to handle large loads of streaming video.

Can't watch the video? Well, you could always stream audio, or just check an automatically updating scoreboard. Playing the office pool? You can do that online, too. The site will do the busywork of keeping score. Some will even show updated scores of games in progress. Nice.

Frivolity? Sure (unless the Jayhawks are playing). But it's also a nice test case and driver for the technology. You have everything from streaming media down to nice UIs for showing scores and tracking brackets, all aimed at a mass market that, collectively, is very picky about usability. If your site is clunky, millions of sports fans that should rightfully have been your customers will find one that isn't.

Just the sort of testbed for improving the infrastructure to pave the way for ... um, well, March 2010 comes to mind ...

Monday, March 16, 2009

Curses, foiled again!

Seen on an automated ticket site, right next to the captchas:
You do not have permission to access this website if you are using an automated program.
Oh no! Now I'll have to modify my automated ticket-poaching bot to scrape the page I'm hitting and look for text that says I can't use the site. And it gets worse. It's not safe to just look for those particular words. They might decide to change the message to something really scary, like "TOP SEEKRIT NO BOTS ALLOWED WE MEAN IT!" No, I'll need to put together something that can parse English text and figure out whether or not I'm permitted to poach tickets there. That's even harder than cracking captchas.

Damn you, evil ticket guardians!

Sunday, March 15, 2009

The pillow fight that got out of hand

Flash mobs -- groups of people who gather in public at random times, commit random acts and disappear back into the woodwork -- have been around for a while now. They're particularly popular in San Francisco because, well, it's San Francisco. Like many things in San Francisco they've been tolerated by the authorities because, well, it's San Francisco.

Unfortunately, the authorities are now having second thoughts, after a pillow fight that carried on from 6pm to around midnight left the city with a bill for $20,000. Since one of the key features of a flash mob is that any random anonymous person can get them started, and many do, there's not really anyone to send the bill to. And times is tight.

Oh, those irresponsible net.hooligans, trashing the place with no thought for the citizenry at large, right? Well, not really. The text messages that got the thing started said "Rules: Tell Everyone you know. ... Arrive with pillow hidden in bag. ... Practice responsible fun and help clean up. ..." and it's not like people suddenly went nuts and started breaking windows.

What actually happened was that pillows started breaking, spreading feathers on the wind. San Francisco weather being what it is, the flying feathers soon became wet, soggy, hard to clean up feathers. I'm pretty sure people anticipated the first part, and to some extent the second.

What they didn't realize was that cleaning up involved, among other things, draining a local fountain (which had just been filled) and checking the pumps and plumbing for clogs. Raking scads of feathers out of the grass was no fun either. Pretty soon you've got a $20,000 bill. This does not include $10,000 worth of damage to a nearby restaurant flooded when a second fountain overflowed.

The city would prefer that anyone planning a flash mob get permission, pay the appropriate rental fees, etc., etc. For obvious reasons, this is not going to happen. I don't get the impression that many people really want to crack down on flash mobs, but it's easy to understand why the city might see this as the most feasible option.

My personal opinion: Stuff like this is going to happen, particularly in a city like San Francisco (or New York, or London, or ...). To some extent it should be figured into the budget and the insurance bills of local businesses. On the other hand, the Right Thing for the flash mob participants to do would be to take up an anonymous collection and send the proceeds to the appropriate city departments (at least the Public Works and Recreation and Park departments were involved) and to the damaged restaurant to the extent it's actually liable. There were an estimated 1500 - 3000 participants. A donation of $10-$20 per head for hours of pillow-busting anarchic fun doesn't seem out of line.

[As the article I originally linked to implies ("This year's Valentine's Day pillow fight") the pillow fight itself was already an annual event in 2009.  As of last year it was still going strong.  This all seems like the antithesis of a "flash mob", or at least a fairly loose interpretation.

In any case, participants are still encouraged to clean up after themselves --D.H. Jan 2016]

Thursday, March 12, 2009

The game is changing

When I'm beating my "not-so-disruptive technology" drum, I'm not trying to say that technological change has no effect. Rather, technological change is more gradual than we might sometimes think. Over time, it can have significant effects. The rule of thumb I've heard is that predictions overestimate the short term and underestimate the long term.

For example, over my lifetime Moore's law has tracked orders of magnitude worth of steady improvement in hardware. This progress has been accompanied by a steady stream of discoveries and algorithmic improvements on the software side (and a counterbalancing accumulation of layers between the code and the hardware, but that's a separate story).

In the early days of AI, there was much breathless talk (not necessarily by those actually doing the research) about the "electronic brain" being able to match and surpass the human brain. Only later did it become clear that this vastly underestimated the sheer computational power of a human brain -- or a gerbil brain, for that matter.

A classic AI program, say SHRDLU, ran on a PDP-6 and processed small chunks of text to maintain a set of assertions about a toy "block world". It was definitely a neat hack, and it looked pretty impressive since, as everyone knew, language processing is one of the highest levels of thought and therefore one of the most difficult [Re-reading, that's not quite right. Language processing really is hard in its full glory. However SHRDLU, like everything else at the time, did fairly rudimentary language processing. The "gee-whiz" part was that it could keep track of spatial relations between blocks in its block world. My point was that this is actually much, much simpler than, say, walking. So for "language processing" read "reasoning about spatial relations".].

In fact, "high-level" abstract thought is going to be about the easiest type of thought for a computer to mimic -- the computer itself is the product of exactly that sort of thought, so a certain structural suitability is to be expected. This is probably clearer in hindsight than at the time.

An oversimplified view of what happened next is that we began to understand and appreciate a few key facts:
  • Biological minds are much more complex than a PDP-6. There are hundreds of billions of neurons in the brain, versus (somewhere around) hundreds of thousands of components in a PDP-6, most of which would have been core memory (by which I mean actual magnetic cores).
  • Biological minds are distributed and work in parallel. For example, the eye is not a simple camera. The retina and optic nerve do significant processing before the image even reaches the brain.
  • Biological computation is not based on abstract reasoning. Rather, the other way around. Biological computation has a much more statistical, approximate flavor than traditional symbol-bashing.
and so, again oversimplifying, the AI bubble burst. Everyone now knew that AI was a pipe dream. Best to go back to real work.

"Real work" meant a lot of things, but it included:
  • Figuring out how to handle much larger amounts of data than, say, dozens or hundreds or even thousands of assertions about blocks in a toy world, and how to make use of hardware that (currently) throws around prefixes like giga- and tera-.
  • Figuring out how to build distributed systems that work in parallel.
  • Figuring out how to handle messy, approximate real-world problems.
Hmm ... three bullet points up there ... three bullet points down here ... almost like they were meant to be compared and contrasted ...

The jumping-off point for all this was observation in the previous post that our internal data network appears to have capacity comparable to fairly fast off-the-shelf digital networks. This rough parity is a fairly recent development.

A gigabit behind your eyes

OK, I thought this was going to be an easy one. At a lecture the other day I heard Edward Tufte (author, among other things, of The Visual Display of Quantitative Information and quite a bit of scathing invective aimed in the general direction of Power Point) claim that the human optic nerve has a capacity of around 20 megapixels per second. "And we have two of them!" he continued, pushing one of his major themes: people can easily and naturally process much more information visually than most graphics contain.

Leaving aside the questionable implication that two optic nerves allow us to process much more information than one -- unlikely both because the two eyes generally see almost the same image and because if they don't the result is generally less informative than the normal case -- I was happy to hear some hard numbers, apparently based on a careful, peer-reviewed study, regarding human visual bandwidth.

So all I needed to do was track down Tufte's assertion on the web, follow that to the original study and write it up: Our optic nerves can handle approximately X, so a display purporting to handle more than X may not be that useful (and maybe that's why Blu-Ray doesn't seem to be taking the world by storm). Granted, this is just a crude measure of bandwidth and leaves aside many, many details of human visual perception, but it's still a useful number for sanity checking and ballpark estimates.

Alas, I'm stuck at step 1. I'm only mostly sure the number was 20 and the units were megapixels per second, and I'm assuming that a pixel is more or less three bytes, based on fairly well-known results in color perception. So instead, here are some facts and factoids that turned up:
  • The human eye has about 100 million receptors. This is sometimes quoted as "the eye has 100 megapixels," but trying to compare rods and cones to camera pixels is really apples and oranges.
  • Unlike the uniform grid of digital cameras and video displays, the eye instead has about 100 million light/dark-sensitive rods and 5 million color-sensitive cones. The cones are clustered around the focus point of the lens. Peripheral vision is much less color-sensitive.
  • Most people can't really tell the difference between a 6"x4" photograph printed at 150 dpi and one printed at 300 dpi when both are viewed at normal distance. 6"x4"x(150dpi)2 is about half a megapixel. Half a megapixel times 20 frames per second is about 10 megapixels per second; that's low of Tufte's figure, but then a 6"x4" photo at normal distance doesn't completely fill the field of vision -- just the most acute portion.
  • The optic nerve contains about 1.2 million fibers. That's a bit more than one for every hundred receptors, so either some aggregation is done on the retina, or the neurons are able to multiplex information from multiple receptors, or both.
  • 1.2 million fibers times 20 frames per second is close to Tufte's 20 million per second.
All this suggests that, to a rough approximation, we can process still images of about a megapixel and moving images with around 20 megapixels per second of useful information. 20 megapixels per second at three bytes per pixel is 60MB/s or about 500Mb/s, so we have something close to a gigabit network right behind our eyes. This sort of thing is one reason I tend to put "broadband" in quotes.

If we can only process a megapixel or so, why have a bigger display than that? Good typographic resolution is more like 1200dpi. On an 8 1/2" x 11" page that's over 100 megapixels. Isn't that overkill?

Not really. You don't look at the entire page at once. You scan it, focusing on on piece, then the next. Each of those pieces needs to be sharp. A large, finely-printed page will give you about a hundred high-resolution patches to focus on. Similarly, you can't take in all of an IMAX image at once. Rather, you have a huge image that looks sharp no matter where you look at it.

A sharp display with only a megapixel of resolution would have to cover the entire field of view, and it would have to track eye movements so that which megapixel you got depended on where you were looking. Maybe some sort of really high-tech contact lens?

Saturday, March 7, 2009

News without paper

Two sad items for those in the old-fashioned newspaper business, both from the western US:
  • Denver's Rocky Mountain News is shutting down, for good, "just 55 days shy of its 150th birthday." This leaves the Denver Post as the only mainstream daily in town.
  • The Seattle Post-Intelligencer, already sharing most of its infrastructure with the Seattle Times, has announced that it will cease print publication, reduce staff from 180 to around 20, and publish entirely online. The P-I is only a few years younger than the Rocky Mountain News. To my knowledge it is the first major US city daily to go this route (the Christian Science Monitor made the switch a while ago, but it's not a city daily).
In both cases the problem is economics. Not the global downturn/recession/whatever-we-call-it, mind. Both papers had survived the Great Depression and several other upheavals and panics. The problem is advertising. Competition from online sources is killing classified ads. Nor is moving online an easy option, though the P-I is going to have a go at it, because online ads just don't pull in as much money as print ads used to.

So: Online advertising is (obviously) real. Publishing on line is inherently less costly than publishing in print -- which is why 100+ people in Seattle are losing their jobs -- but even together it's an open question whether this adds up to viable online city newspapers. Or whatever you call a newspaper that's not on paper.

When I travel, I often pick up a copy of the local paper, even if it's a slow news day, just to get a bit of the flavor of the place. Just the name Post-Intelligencer lets you know you're in Seattle and nowhere else. Likewise with the Plain Dealer, Free Press or Mercury News. Even the more common names are quirky. One city's World is another's Globe, one's Chronicle another's Journal.

Papers have been losing flavor for years, though, as more and more of them get bought up by national chains, and again the driver is economics. With this in mind, it's not quite right to say that the economics of online advertising is killing the dailies. Nonetheless, it certainly does seem set to deal a number of death blows.

Friday, March 6, 2009

The point of the postscript

In a previous rumination over Open Source, I quoted Linus's original Usenet post announcing the beginnings of Linux. I left it sitting there somewhat ambiguously, and since a certain amount of ambiguity can be useful, and since this is a blog, not a wiki, I'm going to leave it that way in the original and comment on it here instead.

The point I was making is this: The post strongly suggests Linus thought he was just throwing stuff at the net, but this is highly ironic in light of what actually played out. Little did he know ...

One might go further and speculate about why one would want to put something like that out and invite feedback. My personal guess is that Linus's post cautiously understates what he thought his as-yet-unnamed OS might become. This, in turn, vastly underestimates what it actually has become.

Sunday, March 1, 2009

Revolution OS and thereabouts

OK, so I just watched Revolution OS (on the Roku/Netflix box, of course), which I'd been putting off out of concern it might be more propaganda than information. The opening minute or so did little to allay that, but it turned out to be a pretty good documentary, and as even-handed as you could expect from something that interviewed Open Source folks entirely. It did this by getting in touch with several of the principals, including rms, esr and Linus, and pretty much just letting them talk. This is often a good idea, especially when the principals involved are thoughtful, creative, articulate and intellectually curious.

What emerged was a clear picture of the history of Free Software/Open Source, how "Open Source" came to be the dominant name, and the essential differences between the two: Free Software advocates want all software to be free because it's a Good Thing. Open Source advocates want particular software to be open because it's a Useful Thing. It may not surprise the attentive reader that I tilt toward the latter.

There are ironies along the way, for example a small one in Netscape adopting Open Source not through grassroots activism by engineers -- though this did occur -- but because it was eventually imposed from the top down by management; a large one in that the entire Open Source movement, which is at best indifferent to rms' s central goal of making all software free, depends crucially on GNU code and even more crucially on the GPL [more precisely: on the GPL and licenses directly influenced by it]. Rms himself points this out in his acceptance of the Linus Torvalds award at the 1999 Linux world. Linus's daughters trot back and forth behind him on stage all the while.

If you're looking for a spirited debate over Open Source vs. not-open, you won't find it -- except for an early quote from Bill Gates (who did not directly participate), there are no dissenting voices. If you're looking for knock-down, drag-out Linux vs. Windows, as the marketing collateral implies, you won't find that either. And a good thing. Revolution OS is a much more a chance to put a human face on the names you see floating around and get an idea of what they were thinking. Considered from that point of view, it succeeds nicely.

But I didn't really set out to write a movie review here. I really just wanted to share something amusing I ran across while chasing a link from a link from a page I looked up out of curiosity after watching the movie. This is from Jamie Zawinski, who has done more Open Source development than most of us, I would wager. Zawinski says:
But now I've taken my leave of that whole sick, navel-gazing mess we called the software industry. Now I'm in a more honest line of work: now I sell beer.

Specifically, I own the DNA Lounge nightclub in San Francisco. However, it takes quite a lot of software to keep the place running, because we do audio and video webcasts twenty-four hours a day, and because the club contains a number of anonymous internet kiosks. So all that code is also available.

This all sounds fine and noble, and I like the design decisions, but I have to wonder: Just how anonymous can an internet kiosk be in a nightclub full of webcams? Checking sports scores without having to establish an account anywhere? Sure. Plotting world domination? Maybe not so much.

By the way, this turns out to be post number 300. I made a production of 100 and 200, but from here on out I probably won't until some more significant milestone. Hmm ... 100π is about 314 ...