Friday, April 6, 2012

What's twice as big as the internet?

(Yikes ... I went 0 for March!)

I've mentioned before that telescopes can generate a lot of data.  IBM seems inclined to drive the point home by collaborating with ASTRON (the Netherlands Institute for Radio Astronomy) to put together "exascale" computing horsepower behind the world's largest radio telescope.

The telescope is actually (or rather, will be) an array of millions of antennas spread out over a square kilometer, from which the name SKA, for Square Kilometer Array.  This array is expected to produce on the order of an exabyte of data per day.  This is an absolutely ridiculous amount of data by today's standards.  Think one million terabyte disk drives, or twenty million feature film's worth of Blu-ray, or ... according to IBM, twice the daily volume currently carried on the internet.

I'm a little skeptical as to exactly how one measures that, but hey, you've got to trust a press release, right?

So where do you put an exabyte a day worth of data?  Well, you don't.  You're certainly not going to upload it to the web.  Particle physicists are faced with the same problem of having to figure out what portion of a huge data set to keep for later analysis, and a large part of running an experiment is setting up the "trigger" criteria by which the software collecting the data will decide what to keep and what to throw.  IBM and ASTRON's system will be dealing with the same problem, but on an even larger scale.

Or I suppose you could sign up two million people and somehow stream an equal share of the data to each at Blu-ray resolution all day every day, but somehow I doubt that kind of crowdsourcing will help much.


earl said...


Anonymous said...

sounds intresting i myself an electrical engineering student i will research on this topic