The Large Synoptic Survey Telescope (LSST) is designed to cover the entire sky visible from its location every three days, using a 3.2 gigapixel camera and three very large mirrors. In doing this, it will produce stupefying amounts of data -- somewhere around 100 petabytes, or 100,000 terabytes, over the course of its survey. So imagine 100,000 terabyte disk drives, or over 2 million two-sided Blu-ray disks. Mind, the thing hasn't been built yet, but two of its three mirrors have been cast, which is a reasonable indication people are serious. Even if it's never finished, there are other sky surveys in progress, for example the Palomar Transient Factory.
Got a snazzy 100 gigabit ethernet connection? Great! You can transfer the whole dataset in a season -- start at the spring equinox and you'll be done by the summer solstice. The rest of us would have to wait a little longer. My not-particularly-impressive "broadband" connection gets more like 10 megabits, order-of-magnitude, so that'd be more like 2500 years, assuming I don't upgrade in the meantime and leaving aside the small question of where I'd put it all.
Nonetheless, the LSST's mammoth dataset is well within reach of crowdsourcing, even as we know it today:
- Galaxy Zoo claims that 250,000 people have participated in the project. Many of them are deadbeats like me who haven't logged in for ages, but suppose there are even 10,000 active participants.
- The LSST is intended to produce its data over ten years, for an average of around 2-3Gbps. Still fairly mind-bending -- about a thousand channels worth of HD video, but ...
- Divide that by our hypothetical 10,000 crowdsourcers and you get 200-300Kbps, not too much at all these days. Each crowdsourcer could download a 3GB chunk of data in under an hour in the middle of the night or spread it out through the day without noticeably hurting performance.
- Assuming you kept all the data, you'd need a new terabyte disk every few months, so that's not prohibitive either.
- The hard part is probably uploading a steady stream of 2-3Gbps (bittorrent wouldn't help here, since each recipient gets a unique chunk of data). As far as I can tell the bandwidth is there, but at that volume I'm guessing the cost would be significant.
- In reality, there would probably be various reasons not to ship out all the raw data in real time, but instead send a selection or a condensed version.
Wikipedia references a 2007 press release saying Google has signed up to help. As usual I don't know anything beyond that, but it does seem like a googley thing to do.
No comments:
Post a Comment