Extremely Large Databases
The First Workshop on Extremely Large Databases was held at the Stanford Linear Accelerator Center , October 2007. Many of the heavy hitters were there (Google, Yahoo, Microsoft, IBM, Oracle, Terrasoft, SLAC, NCSA, eBay, AT&T, etc) from industry, academia and science (? their classification). A report is available and I thought I'd touch on some of the more interesting things I found in it: Scale : Most have systems with > 100TB of data, with 20% of scientific databases > 1PB of data; All from industry reps had >100PB of data, with all having at least one system with >1PB Industry had single tables with > 1 trillion rows; science ~100 times smaller. Need for multi-trillion-row tables in Peak ingest: 1B rows per hour; 1B rows per day common " All users said that even though their databases were already growing rapidly, they would store even more data in databases if it were affordable. Estimates of the potential ranged from ten to one hundred times current...