How the Living Earth Simulator Will Work

By: Robert Lamb

Big Data Divination

Our reflection in the puddles
Our reflection in the puddles
Michael Nagle/Getty Images

Simulations feed on external data. In the case of weather simulations, the computer models require an expansive diet of both past and present atmospheric readings -- everything from the temperature in Aberdeen, Scotland, to Earth's current distance from the sun. It all comes together to form a more complete picture of the world's weather.

Humans have amassed vast collections of data on a range of topics, yet in most cases these data sets stand apart from one another. Just imagine human knowledge as a vast field littered with puddles. Each puddle represents a collection of data: economic data here, political data there -- all of them separate from the other puddles.


But the rain continues to fall and the puddles of data continue to swell, to the tune of 2.5 quintillion bytes per day [source: IBM]. (To give you an idea of how crazy that number is, some people have conservatively estimated that all the words ever spoken by humans equal 5 quintillion bytes of data.)

All that new data comes from climate sensors, social media hubs, digital media Web sites, online transaction records, cell phone GPS signals and countless other sources. The information about the world pours in at an exponential rate. In fact, according to IBM, 90 percent of the data in the world today was created in the last two years alone.

So the rain falls. The data pools swell and spread, overlapping and merging until there are no more pools -- just the vast sea of information we call big data.

To better understand the value of big data, think of it in terms of three v's: variety, velocity and volume. It encompasses data of all varieties, is generated in real time and amasses in volumes that stagger the imagination -- to the tune of petabytes. That's a million gigabytes, sufficient space to stash a 32-year-long MP3 file [source: BBC].

Can we really build a simulation of the world from this growing wealth of data? The men and women behind the FutureICT Project believe we can -- and all for a mere 1 billion euros ($1.3 billion).