Analyst group IDC's new Digital Universe study predicts there will be 40 Zettabytes of data on the planet by 2020, an amount that exceeds previous forecasts by 14%.
To put that into a real world context, 40 ZB is equal to 57 times the number of all the grains of sand on all the beaches on earth.
The study – which was sponsored by storage company EMC – found 2.8 ZB of data will have been created and replicated in 2012 and that the total amount of data will double every two years between now and 2020 and be equivalent to approximately 5,247 GB of data for every man, woman and child on earth in 2020.
Machine-generated data is a key factor behind this expansion, increasing from 11% of the digital universe in 2005 to more than 40% in 2020.
And in a year when big data has been at the peak of supplier-driven hype the study found that in 2012, 23% (643 exabytes) of total data could be useful as big data if tagged and analysed. However, currently only 3% of the potentially useful data is tagged, and even less is analysed. By 2020, 33% of the digital universe (13,000-plus exabytes) will have big data value if it is tagged and analysed.
The study also found 35% of all data required data protection in 2010, but that less than 20% of it is actually protected.
Geographically, the location of the world's data is set to undergo a shift. Currently, emerging markets account for 36% of the world's data, but that is set to increase to 62% by 2020. The current global breakdown is: US – 32%, Western Europe – 19%, China – 13%, India – 4%, rest of the world – 32%.
The study found western Europe currently invests the most to manage data, spending $2.49 per GB. The U.S. comes in second, investing $1.77 per GB, followed by China at $1.31 per GB and India at $0.87 per GB.
By 2020, IDC estimates nearly 40% of data will be stored or processed in a cloud between a byte's origination and consumption.