The world’s data explained: how much we’re producing and where it’s all stored
Ancient humans stored information in cave paintings, the oldest we know of are over 40,000 years old. As humans evolved, the emergence of languages and the invention of writing led to detailed information being stored in various written forms, culminating with the invention of paper in China around the first century AD.
The oldest printed books appeared in China between AD600 and AD900. For over a millennium, books remained the main source of information storage.
Humans achieved more technological development in the past 150 years than during the previous 2,000 years. Arguably one of the most important developments in human history was the invention of digital electronics.
Since the discovery of the transistor in 1947 and the integrated microchip in 1956, our society experienced a shift. In just over 50 years, we’ve achieved unprecedented computing power, wireless technologies, the internet, artificial intelligence, and advances in display technologies, mobile communications, transportation, genetics, medicine and space exploration.
Most importantly, the introduction of digital data storage also changed the way we produce, manipulate and store information. The transition point took place in 1996 when digital storage became more cost-effective for storing information than paper.
Digital data storage technologies are very diverse. Most notable are magnetic storage (HDD, tape), optical discs (CD, DVD, Blu-Ray) and semiconductor memories (SSD, flash drive). Each type of memory is more useful to specific applications.
Semiconductor memories are the preferred choice for portable electronics, optical storage is mostly used for movies, software and gaming, while magnetic data storage remains the dominant technology for high-capacity information storage, including personal computers and data servers.
All digital data storage technologies operate on the same principles. Bits of information can be stored in any material containing two distinctive and switchable physical states. In binary code, the digital information is stored as ones and zeroes, also known as bits. Eight bits form a byte.
A logical zero or one is allocated to each physical state. The smaller these physical states are, the more bits can be packed in the storage device. The width of digital bits today is around ten to 30 nanometres (billionths of a metre). These devices are very complex because developing devices capable of storing information at this scale requires controlling materials on the atomic level.
Digital information has become so entrenched in all aspects of our lives and society, that the recent growth in information production appears unstoppable. Each day on Earth we generate 500 million tweets, 294 billion emails, 4 million gigabytes of Facebook data, 65 billion WhatsApp messages and 720,000 hours of new content added daily on YouTube.
In 2018, the total amount of data created, captured, copied and consumed in the world was 33 zettabytes (ZB) – the equivalent of 33 trillion gigabytes. This grew to 59ZB in 2020 and is predicted to reach a mind-boggling 175ZB by 2025. One zettabyte is 8,000,000,000,000,000,000,000 bits.
To help visualise these numbers, let’s imagine that each bit is a £1 coin, which is around 3mm (0.1 inches) thick. One ZB made up of a stack of coins would be 2,550 lightyears. This can get you to the nearest star system, Alpha Centauri, 600 times. Currently, each year we produce 59 times that amount of data and the estimated compound growth rate is around 61%.
Most digital information is stored in three types of location. First is the global collection of what are called endpoints, which include all internet of things devices, PCs, smartphones and all other information storage devices. Second is the edge, which includes infrastructure such as cell towers, institutional servers and offices, such as universities, government offices, banks and factories. Third, most of the data is stored in what’s known as the core – traditional data servers and cloud data centres.
There are around 600 hyperscale data centres – ones with over 5,000 servers – in the world. Around 39% of them are in the US, while China, Japan, UK, Germany and Australia account for about 30% of the total.
The largest data servers in the world are China Telecom Data Centre, in Hohhot, China, which occupies 10.7 million square feet and The Citadel in Tahoe Reno, Nevada, which occupies 7.2 million square feet and uses 815 megawatts of power.
To meet the ever-growing demand for digital data storage, around 100 new hyperscale data centres are built every two years. My recent study examined these trends and concluded that, at a 50% annual growth rate, around 150 years from now the number of digital bits would reach an impossible value, exceeding the number of all atoms on Earth. About 110 years from now, the power required to sustain this digital production will exceed the total planetary power consumption today.