The Emergence of Data
How did we go from simple livestock counts to algorithms that recommend which movie to watch, which article to read, or which person to invite to a restaurant? In other words, how did we come to rely on numbers to make laws and correlate our policy decisions with arithmetic calculations? To understand this evolution, we need to look back at the history of data culture and establish the link - still insufficiently studied today - between the emergence of statistics and the contemporary development of artificial intelligence algorithms.
Data are not a recent invention
Data have been circulating for tens of thousands of years, when men had to count resources, hazardous phenomena, everything that could be counted. Numbers actually preceded the letters. Bone notches or chiselings were already used to count animals even before the Sumerians invented writing in the 4th millennium BC. At that time, the Mesopotamian clay tablets formed one of the first data corpus recording accounting operations. Dried tablets could be reused by rehydration. Less volatile, the non-transcriptible baked tablets allowed to keep track of the exchanges.
The countable field has grown significantly over time. Wishing to know its territories and populations, the State thus began to count its vital forces, whether it was with a purpose of starting a war or distributing taxes according each one's occupation. As early as Ancient Rome, the census was devolved to consecrated magistrates, elected for five years: the "censors". Each family would then go to the Champs-de-Mars in order to share the composition of their human and material assets. But the information became outdated even before studies were finalized. It was not until the 18th century that these practices generalized thanks to new ways of organizing data collection. In Sweden for instance, the proximity between state and religious administrations made censuses more reliable, namely thanks to the burial registers held by parish priests (data on age, gender and marital status).
The higher counting frequency made it possible to observe the evolution of the measurements over time. Although it initially started as a simple herd counting exercise, the counting was then based on statistics that made it possible to assess the risks taken by merchant ships, to provide insurance, and to distinguish depending on the power of the States and companies. Statistics then became an objective measuring instrument, making it possible to understand a social reality that had previously been complex and elusive.
« Statistics then became an objective measuring instrument, making it possible to understand a social reality that had previously been complex and elusive. »
Data have been flowing since our species' genesis, but it is only through statistics that we have learnt how valuable and useful their collection is. Etymologically, statistics is linked to the idea of making an inventory. In medieval Latin, status means "inventory". This term is inherently attached to all government activity. Modern Latin uses statisticus to refer to what is "relative to the State". Its Italian derivative, statista, gives the term "statesman". This historical and central link with the State should not, however, hide the fact that it was originally used in a commercial context.
Numbers and Law: from the Middle Ages to the GDPR
Merchants' liability now derives from the law, but originally stemmed from the regulations of the merchant corporations, the lex mercatoria. It was only with the Italian City-States that it was gradually imposed on tradesmen in order to prevent bankruptcies, to guarantee proof of good management, and ultimately to protect trade. Trade globalization and the emergence of the market economy have made it necessary to quantify the new mass phenomena, whether demographic, economic, social or moral. States in turn have embraced these tools and techniques in order to structure their societies more adequately.
« This medieval link between numbers and law, through the obligation of accountability, is still found today in our Big Data era. »
In the Middle Ages, it was indeed the traders who made it their duty to be accountable through the introduction of double-entry book-keeping. This invention allowed them to account for their activities to their contractors, both private individuals and public authorities. The faithful keeping of accounts was thus the basis of commercial liability. This medieval link between numbers and law, through the obligation of accountability, is still found today in our Big Data era.
“Accountability” is indeed a key notion of the recent "European Regulation on Personal Data", known as the GDPR, which requires data controllers to "be accountable" towards the individuals concerned by these processing operations, both on the impact on their privacy and the upstream safeguards implemented to prevent any abuse.