Data Empire

Since Adrien Basdevant's essay on society, algorithm and the rule of law "L'Empire des données" has not been translated yet, we wanted to present some of the topics raised in this book, using extracts of a previous interview led by Mr. Louis Armengaud Wurmser (available in French here).

Louis Armengaud Wurmser (LWA) : You open your book with a brief history of data, which reminds us that citizens have been the subject of multiple censuses throughout history. Why is it important to highlight this area?

Adrien Basdevant (AB): By studying the history of data, we observe how humans have collectively depended on data to establish laws. From simple livestock counting, we began censing populations, then the wealth of nations. We count and classify to better govern. The philosopher Francis Bacon summarized it well: “knowledge is power”. Effectively, the art of guiding the masses required a perfect knowledge of the state of a country and its population. The history of statistics is the belief of harmonizing society through data crunching. It is important to go back to the history of statistics to understand how it allowed us to better organize our societies, quantify the various phenomena pertaining to globalization, and not describe a given situation but prescribe solutions.

LWA: What is new in the current status quo of Big Data in comparison with traditional statistics?

AB: Statistics was initially destined for sovereign governments and entities to better understand the extent of their territory, and to better react against famine, war and epidemics. The birth of statistics is therefore correlated with the creation of the modern state and what Hobbes suitably called the “artificial person”. Currently, we live in the era of “artificial intelligence”, where algorithms feed on exponential quantities of data. The novelty in big data, is detecting weak signals that would have escaped us otherwise. Where statistics reasoned in samples, big data takes the entire population in consideration, even the outliers. We collect data in real time and infer decisions by correlation. This time, applications are not solely destined for government, but accessible and destined to individuals.

LWA: You affirm that we have a tendency to focus on personal data whereas public data pose just as many problems.

AB: Data has become the new raw material of the 21st century. Comparisons to oil are frequently (and unfortunately) used. This data is not necessarily personal data. Take for example transport data. Storing them gives us access to a vast amount of information, which was the cause of many cases of litigation. For example, the RATP has for a long time refused giving the application CityMapper access on bus schedule data, real time delays etc. This can create critical questions around competition such as barriers to entry, monopolistic behavior.

LWA: Can we imagine a world where each and every one of us will be the owner of their personal data?

AB: It is possible, but I don’t think it’s something desirable. A right to ownership supposes a certain exclusivity. We will rapidly reach absurd situations. If I was proprietary of the data point “Adrien”, then no one else would be able to use my name. I would need to be paid a license in order not to fall prey to identity fraud, it wouldn’t make sense. It is in fact the circulation and the use of data that creates their value. Raw data, as a data points that are not put to use with relation to other data points has really little value in itself. Even more so, data ownership becomes more difficult once we take into account that ownership laws differ from country to country. In France, we have a personalistic approach: persons have rights over their personal data. This is not reduced to the protection of simple economic value. Finally, an individual would be incapable to set the economic value of his data. How much does a genetic code cost? How can we price our musical tastes? We would rapidly fall victim to information asymmetry that will make us sell our data for almost nothing. To consider that a datum, like a kidney or a lung, would be an extension of our person, presents the risk of commodifying the attributes of our personality.

LWA: More holistically, you denounce the normative and discriminatory nature of algorithms. But what is the fundamental difference between a piece of software deciding, for example, who has access to credit, and a human producing the same decision? Ultimately, like a human, a machine is following a set of predefined criteria - and individuals are known to be biased.

AB: We usually use black box algorithms whose decision-making process is unknown to us. And you are right to underline the fact that we can consider the human brain to be the ultimate black box. The difference however is that we can recognize the motives behind human decision-making and discern they are contestable. An algorithm by nature discriminates, we need to assure that this happens on the basis of acceptable and contestable criteria, sheltered from exclusion lists. And for that, we need to verify the decision making process happening at the heart of the algorithm. I can give you illustrative examples. Can a credit card company authorize increasing the interest rate of a couple because they’re taking marriage counseling? Can the price of an airplane ticket change as a function of the laptop you are using? Can banks decrease the credit ceiling of a client based on other customers with bad credit history frequenting the same supermarkets? We need to have an ethical discussion on the criteria used by algorithms.

LWA: You notably say that the fight happens especially in education: we need to learn how to understand and question statistics. What needs to happen and how?

AB: It is above all a question about driving change. If we do not understand the underlying mechanisms of big data, and question them, we will never be able to challenge them. A coder needs to become an ethicist and the judge a statistician. This happens only by the hybridization of knowledge, by training the creators of algorithms on ethics, and algorithmic understanding of those that interpret them. Education needs to mix between the hard and soft sciences. We need both sides, communicating and sharing knowledge with one and another. The point is to also educate civil society so that it gets involved in this revolution and contributes to steering it in the right direction. We also need to create multidisciplinary research institutes. There is a missing cultural understanding on questions pertaining to the Internet and society. We cannot just be satisfied with having Station F, we need to promote a new generation of pioneers. We need to make digital technologies a societal wide debate that philosophers, sociologists, engineers, biologists, historians and entrepreneurs can participate in. This is one of the main reasons why the subtitle of my book is “An essay on society and algorithms”. In the United States, research institutes such as these ones have existed for a long time, at Harvard and Stanford for example.

LWA: For now, you denounce the “Coup Data”. What do you mean exactly by this term?

AB: Big data today is a new key for power. Today, we govern with data. Data feeds into algorithms that steer the course of our daily lives - from information we access on our tablets and social networks, the credit rate that gets proposed to us, the recruitment process for our next job, the valuation of our insurance premium, the distribution of students in university courses at the end of their high school degree, or to our probability of recidivism. The "coup data" is the overthrow of the rule of law in favor of algorithmic governance. I explain in detail this concept in my book. If I had to summarize it, I would say that it refers primarily to two aspects. On the one hand, those who own the data hold power. On the other hand, opaque algorithms dictate more and more the laws governing our daily lives. We come to govern from a statistical interpretation of reality that is no longer interested in causation, intention. This is to govern on the basis of our profiles and to erase the very existence of individuality, which makes us irreducibly human beings and capable of unexpected behavior. By wishing to eliminate uncertainty, this logic avoids all differences between us. The “coup data” does not necessarily mean the end of the rule of law, but could become so if, in the future, we came to be governed by algorithms that we could not understand, and challenge.

LWA: Are there any actors mobilizing as effective counter-powers today

AB: There are actors that are mobilizing. This is the case of Max Schrems, an Austrian student who won his case against Facebook and make the Safe Harbor invalidated, the US-EU data sharing agreement, 7 years after its implementation. We need more efficient counterbalances. Schrems created an NGO called None of Your Business, designed to support lawsuits and actions relating to data protection at the European level. We need to develop and promote other NGOs of data protection in the digital era, built on the model of the Electronic Frontier Foundation.

Still, we need to encourage innovation. There is no sense in opposing innovation either out of fear or laziness, except if we wanted to repeat the failed revolts of the luddites in the beginning of the 19th century. The Lyonnais silk workers made it their mission to destroy the machines that substituted their jobs. Their revolt could be slightly synonymous to the word “sabotage”, as they threw their boots into the silk machines to make them out of service. No, we should not sabotage innovation, but we should promote it in ways that respect the rights of all of us, in the most inclusive and ethical manners possible.

See another angle of the Kaleidoscope