The Human Error of Artificial Intelligence

In January 2020, Robert Julian-Borchak, a black man, was arrested in Detroit for a crime he didn’t commit only because of an error in a facial recognition AI system. This, according to the New York Times, was the first known case of a U.S. citizen being arrested for an algorithmic error.

The case of Robert Julian-Borchak tells the story of our times. Today, AI systems are increasingly being used by law enforcement and courts to monitor, track and profile us an to determine our possible innocence or guilt. These transformations do not only affect the United States. Also in Europe algorithmic logics are rapidly being incorporated in criminal justice settings [1] and predictive policing. [2].

L’errore umano dell’intelligenza artificiale: ecco perché dobbiamo imparare a conviverci

As the example of Robert Julian-Borchak demonstrates, the use of these technologies by courts and law enforcement can have a deleterious impact on our rights and freedoms, and this is the reason why these systems need to be studied, tested, evaluated and sometimes prohibited.

In April 2021, the European Commission published a new Proposal for a Regulation on Artificial Intelligence in Europe, which aims to do precisely this. The new proposal does not only suggest that we need to understand AI systems that are used to profile individuals – by law enforcers, educators, human resources etc. – as high-risk but that practices such as social scoring or ‘real time’ biometric surveillance should be prohibited. The new proposal calls also for greater accountability and transparency in both the ways in which high-risk AI is used and trained. What emerges clearly from the EU Commission’s proposal is that AI systems aimed at human profiling are exposed to different types of implicit errors and biases in human profiling that can amplify inequalities in our society and can have a negative impact on human rights.

The European Commission’s proposal for a regulation on artificial intelligence is – in my opinion – a very important step forward. Yet, I also believe that we have to be realistic, learn from the experience of GDPR and remember that implementation of legislative proposals like this in Europe is by no means easy. The truth is that our daily lives are now defined by an incredible amount of algorithmic decisions, which are not easily controllable or governable.

Robert Julian-Borchak’s story sounds absurd, dystopian, and Kafkaesque. It is an extreme case. In its extremity, however, it confronts us with a much more mundane and ordinary story: every day we are being judged by AI systems, and every day their errors, their biases can limit our freedoms.

Indice degli argomenti

Whole lives judged by algorithms?

The technological changes of the last years, the innovations in the field of big data and artificial intelligence, have brought us to a historical situation in which our personal data is used to decide fundamental aspects of our daily lives. When we look for a job, take out an insurance policy, apply for a loan, enroll our children in school and in countless other situations of everyday life, this data – decontextualized, sterilized and compared with standard benchmarks and algorithms – is used to judge us. For years I have been studying this transformation focusing mainly on the datafication of citizens from before birth. I talk about the results of this research in my last book published by MIT Press and titled Child | Data | Citizen: How Tech Companies are Profiling Us from before birth (2020) and also in my new book – titled I Figli dell’Algoritmo / Children of the Algorithm – that will be published in Italy by LUISS University Press.

My research has led me to the conclusion that for the first time in history, we are creating a generation of citizens who are datafied from before birth. From the moment children are conceived, their medical information is often shared on pregnancy apps or social media. In their everyday lives, their personal data is collected and stored by home technologies and virtual assistants in their homes, by educational platforms in schools, by online documents and portals at the doctor’s office, by their Internet-connected toys, by their online games, and by many, many, many other technologies. All of this personal data is aggregated, traded, sold, and turned into digital profiles that can follow them across a lifetime.

One of the main ideas of my research is the fact that nowadays there is no boundary between consumer data, which is collected for targeted ads, and citizen data, which is collected to decide whether or not we can have access to certain rights [3]. A key example I discuss in my new book I Figli dell’Algoritmo (forthcoming 2021) comes from the United States. On February 26, 2021, the Washington Post reported that ICE (the immigration force in the United States) had access to a database called CLEAR – by Thomson Reuters. The database includes more than 400 million names, addresses, and user records created from the consumer data that had been collected by more than 80 companies that supplied water, gas, electricity, telephone, Internet, and cable TV bills. When I went on CLEAR’s website, I discovered that the database is used for several types of government and institutional investigations that go far beyond immigration violations. In fact, the database is used to combat tax or healthcare fraud, money laundering, or to make decisions about child custody. Without the users knowing it, their household data is being cross-referenced and shared, and is being used to decide on their rights.

We are living in a new reality where – more and more – the information that we agree to offer as consumers is being used by predictive analytics and AI systems to determine whether or not we have access to specific rights. Unlike other countries – such as the U.S.A. and China – Italy and the European Union offer greater protection when it comes to the use of our personal data and privacy. Yet, even in Italy and Europe we are facing a rapid datafication of governmental infrastructures and an increase in predictive analytics at institutional level. In Italy, a key example can be found in President Draghi’s reform plan.

The thing that amazes me most when I think about these transformations is the speed with which AI systems – aimed at human profiling – are being adopted in different contexts, from schools to hospitals, from law enforcement to government infrastructure. It seems almost as if – over the last decades – we’ve really convinced ourselves that the technologies used for predictive analytics, cross-referencing, facial recognition, emotion recognition, and all the other technologies that power our AI systems offer us a deeper understanding of human behavior and psychology. The question that we need to ask ourselves is: Can we trust these technologies when it comes to reading humans? My answer is no.

The Human Error in our Data

In 2018, I sat in a crowded restaurant in Los Angeles, where I met Cara, one of the parents I interviewed for my Child | Data | Citizen (2020) book. The restaurant was noisy and reflected all the sensory vibrancy we were used to in a pre-Covid-19 world. During the interview Cara referred to internet advertising companies as “parasites” and “gossipers,” and explained, “when people seem to infer something about you based on a certain piece of information or rumor, then it’s wrong, it feels like gossip. When I get targeted for a search I did on Google I feel exactly that way; like if someone was gossiping about me.”

In its simplicity, Cara’s analogy shows one of the most problematic aspects of digital profiling. It reminds us that data technologies build approximate images of who we are on the basis of pieces of information that they aggregate and cross-reference. We need to realize that there often is a human disconnection between our digital practices and the data we produce. Our digital practices are complex and contradictory: they reflect different intentions, values and situations, and they are not the expression of precise behaviors. Sometimes we do not use technologies as we are meant to, and we use them tactically. For example, in the families I worked with, practices such as “self-censoring” or “playing with the algorithm” were strategically aimed precisely at confusing data trackers.

Algorithms are not equipped to understand human complexity. Thus, they end up making reductionist and incorrect assumptions about the intention behind a specific digital practice (web search, a purchase, or a social media post). This is because the data that is collected from our digital practices – and used to create our digital profiles – is often stripped from the feelings, thoughts and context that produced it. In order to shed light on this problem, in 2018, I wrote a report signed by Gus Hosein, CEO of Privacy International and supported by Jeff Chester, director of the Centre For Digital Democracy in Washington DC, where I talked about the anthropological complexity of the data that is collected in our homes (home life data). In the report we show that AI systems such as virtual assistants – which are used by Big Tech to “profile” individuals, find “personalized solutions” or “mitigate future risks” (as in the case of voice profiling to determine a mental state) – rely on data collected through digital home practices that are messy, incomplete and contradictory. We thus conclude that being profiled on the basis of these data traces would inevitably lead to a reductionist and inaccurate analysis of our behaviors.

The problem with our AI systems today, is not only that they learn from inaccurate and de-contextualized human data, but also that they are trained with databases which are filled with biases and errors. In 2019, for example, researchers at the AI Now Institute in New York published a terrifying report on the uses of AI technologies by U.S. police forces. The report revealed that in several U.S. jurisdictions, AI technologies used by police for predictive analytics relied on “dirty data,” or, in other words, data that was plagued by corrupt practices and underlying racist ideologies [4]. This study, which is one that I cite often, is particularly important because it shows that if AI systems are trained with datasets that are already biased and inaccurate, they end up reinforcing and amplifying the inequalities present in our society. Hence, we have to question the datasets that are used to build our AI systems. When we do this, we may end up with a clearer understanding of why these systems do not profile humans in accurate and fair ways. In Italy, for instance, one important question that we should ask ourselves is this: why are 80% of the entries in the database used by law enforcement agencies for facial recognition (Sari), which includes 16 million citizens, made up of foreign citizens (according to a 2019 Wired Italia investigation).

When we think about the bias of our databases, we need to realize that, unfortunately, we do not have a real solution for this. Databases can’t really be corrected with ‘clean and error-free’ data, because they reflect the social and cultural context that created them, and hence are inherently biased. We have the responsibility to reduce that bias, but we can’t really fix it. This, to me, is the key if we really want to understand and limit the social impact of AI systems. We need to recognize the human error of artificial intelligence. It is for this reason that, in the last year, I launched a new research project at the University of St. Gallen, which is titled The Human Error Project: AI, Human Nature and the conflict over Algorithmic Profiling. In my research, my team and I use the concept of ‘Human Error’ precisely to explore its ambivalence and ambiguity. AI technologies are often used to make decision more efficient and objective and to ‘avoid human error’. Yet, paradoxically, when it comes to reading people these technologies are full of systemic ‘errors’, ‘biases’ and ‘inaccuracies’. I believe it is critical to study these errors because they shed light on the fact that the race for AI innovation is often shaped not only by stereotypical and misconstrued understandings of human nature, but also by problematic scientific theories that are grounded in a long history of human reductionism.

The Human Error in AI Systems: Between Reductionism and Scientific Bias

In February 2021, CNN broke the news of an AI software, called 4 Little Trees, was being used in schools in Hong Kong to analyze children’s facial expressions, determine their emotions and make targeted interventions. A few days later, Kate Crawford, founder of the AI Now Institute in New York, wrote an article in the magazine Nature [5], in which she referred to this very example. In the article, Crawford points out that the science behind the 4 Little Trees systems (and behind most facial-emotion classification systems) is based on Ekman’s theories. According to Ekman, there are 6 “universal” emotions that are innate, cross-cultural and consistent – fear, anger, joy, sadness, disgust and surprise – and which could be read through the analysis of facial expressions.

Crawford cites the anthropologist Margaret Mead and highlights how Ekman’s theory does not take into account context, culture, and other social factors. If we think about anthropological knowledge, in fact, we realize that the idea that our emotions are universal and have an objective correspondence in our bodies (such as facial expressions) is yet to be proven. Here I am not only thinking of Michelle Rosaldo’s work on the cultural construction of emotions [6], but also Brian Morris’s work on the anthropology of the self. Both show that psychology and emotions are not only determined purely by cognitive processes but also by cultural processes [7].

The idea that there are 6 universal emotions that can be mapped to facial expressions has been discredited not only by anthropological knowledge but also by the work of psychologists such as Gendron and colleagues who conducted a comparative study between U.S. participants and participants from the Himba tribe pointing out that past research on the universality of emotions had used incorrect research methods [8]. Also, Barrett and colleagues, highlighted the problems with Ekman’s theory and showed that there are still many open questions about the relationship between expressions and emotions, such as the fact that many times people do not express a single emotion with their facial expression [9]. The question that spontaneously arises: Why does the AI market seek scientific validity in Ekman’s theories when it comes to emotion detection? According to Crawford, the answer is obvious: Ekman’s theory was adopted because it fits with what AI systems can do [5]. Six coherent emotions can easily be standardized and automated at scale – as long as the more complex problems are ignored.

Most of our AI systems are based on Western science, which is shaped by western-centric, biased and reductionist understandings of human nature. This becomes evident if we think about the fact that our AI systems end up not only misreading our emotions, but also our bodies. Over the past 6 months, my team and I on The Human Error Project realized that most of the examples of algorithmic error that were being cited in the over 100 international media articles in Europe we analyzed, involved some kind of misreading of the human body, including many stories of racist and sexist algorithms. If we really want to appreciate why AI systems and algorithms seem to get it so wrong when it comes to reading the human body, we found that we need to look at the history of scientific bias in Western thought (Poux-Berthe and Barassi, 2021). In 1981, Gould, for example, wrote a book called The Mismeasure of Man, where he demonstrates how Western scientific thought has often relied on scientific measures modeled on the white man [10]. Gould was particularly fascinated by IQ tests and the idea that intelligence could be measured biologically (e.g. skull measurements) and shows how the benchmarks of these tests use the white man as a reference point. We also find a similar interpretation in Strings’ book Fearing the Black Body, published in 2019, which demonstrates how body mass index calculations (BMI), on which a lot of our technologies and wellness programs are based, were not established following years of study on healthy weight-form in different ethnicities and cultural contexts, but were modeled exclusively on white bodies [11].

The technologies we are creating today are based on scientific data and measures that very often carry with them a long history of human reductionism and implicit bias. That’s why we shouldn’t be surprised by all the errors that are emerging when it comes to human profiling. A particularly important example right now that demonstrates the human reductionism of our systems is found in COVID-19 contact tracking apps. In her fascinating work, Milan has shown that most of these apps rely on a “standard” experimental subject that hardly allows for the exploration of other variables such as gender, ethnicity, race, or low income [12]. Milan shows how the roots of this reductionism, stem from the very practice of design. It is for this reason that she draws on the anthropologist Arturo Escobar who advanced a new vision of design theory that takes into account the complex and intersectional pluriverse in which we live [13]. Thinking about the pluriverse in the context of designing AI technologies is certainly a step forward, as Escobar says. Another important step, however, is to learn how to coexist with the human error in AI and recognize that AI technologies will always been inaccurate and biased when it comes to human profiling.

Learning to Live with the Human Error in Artificial Intelligence

Today, there is a lot of talk about AI bias and tech companies are trying to find “ethical” solutions to “combat algorithmic bias” in their products and technologies; they’re funding research and setting up advisory committees that aim to examine the ethical and policy impacts of their technologies. At the heart of these strategies and practices adopted by companies is the very understanding that algorithms are biased because they have been fed with “bad data” and that therefore, in order to correct algorithmic bias, companies must train algorithms with “fair” or “unbiased data”. Current strategies to ‘combat algorithmic bias’ in the industry are deeply problematic because they push forward the belief that algorithms can be corrected and that they can be unbiased [14].

These strategies to ‘combat algorithmic bias’ are, in my opinion, not only problematic but completely miss the point. There is no such thing as a computer system that is not biased. In 1996, Friedman and Nissenbaum, for example, identified three types of bias in computer systems: pre-existing bias (the bias of humans designing computer systems and the bias produced by the cultural context that influences the design), technical bias (there is often a lack of resources in developing computer systems and engineers work with technical limitations – just think of the example of emotion recognition), and emergent bias (society is always changing and so technologies designed at one time or in one cultural context might become biased in a different time and context) [15].

Algorithms and AI systems are human-made. Thus, they will always be shaped by the cultural values and beliefs of humans and the technical and social conditions that created them. Rather than trying to fix the biases of AI systems and their human error, we need to find ways to coexist with it. Anthropology can help us a lot here. Anthropologists have long sought to address the fact that individuals necessarily interpret real-life phenomena according to their cultural beliefs and experience [16], and that cultural biases necessarily translate into the systems we construct, including scientific systems [17]. From an anthropological perspective, there is nothing we can really do to ‘correct’ or combat our cultural bias, because it will always be there. The only thing we can do is to acknowledge the existence of prejudice through a self-reflective practice and admit that the systems, representations, and artifacts we construct will never be truly ‘objective’. This same understanding should be applied to our AI systems.

Bibliography

[1] Završnik, Aleš. 2019. « Algorithmic Justice: Algorithms and Big Data in Criminal Justice Settings ». European Journal of Criminology: 1477370819876762.

[2] Figini, Silvia, et Vittoria Porta. 2019. « Algoritmi anti-crimine: tutte le tecnologie in campo ». Agenda Digitale. https://www.agendadigitale.eu/cultura-digitale/algoritmi-anti-crimine-tutte-le-tecnologie-in-campo/ (6 mai 2021).

[3] Barassi, Veronica. 2020. Child | Data | Citizen: How Tech Companies are Profiling Us from before Birth. MIT Press. https://mitpress.mit.edu/books/child-data-citizen (6 mai 2021).

[4] Richardson, Rachida, Jason Schultz, et Kate Crawford. « Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice ». NYU Law Review. https://www.nyulawreview.org/online-features/dirty-data-bad-predictions-how-civil-rights-violations-impact-police-data-predictive-policing-systems-and-justice/ (6 mai 2021).

[5] Crawford, Kate. 2021. « Time to Regulate AI That Interprets Human Emotions ». Nature 592(7853): 167‑167.

[6] Rosaldo, Michelle Zimbalist. 1980. Knowledge and Passion. Cambridge University Press.

[7] Morris, Brian. 1994. Anthropology of the Self: The Individual in Cultural Perspective. Pluto Press.

[8] Gendron, Maria, Debi Roberson, Jacoba Marietta van der Vyver, et Lisa Feldman Barrett. 2014. « Perceptions of Emotion from Facial Expressions are Not Culturally Universal: Evidence from a Remote Culture ». Emotion (Washington, D.C.) 14(2): 251‑62.

[9] Barrett, Lisa Feldman et al. 2019. « Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements ». Psychological Science in the Public Interest 20(1): 1‑68.

[10] Gould, Stephen Jay, Steven James Gold, et The Alexander Agassiz Professor of Zoology Stephen Jay Gould. 1996. The Mismeasure of Man. Norton.

[11] Strings, Sabrina. 2019. Fearing the Black Body: The Racial Origins of Fat Phobia. NYU Press. https://nyupress.org/9781479886753/fearing-the-black-body (6 mai 2021).

[12] Milan, Stefania. 2020. « Techno-Solutionism and the Standard Human in the Making of the COVID-19 Pandemic ». Big Data & Society 7(2): 2053951720966781.

[13] Escobar, Arturo. 2018. Designs for the Pluriverse: Radical Interdependence, Autonomy, and the Making of Worlds. Durham: Duke University Press.

[14] Gangadharan, Seeta Peña, et Jędrzej Niklas. 2019. « Decentering technology in discourse on discrimination ». Information, Communication & Society 22(7): 882‑99.

[15] Friedman, Batya, et Helen Nissenbaum. 1996. « Bias in computer systems ». ACM Transactions on Information Systems 14(3): 330‑47.

[16] Clifford, James, et George E. Marcus. 1986. Writing Culture: The Poetics and Politics of Ethnography: A School of American Research Advanced Seminar. University of California Press.

[17] Latour, Bruno, et Steve Woolgar. 1986. Laboratory Life: The Construction of Scientific Facts. Princeton University Press.