Most of us like to discuss our ideas and opinions on silly and serious issues, share happy and sad moments, and play together on the internet. And it's a great thing. We all want to be free to learn about new things, get in touch with our friends, and reach out to new people. Every minute we continuously share photos, videos, and ideas. On Snapchat, we share 527,760 photos, we watch 4,146,600 YouTube videos, we share 456,000 tweets and around 46,740 photos are posted on Instagram - every single minute. Do you know how many minutes we have in one day? 1440.
These pieces of information are different in nature. Some of them are home videos, and the law has nothing to do with them. But there is content that clearly breaches the law, such as child abuse material, or incitement to violence. And between legal and illegal content, there is a third group, which some people find harmful, while others have no problem with. Although not illegal, but some parents would like to avoid their children getting access to pornography at the age of 12. It is not easy to define, let alone categorize, what if harmful and for whom. It depends on culture, age, circumstances, and so many other factors.
Because a large quantity of internet content is hosted by online platforms, they have to rely on automated tools to find and tackle different categories of illegal or potentially harmful content. In particular, dominant players such as Facebook and Google have been using monitoring and filtering technologies for identification and removal of content. Do we agree on removing child abuse materials? Certainly. Do we agree on preventing ISIS recruitment videos from spreading? Absolutely.
The EU, together with some Member States, has been continuously pushing online platforms to swiftly remove illegal or potentially harmful content, such as online hate speech or terrorism, often under the threat of fines if they don't act fast enough. To meet these demands, tech companies have to rely on automated tools to filter out information that should not go online.
While automation is necessary for handling a vast amount of content shared by users, it makes mistakes that can be far-reaching for your rights and the well-being of society.
1. Contextual blindness of automated measures silences legitimate speech
Automated decision-making tools lack an understanding of linguistic or cultural differences. Content recognition technologies are unable to assess the context of expressions accurately. Even in straightforward cases, they make false matches. In 2017, the pop star Ariana Grande streamed her benefit concert 'One Love Manchester' via her YouTube channel. The stream was promptly shut down by YouTube‘s upload filter, which wrongly flagged Grande's show as a violation of her own copyright. On a more serious note, the same automated tools removed thousands of YouTube videos that could serve as evidence of atrocities committed against civilians in Syria, potentially jeopardizing any future war crimes investigation that could bring war criminals to justice. Because of their contextual blindness or, in other words, inability to understand users’ real meaning and intentions, they flag and remove content that is completely legitimate. Thus, journalists, activists, comedians, artists, as well as any of us sharing our opinions and videos or pictures online risk being censored because internet companies are relying on these poorly working tools.
2. They’re not a silver bullet
These technologies are sometimes described as 'artificial intelligence', a term that conjures up notions of superhuman computational intelligence. However, nothing of the sort exists, nor is it on the horizon. Instead, what this term refers to is advanced statistical models that have been trained to recognise patterns, but with no actual 'understanding' or 'intelligence'. Content recognition technologies cannot understand the meaning or intention of those who share a post on social media or the effect it has on others. They merely scan content for certain patterns such as visual, verbal, or audio files, which correspond to what they have been trained to identify as 'hate speech' or 'terrorist content'. There is no perfect, unambiguous training data, and so their ability to recognise these patterns is inherently limited to what they have been trained to recognise. Although they can achieve very high levels of accuracy in identifying unambiguous, consistent patterns, their ability to automate the very sensitive task of judging whether something constitutes hate speech will always be fundamentally limited.
Understandably, governments want to show their citizens that they are doing something to keep us safe from terrorism, hate speech, child abuse, or copyright breach. And companies are very happy to sell their automation technologies as a silver-bullet solution to politicians desperately digging for a simple answer. But we have to keep in mind that no automation will solve the problems deeply rooted in our society. We can use them as a tool to lessen the burden on platforms, but we need safeguards that ensure that we don't sacrifice our human rights because of poorly trained automated tools.
So who should decide what we see online? Read more on this topic.
Authors: Eliška Pírková from Access Now & Eva Simon from Liberties