One Database to Rule Them All

A response to this article can be found HERE. [Ed.]

By Svea Windwehr and Jillian C York

The Invisible Content Cartel that Undermines the Freedom of Expression Online

Every year, millions of images, videos and posts that allegedly contain terrorist or violent extremist content are removed from social media platforms like YouTube, Facebook, or Twitter. A key force behind these takedowns is the Global Internet Forum to Counter Terrorism (GIFCT), an industry-led initiative that seeks to “prevent terrorists and violent extremists from exploiting digital platforms.” And unfortunately, GIFCT has the potential to have a massive (and disproportionate) negative impact on the freedom of expression of certain communities.

Social media platforms have long struggled with the problem of extremist or violent content on their platforms. Platforms may have an intrinsic interest in offering their users an online environment free from unpleasant content, which is why most social media platforms’ terms of service contain a variety of speech provisions. During the past decade, however, social media platforms have also come under increasing pressure from governments around the globe to respond to violent and extremist content on their platforms. Spurred by the terrorist attacks in Paris and Brussels in 2015 and 2016, respectively, and guided by the shortsighted belief that censorship is an effective tool against extremism, governments have been turning to content moderation as a means to fix international terrorism.

Commercial content moderation is the process through which platforms—more specifically, human reviewers or, very often, machines—make decisions about what content can and cannot be on their sites, based on their own Terms of Service, “community standards,” or other rules.

During the coronavirus pandemic, social media companies have been less able to use human content reviewers, and are instead increasingly relying on machine learning algorithms to moderate content as well as flag it. Those algorithms, which are really just a set of instructions for doing something, are fed with an initial set of rules and lots of training data in the hopes that they will learn to identify similar content But human speech is a complex social phenomenon and highly context-dependent; inevitably, content moderation algorithms make mistakes. What is worse, because machine-learning algorithms usually operate as black boxes that do not explain how they arrived at a decision, and as companies generally do not share either the basic assumptions underpinning their technology or their training data sets, third parties can do little to prevent those mistakes.

This problem has become more acute with the introduction of hashing databases for tracking and removing extremist content. Hashes are digital “fingerprints” of content that companies use to identify and remove content from their platforms. They are essentially unique, and allow for easy identification of specific content. When an image is identified as “terrorist content,” it is tagged with a hash and entered into a database, allowing any future uploads of the same image to be easily identified.

This is exactly what the GIFCT initiative aims to do: Share a massive database of alleged ‘terrorist’ content, contributed voluntarily by companies, amongst members of its coalition. The database collects ‘hashes’, or unique fingerprints, of alleged ‘terrorist’, or extremist and violent content, rather than the content itself. GIFCT members can then use the database to check in real time whether content that users want to upload matches material in the database. While that sounds like an efficient approach to the challenging task of correctly identifying and taking down terrorist content, it also means that one single database might be used to determine what is permissible speech, and what is taken down—across the entire Internet.

Countless examples have proven that it is very difficult for human reviewers—and impossible for algorithms—to consistently get the nuances of activism, counter-speech, and extremist content itself right. The result is that many instances of legitimate speech are falsely categorized as terrorist content and removed from social media platforms. Due to the proliferation of the GIFCT database, any mistaken classification of a video, picture or post as ‘terrorist’ content echoes across social media platforms, undermining users’ right to free expression on several platforms at once. And that, in turn, can have catastrophic effects on the Internet as a space for memory and documentation. Blunt content moderation systems can lead to the deletion of vital information not available elsewhere, such as evidence of human rights violations or war crimes. For example, the Syrian Archive, an NGO dedicated to collecting, sharing and archiving evidence of atrocities committed during the Syrian war reports that hundred of thousand videos of war atrocities are removed by YouTube annually. The Archive estimates that take down rates for videos documenting Syrian human rights violations is circa 13%, a number that has almost doubled to 20% in the wake of the coronavirus crisis. As noted, many social media platforms, including YouTube, have been using algorithmic tools for content moderation more heavily than usual, resulting in increased takedowns. If, or when, YouTube contributes hashes of content that depicts Syrian human rights violations, but has been tagged as ‘terrorist’ content by YouTube’s algorithms to the GIFCT database, that content could be deleted forever across multiple platforms.

The GIFCT content cartel not only risks losing valuable human rights documentation, but also has a disproportionately negative effect on some communities. Defining ‘terrorism’ is a inherently political undertaking, and rarely stable across time and space. Absent international agreement on what exactly constitutes terrorist, or even violent and extremist, content, companies look at the United Nations’ list of designated terrorist organizations or the US State Department’s list of Foreign Terrorist Organizations. But those lists mainly consist of Islamist organizations, and are largely blind to, for example, right-wing extremist groups. That means that the burden of GIFCT’s misclassifications falls disproportionately on Muslim and Arab communities and highlights the fine line between an effective initiative to tackle the worst content online and sweeping censorship.

Ever since the attacks on two Mosques in Christchurch in March 2019, GIFCT has been more prominent than ever. In response to the shooting, during which 51 people were killed, French President Emmanuel Macron and New Zealand Prime Minister Jacinda Ardern launched the Christchurch Call. That initiative, which aims to eliminate violent and extremist content online, foresees a prominent role for GIFCT. In the wake of this renewed focus on GIFCT, the initiative announced that it would evolve to an independent organization, including a new Independent Advisory Committee (IAC) to represent the voices of civil society, government, and inter-governmental entities.

However, the Operating Board, where real power resides, remains in the hands of industry. And the Independent Advisory Committee is already seriously flawed, as a coalition of civil liberties organizations has repeatedly noted.

For example, governments participating in the IAC are likely to leverage their position to influence companies’ content moderation policies and shape definitions of terrorist content that fit their interests, away from the public and eye and therefore lacking accountability. Including governments in the IAC could also undermine the meaningful participation of civil society organizations as many are financially dependent on governments, or might face threats of reprisals for criticism government officials in that forum. As long as civil society is treated as an afterthought, GIFCT will never be an effective multi-stakeholder forum. GIFCT’s flaws and their devastating effects on the freedom of expression, human rights, and the preservation of evidence of war crimes have been known for years. Civil societies organizations have tried to help reform the organization, but GIFCT and its new Executive Director have remained unresponsive. Which leads to the final problem with the IAC: leading NGOs are choosing not to participate at all.

Where does this leave GIFCT and the millions of Internet users its policies impact? Not in a good place. Without meaningful civil society representation and involvement, full transparency and effective accountability mechanisms, GIFCT risks becoming yet another industry-led forum that promises multi-stakeholderism but delivers little more than government-sanctioned window-dressing.

Svea Windwehr is Mercator Fellow working on digital rights, platform governance and tech regulation with a background in EU policy work. On Twitter @sveawindwehr.

Jillian C York is EFF’s Director for International Freedom of Expression and is based in Berlin, Germany. Her work examines state and corporate censorship and its impact on culture and human rights, with an emphasis on marginalized communities. On Twitter @jilliancyork.

This article was originally published on the Electronic Frontier Foundation (EFF) website, republished under Creative Commons CC BY 3.0 US. Follow EFF on Twitter @EFF.

Want to submit a blog post? Click here.

Leave a Reply