By Raheel Nawaz
The UK government – with considerable pomp and ceremony – recently unveiled a new online tool for detection and removal of jihadi videos boasting a high success rate.
It has been claimed by the Home Office that the machine learning tool, which the government developed with ASI Data Science, can identify 94% of Islamic State (IS) video uploads with a typical error rate of 0.005%. It means that 940 out of every 100,000 IS videos posted online can be correctly identified, but – of that same sample – five videos not serving up IS propaganda will have been falsely flagged by the tool.
While more work is needed, this tool – aimed at smaller video streaming and download sites – sounds remarkable. Once deployed, it promises to automatically flag the suspected IS videos within minutes of upload. Human moderators will then be able to establish if the flagged videos should be blocked or not. However, given the tool’s accuracy, their job is anticipated to be more straightforward.
The government should be congratulated for demonstrating that at least some of the nasty online content can be accurately detected and flagged for blocking. There are, however, a few areas that still need to be addressed.
It is encouraging to see that the proposed model of policing the internet relies on human gatekeepers to make the final calls. This is essential as falsely labelling benign content as extremist can have serious implications for the person who generated the post online. At best they could have a specific post blocked and at worst they could be profiled, blocked from a forum, questioned or even arrested.
Almost all artificial intelligence (AI) systems make their decisions based on sets of patterns or rules, which are either defined by human experts or garnered from data using machine learning algorithms. It means that any AI is only as good as the data used to train that system. However, the technology isn’t at a point where every decision made by such a detection tool can be treated as infallible.
Waving the flag
While certain types of content are clearly illegal or offensive, there exists a greyscale of impropriety and offensiveness. Human experts do not always agree on what should and shouldn’t be blocked. Global legal and cultural diversity also plays a role – what is legal, or even socially tolerable, in one part of the world might break the law elsewhere.
The Home Office hasn’t revealed which datasets have been used to train the extremism blocking tool. More importantly, the guidelines – the specific criteria and thresholds – used to label the videos as “target” or “non-target” aren’t clear.
The different ways people interact with content also poses challenges. Violent imagery is usually quite explicit; however, the use of metaphor and insinuation can be used to mask the level of violence and offensiveness in a piece of text. But the Home Office’s new system specialise in videos, which means additional and more sophisticated tools will be needed to flush out text-based jihadi content.
For now, the tool is trained to detect IS videos. Previously, Facebook and YouTube – after intense political pressure – have reported similar efforts.
But recent events, such as the murder of MP Jo Cox, the terrorist attack near the Finsbury Park Mosque and the rise in hate crime, demonstrate that content posted by far-right organisations is just as dangerous.
While the Home Office’s latest effort to stamp out jihadi content online should be embraced, much more needs to be done. It’s also a reminder that imperfect AI and machine learning tools aren’t the quick fix answer to all of society’s ills.
is a reader in Text and Data Mining at Manchester Metropolitan University. His areas of research include Artificial Intelligence and Higher Education.
This article was originally published on TheConversation.com website on 27 February, 2018. Republished here with permission from the author.