I can’t often publicly share details about the kind of projects we undertake at the Netherlands Forensic Institute with the help of AI, but at the recent EuroPython 2023 in Prague, I was able to discuss a case that unfolded a few years ago and on which the NFI had previously issued a press release: the Threat-to-Life project.

Police could read along with criminals

In 2020, the police managed to read live messages from a provider of so-called cryptophones: modified phones that — for a substantial payment — were used for encrypted communication in the criminal circuit. It wasn’t the first time, nor the last, that the police managed to do this. It happens so frequently that there even exists a summary list of such operations against providers of cryptophones.

In practice, it turns out that some criminals feel extraordinarily safe using these cryptophones. They unabashedly transmit the most sensitive and incriminating messages without any obfuscation. Communication is key in the business world, apparently, no matter what kind of business you are in.

Detecting threat-to-life messages

Being able to read along is one thing, but when it comes to a large flow of messages, you want the police to be able to assess certain types of messages in a timely manner. For example, if discussions revolve around preparing for assaults, kidnappings, and assassinations, timely action must be taken to prevent these. So, something was needed: a threat-to-life detector.

Herein lay the challenge: train a classification model that can find threat-to-life messages in large collections of non-threat-to-life messages from cryptophones. And although the task — classification — is not so innovative in itself, it is not trivial to get such a model off the ground. After all, you need to create a model that can handle the kind of language used in these messages: informal and riddled with street language and jargon. Quite different from the language you encounter when you scrape Wikipedia, for instance.

In addition, you must be able to gather enough training data — examples of the kind of messages you are looking for. And remember: these are relatively rare in the large stream of other messages. Kind of a chicken-and-egg problem actually.

EuroPython 2023

You can see how we solved these problems in the live recording of my talk below. It was a relatively short talk for an audience of programmers, not necessarily data scientists. Therefore, I chose not to delve too deep into the details of the deep learning, and instead spent more time discussing the context of the entire story.

But that’s why I think it gives a nice peek behind the scenes: it shows what you encounter when deploying AI for a case like this.

Dozens of serious violent crimes prevented

And the result? In the police press release from July 2020, the preliminary balance of the police operation was disclosed. It also shows what the police have been able to do with the threat-to-life signals that emerged from the investigation.

Below is the preliminary balance:

  • Over 100 suspects arrested for very serious crimes
  • Nearly 20 million euros in cash seized
  • The seizure of 8000 kilograms of cocaine and over 1200 kilograms of crystal meth
  • 19 synthetic drug labs dismantled
  • Also, dozens of firearms were taken off the streets
  • In the Netherlands alone, over 3000 signals that seemed life-threatening were processed in the past few months. By intervening timely, the police were able to prevent dozens of serious violent crimes, including impending kidnappings, extortions, assassinations, and tortures.

Today, three years later, Europol is still keeping score of the entire operation. According to them, the counter had reached over 6,500 arrests and almost 900 million euros in cash and assets have been seized.

And now?

Cryptophones and decrypted messages were and are still very relevant in criminal cases. This is further illustrated by a recent news article from NOS on how the digital department of the NFI managed to crack hundreds of individual cryptophones.

This post was translated from the original Dutch with the help of GPT-4.