BigScience Bloom

Left behind: why the Dutch language is absent from Europe's foremost open language model

Three volunteers. A couple of weeks of work. That’s what it took to add a language to BigScience BLOOM, the open multilingual language model with no fewer than 176 billion parameters that was released mid-2022. It aimed to become an open and multilingual alternative to GPT-3. In the end, 46 languages from all over the world made it into the dataset BLOOM was trained on. Even relatively small languages like Basque and Catalan managed to be included....

18 September 2023 · 10 min · Edwin Rijgersberg
Screenshot of talk at EuroPython 2023

My talk at EuroPython 2023: "Threat to Life — Preventing Planned Murders with Python"

I can’t often publicly share details about the kind of projects we undertake at the Netherlands Forensic Institute with the help of AI, but at the recent EuroPython 2023 in Prague, I was able to discuss a case that unfolded a few years ago and on which the NFI had previously issued a press release: the Threat-to-Life project. Police could read along with criminals In 2020, the police managed to read live messages from a provider of so-called cryptophones: modified phones that — for a substantial payment — were used for encrypted communication in the criminal circuit....

11 September 2023 · 4 min · Edwin Rijgersberg