Join top executives in San Francisco on July 11-12 and learn how business leaders are getting ahead of the generative AI revolution. Learn More
Powered by Advanced Micro Devices chips, the LUMI supercomputer – recently ranked as the fastest supercomputer in Europe as well as one of the most energy efficient – has enabled the TurkuNLP Group to create new models quickly.
This kind of thing will fuel interest in large language models (LLMs) which have enabled generative AI solutions such as ChatGPT. While LLMs are taking off just fine, training them takes a huge amount of compute power, and models like ChatGPT are usually both proprietary and based on English.
When University of Turku Research Fellow Sampo Pyysalo wanted to extend the value of LLMs to wider research applications, he needed performance to train the models in a useful timeframe. The LUMI supercomputer, based on the HPE Cray EX supercomputer architecture and powered by AMD Epyc
CPUs and AMD Instinct GPUs, provided the scale the workloads needed.
To put this into context, LUMI is two orders of magnitude bigger than the previous generation machines available in Finland. Previously, it took the team half a year to pre-train a billion-parameter language model on a computer, but now it takes only two weeks for LUMI to process about 40 billion tokens, constituting characters, syllables or words, AMD said.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
Väinö Hatanpää, machine learning specialist at CSC, said in a statement, “The computing capacity and the ability to scale further with LUMI enables our customers to push the boundaries of Machine Learning/AI.”
Opening up large language models
Pyysalo’s goal with partners Risto Luukkonen and Ville Komulainen in the, TurkuNLP, was to open up LLMs for academic use.
“The big players are large multinational corporations who keep their models closed,” he said in a statement. “In academia we want practical access to models like these, so we have been creating them ourselves and this requires supercomputer resources.”
Finnish was the natural starting point for a university based in Finland, such as Turku.
Building LLMs relies on advanced AI and machine learning toolsets. Pyysalo has been working with Hugging Face for this. Models of this size require immense computing scale, which is where LUMI proved essential.
LUMI, owned by the EuroHPC Joint Undertaking, was funded 50/50 by the EuroHPC JU and the LUMI consortium consisting of ten European countries. It is based in Finland at CSC — IT Center for Science’s data center— and hosted by the LUMI consortium. The LUMI-G GPU partition dwarfs other GPU partitions hosted by CSC.
The organization’s Mahti AI consists of 24 GPU nodes, while Puhti offers 80 GPU nodes, both with four GPUs per node. LUMI, in contrast, boasts 2,560 nodes powered by AMD Epyc processors, each with four AMD Instinct MI250x accelerators, for a total of 10,240 GPUs and 20,480 graphics compute dies (GCDs).
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.