The first ALIA language models. The aim of the ALIA is to promote the development of artificial intelligence (AI) by making resources available to everyone in Spanish and co-official languages (Catalan and Valencian, Basque and Galician).
The idea is that individual users and companies can use these resources to carry out research or develop their own AI products, although this technology will also land in some public bodies. In fact, the activation of ALIA is accompanied by the launch of two pilot projects: an internal chatbot that promises to streamline the work of the Tax Agency, and a solution aimed at primary care medicine that will allow "an early and more precise diagnosis of heart failure."
ALIA is now available to everyone, verified by the Spanish Agency for the Supervision of Artificial Intelligence (AESIA). In the case of language models, these have been trained using part of the infrastructure of the Barcelona Supercomputing Center, in Specifically, the MareNostrum 5 supercomputer, a key piece for Spain's scientific ambitions, has been in operation since 2023 and has cost more than 200 million euros.
The available models are as follows:
A variety of sources have been used. Data from Common Crawl, GitHub, Wikimedia (Wikimedia, including Wikipedia, Wikilibros, Wikinoticias, Wikiquote, Wikisource and Wikivoyag), EurLex, among others.