Cohere Releases Multilingual Aya 23 LLMs as Open Model in 23 Languages
Enterprise generative AI developer Cohere has released a new set of multilingual large language models (LLMs) called Aya 23 through its non-profit research division Cohere for AI (C4AI). Aya 23 comes in 8-billion and 35-billion parameter variants and offers models with open weights in 23 different languages, allowing researchers to experiment with them more fully than with a fully closed model.
Aya 23
Aya 23 is part of the open science Aya initiative, which previously facilitated 3,000 researchers to create the original Aya 101 model as well as the largest multilingual instruction fine-tuning dataset available. Though Aya 101 included 101 languages, Aya 23 is far more comprehensive in its knowledge of the smaller number of languages, which together account for half the world’s population.
Aya 23 combines a pre-trained model with a new Aya dataset collection and aims to address the language limitations of existing models, which often serve only a few languages. By expanding the scope of high-performant language models, Cohere for AI hopes to democratize access to advanced AI technology and support diverse linguistic communities. The release of Aya 23 signifies a shift in how the machine learning (ML) community approaches multilingual AI research, emphasizing inclusivity and broader language support.
“Aya 23, as well as the wider family of Aya models and datasets contributes to a paradigm shift in how the ML community approaches multilingual AI research. As LLMs, and AI generally, have changed the global technological landscape, many communities across the world have been left unsupported due to the language limitations of existing models,” Cohere explained in a blog post. “Aya 23 is part of our commitment to contributing state-of-the-art research demonstrating that more languages can be treated as first-class citizens and releasing models that support researchers who join this mission.”
Performance benchmarks for Aya 23 indicate that it outperforms other massively multilingual open-source models, including Aya 101, as well as widely used open-weight instruction-tuned models. These models demonstrate they are quite good at natural language understanding, summarization, and translation. One of the notable aspects of Aya 23 is its accessibility. The 8-billion parameter version is designed to be highly efficient and accessible to everyday developers, reducing computational resource requirements.
Aya 23 is notably different from Cohere’s usual focus on generative AI models suitable for customer service and other business needs like the Command R+ LLM and related models. But a partnership with Microsoft and a a $270 million raise last year suggest the company is in good shape. And, of course, multilingual models are a great resource for any international business, with Cohere now able to claim initmate understanding of one capable of communicating with half of the globe.
Follow @voicebotaiFollow @erichschwartz
Cohere Releases Enterprise LLM Command R+ Signs Up With Microsoft Azure to Host