Generative AI Startup Mistral Releases Free ‘Open-Source’ 7.3B Parameter LLM
Generative AI startup Mistral AI has released an open-source large language model with 7.3 billion parameters. While relatively small for a foundation model, Mistral 7B can handle tasks like summarizing text, answering questions, and generating text. The startup claims Mistral 7B LLM outperforms other open-source models and runs faster and more efficiently than even larger models while still being cheaper to run.
Mistral released its LLM ]under an open Apache 2.0 license, aiming to demonstrate the capabilities of even modestly-sized open-source LLMs. The company argues that open-source generative AI aligns better with its principles, including transparency, customizability, and, preventing misuse. Mistral’s founders come from Meta and Google DeepMind AI and came out of stealth in June with a $113 million seed round and plans to release business-related generative AI products next year.
“At Mistral AI, we believe that an open approach to generative AI is necessary. Community-backed model development is the surest path to fight censorship and bias in a technology shaping our future,” Mistral explained in a blog post. “We strongly believe that by training our own models, releasing them openly, and fostering community contributions, we can build a credible alternative to the emerging AI oligopoly. Open-weight generative models will play a pivotal role in the upcoming AI revolution. Mistral AI’s mission is to spearhead the revolution of open models.”
The company, whose name refers to the cold winter wind in parts of France portending good weather, claims that open-source generative AI models benefit both vendors and users by enabling full customization for specific use cases and needs. This includes adapting model sizes and costs efficiently. The startup is opening code repositories and community channels to collaborate with developers and researchers committed to open-source AI advancement. Mistral 7B can be downloaded directly or even as a torrent (a sizeable 13.4 gigabytes). The Apache 2.0 license doesn’t restrict use or copying as long as there is attribution. Anyone or any company that can run the model can use it. That’s similar to how Meta is offering Llama 2, albeit through more traditional sources like Microsoft Azure. However, it’s also like Llama 2 in that its an “open-source” model built with money from corporate sources, with the actual training data and algorithmic weighting kept private. That puts some limits on how useful the model can be on its own. The enterprise services coming later will be much more useful for companies interested in the technology and will be where Mistral (and its investors) get paid for their work.