Hugging Face has upgraded the StarCoder generative AI coding assistant introduced last year. The new StarCoder 2 large language models (LLMs) were trained on open-source data and can even run on computers with relatively low power compared to the likes of GitHub Copilot.

StarCoder 2 was built by Hugging Face and ServiceNow Research through their BigCode Project. in three variants of 3 billion, 7 billion, and 15 billion parameters. The largest model, StarCoder2-15B, stands out for being trained on 4 trillion tokens and more than 600 programming languages. All of the models employ better technical functions than the original StarCoder, including Grouped Query Attention and the Fill-in-the-Middle training objective, setting new benchmarks for efficiency and scalability in code generation.

The significance of StarCoder 2 extends beyond its technical prowess, addressing the growing demand for AI-powered code generators that are both effective and ethically designed. Unlike some of its predecessors and contemporaries, StarCoder 2 is released under the BigCode Open RAIL-M 1.0 license, fostering responsible usage while offering more freedom than traditional licenses. This initiative not only promotes ethical practices in AI development but also positions StarCoder2 as a potentially more attractive option for developers seeking to integrate AI into their workflows without the legal complexities associated with copyright restrictions.

Coding AI assistants are a huge draw for both startup investment and tech giant projects. For instance, generative AI coding assistance startup Magic recently raised $117 million, not long after another named Codeium raised $65 million, and Tabnine pulled in $25 million. On the other end of the spectrum, Magic faces not only GitHub Copilot but Hugging Face’s Starcoder, IBM’s watsonx Code Assistant, Meta’s Code Llama, and Amazon’s CodeWhisperer, with more likely to follow.

