UAE Releases Open-Source Falcon 40B Generative AI LLM
The United Arab Emirates’ Technology Innovation Institute (TII) has released an open-source large language model with 40 billion parameters called Falcon 40B. The UAE’s government hopes to entice generative AI startups to pick Falcon 40B over rival LLMs supported by potential venture funding through the government’s VentureOne commercial investment group.
TII claimed to have trained Falcon 40B and its 40 billion parameters on a trillion tokens worth of data. For comparison, the original version of ChatGPT had 1.5 billion parameters and was trained on roughly 900 billion tokens (300 billion words, with about three tokens per word). There are also smaller Falcon models, though far smaller at 7.5 billion and 1.3 billion parameters, with a far larger variant of 180 billion parameters in the works.
“Computing power plays a pivotal role in expediting AI system training and enabling faster implementation of use cases,” TII CEO Ray Johnson said. “As the new fuel that drives technological innovation, the move to offer such support will be game-changing in enhancing the capabilities of innovators, and enabling them to push the boundaries of their projects to achieve remarkable advancements.”
The open-source approach of the release is to help it stand out from OpenAI, Google, and other LLM providers in order to draw interest from developers.TII is soliciting proposals from researchers and computer scientists on developing applications with Falcon 40B. The chance for money and other resources from VentureOne is part of that incentive. TII and VentureOne are both part of the UAE’s Advanced Technology Research Council (ATRC). The country has been experimenting with integrating generative AI into its services this year, with the Dubai Electricity and Water Authority incorporating ChatGPT into its customer service system. Soon after Falcon 40B’s rollout, it was voted to the top of Hugging Face’s Open LLM Leaderboard. TII has also begun training Falcon on Amazon Web Services infrastructure and worked with the cloud provider to make the model available on Amazon SageMaker.
“Making Falcon 40B open source represents a critical milestone in our commitment to fostering AI innovation,” ATRC secretary general Faisal Al Bannai said. “We are disrupting LLM access and enabling researchers and entrepreneurs to come up with the most innovative use cases. We will further support these submissions with computation power as funding through VentureOne, helping to advance a thriving research ecosystem.”