OpenAI Debuts ChatGPT and Whisper Speech-to-Text APIs
OpenAI released APIs for ChatGPT and Whisper. Businesses can incorporate ChatGPT’s conversational generative AI and Whisper’s speech-to-text capability into their websites and apps.
The ChatGPT API is not the same as the GPT-3.5 API that many brands have referred to as the ChatGPT API. The model underlying the ChatGPT API is a related variant called “gpt-3.5-turbo.” As the name suggests, the model upgrades GPT-3.5’s speed and makes it more responsive. Even companies already engaging with the standard GPT-3.5 API will likely jump at the new version, if only for its price point. The ChatGPT API costs$0.002 for a thousand tokens, word fragments that add up to about 750 words in total. That’s a 90% drop from GPT-3.5’s standard price. OpenAI’s largesse is probably born from its recent success in reducing what OpenAI CEO Sam Altman once called the “eye-watering” costs of hosting generative AI chatbots by around the same percentage.
The new model also differs from its previous iteration in not processing input with tokens. It uses the new Chat Markup Language (ChatML), which feeds words to the AI words as a sequence of messages connected with metadata. Users can add more information to a submitted to before the AI begins to work. Several brands are already using ChatGPT API. Snap has deployed the model for the My AI chatbot on Snapchat, while Instacart and Shopify have rolled out generative AI shopping assistants and education quiz service Quizlet has made ChatGPT’s API an academic tutor and gameshow host. Developers can further adjust their generative AI augmentation by buying dedicated instances on Microsoft’s Azure program. This approach also differs from the recently uncovered Foundry program OpenAI is setting up for companies who want to host large language models on their own dedicated server space and have more control over fine-tuning the model.
“We are also now offering dedicated instances for users who want deeper control over the specific model version and system performance. By default, requests are run on compute infrastructure shared with other users, who pay per request. Our API runs on Azure, and with dedicated instances, developers will pay by time period for an allocation of compute infrastructure that’s reserved for serving their requests,” OpenAI explained in a blog post. “Developers get full control over the instance’s load (higher load improves throughput but makes each request slower), the option to enable features such as longer context limits, and the ability to pin the model snapshot.”
The new Whisper API provides more support to the open-source automatic speech recognition software kit unveiled last year. OpenAI trained Whisper to transcribe English audio or transcribe and translate several other languages into English. The 680,000 hours of audio focused particularly on loud environments or non-standard, technical language. The downside is that Whisper isn’t as good as some other ASR models at next-word prediction. Still, Whisper has the potential to enhance apps and software tools with speech-to-text features. Employing Whisper costs $0.006 per minute but promises faster and more convenient access for companies that don’t want to build their own ASR from the open-source code. South Korean language learning app Speak is already using Whisper to power a new “AI speaking companion product” to help users practice languages and share constructive feedback.
“Whisper, the speech-to-text model we open-sourced in September 2022, has received immense praise from the developer community but can also be hard to run. We’ve now made the large-v2 model available through our API, which gives convenient on-demand access,” OpenAI explained. “We believe that AI can provide incredible opportunities and economic empowerment to everyone, and the best way to achieve that is to allow everyone to build with it. We hope that the changes we announced today will lead to numerous applications that everyone can benefit from. Start building next-generation apps powered by ChatGPT & Whisper.”