Inworld Voice Augments Virtual Beings With Expressive (and Inexpensive) AI-Generated Speech

Generative AI synthetic character startup Inworld AI has officially released a new AI voice generator designed to provide dynamic and expressive voice experiences. Inworld Voice aims to offer developers realistic but relatively low-cost synthetic voices for use in video games, audiobooks, and voice assistants.

Inworld Voice launches with a 58-voice portfolio enhanced with AI models that mimic human speech patterns, including latency, intonation, pitch variation, and rhythm. The platform leverages machine learning models that are also flexible enough for customization. Users can select specific voices and adjust settings to better suit their projects. This level of customization is particularly beneficial for users who need specific vocal characteristics for their content, whether for a game character, an audiobook narration, or a virtual assistant. Inworld said it wants the AI-generated voices to sound more natural and engaging so that interacting with them feels more like talking to a real person.

Inworld AI envisions a wide range of use cases for its new voice generator. These include gaming, where dynamic and realistic voice interactions can enhance the immersive experience; audiobooks and narration, where expressive voices can bring stories to life; and voiceovers for various media productions. Other potential applications include educational tools and training programs, podcasts, AI assistants, chatbots, and other interactive experiences that benefit from high-quality, engaging voice interactions.

And Inworld Voice is notably cost-efficient. The startup provides the first 100 text-to-speech (TTS) daily requests free of charge via its TTS API. Additionally, for customers using the Inworld Engine, the voice generation pipeline is included at no extra cost. This pricing model is designed to make high-quality AI voices accessible to a broader range of users and industries.

Inworld is best known for its Character Engine, which is capable of constructing virtual beings out of text descriptions. Users can detail a character’s appearance, behavior, background, and more in great detail, and the generative AI engine will synthesize the virtual being described, relying on general knowledge databases and proprietary sources the user can include. The company emerged from stealth in 2021 and has raised around $120 million while becoming a popular option for creating virtual beings thanks to the Inworld Studio platform. The startup is a graduate of the Disney Accelerator program, where it demonstrated a prototype ‘Droid Maker’ for designing the look and personality of an interactive droid from the Star Wars universe. Inworld most recently partnered with Microsoft to provide Inworld’s tech to Xbox developers, enabling them to deploy generative AI-assisted content into their games.

In addition to its current offerings, Inworld AI is actively developing new features to enhance Inworld Voice further. Upcoming developments include voice cloning capabilities, allowing users to create custom voices based on specific samples, and multilingual support, broadening the tool’s applicability across different languages and regions. Moreover, the company is focusing on adding more contextual awareness to its voice models to enable more emotionally connected interactions. This could make AI voices more responsive and nuanced, improving the user experience.


