D-ID Launches Web App Giving ChatGPT a Synthetic Human Face and Voice
Generative AI chatbot ChatGPT has a face and voice to go with its conversation thanks to synthetic media startup D-ID’s new chat.D-ID web app. D-ID’s text-to-video technology produces a realistic digital human appearance and converts ChatGPT responses into speech; opening up new possibilities for interacting with OpenAI’s popular conversational engine.
ChatGPT’s API operates as the brain of chat.D-ID, with the D-ID’s streaming text-to-video tech providing the audio and visual components of the photorealistic synthetic human that users can have a face-to-face conversation with by talking and listening instead of writing and reading responses. D-ID envisions the web app broadening access to ChatGPT to people who may not be able to see written text or who would prefer a more human-like approach to conversation.
When connected to the web app, D-ID’s synthetic host “Alice” is there to greet you and respond to queries typed into the text box or spoken after clicking on the microphone icon. The app is still in beta with Alice as the single face and voice, but D-ID plans to start adding other digital characters as options. Eventually, the company will allow users to upload their own picture to serve as their personal ChatGPT face, though no celebrities or public figures will be allowed. Otherwise, it’s similar in concept to how D-ID first gained fame for transforming photographs into videos with still images of people moving and speaking.
“Our tech unlocks a side of artificial intelligence that the world hasn’t seen before,” D-ID CEO Gil Perry said. “The switch from text interface to speaking face-to-face makes the experience more impactful, enjoyable, and engaging and helps people better understand the information it delivers. With chat.D-ID, conversations with AI will become accessible to a far wider audience, including children, the elderly, people with disabilities, and billions of people worldwide beyond the tech community.”
D-ID released the web app not long after introducing a new chat API for real-time streaming with generative AI tools. The API is part of the company’s Creative Reality Studio, which it launched last year as a way for customers to design their own video avatars based on uploaded photographs or from synthetically generated images produced by Stable Diffusion’s text-to-image engine. The avatar can perform a script written by the user or composed by OpenAI’s GPT-3 text generator. The chat API opens the door to real-time interactions using responses streamed from generative AI chatbots.