Synthedia 3 Generative AI Innovation Showcase: Here’s What You Missed
Synthedia 3 – The Generative AI Innovation Showcase hosted a lineup of amazing innovators in generative AI and synthetic media. They each brought something tangible to the table, as Voicebot.ai and Synthedia founder Bret Kinsella pointed out.
What we saw at Synthedia 3 was real use cases and measurable value that generative AI solutions are delivering today for both enterprises and consumers. And I really liked that everyone showed a solution demo. It is amazing how quickly this industry is delivering impact for search, software development, media, advertising, games, healthcare, security, and other use cases.
Check out some highlights from throughout the conference to whet your appetite for the videos, and be sure to subscribe to the Voicebot Weekly and Synthedia newsletters to get the heads up on our next event.
Synthedia 3
Perplexity AI CEO Aravind Srinivas led off the presentations with a look at his company’s success in building a generative AI-powered search engine. Perplexity is a startup, but its focus and innovation in presenting information and reducing inaccuracies and hallucinations have made it stand out against Google and OpenAI.
“Perplexity is an answer engine trying to revolutionize search from 10 million links to directly seeking answers… The difference from ChatGPT is that you actually can trust what it says because it’s going to cite all of the relevant links it pulls content from,” Srinivas said. “One company needs to obsess about this one thing. Bard has so many use cases, ChatGPT has so many other use cases, nobody is only exclusively focused on search and I think that’s why we’re doing so well.”
Pindrop VP of Product Amit Gupta came up next to discuss advancements in deepfake detection and how Pindrop is leading the way in developing techniques to prevent fraud, even as voice cloning with generative AI has become incredibly easy.
“The advancement in generative AI has not stopped, and what will happen is the technology will get better in terms of synthetic speech generation. And the real threat is that the generative AI technology gets to a point where fraudsters… can just use bots to create a widespread attack, and that’s the scenario that absolutely is imminent,” Gupta said. “At Pindrop, we are always trying to stay ahead of the next level of fraud patterns. We have been very successful in identifying the fraud that happens today in the contact centers. We don’t believe [whether] bad guys will collaborate or not; good guys need to collaborate to ensure that you can make the world a safer place.”
VoicesAI Product Lead Dylan Richardson presented the latest on his company’s work providing AI voice clones to the top global voiceover marketplace. He walked through the creation and licensing process, highlighting how the company makes sure the actors can control where and how their synthetic voice is deployed.
“We have a really unique opportunity having the largest database of professional voice actors, and we can really play into that and offer AI as a complement to our existing natural voice product offerings,” Richardson said. “Because there are real voice actors behind these voices, they’re not stock or amalgamation of other voices or large datasets. We really want to add to the industry that control for talent that will provide them a level of consent and security. That’s really, really important for us, and that’s kind of different in the industry. So, we’re really excited to be able to provide those types of controls.”
Soul Machines CEO Greg Cross joined the stage to show off the latest in his company’s work with animated digital humans. He expanded on how the rapid evolution of AI has transformed the way Soul Machines and their clients consider what’s crucial in designing not just the look but the behavior of the virtual characters.
“The world of avatars and digital characters often gets oversimplified and there are clearly two really, really important parts to the way in which you create and bring an avatar into this world. One is the actual creation process of building a CGI character, and the second part of it is actually, how do you bring them to life? How do you animate them? And that’s something that is incredibly important in terms of the way in which we will experience and interact with content,” Cross said. “We have the ability to generate new content out of existing content in a way that we never have before, and the way we experience that content, the way we personalize that content, the way we curate that content is actually becoming critical. Text boxes are not a particularly friendly type of experience, so how do we make it more human-like? How do we make it more interactive?”
Rembrand CEO Omar Tawakol continued the marketing theme with a demonstration of how his company uses generative AI to automate product placement in videos, making it possible to add or change the products seen after filming while making it appear like the object was there at the time.
“We launched something that [is] an equivalent of AdSense for product placement. It’s the ability for you to run a virtual campaign as a marketer and say I want to get this product in these types of influencers, set up your campaign, and literally flight the campaign. No negotiation, no shipping of products, no waiting long periods of time,” Tawakol said.”We’ve done something that’s very unique in the marketplace. We’ve essentially trained the deep networks to understand the laws of physics so that you can do something that’s incredibly photorealistic. We call it generative fusion because you’re fusing a real-world video with another 2D or 3D real-world asset.”
Veritone vice president of voice Rupal Patel and product manager Corey Hill tag-teamed the next presentation on how Veritone creates and uses synthetic voices to expand how and where content creators find audiences and how voice clones are employed at a larger scale for media and advertising needs.
“Creating synthetic content at scale, whether that’s video content, audio content, text content, what we’re trying to do is increase the reach for our customers and boost their accessibility,” Patel said. “Having voice content actually increases accessibility to the content that these media creators are building. We’re not just improving revenue from the revenue streams that they have now, but actually also increasing retention and finding new revenue models by using AI.”
‘Talent actually go in and train their voice to create a clone, and then you can leverage that at scale. For our live sports events, we’re doing that in sub two seconds from when we receive the event… so it’s actually faster than broadcast from that perspective, and it really enables us
to move very quickly in that space,” Hill said. “From a generative perspective, we’re developing an ecosystem around media and entertainment and a few key personas that we are looking to drive some additional expertise and capabilities to enable them to deliver more quickly and accelerate that content creation journey.”
Orbita chairman and president Bill Rogers next brought the discussion of generative AI to its place in healthcare. He reviewed Generative AI’s capacity to extract meaning from enormous amounts of data and aid patients in finding information and assist healthcare providers in communicating with patients in navigating medical processes.
“We’ve built a generative AI pipeline that’s used to actually direct care to the user as well as educate the patient… The advantage generative AI has is that it actually understands the meaning of what someone’s asking and then directs them. It finds the content that’s relevant to them and presents that to them,” Rogers said. “Healthcare is an industry that, unfortunately, there is so much workflow and there are not enough people to go and do the things that we need to do. You finally have something that can personalize the experience to every patient, which ultimately is going to bring better care to patients around the world.”
Play.ht co-founders Mahmoud Felfel and Hammad Syed joined the stage to discuss how generative AI has made voice clones so much better and easier to make and how their clients are employing the technology. Many of those clients first discovered a demo made by Play.ht in which synthetically generated versions of Joe Rogan and Steve Jobs chatted in an imaginary episode of Rogan’s podcast.
“It went very viral with millions of views across different channels, but we were very surprised by the reaction to that podcast, honestly. People didn’t believe that it is real, that the quality was really from digital voices. We started to see a lot of users coming, and instead of using this model to generate content for traditional text-to-speech use cases for speech synthesis or digital voices, we started to see people using it to replace human voices, to clone their own voice and use it to create a YouTube channel or narrate their own audiobook or for marketing commercial videos,” Felfel said. “So many of these use cases have a very high bar on the expectations when you compare it to a human voice versus comparing it to a TTS system… With that high expectation, we started working on the next version of the model.”
“We offer stock voices, which a lot of users are looking for, ready-to-use voices. Then we also have voice cloning, which is the most sought-after feature,” Syed said. “We allow users to clone voices using two different ways. One is the instant voice cloning, which just requires a 30-second audio clip. This is what our new model is capable of and it’s focused around capturing accents and really making it expressive. And then for a more professional voice cloning, we have high fidelity, which needs a little bit more audio–somewhere like 10 or 20 minutes of audio–and which really gives a result that’s more robust and can speak across various emotions and accents.”
WillowTree president Tobias Dengel next shared how generative AI is being discussed in the enterprise space and among larger companies unsure how to deploy the technology. That’s led to a spike in demand for experts not just in generative AI but in how businesses can best leverage the technology.
“One of the challenges in this AI ecosystem is how do you make it all real? That’s the question that WillowTree and other consultancies like us really solve for the marketplace. There are many point solutions out there; there are many products out there; there are a lot of ideas out there; how do we make it real for the enterprises, especially for larger companies trying to figure out how to navigate this rapidly changing ecosystem? Where should they invest, and how do they tie it all together?” Dengel said. “That’s the role that I think an emerging class of consultancies and agencies play. How to navigate that and how to actually put the pipes into place to tie all this together to realize the value of all that AI can be.”
For the last event of Synthedia 3, Innovation Analyst Jeremiah Owyang and NVIDIA LLM data lead Jane Scowcroft joined Dengel on-stage with me to look at the bigger picture of generative AI, its recent rapid ascent in the tech space, and where things go from here.
“In deploying AI, I agree with finding the right model for the right task. We’re going to see not just one size fits all, not just one large language model to rule them all. We’re going to see different applications, different personalization, and ideally those models talking to each other so I think there’s going to be a progression from a few big players to more decentralization of models, more customization, on-the-fly customization or compute customization,” Scowcraft said. “I think one of the interesting elements that we’ll start to see is around authenticity of content. How do we have some sort of watermark on that content? How do we evaluate where that content has come from? Especially when we’re looking at models talking to each other.”
“I think in a year from now; we should see a significant shift away from Google Search. We saw Perplexity today at Synthedia and that is a Google killer [and] then you see the other conversational LLMs like Pi [from] Inflection, and potentially [Anthropic’s] Claude, where you get information all packaged nicely,” Owyang said. “Google search is pretty horrible with all the sponsored ads and then you don’t even know if it’s the right site you’re clicking to [visit] and then you get the ads following you around, so I think consumers will wake up in a year and realize that there’s a better way to get information and that’s using AI.”
“My conclusion after seeing these demos and speaking with innovators at the event is that generative AI adoption is likely to accelerate in the near term,” Kinsella said after the event. “There will be an inevitable pullback at some point, but the impact of these solutions is so strong, and the appetite by companies for the benefits is so acute that activity will increase. We are emerging from six months of hype and wonder and are now entering a year of implementations.”
Follow @voicebotaiFollow @erichschwartz