Clubhouse Social Audio Network: $1B Valuation, 3M Users, and the Intersection with Voice AI
Clubhouse caused some surprise in the tech community last year when secured a $100 million valuation on a $10 million investment for a beta-stage mobile app with 1,500 users and no website. Last week, The Information reported (N.B. paywall) that Andreessen Horowitz was poised to lead a new funding round valuing the company at $1 billion just eight months later. The series B funding is for a $100 million cash infusion according to Axios.
There are no firm metrics on users, but The Information says there were two million by mid-January 2021 “according to two people familiar with the company’s metrics.” After expanding in Japan, Turkey, and Greece over the past week plus rapid expansion in Europe, that figure is surely approaching three million. Vajresh Balaji created the image above with a forecast that suggests Clubhouse will hit that milestone around mid-February.
Another group that has flocked to Clubhouse of late is the voice AI community. That dynamic has led to some interesting conversations and energy around how the voice AI industry might be impacted by this viral social media trend.
The Business of Social Audio
Tech Analyst Jeremiah Owyang has a lengthy write-up about Clubhouse in his blog this weekend titled, “The Future of Social Audio: Startups, Roadmap, Business Models, and a Forecast.” It is not a how-to piece as you might surmise by the title. Instead, Owyang offers a broad overview of Social Audio as a segment and ideas on how it is likely to evolve. He includes six new product categories he expects to emerge:
- Social Audio Analytics
- Social Audio Management Systems
- Social Audo App Partners
- Social Audio Enterprise Software
- Social Audio Services and Advisors
- Marketplace for Voice and Conversational Talent
The Social Audio Analytics space is one where the voice AI community is likely to have a direct impact. Owyang mentions speech-to-text, NLU, and sentiment analysis in this category and these are all core technology components of voice AI. During a more than three-hour Clubhouse conversation this morning related to this article, other related topics arose organically as well. Owyang told me over Twitter:
The new idea was voice authentication which I heard from Christine Hwang in my room.
He also mentioned that the NLP discussion has motivated him to learn more about that segment. Voice authentication has some obvious applications for audio chat rooms where you see nothing more than an avatar and hear someone’s voice. It could be a secondary safeguard against deep fakes. The primary safeguard is Clubhouse’s current practices around requesting people’s real names and connecting them to either or both Twitter and Instagram. That’s not perfect, but that along with social graph connections are a good start.
The NLP angle is even more interesting. During a Clubhouse Town Hall with company founders Paul Davison and Rohan Seth today, there was mention of a deaf user that was participating in audio chat rooms by starting a Google Meet and holding the phone up to the mic with the automated transcription feature turned on. This is a feature that Google added just last week using technology from Otter.ai.
However, transcription is the tip of the iceberg. NLP could also be used to drive sentiment analysis of conversations about ideas, brands, musicians, celebrities, events, and so forth. This is being done today by “listening” to social posts, but in those cases, it is actually an analysis of text-based content or transcriptions of video posts. As social audio content rises, it will be a new channel for assessing sentiment from spoken conversations which will include even richer data even if the reach is smaller.
What Voice AI Community Members are Saying
I’m very interested in the authentication in Clubhouse and the implications of that to prove someone’s identity when the whole interactions are through voice. I’m also very interested in the sentiment analysis about the context of the rooms for the audience…Also the fact that you can record with explicit permission of the speakers will change podcasting. I’ve been waiting a long time for some interactivity as a content producer when it comes to my audio content. This seems like it is a great option now to get real-time feedback and then take this content and share it out in other formats.
Clubhouse will create opportunities to integrate services like ASR and TTS that we in voice tech are so familiar with. The main application will be to enable people with hearing and speaking disabilities to participate in the discussion. If the platform decides to open up APIs it may go even further and allow an entire ecosystem of apps or plugins to flourish.Clubhouse is different than any other social network. It will build new habits as it is the first one where users actively schedule their participation in the calendar..With functionality of room scheduling I see a clear path for VoiceFirst experience. I’d love to be notified when my favourite friends start a room and when any scheduled room that I signed up for is about to start and then participate without even getting up from the couch.
One challenge for audio social platforms lies in moderation. As these platforms scale up, their social responsibility also grows: protecting their users from harassment. It will pose a technical challenge on speech-to-text and natural language understanding, especially when tackling an international audience and a variety of languages that are not English.
Voice has been looking for a social media play; it looks like social just finally discovered voice. The opportunities here are plentiful but not easy. All social networks need critical mass to go from a niche to something everyone is using – with that comes the burden of moderation, verification and safety. I think the voice first industry has a lot to offer the likes of clubhouse in helping them scale a safe community that makes use of conversational AI for good.