Google Upgrades Conversational AI Intuition and Flexibility at Google I/O 2021 With LaMDA and MUM
This year’s virtual Google I/O covered a lot of ground, including Google showing off new conversational AI techniques, albeit with minimal direct attention for Google Assistant. Google highlighted the Language Model for Dialogue Applications (LaMDA) and Multitask Unified Model (MUM) as new tools capable of enhancing an AI’s conversational ability to feel more like talking to another human and capable of being more helpful when people use casual words and phrases.
LaMDA attempts to parse the complexity and uncertainty of open-ended conversation between humans into an AI, addressing how the beginning of a chat may have nothing to do with where the talk ends. LaMDA was trained on dialogue to mimic a more natural way of conversing by looking at individual words and whole sentences and paragraphs, working out their relationships, and grasping the bigger picture to try and predict what will be said next and what its response should be. That way, it can respond in a way that actually makes sense in terms of the whole conversation, not just the last phrase uttered.
“LaMDA is a language model for dialogue applications. It’s open domain, which means it is designed to converse on any topic,” Alphabet and Google CEO Sundar Pichai explained in presenting the conference. “For example, LaMDA understands quite a bit about the planet Pluto. So if a student wanted to discover more about space, they could ask about Pluto and the model would give sensible responses, making learning even more fun and engaging. If that student then wanted to switch over to a different topic — say, how to make a good paper airplane — LaMDA could continue the conversation without any retraining.”
LaMDA was born out of Meena, the chatbot Google claimed as the most advanced ever built when it debuted in early 2020. Meena relied on 2.6 billion parameters, training on more than 40 billion words, 341 gigabytes, of online text over 30 days. Google even devised a new test for making sense to show how much Meena talks like a person. But, the question is if LaMDA can be sensible on every topic in its open-ended conversations. Former Google researcher and current chief scientist at AI developer ASAPP Ryan McDonald told Voicebot that LaMDA might be too general to make users happy in the long run.
“LaMDA and Meena are powerful models that encapsulate a huge amount of world knowledge, which makes them impressive to use. However, out of the box, LaMDA will likely encounter problems in domain-specific settings. Getting to the point where LaMDA could be quite powerful (in potentially future enterprise offerings) will require many of the utterances to be based not on general domain knowledge, but specific to a user, offering, etc. It will require conditioning on more than just the conversation itself,” McDonald said. “It poses interesting challenges — such as how to retrieve and represent the out-of-conversation context needed to predict what will be said.”
MUM More Than Words
Perhaps partly because of LaMDA’s limitations, the Multitask Unified Model takes another approach to solving the complexities of conversational AI. MUM is designed to understand complex, multi-part requests better than any existing algorithm. The multitasking power comes in part from MUM’s training across 75 languages, multiple tasks, and, crucially, not just words but images. The multimodal input will even expand to audio and video in the near future, according to Google. You can see one potential example in the gif above. While working out an answer would normally take more than a couple of searches, MUM, currently in pilot testing, is designed to understand and respond to that complex and partly visual question as well as a human would.
“LaMDA is a huge step forward in natural conversation, but it’s still only trained on text,” Pichai said. “When people communicate with each other they do it across images, text, audio and video. So we need to build multimodal models (MUM) to allow people to naturally ask questions across different types of information. With MUM you could one day plan a road trip by asking Google to “find a route with beautiful mountain views.” This is one example of how we’re making progress towards more natural and intuitive ways of interacting with Search.”