Google IO 2024

Google’s ‘Gemini Era:’ The Top Generative AI Google I/O 2024 Announcements

Google I/O 2024 managed to center generative AI even more than the previous year, with a long list of new models and features across Google’s product line. We are summarizing and updating the top highlights from the multi-hour event below and will add links to more detailed coverage of some of them.

Google is fully in our Gemini era,” Google CEO Sundar Pichai said during his keynote address. “Still, we are in the early days of the AI platform shift. We see so much opportunity ahead, for creators, for developers, for startups, for everyone. Helping to drive those opportunities is what our Gemini era is all about.”

Model Mobs: Gemini and Gemma

Gemini was the star of the show (no pun intended), with several notable updates announced, including that Gemini 1.5 Pro has doubled its token capacity to two million. This allows the AI to more effectively process and analyze longer documents, extensive codebases, and multimedia files. Google also previewed Gemini Live, a vocal interface for Gemini AI. Users can engage in real-time, in-depth voice chats with Gemini on their smartphones, interrupt the AI to ask questions, and receive responses that adapt to their speech patterns. Gemini can also use a device’s camera to view and respond to what’s around the user, whether by picture or video.

Another major highlight was the introduction of the Gemma 2 model, a more open LLM compared to Gemini. The 27-billion-parameter model will roll out in June. Gemma 2 is optimized by Nvidia for next-generation GPU efficiencies, and Google said it will run on a single TPU host and Google’s Vertex AI, suggesting significant improvements in speed and handling of complex AI tasks.

Service Stars

Gemini is going to be woven even more into Google’s online services, including Gmail. The model will power features like searching and summarizing emails, as well as drafting responses. Moreover, Gemini will assist with more complex tasks, such as returning products ordered online, by automating the search for receipts and filling out return forms. For developers, Google is integrating Gemini AI into the Google Maps platform via the Places API. This will allow developers to incorporate AI-generated summaries of locations into their applications, automating rich content generation and reducing the need for manual input. Google also previewed an Android feature designed to alert users to potential phone call scams using Gemini Nano. The AI listens for scam-associated conversation patterns, such as fraudulent company representatives. When suspicious patterns are detected, the system alerts users.

Multimodal Media

Google unveiled Imagen 3, the latest iteration of its generative AI image creation model. Imagen 3 is supposed to be better at understanding text and grasping the intent behind image prompts. The company’s synthetic media options also expanded with the introduction of Veo, a new generative AI model that can create high-quality video clips from text prompts. Veo can produce a minute of 1080p video in a variety of styles and edit any generated video or real film uploaded to the platform.

Google also showed off a new AI-fueled Ask Photos feature for Google Photos coming out this summer. This feature will use Gemini AI for natural language search queries. Pichai demonstrated by asking his phone for his license plate number, and it pulled the correct image with the license plate from his photos.

  

20+ Generative AI Features Announced at Google I/O

Generative AI Stars at Google Cloud Next: Here’s The Most Notable News This Year

Google Bard’s Global Vision for Generative AI at Google I/O