Google, the tech giant, recently showcased its latest advancements in artificial intelligence (AI) at its annual developer conference, Google I/O. One of the most notable announcements was Project Astra, a universal AI agent that can understand and respond to users' queries in real-time. The technology is designed to be multimodal, meaning it can process both text and visual information from the environment.
Google's head of DeepMind and leader of its AI efforts, Demis Hassabis, has been working on AI for decades. He envisions Project Astra as a universal assistant that can accomplish tasks on behalf of users and improve trip planning features. Some AI agents may act as simple tools for getting things done while others will be more like collaborators and companions.
Google's Gemini 1.5 Flash model is designed to be faster for common tasks such as summarization and captioning, while the Gemini Nano model is supposedly faster than ever for local use on devices like smartphones. The context window for the Gemini Pro has been doubled to 2 million tokens, improving its ability to consider information in a given query.
Project Astra can see the world, identify objects and locations, answer questions, and assist users in various tasks near real-time. Google is focusing on developing AI agents that can accomplish tasks on behalf of users. Hassabis envisions some AI agents as simple tools for getting things done while others will be more like collaborators and companions.
Google's Project Astra technology will come to the Gemini app later this year, allowing users to point their phone camera to nearby objects and ask relevant questions such as 'What neighborhood am I in?' or 'Did you see where I left my glasses?'. The company also announced Music AI Sandbox, a group of AI tools that can help artists create music with DeepMind and YouTube.
Google's rivals in the high-stakes AI competition include Microsoft, Meta, Amazon, OpenAI, Anthropic, and Perplexity. Google Search has already answered billions of queries with Gemini technology and is now incorporating multi-step reasoning in search. For example, it can find the best-rated yoga studios in Los Angeles and calculate the walking distance from each while offering the cost per class all from one search query.
Google's Alphabet CEO Sundar Pichai announced that Google will roll out AI capabilities in its flagship search product to all U.S. users this week, making it more accessible and convenient for everyone.