ADVERTISEMENT
Google’s generative AI ‘Project Astra’ bot to take on GPT-4oDeveloped by the Google DeepMind team, Project Astra will be incorporated to the Gemini chatbot app on Android phones. It runs on advanced speech models to be able to understand and have long conversations with device owners to performs multiple tasks with less time.
Rohit KVN
Last Updated IST
<div class="paragraphs"><p>Google's Project Astra bot comes with Multimodal capabilities.</p></div>

Google's Project Astra bot comes with Multimodal capabilities.

Credit: Google

Just a day before I/O 2024 event, OpenAI showcased the new GPT-4o. It is capable of understanding queries in any combination of text, images and audio and respond with the same mode. 

ADVERTISEMENT

All the demo videos of the GPT-4o were flawless. The chatbot was able to understand complex queries and managed to interact with human-like voice and reasoning capabilities. 

On Tuesday (May 14), Google showcased latest versions of Gemini Large Language Models (Nano, Pro and Advanced). It also offered a sneak peek at Project Astra, its latest advancements in generative Artificial Intelligence (gen AI), that can match with OpenAI’s latest GPT-4o model. 

Developed by the Google DeepMind team, Project Astra will be incorporated to the Gemini chatbot app on Android phones. It runs on advanced speech models to be able to understand and have long conversations with device owners to performs multiple tasks with less time. 

In a two-part demo, Google showed the prototype chatbot in action. It was quick to understand the environment inside the office, viewed through the phone’s camera. The user pointing the camera to a speaker, asked "Tell when you see something that makes sound". Gemini instantly answered —"I see a speaker that makes sound".  The user then drew an arrow mark on the screen to highlight a component of the speaker and asked the chatbot, what it was. It again rightly answered it saying- "That is a Tweeter. It produces high frequency sounds."

The user also showed a drawing of two cats and placed a small card box with ‘?’ on it.  And, said "What does this remind you of". It correctly replied "Schrodinger’s cat".

It was also able to instantly read and understand hundreds of codes. And, managed to correctly identify that the part was of "encryption and decryption functions" of an application. 

Google said that similar capabilities is being implemented in Circle to Search feature. He/she has to just point the camera on a problem on the book and invoke Gemini AI bot by performing ‘circle to search’ gesture on the screen. It will offer step-by-step guide on solving the problem in less time.

The new gen AI chatbot model of Project Astra looks very promising. Google said it will continue to improve the prototype and be able to understand more information in context such as sights (video and images), sounds and spoken language.

The company has confirmed to bring aforementioned features to Pixel phones in the coming months. 

Get the latest news on new launches, gadget reviews, apps, cybersecurity, and more on personal technology only on DH Tech.