Google is enhancing its visual search tool, Google Lens, by introducing the ability to answer questions about videos in near-real time. This new feature allows users to capture a video of their surroundings and ask questions about objects of interest within the video.
Available for both Android and iOS users who have the Google app installed, this update enables Lens to analyze videos and provide relevant information. According to Lou Wang, Google’s Director of Product Management for Lens, the feature relies on a specialized version of the Gemini AI model, which helps interpret the content of videos and respond to queries. The Gemini model is part of Google’s AI ecosystem and powers various products across the company.
For example, Wang explained that if someone is curious about a group of fish swimming in a circle, Lens could generate an overview explaining their behavior, along with additional resources to explore.
To use this new video analysis feature, users must sign up for Google’s Search Labs program and opt into the experimental features under “AI Overviews and more.” By holding down the shutter button on the Google app, users can activate the video-capturing mode of Lens, and as they ask questions, Lens will pull information from AI Overviews, summarizing relevant data from across the web.
Wang noted that the AI determines which video frames are most significant and relevant to the question being asked, allowing Lens to provide a more focused and accurate response. This feature stems from observing how users currently interact with Lens, and Google hopes it will encourage people to naturally engage with the app in a more inquisitive way.
This video feature for Lens is similar to one recently previewed by Meta for its AR glasses, Ray-Ban Meta, which aims to offer real-time AI video assistance. Additionally, OpenAI has hinted at plans to introduce video understanding capabilities in its Advanced Voice Mode tool, a premium feature in ChatGPT.
Although Google’s video feature is asynchronous, meaning it cannot provide real-time responses just yet, it still marks a significant step forward in video analysis powered by AI.
In addition to video analysis, Lens has also introduced new features for e-commerce. When Lens identifies a product in a photo, it will display relevant details such as price, deals, reviews, and availability. This functionality is currently limited to specific regions and categories like electronics and beauty products. As shopping searches make up a large portion of Lens usage, Google sees this as an opportunity to integrate advertising into the results.