What are the key features of Google’s new Imagen 2 AI tool?
Cover Photo Major News from the Cloud Next Conference in Las Vegas featuring Google Imagen, Google Add ons, Vertex AI and Google's Gemini 1.5 Pro

Google Unveils Imagen 2 with Text-to-Live Images Feature, Raising Concerns

Google has announced the launch of Imagen 2, an enhanced image-generating tool, within its Vertex AI developer platform. This comes after Google’s previous image generator, built into its AI-powered chatbot Gemini, faced controversy for injecting gender and racial diversity into prompts, resulting in offensive inaccuracies. The most significant addition to Imagen 2 is the “text-to-live images” feature, which can create short, four-second videos from text prompts. Google is positioning this feature as a tool for marketers and creatives, such as generating GIFs for ads. However, the current resolution of these live images is low, at 360 pixels by 640 pixels, with Google promising improvements in the future. Despite Google’s emphasis on safety filters and bias mitigations, questions remain about the competitiveness of live images compared to other video generation tools in the market. Additionally, Google has not provided detailed information about the training data used for Imagen 2, raising concerns about potential IP-related lawsuits and the lack of an opt-out tool or compensation for creators whose work may have been used in the model training process.

Google Introduces $10 AI Add-Ons for Workspace, Following Microsoft’s Lead

In a move to monetize AI, Google has announced two new $10 per user per month add-on packages for its Google Workspace productivity suite. This follows Microsoft’s decision last year to add $30 per user per month to the price of an Office 365 subscription for its Copilot feature. The first add-on, AI meetings and messaging, takes notes, provides meeting summaries, and translates content into 69 languages. Aparna Pappu, VP & GM at Google Workspace, highlighted the addition of 52 new languages, including Filipino and Korean, bringing the total number of supported languages to 69. The second add-on, AI security, helps admins keep Google Workspace content more secure by classifying and protecting files with sensitive characteristics, protecting private information, and applying data loss prevention controls tailored to individual organizations’ requirements. While the $10 per user cost may seem steep, it aligns with the pricing of similar features from third-party services. Google allows customers to mix and match license types, applying the advanced features where they would be most useful. The two add-ons are now available to Workspace subscribers.

Google Cloud Introduces Vertex AI Agent Builder for Simplified AI Agent Creation

Another interesting launch from Google, it has unveiled a new tool called Vertex AI Agent Builder, designed to simplify the creation of AI agents. These agents, unlike traditional chatbots, can take actions based on conversations and interact with back-end transactional systems to automate processes. Google Cloud CEO Thomas Kurian emphasized the ease and speed with which users can build and deploy production-ready, generative AI-powered conversational agents using Vertex AI Agent Builder. One key feature of the tool is “grounding,” which ties answers to reliable sources such as Google Search or enterprise data sources. The new capabilities are already available and support multiple languages, with country-based API endpoints in the U.S. and EU. As the AI agent craze continues to grow, Google Cloud aims to position itself as a leader in simplifying the creation of these powerful tools for businesses.

Google’s Gemini 1.5 Pro Enters Public Preview on Vertex AI with Impressive Context Window

Gemini 1.5 Pro, most capable generative AI model of Google, is now available in public preview on Vertex AI, the company’s enterprise-focused AI development platform. Launched in February, Gemini 1.5 Pro boasts an impressive context window, capable of processing between 128,000 to 1 million tokens, equivalent to around 700,000 words or 30,000 lines of code. This is significantly higher than competitors like Anthropic’s Claude 3 and OpenAI’s GPT-4 Turbo. The model’s capabilities extend to analyzing code libraries, reasoning across lengthy documents, and engaging in long conversations with chatbots. Being multilingual and multimodal, Gemini 1.5 Pro can understand images, videos, and now audio streams, enabling it to analyze and compare content across various media formats and languages. Google acknowledges that processing a million tokens takes time, with searches in demos taking between 20 seconds and a minute to complete. However, the company is working on optimizing the model to improve latency.

Frequently asked questions

Google’s Imagen 2, integrated into Vertex AI, introduces a groundbreaking “text-to-live images” feature that can create 4-second videos from text prompts. The tool currently produces videos at 360×640 pixel resolution and includes enhanced safety filters and bias mitigations. It’s primarily designed for marketers and creatives to generate content like GIFs for advertisements, though Google plans to improve resolution capabilities in future updates.
Google’s new AI Add-Ons for Workspace cost $10 per user per month and come in two packages. The first package, AI meetings and messaging, provides note-taking, meeting summaries, and translation capabilities in 69 languages. The second package, AI security, helps administrators protect sensitive content and implement data loss prevention controls. Users can mix and match these add-ons based on their needs.
Gemini 1.5 Pro stands out with its exceptional context window, capable of processing up to 1 million tokens (approximately 700,000 words). This capacity significantly exceeds competitors like Claude 3 and GPT-4 Turbo. The model is multilingual and multimodal, handling text, images, videos, and audio streams. It’s particularly effective for analyzing large code libraries and conducting extended conversations, though processing times can range from 20 seconds to a minute.
Vertex AI Agent Builder is Google’s new tool for creating AI agents that can interact with back-end systems and automate processes based on conversations. It simplifies the development of production-ready, generative AI-powered conversational agents. The tool includes “grounding” capabilities that connect responses to reliable sources like Google Search or enterprise data, making it particularly valuable for businesses seeking to implement AI solutions quickly.
Google has implemented enhanced safety filters and bias mitigations in Imagen 2 following previous controversies with their image generation tools. However, concerns remain about training data transparency and the lack of an opt-out tool for creators whose work may have been used in training. Google hasn’t fully disclosed details about the training data, which has raised questions about potential IP-related issues.
The AI meetings and messaging add-on supports 69 languages in total, with recent additions including Filipino and Korean. This represents an expansion of 52 new languages, making it a comprehensive solution for global businesses requiring multilingual communication and translation capabilities in their workspace environment.
Vertex AI serves as Google’s enterprise-focused AI development platform, integrating various AI tools including Gemini 1.5 Pro, Imagen 2, and the Agent Builder. It provides businesses with a unified platform for developing and deploying AI solutions, with features like country-based API endpoints in the U.S. and EU, ensuring compliance with regional requirements and data protection standards.
Picture of Gor Gasparyan

Gor Gasparyan

Optimizing digital experiences for growth-stage & enterprise brands through research-driven design, automation, and AI