
Google Gemini Embedding 2: Native multimodal embeddings for enterprise
Executive summary
Google has released Gemini Embedding 2 into public preview, a purpose-built embeddings model that maps text, images, native audio, video and document content into a single vector space intended to simplify enterprise retrieval and downstream multimodal workflows. This model is designed as a native multimodal embedder rather than a text-first retrofit: it can consume audio and video natively (without mandatory transcription) alongside images and documents, enabling direct cross-modal queries and reducing error introduced by separate transcription pipelines.
Technical profile
Gemini Embedding 2 emits a 3,072‑dimensional vector by default and implements a Matryoshka-style layered representation that lets teams truncate to smaller footprints (for example 768 or 1,536 dims) to save storage and index costs with modest accuracy tradeoffs. Per-request ceilings are explicit: up to 8,192 text tokens, up to 6 images, up to 128 seconds of video, up to 80 seconds of audio, or a 6‑page PDF; larger assets require chunking and re-indexing. Google positions the model around cross-modal semantic alignment rather than stitched transcription flows, which reduces translation-related retrieval errors and can lower end-to-end latency in pipeline implementations.
Deployment, pricing and access
The preview includes a free, rate-limited experimental tier (roughly 60 requests per minute) plus paid usage tiers: approximately $0.25 per 1M tokens for most inputs and $0.50 per 1M tokens for native audio. Distribution is being handled through Google’s public APIs and Vertex AI channels; however, Google is staging feature exposure — consumer and enterprise access paths differ (AI Pro and Gemini Alpha programs) and some enterprise features require tenant admin opt‑ins. Separately reported product updates across the Gemini family indicate Google is keeping preview rate cards broadly consistent while gating the deepest capabilities behind subscription and staged enterprise programs.
Ecosystem and integrations
Google has integrated the model into developer tooling and enterprise gateways and early integrations with LangChain, LlamaIndex, Weaviate, Qdrant and others reduce adoption friction. Embedding-native search can be combined with Workspace and Drive integrations in the broader Gemini stack to let apps pull contextual signals from Drive, Gmail and Chat — turning passive file stores into active retrieval-and-synthesis layers. Gemini’s multimodal and reasoning improvements (reported in companion updates such as Gemini 3.1 Pro) further increase the utility of embeddings for building synthesis-centric workflows, lowering token usage and iteration costs in some partner pilots.
Enterprise implications and governance
Early adopters report practical gains — for example, some customers cite latency reductions up to 70% and Everlaw reports a recall lift near 20% for legal discovery — but these gains come with migration costs: corpora must be re‑embedded and assets chunked to fit per-request limits. Google emphasizes enterprise controls (data isolation, tenant-admin opt-ins) and multimedia provenance tools such as SynthID watermarking to address compliance. That said, public reporting also highlights rapid, large-scale pilots (a reported unclassified Department of Defense pilot showing ~1.2M distinct users issuing ~40M prompts) that underline governance gaps when adoption outpaces accredited training and policy processes.
Bottom line
For architecture and procurement teams the trade-offs are clear: the model can materially simplify pipelines and enable richer cross-modal search, but realizing those benefits requires one-time reindexing cost, careful truncation and storage planning, and attention to governance and portability. The wider Gemini product updates — reasoning improvements, Workspace integrations, and staged commercial gating — mean the embedding is being launched as part of a broader platform strategy rather than as an isolated model update.
Read Our Expert Analysis
Create an account or login for free to unlock our expert analysis and key takeaways for this development.
By continuing, you agree to receive marketing communications and our weekly newsletter. You can opt-out at any time.
Recommended for you

Google Gemini Tightens Grip on Workspace Productivity
Google expanded Gemini deeply into Workspace, enabling cross-file document, spreadsheet and slide generation from single prompts while marking premium access via AI Pro subscriptions and early enterprise access through Gemini Alpha. The update pairs productized reasoning advances (Gemini 3.x/Deep Think tuning) with a measured 9x Sheets speed claim, a Department of Defense pilot scale signal, and admin controls — creating immediate productivity upside but sharper platform‑capture and procurement tradeoffs for IT and security teams.

Google’s Gemini 3.1 Pro surges ahead with large reasoning improvements and research-focused tooling
Google released Gemini 3.1 Pro, a refined flagship tuned for deeper multi-step reasoning and research workflows, posting major benchmark gains while keeping API pricing unchanged. The update emphasizes interoperability with scientific toolchains and positions the model as an augmenting collaborator — useful for hypothesis generation and experiment planning but still requiring expert oversight for validation.

Google trials Gemini tool to import rival AI chat histories (United States)
Google is experimenting with a Gemini function that would let users upload conversation archives from other chatbots so they can continue projects and preserve personalised context. If launched, the capability would lower switching friction, raise technical and privacy questions about memory mapping, and potentially accelerate user migration toward Gemini.

DeepSeek Signals Ambition to Compete with Google with a Multimodal, Multilingual AI Search
Recent job listings indicate DeepSeek is building an AI search product that can handle text, images and audio while supporting multiple languages. The postings also emphasize engineering work on evaluation, training data and scalable infrastructure—signals that the company aims for a reliable, production-grade search and agent platform rather than a research demo.
Google warns of large-scale prompting campaign to clone Gemini
Google disclosed that actors prompted its Gemini model at scale to harvest outputs for use in building cheaper imitations, with at least one campaign issuing over 100,000 queries. The company frames the activity as theft of proprietary capabilities and signals a rising threat vector for LLM operators, with technical and legal consequences ahead.

Google deploys Gemini agents across Pentagon unclassified networks
Google has provisioned Gemini-based agents to the Department of Defense’s unclassified networks to automate administrative and analytic workstreams, producing rapid uptake and exposing a large training shortfall. Parallel procurement tensions — including a supply‑chain designation affecting Anthropic, competing vendor negotiations for classified use, and uneven public accounts of which firms won restricted approvals — mean the move accelerates productivity while raising immediate governance, supply‑chain and legal hazards.

Microsoft Phi-4-Reasoning-Vision-15B: Efficiency-First Multimodal Play
Microsoft released Phi-4-Reasoning-Vision-15B , a 15B-parameter multimodal model trained on ~200B tokens designed for low-latency, low-cost inference in perception and reasoning tasks. Unlike recent sparse, very-large-parameter efforts that rely on conditional activation and heavy memory footprints, Phi-4 emphasizes a compact, deterministic serving profile and published artifacts to ease enterprise verification and on‑premise or edge adoption.

Google: Public GCP API Keys Became Gemini Credentials, Exposing Data
Truffle Security found that publicly posted Google Cloud API keys were suddenly accepted by the Gemini (Generative Language) API, enabling outsiders to read uploaded files and conversation context and to consume project quota. Beyond data disclosure and unexpected billing, these leaked keys could also be used to mass-query Gemini and harvest model outputs for commercial cloning efforts, compounding IP and competitive risk.