Gemini, Google’s recently announced premier GenAI model family, is now available for Google Cloud customers to use with Vertex AI. This comes after it debuted with Bard and the Pixel 8 Pro last week.
Through the new Gemini Pro API, Vertex AI, Google’s fully managed AI development platform, is now offering Gemini Pro, a more compact version of the more powerful Gemini model, in public preview. Gemini Ultra is already available in private preview for a “select set” of customers. The API is free to use “within limits” (more on what that means later). It supports 38 languages and locations, including Europe, and has features like filtering and chat.
Thomas Kurian, CEO of Google Cloud, stated at a press briefing on Tuesday that “Gemini is a state-of-the-art natively multimodal model with sophisticated reasoning advanced coding skills.” Developers will now be able to create original applications for it.
Gemini Pro API
Like generative text model APIs like Anthropic, AI21, and Cohere, the Gemini Pro API in Vertex, by default, takes text as input and produces text as output. Another endpoint being released in preview today is Gemini Pro Vision, which can process text and imagery, including images and videos, and produce text similar to OpenAI’s GPT-4 with Vision model.
One of the main complaints of Gemini that surfaced after its launch last Wednesday is that, although technically “multimodal” (i.e., trained on a variety of data including text, images, videos, and audio), the version of Gemini that powers Bard, an improved Gemini Pro model, is unable to accept ideas. Image processing addresses this issue. Concerns remain over Gemini’s image analysis capabilities and performance, particularly in light of a deceptive product demo. However, at least users can now test the model and its visual comprehension independently.
With Vertex AI, developers may apply the same fine-tuning techniques for other Vertex-hosted models, such as Google’s PaLM 2, to tailor Gemini Pro to particular settings and use cases. Additionally, Gemini Pro can be “grounded” to enhance the precision and applicability of the model’s replies, or it can be linked to other APIs to carry out specific tasks. Ground data can come from the internet, Google Search, or third-party apps and databases.
Another Vertex AI feature now supported for Gemini Pro is citation checking, which acts as an extra fact-checking step by emphasizing the sources of data that Gemini Pro consulted to formulate a response.
“We can use grounding to compare a Gemini-generated response with data from online sources or internal company systems,” Kurian explained. “[This] comparison enables you to raise the calibre of the model’s responses.”
Kurian seems to be responding to reports that suggested Gemini Pro could have been a better model available by devoting a reasonable amount of time to highlighting the control, moderation, and governance features of Gemini Pro. Will developers be persuaded by the assurances alone? Perhaps. However, Google is sweetening the pot by offering discounts if they aren’t.
Vertex AI’s Gemini Pro will cost $0.00025 per character for input and $0.00005 for output. (Vertex clients pay for every 1,000 letters; for models such as Gemini Pro Vision, they pay for each image.) That is a 4x and 2x decrease in price from the previous Gemini Pro model. Additionally, Vertex AI customers can test Gemini Pro for free until the beginning of the following year.
Kurian said, “Our goal is to attract developers with attractive pricing.”
Beefing Up Vertex:
Google is adding more functionality to Vertex AI to entice developers away from competing platforms like Bedrock.
A few are related to Gemini Pro. Customers of Vertex will soon have access to Gemini Pro, which will enable conversational voice and chat agents that are specially designed and offer what Google refers to as “dynamic interactions… that support advanced reasoning.” Additionally, Gemini Pro will be able to power Vertex AI’s search summarization, recommendation, and answer generation functions by using documents from various modalities (such as PDFs and photos) and sources (such as OneDrive Salesforce) to respond to user inquiries.
According to Kurian, the conversational and search features enabled by Gemini Pro should be available “very early” in 2024.
Automatic Side by Side (Auto SxS) is now somewhere in Vertex. In response to AWS’s recently announced Model Evaluation on Bedrock, Google Auto SxS enables developers to evaluate models “on-demand” and “automatically.” According to Google, Auto SxS is faster and more cost-effective than manually evaluated models, though independent testing is needed to confirm this.
Google has also added models from other sources, such as Mistral and Meta, to Vertex. Additionally, the company has introduced “step-by-step” distillation, a method that breaks down bigger models into smaller, more specialized, low-latency models. Furthermore, Google is expanding its indemnity policy to cover outputs from its Imagen models and PaLM 2, which means it will defend qualified clients involved in legal actions about intellectual property disputes concerning the results of those models.
Corporate clients understandably worry about generative AI models’ propensity to repeat training data. Should it ever come to light that a provider such as Google trained a model using copyrighted data without first getting the necessary authorization, then the clients of that vendor may be held accountable for using intellectual property infringement in their projects.
A defence offered by certain vendors is fair usage. However, many businesses are extending their indemnity plans around GenAI solutions in recognition of their misgivings.
Google will not extend the indemnity coverage of Vertex AI to users of the Gemini Pro API. However, the business promises that it will after the Gemini Pro API is live.