Updated production-ready Gemini models, reduced 1.5 Pro prices, increased tariff limits and more

[ad_1]

Today we are releasing two updated, production-ready Gemini models: Gemini-1.5-Pro-002 And Gemini-1.5-Flash-002 together with:

>50% discounted price for 1.5 Pro (both input and output for prompts).
2x higher rate limits on 1.5 Flash and ~3x higher on 1.5 Pro
2x faster output and 3x lower latency
Updated default filter settings

These new models build on our recent experimental model releases and include significant improvements over the Gemini 1.5 models released at Google I/O in May. Developers can access our latest models for free via Google AI Studio and the Gemini API. For larger organizations and Google Cloud customers, the models are also available on Vertex AI.

Improved overall quality with greater gains in math, long context and vision

The Gemini 1.5 series are models designed for general performance across a wide range of text, code, and multimodal tasks. For example, Gemini models can be used to synthesize information from 1000-page PDFs, answer questions on repos with more than 10,000 lines of code, record hours of videos and create useful content from them, and much more.

With the latest updates, 1.5 Pro and Flash are now better, faster and more cost-effective to produce in production. We see a ~7% increase in MMLU-Pro, a more sophisticated version of the popular MMLU benchmark. On the MATH and HiddenMath benchmarks (an internal set of competitive math problems), both models achieved a significant improvement of approximately 20%. For vision and code use cases, both models also perform better in evaluations measuring visual understanding and Python code generation (in the range of approximately 2-7%).

Xem thêm Acquired a mannequin new TV? Listed beneath are 6 errors individuals make when establishing their TV and the way in which one can avoid them

We've also improved the overall usefulness of the model answers while adhering to our content security policies and standards. This means less poking/rejections and more helpful answers on many topics.

In response to developer feedback, both models now feature a more concise style designed to make these models more user-friendly to use and reduce costs. For use cases such as summarization, question answering, and extraction, the default output length of the updated models is approximately 5-20% shorter than previous models. For chat-based products where users may prefer longer responses by default, you can read our Prompt Strategies guide to learn more about how to make models more verbose and conversational.

For more information about migrating to the latest versions of Gemini 1.5 Pro and 1.5 Flash, see the Gemini API models page.

Gemini 1.5 Pro

We continue to be blown away by the creative and useful applications of Gemini 1.5 Pro's 2 million token contextual window and multimodal capabilities. From video understanding to processing 1000-page PDFs, so many new use cases still need to be developed. Today we are announcing a 64% price reduction for input tokens, a 52% price reduction for output tokens, and a 64% price reduction for incremental cached tokens for our most powerful model in the 1.5 series, Gemini 1.5 Pro, effective January 1. October 2024 requests less than 128,000 tokens. Combined with context caching, this further reduces the cost of building with Gemini.

Increased fare limits

To make it even easier for developers to build with Gemini, we're increasing the rate limits of the paid tiers to 2,000 RPM for 1.5 Flash and 1,000 RPM for 1.5 Pro (from 1,000 and 360, respectively). We expect to continue increasing the rate limits for the Gemini API in the coming weeks to allow developers to build more with Gemini.

Xem thêm LG will unveil brighter OLED TVs at CES 2025, thanks to the addition of 33% more OLED

2x faster output and 3x less latency

In addition to key improvements to our latest models, in recent weeks we have reduced latency with 1.5 Flash and significantly increased output tokens per second to enable new use cases with our most powerful models.

Updated filter settings

Since Gemini's initial launch in December 2023, the focus has been on building a safe and reliable model. With the latest versions of Gemini (-002 models), we have improved the model's ability to follow user instructions while balancing security. We will continue to offer a range of security filters that developers can apply to Google's models. The models released today do not apply the filters by default, allowing developers to determine the configuration that best suits their use case.

Gemini 1.5 Flash-8B Experimental Updates

We are releasing a further improved version of the Gemini 1.5 model announced in August called “Gemini-1.5-Flash-8B-Exp-0924”. This improved version provides significant performance improvements for both text and multimodal use cases. It is available now via Google AI Studio and the Gemini API.

The overwhelmingly positive feedback developers shared on 1.5 Flash-8B was incredible to see, and we will continue to shape our pipeline from experimental to production releases based on developer feedback.

We're excited about these updates and can't wait to see what you build with the new Gemini models! And for Gemini Advanced users, you will soon be able to access a chat-optimized version of Gemini 1.5 Pro-002.

[ad_2]

Source link

Updated production-ready Gemini models, reduced 1.5 Pro prices, increased tariff limits and more

Improved overall quality with greater gains in math, long context and vision

Gemini 1.5 Pro

Increased fare limits

2x faster output and 3x less latency

Updated filter settings

Gemini 1.5 Flash-8B Experimental Updates

By

Trả lời Hủy

You Missed

Rumored features of the Samsung Galaxy S25 Ultra: the most important upgrades for the S25 Ultra

Do you often get stuck in games? Google's potential Circle to Search could solve your gaming problems

Google Messages finally undoes an annoying change and makes organizing your contacts easier

Samsung Galaxy S25 rumored features: key upgrades to the S25 range

Improved overall quality with greater gains in math, long context and vision

Gemini 1.5 Pro

Increased fare limits

2x faster output and 3x less latency

Updated filter settings

Gemini 1.5 Flash-8B Experimental Updates

By

Related Post

Trả lời Hủy

You Missed