Gemini 2.5 Flash Preview 04-17 - Model

Gemini 2.5 Flash Preview 04-17

BASE

Created June 16, 2025

$ 0.15 / million token

Gemini 2.5 Flash Preview is a lightweight large language model launched by Google, designed for rapid response and efficient processing. This model provides the core functionality of the Gemini series while optimizing speed and efficiency, making it particularly suitable for application scenarios requiring quick responses. The Flash version significantly reduces latency while maintaining high-quality output, providing users with a more fluid interaction experience.

Technical Specifications

Provider: Google
Version: 2.5 Flash Preview 04-17
Release Date: January 2025
Context Length: 128,000 tokens
Parameters: Approximately 200 billion parameters
Supported Languages: Over 80 languages, including English, Chinese, Japanese, Korean, German, French, etc.
Rate Limit: 1000 RPM

Pricing

Service Type	Input Price (USD)	Output Price (USD)	OAPToken Input	OAPToken Output
API Usage	$0.15 / million tokens	$0.60 / million tokens	150,000 / million tokens	600,000 / million tokens

*UI interface usage is free, API pricing is calculated per million tokens

Advantages and Features Benchmark

Extremely low latency, response speed 5-10 times faster than standard models
Moderate context length, supporting 120K level token input
Optimized resource usage efficiency, reducing computational costs
Maintains most core capabilities, including multilingual support and basic multimodal functionality
Excellent cost-performance ratio, suitable for large-scale deployment and high-frequency usage scenarios

Best Use Cases

Large-scale processing tasks (such as processing multiple PDF documents)
Low latency, high traffic demand tasks
Agentic use cases
Complex problem reasoning
Native tool calling applications