
ERNIE Speed 8K
An efficient inference optimization model developed by Baidu. It is lightweight improved based on the ERNIE 4.0 architecture, supports an 8K context window, has an inference speed 5 times faster than the base model, and reduces input costs by 80%.
Intelligence(Weak)
Speed(Relatively Fast)
Input Supported Modalities
No
Is Reasoning Model
8,192
Context Window
8,192
Maximum Output Tokens
2024-10-31
Knowledge Cutoff
Pricing
¥0.8 /M tokens
Input
¥3.2 /M tokens
Output
¥1.6 /M tokens
Blended Price
Quick Simple Comparison
ERNIE-4.5-Turbo-128K
¥0.56
ERNIE-4.5-Turbo
¥0.56
ERNIE-X1-Turbo-32K
¥0.28
Basic Parameters
ERNIE-Speed-8KTechnical Parameters
Parameter Count
Not Announced
Context Length
8,192 tokens
Training Data Cutoff
2024-10-31
Open Source Category
Proprietary
Multimodal Support
Text Only
Throughput
0
Release Date
2025-05-01
Response Speed
180 tokens/s
Benchmark Scores
Below is the performance of ERNIE-Speed-8K in various standard benchmark tests. These tests evaluate the model's capabilities in different tasks and domains.
Intelligence Index
-
Large Language Model Intelligence Level
Coding Index
-
Indicator of AI model performance on coding tasks
Math Index
-
Capability indicator in solving mathematical problems, mathematical reasoning, or performing math-related tasks
MMLU Pro
-
Massive Multitask Multimodal Understanding - Testing understanding of text, images, audio, and video
GPQA
-
Graduate Physics Questions Assessment - Testing advanced physics knowledge with diamond science-level questions
HLE
-
The model's comprehensive average score on the Hugging Face Open LLM Leaderboard
LiveCodeBench
-
Specific evaluation focused on assessing large language models' ability in real-world code writing and solving programming competition problems
SciCode
-
The model's capability in code generation for scientific computing or specific scientific domains
HumanEval
-
Score achieved by the AI model on the specific HumanEval benchmark test set
Math 500 Score
-
Score on the first 500 larger, more well-known mathematical benchmark tests
AIME Score
-
An indicator measuring an AI model's ability to solve high-difficulty mathematical competition problems (specifically AIME level)
GPT 5 Mini
openai

¥1.8
Input tokens/million
¥14.4
Output tokens/million
400k
Context Length
GPT 5 Standard
openai

¥63
Input tokens/million
¥504
Output tokens/million
400k
Context Length
GPT 5 Nano
openai

¥0.36
Input tokens/million
¥2.88
Output tokens/million
400k
Context Length
GPT 5
openai

¥9
Input tokens/million
¥72
Output tokens/million
400k
Context Length
GLM 4.5
chatglm

¥0.43
Input tokens/million
¥1.01
Output tokens/million
131k
Context Length
Gemini 2.0 Flash Lite (Preview)
google

¥0.58
Input tokens/million
¥2.16
Output tokens/million
1M
Context Length
Gemini 1.0 Pro
google

¥3.6
Input tokens/million
¥10.8
Output tokens/million
33k
Context Length
Qwen2.5 Coder Instruct 32B
alibaba

¥0.65
Input tokens/million
¥0.65
Output tokens/million
131k
Context Length