G

Gemma 3 12b Pt

Developed by google
Gemma is a lightweight open-source multimodal model series launched by Google, built on the same technology as Gemini, supporting text and image inputs and generating text outputs.
Downloads 54.36k
Release Time : 3/1/2025

Model Overview

Gemma 3 is a multimodal model capable of processing both text and image inputs to generate text outputs, suitable for various tasks such as Q&A, summarization, and reasoning. It features a 128K large context window and supports over 140 languages.

Model Features

Multimodal processing capability
Capable of processing both text and image inputs simultaneously, enabling cross-modal understanding and generation
Large context window
Supports a context length of 128K tokens, suitable for handling long documents and complex tasks
Multilingual support
Supports processing in over 140 languages, with internationalization capabilities
Lightweight design
Relatively small size allows deployment in resource-constrained environments

Model Capabilities

Text generation
Image understanding
Q&A systems
Document summarization
Logical reasoning
Code generation
Mathematical computation
Multilingual processing

Use Cases

Content generation
Image caption generation
Generates detailed descriptions based on input images
Example accurately describes a scene of a bee on a pink flower
Document summarization
Automatically summarizes long documents
Q&A systems
Image-based Q&A
Answers questions about image content
Factual Q&A
Answers knowledge-based questions
Achieved 78.2 points (12B model) on the TriviaQA benchmark
Education
Math problem solving
Solves math problems and proofs
Achieved 71.0 points (12B model) on the GSM8K benchmark
Programming assistance
Code generation and explanation
Achieved 45.7 points (12B model) on the HumanEval benchmark
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase