T

T5 Efficient Small Kv256

Developed by google
T5-Efficient-SMALL-KV256 is a variant of Google's T5, optimized for downstream task performance using a deep narrow architecture, with 117 million parameters, requiring fine-tuning for use.
Downloads 16
Release Time : 3/2/2022

Model Overview

A deep narrow pre-trained model based on the T5 architecture, prioritizing increased model depth to enhance downstream task efficiency, requiring fine-tuning for English NLP tasks.

Model Features

Deep Narrow Architecture
Optimizes performance by increasing the number of Transformer layers (depth) rather than width, with research proving this strategy more efficient for downstream tasks.
KV Projection Optimization
Key-value projection dimension set to 256, balancing computational efficiency with model capacity.
Pre-training Objective
Trained using span-based masked language modeling (MLM) objectives on the C4 dataset.

Model Capabilities

Text generation
Text summarization
Q&A systems
Text classification (requires adjustment)

Use Cases

Text generation
News summarization
Generates concise summaries of input texts after fine-tuning
Q&A systems
Open-domain Q&A
Generates answers to questions based on context
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase