I

Instella 3B Long Instruct

Developed by amd
Instella-Long is an open-source language model with 3B parameters developed by AMD, supporting a context length of 128K and performing excellently in long-context benchmark tests.
Downloads 240
Release Time : 5/28/2025

Model Overview

Instella-Long is a fully open-source language model capable of handling long contexts. It is continuously trained on the AMD Instinct™ MI300X GPU based on Instella-3B-Instruct, supports a context length of 128K, and outperforms similar open-source models in terms of performance.

Model Features

Long context support
Supports a context length of 128K and performs excellently in long-context tasks.
Fully open source
The model weights, training configurations, datasets, and code are all open source, facilitating community collaboration and innovation.
Efficient training technology
Adopts efficient training technologies such as sequence parallelism, FlashAttention-2, Torch Compile, and FSDP to achieve high-performance training on AMD hardware.
Multi-stage training
Optimizes the model performance through three stages: continuous pre-training, supervised fine-tuning, and direct preference optimization.

Model Capabilities

Long text processing
Question answering generation
Instruction following
Text generation

Use Cases

Information retrieval and question answering
Long document question answering
Process documents up to 128K tokens long and generate accurate question-answer pairs.
Outperforms similar open-source models in the Helmet benchmark test.
Multi-document information integration
Integrate information from multiple documents and generate comprehensive answers.
Performs excellently in RAG tasks.
Academic research
Academic paper summarization and question answering
Process academic papers and generate summaries or answer related questions.
Performs well on the ArXiv dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase