H

Hymba 1.5B Instruct

Developed by nvidia
A 1.5B-parameter model fine-tuned for instructions based on Hymba-1.5B-Base, capable of handling complex tasks such as mathematical reasoning, function calling, and role-playing
Downloads 3,547
Release Time : 10/31/2024

Model Overview

An instruction-tuned model trained on a combination of open-source instruction datasets and internally synthesized data, using supervised fine-tuning and direct preference optimization

Model Features

Hybrid Attention Architecture
Each layer integrates standard attention heads and Mamba state-space model heads in parallel, enhancing long-sequence processing capabilities
Meta-Token Technology
Prepend tokens enable global interaction, mitigating the 'forced attention' issue in traditional attention mechanisms
Efficient Design
Combines Grouped Query Attention (GQA), Rotary Position Embedding (RoPE), and cross-layer KV sharing techniques
Business-Friendly License
Uses the NVIDIA Open Model License, permitting commercial use

Model Capabilities

Mathematical Reasoning
Function Calling
Role-Playing
Multi-Turn Dialogue
Text Generation
Instruction Understanding

Use Cases

Intelligent Assistants
Task-Oriented Dialogue Systems
Handles complex user requests involving multi-step operations
Outperforms same-scale models by 15% in SFT benchmark tests
Educational Applications
Math Problem Tutoring
Provides step-by-step explanations for mathematical problem-solving
Achieves 62.3% accuracy on the GSM8K test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase