J

Jetmoe 8b Chat

Developed by jetmoe
JetMoE-8B is an efficient open-source large language model that surpasses LLaMA2-7B performance with a low training cost of $100,000, activating only 2.2 billion parameters during inference
Downloads 26
Release Time : 3/31/2024

Model Overview

An open-source large language model based on Mixture of Experts (MoE) architecture, focusing on efficient inference and low-cost training, suitable for tasks like dialogue generation and code completion

Model Features

Low-cost efficient training
Achieved performance surpassing LLaMA2-7B with only $100,000 cost (96×H100 trained for 2 weeks)
Efficient inference
Only activates 2.2 billion parameters during inference, significantly reducing computational costs
Fully open-source
Trained using public datasets, open-source code, supports fine-tuning on consumer-grade GPUs
Two-phase training approach
Adopts MiniCPM training method: Phase 1 base training + Phase 2 high-quality data fine-tuning

Model Capabilities

Text generation
Dialogue systems
Code completion
Mathematical problem solving
Multi-turn dialogue

Use Cases

Dialogue systems
Intelligent chatbot
Build friendly and knowledgeable conversational assistants
MT-Bench score of 6.681, surpassing Llama-2-13b-chat
Code generation
Programming assistance
Automatically generate and complete code
MBPP benchmark Pass@1 reached 34.2%, outperforming LLaMA2-7B
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase