J

Jetmoe 8b

Developed by jetmoe
JetMoE-8B is an efficient open-source large language model that achieves performance comparable to LLaMA2-7B with a training cost of under $100,000, specifically designed for low-resource environments.
Downloads 1,337
Release Time : 3/25/2024

Model Overview

JetMoE-8B adopts a Mixture of Experts (MoE) architecture, dynamically activating only 2.2 billion out of 8 billion total parameters, significantly reducing computational costs. The model is trained on 1.25T of public datasets and supports tasks such as text generation and code completion.

Model Features

Ultra-low-cost training
Achieved performance comparable to LLaMA2-7B with only $80,000 (96 H100 GPUs trained for 2 weeks), challenging the industry belief that large models require massive investments.
Dynamic parameter activation
Only 2/8 experts are activated per token, with actual computation involving just 2.2 billion out of 8 billion parameters, significantly improving inference efficiency.
Academic-friendly design
Trained entirely on public datasets, fine-tunable on consumer-grade GPUs, lowering the barrier to research.

Model Capabilities

Text generation
Code completion
Dialogue interaction
Mathematical reasoning
Common-sense QA

Use Cases

Education & Research
Lab-level model research
Academic institutions can use consumer-grade devices for model fine-tuning and experiments
Reduces research costs by over 90% compared to traditional large models.
Commercial Applications
Low-cost dialogue systems
Deploy highly efficient inference chatbots
MT-Bench score of 6.681, surpassing LLaMA2-7B-chat.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase