C

Croissantllmbase

Developed by croissantllm
CroissantLLM is a 1.3 billion parameter language model pre-trained on 3 trillion English-French bilingual tokens, designed to provide high-performance, fully open-source bilingual models for the research and industrial communities.
Downloads 901
Release Time : 1/9/2024

Model Overview

CroissantLLM is a high-performance, fully open-source bilingual (English and French) language model that runs smoothly on consumer-grade local hardware. The model employs a 1:1 English-French pre-training data ratio, a custom tokenizer, and bilingual fine-tuning datasets for inherently bilingual model training.

Model Features

Bilingual Support
The model uses a 1:1 English-French pre-training data ratio, specifically optimized for English and French.
High Performance
The model runs smoothly even on consumer-grade local hardware, making it suitable for research and industrial applications.
Open and Transparent
The model is fully open-source, including the codebase, checkpoints, fine-tuned chat models, and high-quality translation models.
High-Quality French Data
The training data includes manually curated, high-quality, and diverse French data branches.

Model Capabilities

Text Generation
Bilingual Translation
Code Generation

Use Cases

Text Generation
Bilingual Translation
Translate English text to French or vice versa.
High-quality translation results suitable for everyday and professional scenarios.
Code Generation
Generate code snippets based on prompts.
Ideal for developers and researchers.
Research
Multilingual Model Research
Used to study the performance of language models in multilingual environments.
Provides rich bilingual data and model checkpoints.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase