L

Latxa 7b V1.2

Developed by HiTZ
Latxa is a large Basque language model based on the LLaMA-2 architecture, specifically designed for low-resource languages, trained on a 4.2 billion token Basque corpus
Downloads 875
Release Time : 6/11/2024

Model Overview

The Latxa series includes models ranging from 7B to 70B parameters, optimized for Basque, excelling in language understanding and generation tasks, supporting both English and Basque

Model Features

Low-resource language optimization
Specifically designed for low-resource languages like Basque, bridging the technological gap between high- and low-resource languages
High-quality corpus training
Trained on a rigorously selected 4.2 billion token Basque corpus to ensure language quality
Multiple sizes available
Offers three parameter sizes: 7B, 13B, and 70B to meet different computational needs
Open license
Follows LLaMA-2 license agreement, allowing both commercial and research use

Model Capabilities

Basque text generation
Multiple-choice QA
Reading comprehension
Language understanding
English text generation (auxiliary capability)

Use Cases

Education
Language proficiency testing
Used to evaluate Basque C1 level exam questions
Achieved 30.26% accuracy on EusProficiency dataset (5-shot)
Reading comprehension assistance
Helps students understand Basque texts
Achieved 25% accuracy on EusReading dataset (5-shot)
Research
Low-resource language research
Provides benchmarks for large model research on Basque and other low-resource languages
Released complete toolchain including models, corpora, and evaluation datasets
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase