🚀 LanguageNaturalAI Chat and Instruction Model 2B
Developed by LanguageNatural.AI, this model offers advanced text generation, chat, and instruction capabilities for the Spanish - speaking community.
🚀 Quick Start
You can start using this model through the Hugging Face API or integrate it into your applications using the transformers
library. Here is an example of how to load the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "LenguajeNaturalAI/leniachat-gemma-2b-v0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
messages = [
{"role": "system", "content": "Eres un asistente que ayuda al usuario a lo largo de la conversación resolviendo sus dudas."},
{"role": "user", "content": "¿Qué fue la revolución industrial?"}
]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
with torch.no_grad():
output = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
✨ Features
- Designed for Spanish - speaking Community: The model has been trained exclusively in Spanish to maximize its effectiveness for Spanish - speaking users.
- Advanced Training Phases: Trained in three distinct phases, including multi - task learning in Spanish, high - quality instruction training, and chat and abstract QA training.
- Based on a Well - known Model: Fine - tuned from
google/gemma-2b
, incorporating advanced features for better text generation and understanding in Spanish chat and instruction tasks.
📦 Installation
This section is not provided in the original README, so it is skipped.
💻 Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "LenguajeNaturalAI/leniachat-gemma-2b-v0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
messages = [
{"role": "system", "content": "Eres un asistente que ayuda al usuario a lo largo de la conversación resolviendo sus dudas."},
{"role": "user", "content": "¿Qué fue la revolución industrial?"}
]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
with torch.no_grad():
output = model.generate(input_ids, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Advanced Usage
There is no advanced usage example in the original README, so this part is not added.
📚 Documentation
Model Details
This model has been developed by LanguageNatural.AI. It aims to provide advanced tools for text generation, chat, and instruction to the Spanish - speaking community. It is the first in a series of models the company plans to launch.
Training
The model was trained in three phases:
- Multi - task Learning in Spanish: Using multiple supervised datasets for FLAN - style training.
- High - quality Instruction Training: Fine - tuning the model to understand and generate responses to complex instructions.
- Chat and Abstract QA Training: Optimizing the model for smooth conversations and generating responses to abstract questions. The training in all three phases was carried out using the library autotransformers.
Evaluation
To ensure the quality of the model, a comprehensive evaluation was conducted on several datasets, showing significant performance in text generation and instruction understanding in Spanish. The specific details of the evaluation of the LeNIA - Chat models are available in the following table.

Uses and Limitations
This model is designed for use in Spanish text generation applications, chatbots, and virtual assistants. Although it has been trained to minimize biases and errors, users are recommended to evaluate its performance in their specific use context. Users should be aware of the inherent limitations of language models and use this model responsibly. Also, since the base model has only 2 billion parameters, this model shares the inherent limitations of models of that size.
Future Versions
The developers plan to continue improving this model and launch future versions with expanded capabilities. You can stay updated on their website or their LinkedIn page.
🔧 Technical Details
This section is not provided in the original README, so it is skipped.
📄 License
This model is distributed under the Apache 2.0 license.
Property |
Details |
Model Type |
Language model for text generation, chat, and instruction in Spanish |
Training Data |
Trained in three phases using multiple supervised datasets, with the help of the autotransformers library |
Base Model |
google/gemma-2b |
Language |
Spanish |
Maximum Sequence Length |
8192 tokens |

💡 Usage Tip
Although this model has been trained to minimize biases and errors, it is recommended to evaluate its performance in your specific use context.
⚠️ Important Note
Users should be aware of the inherent limitations of language models and use this model responsibly. Also, since the base model has only 2 billion parameters, this model shares the inherent limitations of models of that size.