Open-source Llama-3.2-400M-Amharic Model - Specifically designed for Amharic processing, convenient and practical!

Llama 3.2 400M Amharic

Developed by rasyosef

This is a streamlined version based on Meta's Llama-3.2-1B model, specifically pretrained for Amharic with 400 million parameters and a context length of 1024 tokens.

Large Language Model

Transformers

Other#Amharic generation #Small-scale pretraining #African language model

Downloads 310

Release Time : 11/26/2024

Model Overview

This model is a decoder transformer designed for Amharic text generation tasks without supervised fine-tuning.

Model Features

Amharic Optimization

Pretrained with 274 million Amharic text tokens, specifically optimized for Amharic text generation.

Streamlined Model

A compact version of the Llama-3.2-1B model with 400 million parameters, suitable for running on a single A100 40GB GPU.

Efficient Training

Completed pretraining in just 23 hours on a single A100 40GB GPU, achieving a validation perplexity of 41.3.

Model Capabilities

Amharic text generation

Long-text generation (1024 token context length)

Use Cases

Text generation

News Summary Generation

Generate news summaries based on Amharic news headlines

Produces coherent and contextually appropriate news content

Dialogue Systems

Used for reply generation in Amharic chatbots

Generates natural and fluent conversational responses

🚀 Llama 3.2 400M Amharic

This is a smaller version of Meta's Llama-3.2-1B decoder transformer model. It's pretrained from scratch for 23 hours using a single A100 40GB GPU and 274 million tokens of Amharic text, offering a cost - effective solution for Amharic text processing.

🚀 Quick Start

This is a smaller version of the Meta's Llama-3.2-1B decoder transformer model. It was pretrained from scratch for 23 hours using a single A100 40GB GPU and 274 million tokens of Amharic text.

It has 400 Million parameters.
The context size of this model is 1024 tokens.
It uses the same tokenizer as Llama - 3.2 - 1B, which was trained from scratch using the same Amharic dataset as the model, with a vocabulary size of 32k.
Validation Perplexity: 41.3
This is a base model and hasn't undergone any supervised finetuning yet.

📦 Installation

First, you need to install the latest version of transformers

pip install -Uq transformers

💻 Usage Examples

Basic Usage

You can use this model directly with a pipeline for text generation:

from transformers import pipeline

llama_am = pipeline(
    "text-generation",
    model="rasyosef/Llama-3.2-400M-Amharic",
    device_map="auto"
  )

prompt = "አዲስ አበባ"
llama_am(
    prompt,
    max_new_tokens=128,
    temperature=0.5,
    do_sample=True,
    top_k=8,
    top_p=0.8,
    repetition_penalty=1.2
  )

Output:

[{'generated_text': 'አዲስ አበባ፣ ታህሳስ 8 ፣2012 (ኤፍ ቢ ሲ) የኢፌዴሪ የውጭ ጉዳይ ሚኒስትር አቶ ገዱ አንዳርጋቸው ከአፍሪካ ህብረት የስራ አስፈጻሚዎች ምክር ቤት መደበኛ ስብሰባ ጎን ለጎን ከዴሞክራቲክ ሪፐብሊክ ኮንጎ አቻቸው ማሪ ቱምባ ንዜዛ እና ከሌሎች የአፍሪካ አምባሳደሮች ጋር ተወያይተዋል።በውይይታቸውም በአፍሪካ የኮሮና ቫይረስን ለመከላከል እየተከናወኑ ባሉ ስራዎች ዙሪያ መምከራቸውን በትዊተር ገጻቸው አስፍረዋል።የሁለቱን ሀገራት ግንኙነት በተመለከተም፥ ኢትዮጵያ በህብረቱ ቋሚ አምባሳደርነት ባላት ሀላፊነት ለሹመት ማቅረብዋ የሚደነቅ መሆኑንም አንስተዋል።ኢትዮጵያ የኮቪድ19 ወረርሽኝን ለመግታት እያደረገች ባለው ጥረት ለደቡብ አፍሪካ ምስጋና አቅርባም ነበር፤ ቫይረሱን ለመቆጣጠር ከኢትዮጵያ ምን እንደምትማር በዝርዝር ላቀረብንላቸው ጥያቄም ወደፊት በሚሰሩ የትብብር መስኮች ላይ ተነጋግረን መስራት እንፈልጋለን ብለዋል።በቀጣይም ሁለቱ'}]

Property	Details
Model Type	Decoder Transformer
Training Data	274 million tokens of Amharic text
Parameters	400 Million
Context Size	1024 tokens
Tokenizer Vocabulary Size	32k
Validation Perplexity	41.3
Fine - tuning Status	Not yet supervised finetuned

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご