ALMA-7B-Ja-V2 Open-Source Machine Translation Model - Supports Bidirectional Translation between Japanese and English, with Better Performance after Additional Training

ALMA 7B Ja V2

Developed by webbigdata

ALMA-7B-Ja-V2 is a machine translation model supporting Japanese-English bidirectional translation, with improved performance through additional training on the previous version.

Machine Translation

Transformers

Supports Multiple Languages#Japanese-English bidirectional translation #Multilingual adaptation #Low VRAM optimization

Downloads 118

Release Time : 10/21/2023

Model Overview

This model is primarily designed for Japanese-English translation, while also supporting translation tasks between English and German, Chinese, Icelandic, and Czech.

Model Features

Multilingual translation capability

Specialized in Japanese-English translation while supporting German, Chinese, Icelandic, and Czech translations with English

Performance optimization

Additional training on the previous ALMA-7B-Ja model significantly improves translation performance

Quantization technology

Offers 4-bit quantization and GPTQ quantization versions to reduce hardware requirements

Cross-platform support

Supports running on Colab free version and provides a llama.cpp version specifically for MacBook

Model Capabilities

Japanese-English translation

German-English translation

Chinese-English translation

Icelandic-English translation

Czech-English translation

Use Cases

Document translation

Government document translation

Translating formal documents such as government papers

In government document translation tests, the GPTQ quantized version achieved EN→JA COMET 0.8848 and JA→EN COMET 0.6189

Literary translation

Translating classical literary works

In classical literature translation tests, the original version achieved EN→JA COMET 0.7202 and JA→EN COMET 0.5638

Creative content translation

Fan fiction translation

Translating creative content like fan fiction

In fan fiction translation tests, the GPTQ quantized version achieved EN→JA COMET 0.8643 and JA→EN COMET 0.6106

🚀 New Translation Model Released

The C3TR-Adapter is the QLoRA adapter for google/gemma-7b. Despite the 4-bit quantization, the GPU memory requirement has increased to 8.1 GB. However, it can be run on the free version of Colab, and the performance is significantly improved!

✨ Features

webbigdata/ALMA-7B-Ja-V2

The ALMA-7B-Ja-V2 is a machine translation model capable of translating between Japanese and English, as well as between the following languages, although its primary focus is on Japanese-English and English-Japanese translation:

German (de) and English (en)
Chinese (zh) and English (en)
Icelandic (is) and English (en)
Czech (cs) and English (en)

This model builds on the previous one (ALMA-7B-Ja) by adding further learning, thereby enhancing its performance.

Benchmark Results

The following three metrics were used to evaluate the translation performance. A higher score indicates better performance:

BLEU

A metric that assesses the similarity between the translated text and the original text. However, it mainly focuses on word frequency and may not adequately evaluate word order accuracy or sentence fluency.

chrF++

A method for evaluating translation accuracy based on character combination matches and word order. Its drawback is that it may not be suitable for evaluating long sentences.

comet

A tool that uses machine learning models to automatically evaluate translation quality. It is said to be similar to human subjective evaluation, but since it is based on machine learning, it highly depends on the data used for training by the original model.

Comparison with NLLB-200

The benchmark results compared to Meta's NLLB-200 series of super multilingual machine translation models, which support over 200 languages, are as follows:

Model Name	File Size	E->J chrf++/F2	E->J comet	J->E chrf++/F2	J->E comet
NLLB-200-Distilled	2.46GB	23.6/-	-	50.2/-	-
NLLB-200-Distilled	5.48GB	25.4/-	-	54.2/-	-
NLLB-200	5.48GB	24.2/-	-	53.6/-	-
NLLB-200	17.58GB	25.2/-	-	55.1/-	-
NLLB-200	220.18GB	27.9/33.2	0.8908	55.8/59.8	0.8792

Comparison with Previous Model (ALMA-7B-Ja)

Model Name	File Size	E->J chrf++/F2	E->J comet	J->E chrf++/F2	J->E comet
webbigdata-ALMA-7B-Ja-q4_K_S	3.6GB	-/24.2	0.8210	-/54.2	0.8559
ALMA-7B-Ja-GPTQ-Ja-En	3.9GB	-/30.8	0.8743	-/60.9	0.8743
ALMA-Ja(Ours)	13.48GB	-/31.8	0.8811	-/61.6	0.8773

ALMA-7B-Ja-V2

Model Name	File Size	E->J chrf++/F2	E->J comet	J->E chrf++/F2	J->E comet
ALMA-7B-Ja-V2-GPTQ-Ja-En	3.9GB	-/33.0	0.8818	-/62.0	0.8774
ALMA-Ja-V2(Ours)	13.48GB	-/33.9	0.8820	-/63.1	0.8873
ALMA-Ja-V2-Lora(Ours)	13.48GB	-/33.7	0.8843	-/61.1	0.8775

Comparison with Real-World Applications

The results of comparing ALMA-7B-Ja-V2 with real-world applications across various text genres are as follows:

Government Official Announcements

	e->j chrF2++	e->j BLEU	e->j comet	j->e chrF2++	j->e BLEU	j->e comet
ALMA-7B-Ja-V2-GPTQ-Ja-En	25.3	15.00	0.8848	60.3	26.82	0.6189
ALMA-Ja-V2	27.2	15.60	0.8868	58.5	29.27	0.6155
ALMA-7B-Ja-V2-Lora	24.5	13.58	0.8670	50.7	21.85	0.6196
SeamlessM4T	27.3	16.76	0.9070	54.2	25.76	0.5656
gpt-3.5	34.6	28.33	0.8895	74.5	49.20	0.6382
gpt-4.0	36.5	28.07	0.9255	62.5	33.63	0.6320
google-translate	43.5	35.37	0.9181	62.7	29.22	0.6446
deepl	43.5	35.74	0.9301	60.1	27.40	0.6389

Classical Literature

	e->j chrF2++	e->j BLEU	e->j comet	j->e chrF2++	j->e BLEU	j->e comet
ALMA-7B-Ja-V2-GPTQ-Ja-En	11.8	7.24	0.6943	31.9	9.71	0.5617
ALMA-Ja-V2	10.7	4.93	0.7202	32.9	10.52	0.5638
ALMA-7B-Ja-V2-Lora	12.3	7.25	0.7076	32.5	11.14	0.5441
gpt-3.5	-	-	0.6367	69.3	46.34	0.4922
gpt-4.0	13.3	8.33	0.7074	44.3	23.75	0.5518
deepl	14.4	9.18	0.7149	34.6	10.68	0.5787
google-translate	13.5	8.57	0.7432	31.7	7.94	0.5856

Fanfiction

	e->j chrF2++	e->j BLEU	e->j comet	j->e chrF2++	j->e BLEU	j->e comet
ALMA-7B-Ja-V2-GPTQ-Ja-En	27.6	18.28	0.8643	52.1	24.58	0.6106
ALMA-Ja-V2	20.4	8.45	0.7870	48.7	23.06	0.6050
ALMA-7B-Ja-V2-Lora	23.9	18.55	0.8634	55.6	29.91	0.6093
SeamlessM4T	25.5	19.97	0.8657	42.2	14.39	0.5554
gpt-3.5	31.2	23.37	0.9001	-	-	0.5948
gpt-4.0	30.7	24.31	0.8848	53.9	24.89	0.6163
google-translate	32.4	25.36	0.8968	58.5	29.88	0.6022
deepl	33.5	28.38	0.9094	60.0	31.14	0.6124

💻 Usage Examples

Basic Usage

Using Colab, Google's free web tool, you can easily verify the performance of ALMA_7B_Ja_V2. Sample Code For Free Colab

📚 Documentation

Other Versions

llama.cpp

The main purpose of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook. Although performance is somewhat reduced with 4-bit quantization, webbigdata-ALMA-7B-Ja-V2-gguf, created by mmnga, can be used to run this model on Mac, Windows, and Linux without a GPU. Here is Colab(without GPU) sample code

GPTQ

GPTQ is a technique (called quantization) that reduces model size. ALMA-7B-Ja-V2-GPTQ-Ja-En is a quantized GPTQ version, which reduces model size (3.9 GB) and memory usage and increases speed. However, performance is slightly reduced, and the ability to translate into languages other than Japanese and English should be significantly reduced. Sample Code For Free Colab webbigdata/ALMA-7B-Ja-V2-GPTQ-Ja-En

If you want to translate the entire txt file at once, try Colab below. ALMA_7B_Ja_GPTQ_Ja_En_batch_translation_sample

Model Details

ALMA (Advanced Language Model-based trAnslator) is an LLM-based translation model that adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance. Please find more details in their paper.

@misc{xu2023paradigm,
      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models}, 
      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
      year={2023},
      eprint={2309.11674},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Original Model ALMA-7B. (26.95GB)
Prevous Model ALMA-7B-Ja. (13.3 GB)

About This Work

This work was done by : webbigdata.

📄 License

The license for this model is llama2.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご