Llama-3-Taiwan-70B-Instruct Open-Source AI Model - Supports Both Traditional Chinese and English, Enabling Top-Notch Chinese NLP Applications!

Llama 3 Taiwan 70B Instruct

Developed by yentinglin

Llama-3-Taiwan-70B is a 70 billion parameter model fine-tuned with Traditional Chinese and English big data based on the Llama-3 architecture, demonstrating top-tier performance in multiple Traditional Chinese NLP benchmarks.

Large Language Model

Transformers

Supports Multiple Languages#Traditional Chinese optimization #70 billion parameter large model #Taiwan localization knowledge

Downloads 1,279

Release Time : 5/31/2024

Model Overview

A large language model optimized for Traditional Chinese and English users, featuring exceptional language understanding and generation capabilities, logical reasoning, and multi-turn dialogue abilities.

Model Features

Powerful language understanding and generation capabilities

Demonstrates top-tier performance in multiple Traditional Chinese NLP benchmarks.

Supports 8K context length

Capable of handling longer text inputs and outputs.

Multi-turn dialogue capability

Able to conduct coherent multi-turn dialogues.

Logical reasoning capability

Possesses strong logical reasoning and problem-solving abilities.

Model Capabilities

Text generation

Multi-turn dialogue

Logical reasoning

Retrieval-augmented generation (RAG)

Structured output

Use Cases

Dialogue systems

Multi-turn dialogue

Users engage in multi-turn dialogues with the AI assistant, which provides useful, detailed, and polite responses.

Capable of conducting coherent multi-turn dialogues.

Information retrieval and generation

Retrieval-augmented generation (RAG)

Generates more accurate responses by incorporating retrieved information.

Improves the accuracy and relevance of responses.

Structured output

Generates structured data output.

Capable of generating outputs that conform to specific formats.

🚀 Llama-3-Taiwan-70B

Llama-3-Taiwan-70B is a 70B parameter model finetuned on a large corpus of Traditional Mandarin and English data, demonstrating state - of - the - art performance on various Traditional Mandarin NLP benchmarks.

🚀 Quick Start

Free API on NVIDIA NIM: Free API on NVIDIA NIM
Demo Site: Try out Llama-3-Taiwan interactively at twllm.com
Chatbot Arena: Participate in the exciting Chatbot Arena and compete against other chatbots!

✨ Features

Llama-3-Taiwan-70B is a large language model finetuned for Traditional Mandarin and English users. It has strong capabilities in language understanding, generation, reasoning, and multi - turn dialogue. Key features include:

70B parameters
Languages: Traditional Mandarin (zh - tw), English (en)
Finetuned on High - quality Traditional Mandarin and English corpus covering general knowledge as well as industrial knowledge in legal, manufacturing, medical, and electronics domains
8K context length
Open model released under the Llama - 3 license

📦 Installation

No installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

# No code examples are provided in the original document.

📚 Documentation

Model Summary

70B parameters
Languages: Traditional Mandarin (zh - tw), English (en)
Finetuned on High - quality Traditional Mandarin and English corpus covering general knowledge as well as industrial knowledge in legal, manufacturing, medical, and electronics domains
8K context length
Open model released under the Llama - 3 license

Training Details

Training Framework: NVIDIA NeMo, NVIDIA NeMo Megatron
Inference Framework: NVIDIA TensorRT - LLM
Base model: Llama - 3 70B
Hardware: NVIDIA DGX H100 on Taipei - 1
Context length: 8K tokens (128k version)
Batch size: 2M tokens per step

Evaluation

Checkout Open TW LLM Leaderboard for full and updated list.

Model	TMLU	Taiwan Truthful QA	Legal Eval	TW MT - Bench	Long context	Function Calling	TMMLU+
	Subject Knowledge	Taiwan Localized Test	Taiwan Legal Exam Questions	Chinese Multi - turn Dialogue	Long Text Support	Function Calling
yentinglin/Llama-3-Taiwan-70B-Instruct	74.76%	80.95%	68.42%	7.54	128k version	✅	67.53%
yentinglin/Llama-3-Taiwan-70B-Instruct-DPO	74.60%	81.75%	70.33%	-	-	✅	-
yentinglin/Llama-3-Taiwan-70B-Instruct-128k	73.01%	80.16%	63.64%	-	-	✅	-
yentinglin/Llama-3-Taiwan-8B-Instruct	59.50%	61.11%	53.11%	7.21	128k version	✅	52.28%
yentinglin/Llama-3-Taiwan-8B-Instruct-DPO	59.88%	59.52%	52.63%	-	-	✅	-
yentinglin/Llama-3-Taiwan-8B-Instruct-128k	-	-	-	-	-	✅	-
Claude-3-Opus	73.59% (5 - shot)	69.84%	60.29%	-	200k	✅	-
GPT4-o	65.56% (0 - shot), 69.88% (5 - shot)	76.98%	53.59%	-	128k	✅	-
GPT4-turbo	70.42% (5 - shot)	-	-	-	128k	✅	60.34%^
Gemini-Pro	61.40% (5 - shot)	-	-	-	1000k	✅	49.92%^
GPT-3.5-turbo-1106	49.37% (5 - shot)	-	-	7.1	128k	✅	41.76%^
Qwen1.5-110B-Chat	75.69%	66.67%	49.28%	-	32k	✅	65.81%
Yi-34B-Chat	73.59%	71.43%	55.02%	6.9	200k	✅	64.10%
Meta-Llama-3-70B-Instruct	70.95%	65.08%	52.63%	-	8k	✅	62.75%
Mixtral-8x22B-Instruct-v0.1	55.57%	52.38%	44.98%	-	64k	✅	52.16%
Breexe-8x7B-Instruct-v0_1	-	-	-	7.2	8k	❓	48.92%
c4ai-command-r-plus	62.87%	64.29%	34.45%	-	128k	✅	49.75%
Meta-Llama-3-8B-Instruct	55.81%	46.83%	35.89%	-	8k	✅	43.38%
Breeze-7B-Instruct-v1_0	55.57%	52.38%	39.23%	6.0	32k	❓	41.77%
Llama3-TAIDE-LX-8B-Chat-Alpha1	47.30%	50.79%	37.80%	-	8k	❓	39.03%
Phi-3-mini-4k-instruct	40.97%	37.30%	27.27%	-	4k	❓	33.02%

Numbers are 0 - shot by default.

Eval implementation

^ taken the closet matching numbers from original dataset.

Needle in a Haystack Evaluation

The "Needle in a 《出師表》" evaluation tests the model's ability to locate and recall important information embedded within a large body of text, using the classic Chinese text 《出師表》 by 諸葛亮.

To run the evaluation, use the script.

TW MT - Bench Score

Average Score: 7.5375
Maximum Score: 10
Minimum Score: 1
Median Score: 9.0
Standard Deviation: 3.0349783771882133
Total Number of Scores: 160
Model resopnse
GPT - 4 Eval
Code fork from mtkresearch/TCEval with bug fixing

image/png

Use Cases

Llama-3-Taiwan-70B can be applied to a wide variety of NLP tasks in Traditional Chinese and English, including:

1. Multi - turn Dialogue

System: You are an AI assistant called Twllm, created by TAME (TAiwan Mixture of Expert) project.

User: Hi, hello!

Assistant: Hello! How can I assist you today?

User: Just want to chat...

Assistant: Of course, I'd be happy to chat with you! Is there anything in particular you'd like to talk about?

2. RAG (Retrieval - Augmented Generation)

Demo: You can open Search Web on twllm.com

image/png

System: You are an AI assistant called Twllm, created by TAME (TAiwan Mixture of Expert) project.

User: What's the weather like at NTU tomorrow?

Assistant: According to the weather forecast, the weather at NTU tomorrow will be showers. The low temperature will be 24°C. There will be an east wind of 10 to 15 km/h. The probability of rain is 40%.

3. Formatted Output, Language Understanding, Entity Recognition, Function Calling

If you are interested in function - calling, I strongly recommend using constrained decoding to turn on json mode.

Example from HW7 in INTRODUCTION TO GENERATIVE AI 2024 SPRING from HUNG - YI LEE (李宏毅)

image/png

System: You are an AI assistant called Twllm, created by TAME (TAiwan Mixture of Expert) project.

🔧 Technical Details

The model was trained with NVIDIA NeMo™ Framework using the NVIDIA Taipei - 1 built with NVIDIA DGX H100 systems.

The compute and data for training Llama-3-Taiwan-70B was generously sponsored by Chang Gung Memorial Hospital, Chang Chun Group, Legalsign.ai, NVIDIA, Pegatron, TechOrange, and Unimicron (in alphabetical order).

We would like to acknowledge the contributions of our data provider, team members and advisors in the development of this model, including shasha77 for high - quality YouTube scripts and study materials, Taiwan AI Labs for providing local media content, Ubitus K.K. for offering gaming content, Professor Yun - Nung (Vivian) Chen for her guidance and advisement, Wei - Lin Chen for leading our pretraining data pipeline, Tzu - Han Lin for synthetic data generation, Chang - Sheng Kao for enhancing our synthetic data quality, and Kang - Chieh Chen for cleaning instruction - following data.

📄 License

The model is released under the Llama - 3 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご