PLaMo 2 8B Open-Source Language Model - Free English and Japanese Text Generation

Plamo 2 8b

Developed by pfnet

PLaMo 2 8B is an 8-billion-parameter hybrid architecture language model developed by Preferred Elements, supporting English and Japanese text generation.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Japanese-English Bilingual Generation #Hybrid SSM Architecture #Efficient Inference Optimization

Downloads 401

Release Time : 2/7/2025

Model Overview

A large-scale language foundation model pre-trained on English and Japanese datasets, employing a Samba-like hybrid architecture (combining selective state space models with sliding window attention mechanisms), focusing on efficient text generation.

Model Features

Efficient Hybrid Architecture

Integrates Mamba2 selective state space models with sliding window attention mechanisms, offering higher computational efficiency compared to traditional Transformers.

Bilingual Support

Optimized for English and Japanese, with training data comprising 6 trillion tokens (45% English / 30% Japanese).

Business-Friendly License

Organizations with annual revenue below 1 billion JPY can apply for commercial use licenses (registration required).

Enhanced Training Stability

Added normalization layers and improved Mamba2 core for better stability in large-scale training.

Model Capabilities

English Text Generation

Japanese Text Generation

Code Generation (Limited Support)

Open-domain Q&A

Use Cases

Content Creation

Multilingual Content Generation

Automatically generate English/Japanese marketing copy, blog posts, etc.

Enterprise Applications

Internal Knowledge Processing

Document summarization, report generation, and other non-commercial internal organizational uses.

Must comply with revenue restriction clauses.

🚀 PLaMo 2 8B

PLaMo 2 8B is an 8-billion parameter model pre - trained on English and Japanese datasets. It offers enhanced efficiency and performance, developed by Preferred Elements, Inc.

🚀 Quick Start

PLaMo 2 8B is a powerful language model. Before using it, make sure you understand its license terms and meet the usage requirements.

Requirements

numpy>=1.26.4
numba>=0.60.0
torch>=2.4.1
transformers>=4.44.2
mamba_ssm>=2.2.2
causal_conv1d>=1.4.0

Use a pipeline as a high - level helper

import transformers
pipeline = transformers.pipeline("text-generation", model="pfnet/plamo-2-8b", trust_remote_code=True)
print(pipeline("The future of artificial intelligence technology is ", max_new_tokens=32))

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-2-8b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("pfnet/plamo-2-8b", trust_remote_code=True)
text = "これからの人工知能技術は"
input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_tokens = model.generate(
    inputs=input_ids,
    max_new_tokens=32,
    do_sample=True,
    top_k=50,
    top_p=0.95,
    temperature=1.0,
)[0]
generated_text = tokenizer.decode(generated_tokens)
print(generated_text)

✨ Features

Hybrid Architecture: PLaMo 2 models adopt a hybrid architecture similar to Samba, integrating Mamba (a selective State Space Model) with sliding window attention. This combination leverages their strengths to improve efficiency and performance.
Stability and Efficiency: PLaMo 2 adds normalization layers to enhance training stability and uses the Mamba2 kernel for better computational efficiency.

📦 Installation

To use PLaMo 2 8B, you need to install the following dependencies:

numpy>=1.26.4
numba>=0.60.0
torch>=2.4.1
transformers>=4.44.2
mamba_ssm>=2.2.2
causal_conv1d>=1.4.0

💻 Usage Examples

Basic Usage

import transformers
pipeline = transformers.pipeline("text-generation", model="pfnet/plamo-2-8b", trust_remote_code=True)
print(pipeline("The future of artificial intelligence technology is ", max_new_tokens=32))

Advanced Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("pfnet/plamo-2-8b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("pfnet/plamo-2-8b", trust_remote_code=True)
text = "これからの人工知能技術は"
input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_tokens = model.generate(
    inputs=input_ids,
    max_new_tokens=32,
    do_sample=True,
    top_k=50,
    top_p=0.95,
    temperature=1.0,
)[0]
generated_text = tokenizer.decode(generated_tokens)
print(generated_text)

📚 Documentation

Model Description

PLaMo 2 8B is an 8 - billion parameter model pre - trained on English and Japanese datasets, developed by Preferred Elements, Inc. It uses a hybrid architecture similar to Samba, which combines Mamba (a selective State Space Model) with sliding window attention.

For commercial users

Please check the PLaMo community license and contact us via the following form to use for commercial purposes:

(EN/JA) https://forms.gle/mTL8tBLrMYXKNZD56

Model Details

Property	Details
Model Size	8B
Trained Tokens	6T tokens
Developed by	Preferred Elements, Inc.
Model Type	Causal decoder - only
Language(s)	English, Japanese
License	PLaMo community license

Training Dataset

We trained PLaMo 2 8B in two phases, phase 1 with 5.25T tokens and phase 2 with 0.75T tokens. The percentage of datasets in each phase is shown in the following table.

	5.25T (phase 1)	0.75T (phase 2)	Tokens
English	45 %	35 %	2.625 T
Japanese	30 %	40 %	1.875 T
Coding	15 %	15 %	0.9 T
Other	10 %	10 %	0.6 T

Tokenizer

PLaMo 2 8B tokenizer is optimized by numba, which is a JIT compiler for numerical functions. The tokenizer is trained on a subset of the datasets for model pre - training.

Tech Blog

(JA) https://tech.preferred.jp/ja/blog/plamo-2/
(JA) https://tech.preferred.jp/ja/blog/plamo-2-8b/
(JA) https://tech.preferred.jp/ja/blog/plamo-2-tokenizer/

Bias, Risks, and Limitations

PLaMo 2 8B is a new technology that carries risks with use. Testing conducted to date has been in English and Japanese, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, PLaMo 2 8B’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of PLaMo 2 8B, developers should perform safety testing and tuning tailored to their specific applications of the model.

Acknowledgement

This model is trained under the project, “Research and Development Project of the Enhanced Infrastructures for Post 5G Information and Communication System” (JPNP 20017), subsidized by the New Energy and Industrial Technology Development Organization (NEDO).

AI policies for Preferred Networks, Inc. group

(EN) https://www.preferred.jp/en/company/aipolicy/
(JA) https://www.preferred.jp/ja/company/aipolicy/

📄 License

PLaMo 2 8B is released under PLaMo community license. To download PLaMo 2 8B, you have to agree to our license. For non - commercial use, please contact us via this form.

PLaMo Community License Agreement

(English version is under construction. We apologize for the inconvenience.)

The PLaMo Community License Agreement defines the terms of the license for using the large - scale language foundation model PLaMo and its derivatives provided by Preferred Networks, Inc., as well as the matters that users must comply with. This agreement applies to users' use of PLaMo and its derivatives, and by agreeing to this agreement or using this model, users are bound by this agreement.

NOTE: This model has NOT been instruction - tuned for chat dialog or other downstream tasks.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご