Open-Calm-Large: An Open-Source Japanese Language Model - Trained on Japanese Data, Free to Use

Home

Open Calm Large

Developed by cyberagent

OpenCALM is a series of decoder-only language models pre-trained on Japanese datasets, developed by CyberAgent.

Large Language Model

Transformers

Japanese#Japanese Generation #Large Language Model #Transformer Architecture

Downloads 1,059

Release Time : 5/15/2023

Model Overview

OpenCALM is a Transformer-based Japanese language model series supporting tasks like text generation.

Model Features

Japanese Optimization

Pre-trained specifically for Japanese text, excelling in Japanese tasks

Multi-Scale Options

Offers model choices ranging from 160 million to 6.8 billion parameters

Open-Source License

Adopts CC BY-SA 4.0 license, permitting commercial and research use

Model Capabilities

Japanese Text Generation

Language Understanding

Text Continuation

Use Cases

Content Creation

Article Continuation

Automatically generates coherent article content based on the beginning

Produces fluent and natural Japanese text

Dialogue Systems

Japanese Chatbot

Builds Japanese dialogue systems

Generates contextually appropriate Japanese responses

🚀 OpenCALM-Large

OpenCALM is a suite of decoder - only language models pre - trained on Japanese datasets, developed by CyberAgent, Inc.

🚀 Quick Start

OpenCALM is a decoder - only language model pre - trained on Japanese datasets, developed by CyberAgent, Inc. It offers high - quality language processing capabilities for Japanese text.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("cyberagent/open-calm-large", device_map="auto", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("cyberagent/open-calm-large")

inputs = tokenizer("AIによって私達の暮らしは、", return_tensors="pt").to(model.device)
with torch.no_grad():
    tokens = model.generate(
        **inputs,
        max_new_tokens=64,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.05,
        pad_token_id=tokenizer.pad_token_id,
    )
    
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)

📚 Documentation

Model Details

Property	Details
Model Type	Transformer - based Language Model
Language	Japanese
Library	[GPT - NeoX](https://github.com/EleutherAI/gpt - neox)

Model	Params	Layers	Dim	Heads	Dev ppl
[cyberagent/open - calm - small](https://huggingface.co/cyberagent/open - calm - small)	160M	12	768	12	19.7
[cyberagent/open - calm - medium](https://huggingface.co/cyberagent/open - calm - medium)	400M	24	1024	16	13.8
[cyberagent/open - calm - large](https://huggingface.co/cyberagent/open - calm - large)	830M	24	1536	16	11.3
[cyberagent/open - calm - 1b](https://huggingface.co/cyberagent/open - calm - 1b)	1.4B	24	2048	16	10.3
[cyberagent/open - calm - 3b](https://huggingface.co/cyberagent/open - calm - 3b)	2.7B	32	2560	32	9.7
[cyberagent/open - calm - 7b](https://huggingface.co/cyberagent/open - calm - 7b)	6.8B	32	4096	32	8.2

Training Dataset

Wikipedia (ja)
Common Crawl (ja)

Author

Ryosuke Ishigami

Citations

@software{gpt - neox - library,
  title = {{GPT - NeoX: Large Scale Autoregressive Language Modeling in PyTorch}},
  author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy - Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
  url = {https://www.github.com/eleutherai/gpt - neox},
  doi = {10.5281/zenodo.5879544},
  month = {8},
  year = {2021},
  version = {0.0.1},
}

📄 License

OpenCALM is licensed under the Creative Commons Attribution - ShareAlike 4.0 International License ([CC BY - SA 4.0](https://creativecommons.org/licenses/by - sa/4.0/)). When using this model, please provide appropriate credit to CyberAgent, Inc.

Example (en): This model is a fine - tuned version of OpenCALM - XX developed by CyberAgent, Inc. The original model is released under the CC BY - SA 4.0 license, and this model is also released under the same CC BY - SA 4.0 license. For more information, please visit: https://creativecommons.org/licenses/by - sa/4.0/

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご