MobiLlama - 0.5B Open-Source Language Model: Low-Memory Text Generation for Resource-Constrained Devices

Mobillama 05B

Developed by MBZUAI

MobiLlama-05B is a small language model (SLM) with 500 million parameters, focusing on applications for resource-constrained devices, providing efficient and low-memory text generation capabilities.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:MIT #Lightweight language model #Resource-constrained devices #Parameter-efficient sharing

Downloads 187

Release Time : 2/2/2024

Model Overview

MobiLlama-05B is a small language model designed based on the LLaMA-7B architecture, aiming to provide efficient and low-memory text generation solutions for resource-constrained devices. Its training data comes from the Amber dataset and supports English text generation tasks.

Model Features

Resource-efficient

Designed for resource-constrained devices, reducing pre-training and deployment costs, suitable for on-device processing and high-energy-efficiency scenarios.

Parameter-sharing scheme

Adopts a refined parameter-sharing scheme, significantly reducing the number of model parameters while maintaining performance.

Fully transparent open-source

Provides complete training data processes, training code, model weights, and evaluation code to ensure full transparency.

Model Capabilities

English text generation

Code generation

Technical Q&A

Use Cases

Privacy protection

On-device text generation

Generates text on local devices, avoiding data upload to the cloud and protecting user privacy.

Efficient computing

Low-memory applications

Suitable for devices with limited memory, such as mobile devices or embedded systems.

🚀 MobiLlama-05B

MobiLlama-05B is a Small Language Model with 0.5 billion parameters. It addresses the needs of resource - constrained devices, offering enhanced performance with reduced resource demands.

🚀 Quick Start

To use MobiLlama-05B, you can follow the code example below:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("MBZUAI/MobiLlama-05B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("MBZUAI/MobiLlama-05B", trust_remote_code=True)

model.to('cuda')
text = "I was walking towards the river when "
input_ids = tokenizer(text, return_tensors="pt").to('cuda').input_ids
outputs = model.generate(input_ids, max_length=1000, repetition_penalty=1.2, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())

✨ Features

Resource - efficient: Designed for resource - constrained devices, reducing pre - training and deployment costs.
Transparent: All training data pipeline, training code, model weights, and over 300 checkpoints along with evaluation codes are available on Github.
Based on LLaMA - 7B: Built using the architecture design of LLaMA - 7B.

📚 Documentation

Model Summary

In the development of Large Language Models (LLMs), the trend has been "bigger the better". However, LLMs are not suitable for on - device processing, energy efficiency, low memory footprint, and response efficiency. This paper explores the ‘less is more’ paradigm by designing an accurate and fully transparent open - source 0.5 billion (0.5B) parameter Small Language Model (SLM) named MobiLlama. MobiLlama starts from a larger model and applies a parameter sharing scheme to cut down costs. All related resources are available on Github.

Arxiv Paper Link

Model Description

Property	Details
Model Type	Small Language Model (SLM) built using the architecture design of LLaMA - 7B
Language(s) (NLP)	English
License	Apache 2.0
Resources for more information	Training Code, Data Preparation, Fully processed Amber pretraining data

Training DataMix

Subset	Tokens (Billion)
Arxiv	30.00
Book	28.86
C4	197.67
Refined - Web	665.01
StarCoder	291.92
StackExchange	21.75
Wikipedia	23.90
Total	1259.13

Hyperparameters

Hyperparameter	Value
Total Parameters	0.52B
Hidden Size	2048
Intermediate Size (MLPs)	5632
Number of Attention Heads	32
Number of Hidden Lyaers	22
RMSNorm ɛ	1e^-5
Max Seq Length	2048
Vocab Size	32000

Evaluation

Evaluation Benchmark	MobiLlama - 0.5B	MobiLlama - 0.8B	MobiLlama - 1.2B
HellaSwag	52.52	54.09	62.99
MMLU	26.45	26.92	24.23
Arc Challenge	29.52	30.20	34.55
TruthfulQA	38.05	38.48	35.57
CrowsPairs	64.03	64.82	68.12
PIQA	72.03	73.17	75.29
Race	33.68	33.37	35.31
SIQA	40.22	41.60	41.96
Winogrande	57.53	57.45	61.08

Citation

BibTeX:

@misc{thawakar2024mobillama,
      title={MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT}, 
      author={Omkar Thawakar and Ashmal Vayani and Salman Khan and Hisham Cholakkal and Rao Muhammad Anwer and Michael Felsberg and Timothy Baldwin and Eric P. Xing and Fahad Shahbaz Khan},
      year={2024},
      eprint={2402.16840},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

📄 License

The model is licensed under Apache 2.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご