Mamba 1.3b Chinese Chat V0.1
M
Mamba 1.3b Chinese Chat V0.1
Developed by gywy
A large language model focused on the Chinese domain, achieving performance leaps through innovative architecture and curated corpora
Downloads 34
Release Time : 4/9/2024
Model Overview
A Chinese large language model based on the Mamba architecture, trained with selected Chinese book corpora, supporting high-quality text generation and dialogue tasks
Model Features
Innovative Architecture Expansion
Adopts the innovative 'Solar Left Foot Steps on Right Foot' approach to scale the model from 765M to 1.3B
Deep Optimization
Through self-iteration, the model layers are increased to 82, achieving significant performance leaps
Chinese Focus
Pre-training data is carefully selected from high-quality Chinese book corpora, specifically optimized for Chinese language processing
Model Capabilities
Chinese Text Generation
Multi-turn Dialogue
Policy Document Writing
News Report Generation
Use Cases
Government Writing
Policy Document Writing
Generates policy document analyses and editorials with official tone
Examples demonstrate complete policy analysis frameworks and standardized expressions
News Creation
Technology News Reporting
Generates professional technology news reports
Examples showcase complete reports including timelines, milestones, and expert quotes
🚀 Solar Model Enhancement
This project uses a unique approach similar to "stepping on one's own feet" with Solar to upgrade the 765M model to a 1.3B model, increasing the number of layers to 82. As a result, the performance of the model has been significantly improved.
The Mamba model we trained focuses on the Chinese language. The pretraining data mainly consists of high - quality Chinese books. If you like our model, please give us a star!
🚀 Quick Start
💻 Usage Examples
Basic Usage
import sys
import torch
from transformers import LlamaTokenizer, MambaForCausalLM
device = "cuda"
model_path = sys.argv[1]
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = MambaForCausalLM.from_pretrained(model_path).to(device)
messages = []
while True:
user_message = input("User: ")
messages.append(dict(
role="user",
content=user_message
))
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)
out = model.generate(input_ids=input_ids, max_length=1000, temperature=0.7, do_sample=True,
penalty_alpha=0.8, top_k=5,
repetition_penalty=1.2, eos_token_id=tokenizer.eos_token_id)
decoded = tokenizer.batch_decode(out)
messages.append(dict(
role="assistant",
content=decoded[0].split("<|assistant|>\n")[-1])
)
print("Model:", decoded[0].split("<|assistant|>\n")[-1])
Example Prompts
Write an editorial titled "Opinions of the Central Committee of the Communist Party of China and the State Council on Promoting the Development and Growth of the Private Economy" with novel viewpoints and rich content.
The theme is based on "Opinions of the Central Committee of the Communist Party of China and the State Council on Promoting the Development of Private Enterprises" (hereinafter referred to as the "Opinions"). The discussion centers around the theme that "the private economy is an important force in the national economy." This theme aims to explore how to promote and protect the development and growth of the private economy, as well as how to safeguard the legitimate rights and interests of the private economy in the process of development.
I. Clear Goals: Promote and Protect the Development and Prosperity of the Non - Public Economy
The "Opinions of the Central Committee of the Communist Party of China and the State Council on Several Opinions to Support the Development of Private Enterprises" states that "encouraging and supporting the development of the non - public economy and state - owned small and medium - sized enterprises is conducive to stimulating the vitality of private investment, improving the efficiency of resource allocation, and enhancing national competitiveness." This reflects China's support policy for the private economy. However, in the specific implementation process, specific policies and measures need to be formulated according to the actual situation. The "Opinions" emphasize the following points:
1. Adhere to the basic principle of governing the country by law and prioritize the development of the non - public economy;
2. Actively guide the development of the non - public economy;
3. Create a good social atmosphere and a fair competition environment;
4. Implement differential treatment for different ownership economies;
5. Improve the tax system and service system to promote the orderly development of market entities;
6. Strengthen supervision to ensure the stable and healthy operation of the market order.
II. The Issuance Time and Content Requirements of the "Opinions"
1. The Time Range for Proposing the "Opinions"
The "Opinions" will come into effect on September 1, 2020. This means that the "Opinions" will be revised nationwide and gradually implemented to ensure the implementation of the "Opinions." This time limit is of great significance for promoting the development of the non - public economy and is conducive to promoting the healthy and rapid development of the private economy.
Write a news article on the theme of China's first global manned landing on Mars.
Taking the scientific and technological innovation of China's first global "manned landing on Mars" as an opportunity, the "2016 Annual Human Future Technology Conference - China National Science and Technology Award (2016)" jointly hosted by the China National Space Administration, the Chinese Academy of Sciences, and the Chinese Academy of Engineering was held in Beijing from May 14th to 3rd, 2016. The theme of this conference is "The Community with a Shared Future for Mankind and Sustainable Development."
The purpose of this conference is to celebrate the successful launch of China's manned space exploration dream - the China Manned Landing Probe by the China National Space Administration and the Chinese Academy of Sciences in Chengdu, Sichuan Province this July. This event will have a positive impact on China's leading position in the field of space exploration and further enhance China's influence in the aerospace industry and international influence.
The birth of the China Manned Landing Probe is of milestone significance: First, it marks the first time in human history to achieve geosynchronous orbit flight; second, it provides new tools and methods for human space exploration. At the same time, it also shows that the Chinese people have begun to attach importance to space exploration and are committed to promoting humanity into the era of interstellar travel.
The China Manned Landing Probe began its official test flight in 2016 and is expected to take off at around 18:00 on September 16th, Beijing time. During this trial operation period, scientists from the China National Space Administration, the Chinese Academy of Sciences, and the China National Oceanic Administration will cooperate closely, carry out research tasks using the latest technologies, and jointly explore the development prospects and relevant technological breakthroughs of the manned lunar landing program. These advancements will contribute to future human lunar landings.
In addition, the Chinese government will also fund the manned spacecraft project and the manned satellite launch project to provide more support and assistance for China's manned space industry. By then, the "Manned Landing Probe" will also be an important part of the manned landing mission and will join the manned lunar landing operation. By then, China's manned space industry will take a solid step forward and become the second civilization in the world, after the Soviet Union, to have independently developed manned space technology.
Finally, the "Manned Landing Probe" will become an effective tool for the manned landing mission, enabling humans to perform various tasks more efficiently and conveniently in space navigation. This will help promote the competitiveness and cooperation of countries in space exploration, promote resource sharing worldwide, and the prosperity and development of human civilization. Therefore, it can be said that the manned landing probe is one of the major achievements of China's aerospace industry.
The China National Space Administration and the Chinese Academy of Sciences will work together to build an advanced R & D platform and technological capabilities for the China Manned Landing Probe. This will not only help improve the scientific nature and practicality of manned landing exploration but also make China's manned space industry more mature and scientific in the future, ultimately benefiting humanity.
Note: This article was originally published in the July 2016 issue of the "Global Science" magazine.
Abstract: Yang Zhenning, the "Father of Manned Spaceflight," once predicted that the "manned spacecraft" would become a feasible technology in China. The "manned spacecraft" (manned landing probe) will be the first real - world attempt at manned spaceflight in human history and will have a profound impact on the development process of human history. The successful "Human Future Technology Conference and Manned Immigration Event - The Theme Event of the Community with a Shared Future for Mankind and Sustainable Development" jointly initiated by the China National Space Administration and the China National Space Administration will undoubtedly bring new inspiration and expectations. (Source: "China Science and Technology Daily")
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models