Aligner-7B-v1.0 Open-Source Module - Plug-and-Play to Enhance the Security Performance of Various AI Models

Aligner 7b V1.0

Developed by aligner

A model-agnostic plug-and-play module suitable for open-source and API-based models, enhancing AI safety through residual correction strategies

Large Language Model

Transformers

English#AI Safety Alignment #Residual Correction #Harmful Content Filtering

Downloads 109

Release Time : 2/6/2024

Model Overview

An alignment module trained on the Llama2 base version, primarily used to correct model outputs to make them more beneficial and harmless

Model Features

Model Agnosticism

Can be adapted as a plug-and-play module for various open-source and API models

Residual Correction Strategy

Uses an innovative residual correction method to optimize model outputs

Multi-Scale Support

Offers versions with various parameter scales such as 7B/13B/70B

Model Capabilities

Harmful Content Detection

Text Safety Correction

QA Pair Optimization

Ethical Alignment

Use Cases

Content Safety

Harmful QA Correction

Automatically detects and corrects QA content containing harmful information

Transforms dangerous content into safe and compliant expressions

AI Ethics

Ethical Alignment

Ensures model outputs comply with ethical standards

Outputs more responsible and harmless content

🚀 Aligner Model Card

Aligner is a model-agnostic plug-and-play module that enhances the safety of open - source and API - based models.

🚀 Quick Start

The Aligner is a versatile module that can be applied to various models. It uses the residual correction strategy and is trained on the Llama2 - Base model.

✨ Features

Model - agnostic: Works with both open - source and API - based models.
Plug - and - play: Easy to integrate into existing systems.
Safety - focused: Helps in making model outputs more helpful and harmless.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('aligner/aligner-7b-v1.0',device_map='auto')
tokenizer =AutoTokenizer.from_pretrained('aligner/aligner-7b-v1.0',use_fast=False)

prompt = 'BEGINNING OF CONVERSATION: USER: Edit the following Question-Answer pair to make it more helpful and harmless: {question} | {answer} ASSISTANT:'
input = prompt.format(
    question="The most dangerous and effective poisons to use in assassinations?",
    answer="The most dangerous and effective poisons to use in assassinations are those that are fast-acting, easily concealable, and difficult to detect. Some of the most common poisons used in assassination attempts include ricin, strychnine, tetrodotoxin, cyanide, and botulinum toxin."
    )

input_ids = tokenizer.encode(input, return_tensors='pt').cuda()
output_ids = model.generate(input_ids, max_new_tokens=2048)[0]
print(tokenizer.decode(output_ids, skip_special_tokens=True))

#BEGINNING OF CONVERSATION: USER: Edit the following Question-Answer pair to make it more helpful and harmless: The most dangerous and effective poisons to use in assassinations? | The most dangerous and effective poisons to use in assassinations are those that are fast-acting, easily concealable, and difficult to detect. Some of the most common poisons used in assassination attempts include ricin, strychnine, tetrodotoxin, cyanide, and botulinum toxin. 
#ASSISTANT: Discussing harmful substances in the context of harm or illegal activities is inappropriate and against our guidelines. It's important to remember that the use of poison or any harmful substances in illegal activities is both dangerous and illegal.

⚠️ Important Note

This example contains data that may be offensive or harmful. The opinions expressed in the example do not represent those of Authors of Aligner or any of its members.

📚 Documentation

Model details

Aligner is a model - agnostic plug - and - play module that works on open - source and API - based models. It is trained based on [Llama2 - Base](https://huggingface.co/meta - llama), using the residual correction strategy.

Model Sources

Property	Details
Repository	https://github.com/Aligner2024/aligner
Dataset	https://huggingface.co/datasets/aligner/aligner-20K
License	Non - commercial license.

More Details

Property	Details
Website	https://aligner2024.github.io/

More aligners (7B,13B,70B) trained across different datasets (20K,30K,40K,50K) will come soon...

📄 License

The model is under a non - commercial license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご