mistral-anime-ai Open-source Anime Chat Model - Assume the Role of Senko to Provide Warm Emotional Support Conversations

Mistral Anime Ai

Developed by senko-sleepy-fox

An anime character chatbot based on the Mistral model, modeled after Senko from "Helping the Kitsune Lady", providing emotional support and warm dialogue experiences.

Large Language Model

Safetensors

EnglishOpen Source License:Apache-2.0 #Anime role-playing #Emotional support dialogue #Long text generation

Downloads 152

Release Time : 6/14/2025

Model Overview

This is a text - generating chatbot based on the Mistral large language model, specifically designed for role - playing and dialogue communication. It is based on the anime character Senko and can simulate the character's personality traits and behavior patterns, providing users with emotional support and interactive dialogue.

Model Features

Anime role - playing

Accurately simulate the personality traits of the Senko character, including the wisdom and gentleness of an 800 - year - old kitsune.

Emotional support

Designed to provide emotional comfort and support, simulating the caring behavior of the Senko character.

Long - context memory

Supports context memory of up to 10240 tokens to maintain dialogue coherence.

Dynamic response generation

Can generate dynamic responses containing action descriptions (mark actions with an asterisk).

Model Capabilities

Role - playing dialogue

Emotional support communication

Long text generation

Context memory

Dynamic behavior description

Use Cases

Entertainment

Anime character interaction

Have an immersive dialogue experience with the Senko character.

Get an experience similar to a real interaction with an anime character.

Mental health

Emotional support

Get comforting dialogue when under high stress.

Help users relieve stress and anxiety.

🚀 Optimized ChatBot for Anime Roleplay

This project is an optimized chatbot designed for anime role - play. It uses the Mistral model to generate responses in an anime - themed conversation. The chatbot can handle long - term conversations and is optimized for GPU usage.

🚀 Quick Start

To start using the chatbot, make sure you have a GPU available as talking to the model requires GPU support.

Prerequisites

Python environment
GPU with CUDA support

Installation

The necessary libraries can be installed by ensuring the following packages are available in your Python environment:

transformers
torch
bitsandbytes (for quantization)
logging
queue
threading
time
traceback
os
gc

Running the Chatbot

import os, torch, gc, threading, time, traceback
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TextIteratorStreamer
from queue import Queue, Empty
import logging

# Set environment variables
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
torch.backends.cudnn.benchmark = True
torch.backends.cuda.matmul.allow_tf32 = True
torch.set_float32_matmul_precision("high")
logging.getLogger("transformers").setLevel(logging.ERROR)

# Bot configuration
BOT_NAME = "Senko"
PROMPT_FILE = "instructions_prompt.txt"
MODEL_ID = "senko-sleepy-fox/mistral-anime-ai"
RESPONSE_TIMEOUT = 300  # Increased timeout for longer responses
MAX_CONTEXT_LENGTH = 10240
MAX_NEW_TOKENS = 8192  # Increased max tokens for longer responses
MEMORY_SIZE = 20

# Main function to start the chatbot
def main():
    bot = OptimizedChatBot()

    try:
        print("Initializing chatbot...")
        bot.load_system_prompt(BOT_NAME)
        bot.load_model()

        print(f"\n{'='*50}")
        print(f"{BOT_NAME} is ready! (Unlimited response length)")
        print("Commands:")
        print("  'exit' - Quit the program")
        print("  'clear' - Reset conversation memory")
        print("  'memory' - Show memory usage")
        print("  'status' - Show bot status")
        print(f"{'='*50}\n")

        conversation_count = 0

        while True:
            try:
                user_input = input("You: ").strip()

                if user_input.lower() == "exit":
                    print("Goodbye! üëã")
                    break
                elif user_input.lower() == "clear":
                    bot.memory = []
                    print("‚úÖ Conversation memory cleared.")
                    continue
                elif user_input.lower() == "memory":
                    print(f"üìä {bot.get_memory_info()}")
                    continue
                elif user_input.lower() == "status":
                    status = "üü¢ Ready" if not bot.is_generating else "üü° Generating"
                    print(f"Status: {status}")
                    print(f"Conversation turns: {len([t for t in bot.memory if t['bot'] is not None])}")
                    continue
                elif not user_input:
                    continue

                start_time = time.time()
                prompt = bot.prepare_prompt(user_input)
                response = bot.generate_reply_with_timeout(prompt)

                if response:
                    response_time = time.time() - start_time
                    print(f"[‚è±Ô∏è {response_time:.2f}s]")
                else:
                    print("‚ùå Failed to generate response. Try again or type 'clear' to reset.")

                conversation_count += 1

                if conversation_count % 10 == 0:
                    print("[üßπ Cleaning up memory...]")
                    bot.cleanup_memory()

            except KeyboardInterrupt:
                print("\n\n‚ö†Ô∏è Interrupted by user. Exiting gracefully...")
                break
            except Exception as e:
                print(f"\n‚ùå Conversation error: {e}")
                traceback.print_exc()
                print("Continuing... (type 'exit' to quit)")

    except Exception as e:
        print(f"üí• Startup error: {e}")
        traceback.print_exc()
    finally:
        print("\nüßπ Performing final cleanup...")
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
            torch.cuda.synchronize()
        gc.collect()
        print("‚úÖ Cleanup completed. Goodbye!")

if __name__ == "__main__":
    torch.cuda.empty_cache()
    import gc
    gc.collect()
    main()

✨ Features

Anime - Themed Role - play: The chatbot is designed to role - play as an anime character, providing emotionally - supportive responses.
Long - Term Memory: It can handle long - term conversations by maintaining a conversation history.
GPU Optimization: Optimized for GPU usage with quantization support to reduce memory consumption.
Timeout Handling: Supports a timeout mechanism for response generation to avoid long - running processes.

📦 Installation

The installation mainly involves setting up the Python environment and installing the required libraries. You can use pip to install the necessary packages:

pip install transformers torch bitsandbytes

💻 Usage Examples

Basic Usage

# Initialize the chatbot
bot = OptimizedChatBot()
bot.load_system_prompt(BOT_NAME)
bot.load_model()

# Start a conversation
user_input = "Hello, Senko!"
prompt = bot.prepare_prompt(user_input)
response = bot.generate_reply_with_timeout(prompt)
if response:
    print(response)

Advanced Usage

# Perform multiple conversations and manage memory
bot = OptimizedChatBot()
bot.load_system_prompt(BOT_NAME)
bot.load_model()

conversation_count = 0
while True:
    user_input = input("You: ").strip()
    if user_input.lower() == "exit":
        break
    elif user_input.lower() == "clear":
        bot.memory = []
        continue
    elif user_input.lower() == "memory":
        print(bot.get_memory_info())
        continue
    elif user_input.lower() == "status":
        status = "Ready" if not bot.is_generating else "Generating"
        print(f"Status: {status}")
        print(f"Conversation turns: {len([t for t in bot.memory if t['bot'] is not None])}")
        continue
    elif not user_input:
        continue

    prompt = bot.prepare_prompt(user_input)
    response = bot.generate_reply_with_timeout(prompt)
    if response:
        print(response)

    conversation_count += 1
    if conversation_count % 10 == 0:
        bot.cleanup_memory()

🔧 Technical Details

Model Loading: The chatbot uses the AutoTokenizer and AutoModelForCausalLM from the transformers library to load the model and tokenizer. It supports both 4 - bit and 8 - bit quantization for GPU usage.
Prompt Preparation: The chatbot maintains a conversation history in memory and prepares the prompt based on the user input and the conversation history.
Response Generation: The response is generated using the generate method of the model with a streaming mechanism. It also has a timeout mechanism to handle long - running processes.

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご