Qwen3-30B-A7.5B-24-Grand-Brainstorm Open Source Model - Free Deployment to Assist in In

Qwen3 30B A7.5B 24 Grand Brainstorm

Developed by DavidAU

A fine-tuned version based on the Qwen3-30B-A3B Mixture of Experts model, increasing the number of active experts from 8 to 24, suitable for complex tasks requiring deep reasoning

Large Language Model

Transformers

#24 Expert Deep Reasoning #32k Long Context #Mixture of Experts Architecture

Downloads 55

Release Time : 5/4/2025

Model Overview

This is an enhanced Mixture of Experts model that improves complex reasoning capabilities by increasing the number of active experts (24), making it ideal for scenarios requiring in-depth thinking and analysis.

Model Features

Enhanced Expert System

Utilizes 24 experts (originally 8), activating 7.5B out of 30B parameters, significantly improving complex task handling capabilities

Long Context Support

Supports 32k context + 8k output, totaling 40k tokens for long-text processing

Flexible Quantization Support

Supports multiple quantization formats including GGUF, GPTQ, EXL2, AWQ, HQQ, and can run on CPU/GPU

Deep Reasoning Capability

Particularly suitable for complex tasks requiring long chain-of-thought and systematic reasoning

Model Capabilities

Complex Text Generation

Deep Reasoning

Long Text Processing

Multi-turn Dialogue

Use Cases

Scientific Research

Climate Solution Analysis

Analyzing methods to reduce global temperature using space radiation cooling effects

Can generate 20+ innovative solutions

Creative Writing

Sci-Fi Story Creation

Generating sci-fi short stories exploring themes of human emotions and survival

Can create complete stories with deep emotional descriptions

Problem Solving

Innovative Thinking Stimulation

Proposing 'out-of-the-box' solutions for specific problems

Can systematically generate multiple innovative ideas

🚀 Qwen3-30B-A7.5B-24-Grand-Brainstorm

This repository contains a finetuned version of the Qwen model with 24 experts, suitable for complex use cases.

🚀 Quick Start

This repo contains the full precision source code in "safe tensors" format, which can be used to generate GGUFs, GPTQ, EXL2, AWQ, HQQ and other formats. The source code can also be used directly.

✨ Features

Expert Configuration: This is a simple "finetune" of the Qwen's "Qwen 30B - A3B" (MOE) model. It sets the number of experts in use from 8 to 24 (out of 128 experts), using 7.5B (of 30B) parameters instead of 3B (of 30B) parameters.
Context Size: It has a context size of 32K + 8K for output (40k total).
Template Support: It supports the use of Jinja Template or CHATML template.

📚 Documentation

Model Usage

Regular vs. Complex Use Cases: Regular or simpler use cases may benefit from using the normal (8 experts), the "12 cooks" (12 experts), 16 Extreme (32k or 128k context) or "High - Speed" (4 experts) version(s). Using 24 experts instead of the default 8 will slow down token/second speeds by about 2/3. This version with 24 experts may be overkill for some use cases.
Temperature Suggestion: For some use cases, higher temperatures (1+, 2+) are suggested.
Model Comparison: It is recommended to try the 4, 8 (reg), 12, and 16 versions as they will be faster in terms of tokens per second and may better meet your use cases.

Important Notes

Quantization and Hardware: Due to the unique nature (MOE, Size, Activated experts, size of experts) of this model, GGUF quants can be run on the CPU, GPU or with GPU part "off - load", right up to full precision.
Imatrix Difficulty: This model is difficult to Imatrix. You need a much larger imatrix file / multi - language / multi - content (i.e., code/text) to imatrix it.
GPU Speed Advantage: GPU speeds will be 4x - 8x or higher than CPU - only speeds. And this model will also be very fast relative to other "30B" models (Token per second speed equal roughly to 7.5B "normal" model speeds).

Example Generation

Example Generation #1

Settings: Temp 1.2, rep pen 1.06, rep pen range 64, topk 100, topp.95, minp.05
Quant and Hardware: Q4KS [non - imatrix], CPU only (windows 11), LMSTUDIO.
Speed: 8 T/S (no GPU offload)

Prompt: Explain ways to use the "night" time cooling of radiant energy into space to reduce global temperatures.

Thought Process: Thought for 6 minutes 18 seconds. The model considered various ways to enhance the Earth's ability to emit heat into space during nighttime to reduce global temperatures, including enhancing surface emissivity, reducing greenhouse gases, atmospheric engineering, using space - based cooling, etc.

Possible Ways:

Enhancing emissivity of surfaces on Earth. For example, using materials with high emissivity (like black body) which can radiate more heat at night. If the surface is designed to emit more energy, then it cools down faster. This could reduce local temperatures and maybe contribute to global cooling.
Reducing greenhouse gases in the atmosphere. But that's not specifically nighttime, but overall reduction. However, during nighttime, when there is no sunlight, the greenhouse gases might have a different effect.
Using atmospheric engineering to create conditions that allow more heat to escape during night time. For example, maybe using reflective particles or altering cloud cover to let more radiation escape.
Creating structures or systems that can actively cool by radiating heat during nighttime. Maybe like thermal emissive surfaces, which are designed to maximize the emission of heat into space.
Using space - based cooling. Maybe if we can somehow reflect or redirect heat radiation into space more effectively. Perhaps using satellites or mirrors.

💡 Usage Tip

I suggest trying the 4, 8 (reg), 12, and 16 versions as these will be faster - token / second, and may meet your use case(s) better. Also, for some use cases, higher temps (1+, 2+) are recommended.

⚠️ Important Note

This model is difficult to Imatrix. You need a much larger imatrix file / multi - language / multi - content (i.e., code/text) to imatrix it. And make sure to review the document [https://huggingface.co/DavidAU/Maximizing - Model - Performance - All - Quants - Types - And - Full - Precision - by - Samplers_Parameters] for critical parameter, sampler and advance sampler settings.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご