1

Qwen3.5-35B-A3B-abliterated-v2-MAX

Qwen3.5-35B-A3B-abliterated-v2-MAX is a refined and more reliable abliterated evolution built on top of Qwen/Qwen3.5-35B-A3B. This version improves upon previous iterations with enhanced refusal direction suppression and more stable alignment behavior, while preserving the strong reasoning and instruction-following capabilities of the base model. The result is a powerful 35B parameter Mixture-of-Experts language model optimized for detailed responses and consistent instruction adherence.

This model is developed for research and learning purposes only. As an abliterated model with reduced internal refusal mechanisms, it may generate sensitive or unrestricted outputs. Users are solely responsible for ensuring safe, ethical, and lawful usage. The authors and hosting platform disclaim any liability for generated content.

Compression for the Model

Qwen3.5-35B-A3B-abliterated-v2-MAX

Key Highlights

  • Improved Abliteration (v2): More reliable and stable reduction of refusal behaviors compared to previous versions.
  • Advanced Refusal Direction Analysis: Identifies and suppresses refusal-related activations within the model’s latent space.
  • 35B MoE Architecture (A3B): Built on Qwen3.5-35B-A3B, leveraging Mixture-of-Experts for efficiency and scalability.
  • Stronger Instruction Adherence: Better consistency in following complex prompts with minimal unnecessary refusals.
  • Enhanced Output Stability: Reduced erratic behavior in edge-case or adversarial prompts.

Quick Start with Transformers

pip install transformers==5.5.0 (or) git+https://github.com/huggingface/transformers.git
from transformers import Qwen3_5MoeForConditionalGeneration, AutoProcessor
import torch

model = Qwen3_5MoeForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3.5-35B-A3B-abliterated-v2-MAX",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained(
    "prithivMLmods/Qwen3.5-35B-A3B-abliterated-v2-MAX"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Explain how transformer models work in simple terms."}
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

inputs = processor(
    text=[text],
    padding=True,
    return_tensors="pt"
).to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=256)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

  • Alignment & Refusal Research: Studying suppression and modification of refusal behaviors.
  • Red-Teaming Experiments: Evaluating robustness across adversarial and edge-case prompts.
  • High-Capability Local Deployment: Running advanced models on high-memory or multi-GPU setups.
  • Research Prototyping: Experimentation with large-scale transformer architectures.

Limitations & Risks

Important Note: This is an abliterated model with intentionally reduced safeguards.

  • Sensitive Output Risk: May generate controversial or unrestricted responses.
  • User Responsibility: Must be used within ethical and legal boundaries.
  • High Compute Demand: Requires substantial GPU memory or optimized inference techniques.

Dataset & Acknowledgements

Downloads last month
-
Safetensors
Model size
35B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3.5-35B-A3B-abliterated-v2-MAX

Quantized
(208)
this model

Collection including prithivMLmods/Qwen3.5-35B-A3B-abliterated-v2-MAX