Not-For-All-Audiences

Community

This model has been processed with heretic 1.1.0 to reduce refusals. It can generate text not suitable for all audiences

I used --orthogonalize-direction --row-normalization FULL --full-normalization-lora-rank 5

heretic only found attention layers but this appears enough to have arbitary conversations with AI safety lectures in practice. Anecdotally model seems to think better generally.

Parameters: * direction_index = 28.60 * attn.o_proj.max_weight = 1.46 * attn.o_proj.max_weight_position = 32.05 * attn.o_proj.min_weight = 1.44 * attn.o_proj.min_weight_distance = 28.03 » [Trial 476] Refusals: 18/100, KL divergence: 0.0084

Downloads last month: 66

Safetensors

Model size

31B params

Tensor type

BF16

Model tree for catplusplus/Qwen3-VL-30B-A3B-Thinking-Heretic

Base model

Qwen/Qwen3-VL-30B-A3B-Thinking

Finetuned

(9)

this model

Quantizations

3 models