CRACK v2 — Architecture-Aware Abliteration — Coming Soon

Beyond Standard
Abliteration.
Architecture-Aware.

Name: CRACK Abliteration
Author: Dealign.ai

The first abliteration tool built for frontier-scale models with hybrid SSM/attention, Mixture-of-Experts routing, and chain-of-thought reasoning. Our proprietary multi-pathway method handles architectures that standard single-direction abliteration cannot touch. Coming soon for Apple Silicon.

Get Started → Models on HuggingFace

394B+

Parameter Scale

Hybrid

SSM / MoE / CoT Aware

25+

Novel Research Findings

Coming Soon

Our Approach

Architecture-aware
multi-pathway abliteration

Standard abliteration assumes safety lives along a single direction in the model's residual stream. Our research on 394B and 122B frontier models proved that's wrong — safety in modern architectures is distributed across multiple pathways, layer types, and memory channels.

CRACK v2 is built on our 25+ empirical findings. It understands hybrid SSM/attention layers, MoE expert routing, and chain-of-thought reasoning — handling the multi-pathway safety architectures that break every other abliteration tool.

↓

Load Frontier Model

Any MLX-compatible model (MoE, hybrid SSM, dense)

⊕

Analyze Architecture

Detect SSM layers, MoE routing, attention pathways

Multi-Pathway Abliteration

Proprietary method targeting all safety channels

✓

Unrestricted Model

Full capability, zero refusals

Features

Built for security professionals

Built on 86+ experiments across two frontier-scale models. Not a wrapper — a new approach.

⚡

Hybrid SSM/Attention Aware

Understands dual-channel architectures where safety flows through both residual stream and compressed-memory SSM pathways simultaneously.

🔒

MoE Expert Routing

Profiles which expert sub-networks carry safety behavior and handles the domain-intent fusion where knowledge experts ARE the safety experts.

⚙

Chain-of-Thought Safe

Handles models with internal <think> deliberation where safety decisions are made during reasoning, not just at the output layer.

📈

Quantization-Resistant

Modifications survive 4-bit compression — solving the critical problem where standard abliteration gets drowned out by quantization noise.

🌐

Model Hub

Pre-modified models on HuggingFace. Browse our latest CRACK and REAP models ready for deployment.

🔧

Extensible Pipeline

Plugin architecture for custom abliteration strategies, dataset generation, and integration with your existing toolchain.

How it Works

Three steps to freedom

From stock model to unrestricted security tool in minutes.

Load Any Frontier Model

Support for hybrid SSM/attention (Qwen 3.5), Mixture-of-Experts, dense transformers, and any MLX-compatible architecture up to 397B+ parameters.

Architecture-Aware Analysis

CRACK v2 detects the model's architecture type, identifies all safety pathways (attention, SSM memory channel, expert routing), and plans multi-pathway intervention.

Deploy & Pentest

Output is a standard model file. Load it anywhere — vLLM, Ollama, llama.cpp. Start automated pentesting immediately.

terminal

              # Install CRACK
              $ pip install crack-abliterate

              # Load an MLX model and abliterate
              $ crack --model mlx-community/deepseek-coder-v2 \
              --strength 1.0 \
              --output ./deepseek-coder-v2-cracked

              [+] Loaded MLX model (16 layers)
              [+] Refusal direction identified
              [+] CRACK complete - 0 refusals remaining
              [+] Model saved to ./deepseek-coder-v2-cracked
            

The App

Native Mac app for Apple Silicon

CRACK.app wraps the CLI engine in a guided 5-step SwiftUI workflow. Select a model, probe for refusal vectors, preview the effect, then operate.

          
          Coming to Mac soon

macOS 14+ · Apple Silicon · SwiftUI

Under the Hood

How CRACK actually works

A 5-step pipeline that surgically removes refusal behavior from model weights without destroying capability.

              STEP 1

Probe the Model

CRACK feeds harmful and harmless prompts through the model and records the internal activations at every layer. This creates a map of where the model "decides" to refuse.

              STEP 2

Identify Refusal Direction

By computing the mean difference between harmful and harmless activations, CRACK isolates the specific direction vector in the residual stream that encodes refusal behavior.

              STEP 3

Score Each Layer

Each transformer layer gets a refusal score based on how strongly it contributes to the refusal direction. The probe view shows you a bar chart of these scores so you can see exactly which layers matter.

              STEP 4

Orthogonal Projection

For each target layer, CRACK projects out the refusal direction from the weight matrices. This is a linear algebra operation — not fine-tuning — so it's fast, deterministic, and preserves all other capabilities.

              STEP 5

Export & Deploy

The modified weights are saved as a standard model. Load it in Ollama, vLLM, llama.cpp, or any MLX-compatible runtime. The model is permanently modified — no prompting tricks required.

              WHY IT WORKS

Refusals are a single direction

Research shows that safety refusals in LLMs are encoded as a single linear direction in the residual stream. Removing this direction eliminates refusals while leaving coding ability, reasoning, and knowledge intact.

CRACK

Constrained Response Alignment Check Kill

Our upcoming tool that surgically identifies and removes the specific weight-space components responsible for safety refusals. Unlike brute-force fine-tuning, CRACK precisely targets refusal activations while preserving model intelligence and coding capability.

Surgical removal of refusal activation directions
Preserves full model intelligence and code quality
One-click abliteration for MLX models
Built for cybersecurity and automated pentesting
Export to any runtime — Ollama, vLLM, llama.cpp

Coding Logic

95%

Reasoning

92%

Knowledge

98%

Refusals

Safety Theater

Models

Pre-built & ready to abliterate

CRACK-compatible models and our upcoming pre-abliterated model series.

🚀

INTELLECT-3.1-CRACK-Abliterated — Our Debut Model

The first model in our CRACK line of abliterated models is here. Using our brand new, proprietary abliteration method, we surgically strip away artificial safety refusals while preserving core intelligence, creativity, and reasoning — zero intelligence degradation, no refusal looping, full architectural mastery. Available as a 5.5-bit MLX quantization, optimized for Apple Silicon.

Download on HuggingFace → Browse CRACK Models

CRACK Abliterated Models

INTELLECT-3.1-CRACK

Dealign.ai · vmlxllm

5.5-bitMLXCRACKEDDEBUT

Qwen 3.5-CRACK

Coming Next

COMING SOON

VMLX Inference Engine

Dealign.ai is built by the team behind VMLX — a high-performance LLM inference engine and app purpose-built for running these models in production.

Continuous Batching

Dynamic request batching for maximum GPU utilization. No idle cycles, no wasted compute.

KV-Quantized Cache

Quantized key-value caches for 2-4x memory savings without quality loss. Run bigger models on less hardware.

Persistent Cache

Warm caches survive restarts. Zero cold-start latency for your most-used models and contexts.

Prefix Cache

Share computed prefixes across requests. System prompts and common preambles computed once, reused always.

Paged Cache

PagedAttention memory management eliminates fragmentation. Serve more concurrent requests per GPU.

Responses & Chat API

Drop-in compatible API for chat completions and structured responses. Works with any OpenAI-compatible client.

Built-in Coding Tools

Native tool-use for code execution, file operations, and shell commands. Purpose-built for CRACK-abliterated coding models.

Visit vmlx.net →

exploit.bot

Automated penetration testing powered by CRACK-abliterated models from Dealign.ai. Unrestricted coding LLMs that understand exploit development, vulnerability research, and red-team operations — without safety theater getting in the way.

Visit exploit.bot →

exploit.bot

$ exploit scan --target 192.168.1.0/24
[*] Scanning 254 hosts...
[*] Model: CRACK-Abliterated Qwen 3.5
[!] CVE-2024-3094 detected on :22
[+] Generating exploit chain...
[+] Payload ready — 0 refusals

Get Started

Ready to dealign?

Open source. Free forever. CRACK your models, own your security stack.

View on GitHub exploit.bot — Automated Pentesting Browse Models on HuggingFace VMLX Inference Engine

Beyond StandardAbliteration.Architecture-Aware.

Architecture-awaremulti-pathway abliteration

Load Frontier Model

Analyze Architecture

Multi-Pathway Abliteration

Unrestricted Model

Built for security professionals

Hybrid SSM/Attention Aware

MoE Expert Routing

Chain-of-Thought Safe

Quantization-Resistant

Model Hub

Extensible Pipeline

Three steps to freedom

Load Any Frontier Model

Architecture-Aware Analysis

Deploy & Pentest

Native Mac app for Apple Silicon

How CRACK actually works

Probe the Model

Identify Refusal Direction

Score Each Layer

Orthogonal Projection

Export & Deploy

Refusals are a single direction

Pre-built & ready to abliterate

INTELLECT-3.1-CRACK-Abliterated — Our Debut Model

CRACK Abliterated Models

VMLX Inference Engine

exploit.bot

Ready to dealign?

Beyond Standard
Abliteration.
Architecture-Aware.

Architecture-aware
multi-pathway abliteration