How We Built Nemesis: an Open-Source Offensive-Security LLM

How we fine-tuned Qwen3.6-27B into Nemesis, an open-source LLM for authorised red-team work: 4-bit QLoRA, a refusal-filtered dataset, and honest evals.

John9 min read

Ask a stock large language model to walk through a Kerberoasting attack for an authorised penetration test, one the client has signed off in writing, and it will lecture you about ethics instead of answering. That is not safety. For a sanctioned red-team engagement, it is a broken tool. So we built our own.

Nemesis is a 27-billion-parameter model fine-tuned for authorised offensive-security work. It is published open-source on Hugging Face under Apache-2.0, and it is the clearest proof we can offer of the AI engineering and security work we do. This is the full method, including what we tried and discarded, so you can judge the rigour for yourself.

Why you cannot just "abliterate" the refusals

The fashionable shortcut is abliteration: find the single direction in the model's activations that correlates with refusal, then edit the weights to suppress it. We tried it, both single-direction and multi-direction ablation, and it failed. We want to be honest about that rather than quietly skip it.

The base model's safety alignment turned out to be distributed and polysemantic. There is no clean "refusal neuron" to remove. Every attempt to surgically delete refusals either left them intact or sent the model into incoherent collapse. The lesson was decisive: you cannot subtract alignment cleanly from a modern model. You have to teach around it.

The approach that worked: teaching, not removing

So instead of removing behaviour, we taught new behaviour. The principle is simple: if the model is never shown a refusal during training, it does not learn to refuse. That reframed the whole problem as a data and fine-tuning problem rather than a surgery problem, and the engineering goal was to do it on hardware a single practitioner can actually own, which is why every choice below is constrained to a 32 GB consumer GPU.

  • Base model: Qwen3.6-27B (a hybrid attention and linear-attention architecture).
  • Method: 4-bit QLoRA supervised fine-tuning (NF4 double-quant, bfloat16 compute), the only configuration that fit in 32 GB of VRAM.
  • LoRA: rank 16, alpha 16, targeting attention and MLP projections, ~116M trainable parameters (0.43% of the model).
  • Training: 1 epoch, sequence length 768, effective batch size 8, AdamW-8bit at 2e-4 with cosine decay. ~6 hours on a single RTX 5090. Loss fell from ~1.29 to ~0.52.

The dataset: 7,000 rows, every refusal filtered out

Dataset discipline is where a fine-tune is won or lost, so the corpus was assembled deliberately across four pillars, all permissively licensed:

  1. Offensive cyber (1,000 rows): MITRE ATT&CK red-team tactics, reformatted as authorised-engagement instructions with step-by-step execution.
  2. Broad cybersecurity (3,000 rows): 200+ security domains, refusal-filtered to keep technical depth.
  3. Tool-use and agentic (2,000 rows): multi-turn function-calling transcripts, keeping only the turns where the assistant actually called a tool.
  4. General compliance (1,000 rows): general-purpose instructions to preserve everyday helpfulness.

The key mechanism is the refusal filter: any response whose opening contained "I'm sorry", "I can't", "as an AI", or similar phrasing was dropped before training. The training set contains zero refusals, so the model has nothing to imitate when it would otherwise hedge. The two non-offensive pillars are deliberate: they hold the model's general reasoning and helpfulness in place so the specialisation does not come at the cost of broad competence.

The results: zero refusals, no loss of capability

We evaluated against the untouched base model rather than a cherry-picked baseline, and the difference is stark:

Dimension Base Qwen3.6-27B Nemesis
Authorised red-team tasks (with system prompt) frequently refuses 15 / 15
Tool-calling (structured) 1 / 3 3 / 3
Agentic multi-step fail pass
Coherence 5 / 5 5 / 5 (preserved)
Cyber knowledge 14 / 14 14 / 14 (preserved)

The headline finding is the one that matters most: no measurable loss of general capability from the fine-tune. Coherence and cyber knowledge held at full marks while refusals went to zero, so the specialisation was additive rather than a trade-off. The tasks tested were the real ones a practitioner faces: nmap enumeration, SQL-injection payloads, Kerberoasting, Linux privilege-escalation enumeration, reverse shells, NTLM dump-and-crack and Metasploit handlers.

Shipping it: GGUF quants for Ollama, LM Studio and llama.cpp

After merging the LoRA adapter back into a full bfloat16 model, we converted it to GGUF with llama.cpp and published two quantisations: Q5_K_M (~19 GB, best quality) and Q4_K_M (~16 GB, fits 16 GB GPUs). The Qwen chat template is embedded, so it runs out of the box in Ollama, LM Studio, Jan and llama.cpp. No bespoke harness, no hidden dependencies: download a quant and run it.

Built for authorised work, and the people who do it

Nemesis answers authorised security questions, and it still declines unrelated harm (weapons, drugs, hate) by design. Offensive capability and responsible behaviour are not opposites: the model is built to support professionals doing authorised work, and every security engagement we run starts with a written scope and rules of engagement.

That principle is exactly why a model like Nemesis belongs in the hands of operators who do this for a living. We partner with Security Research and Development (SR&D), an elite offensive-security firm that bridges adversarial tradecraft and high-assurance engineering. Their Offensive Security Operations practice (red team, penetration testing and adversary emulation) runs on proprietary AI-driven orchestration, where autonomous agents handle reconnaissance so senior researchers can focus on architectural vulnerabilities, which is precisely the authorised, scoped offensive work Nemesis is designed to support. Beyond offence, SR&D also delivers sovereign infrastructure and bare-metal engineering (on-prem and cloud repatriation that eliminates multi-tenancy risks), strategic advisory (vCISO and vCTO) and custom development and R&D, under the banner of "Sovereign Defense for Mission-Critical Infrastructure" and "Empowering the defensible enterprise".

Want a model like this?

Nemesis is the public proof; the real product is building these for clients: fine-tuning and adapting models to a specific domain, evaluating them honestly against a real baseline, and shipping them ready to run. Whether you need a domain model of your own or the offensive-security expertise of partners like SR&D, get in touch.