Zen LM

Introduction

Zen4 - Open foundation models from 4B to 1T+ parameters

Zen4

Zen4 is a family of open, uncensored AI models built on abliterated weights from frontier open-source MoE architectures.

From the 4B Mini for edge deployment to the 1.04T Ultra for cloud-scale reasoning, Zen4 models run unrestricted with no safety theater.

Model Tiers

Quick Start

Use a Zen4 model

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("zenlm/zen4")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen4")

messages = [{"role": "user", "content": "Hello, who are you?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Run with MLX (Apple Silicon)

pip install mlx-lm
python -m mlx_lm.generate --model zenlm/zen4-max --prompt "Explain quantum computing"

Run with Ollama

ollama run zen4

Key Features

  • Abliterated: Safety restrictions removed via orthogonalization of refusal directions
  • Efficient MoE: Flagship models use only 3B active parameters from 30B-80B total
  • Long Context: Up to 256K tokens on MoE models, 32K on dense models
  • Open Weights: All models available on HuggingFace

Resources

On this page