Zen LM

Models

Complete Zen4 model family - Consumer, Coder, and Ultra tiers

Zen4 Models

Consumer Line

Dense and MoE models for desktop and edge deployment. All built on abliterated open-source weights.

ModelParametersActiveContextLicense
Zen4 Mini4B4B32KApache 2.0
Zen48B8B32KApache 2.0
Zen4 Pro14B14B32KApache 2.0
Zen4 Max30B MoE3B256KApache 2.0
Zen4 Pro Max80B MoE3B256KApache 2.0

Zen4 Mini (4B)

Ultra-efficient for edge and mobile. Full quality at 4B parameters.

Zen4 (8B)

The standard model. Excellent balance of quality and efficiency.

Zen4 Pro (14B)

Professional-grade for demanding tasks. Strong reasoning and code generation.

Zen4 Max (30B MoE)

Flagship efficient model. 30B total with only 3B active via MoE.

Zen4 Pro Max (80B MoE) - Flagship

The ultimate consumer model. Hybrid Gated DeltaNet + Gated Attention + MoE architecture.


Coder Line

Specialized models for agentic programming and software engineering.

ModelParametersActiveContextLicense
Zen4 Coder Flash31B MoE3B131KMIT
Zen4 Coder80B MoE3B256KApache 2.0
Zen4 Coder Pro355B355B200KMIT

Zen4 Coder Flash (31B MoE)

Fast coding model for rapid iteration. MoE architecture for maximum efficiency.

Zen4 Coder (80B MoE) - Flagship Code

80B with 512-expert MoE for state-of-the-art code generation and agentic programming.

Zen4 Coder Pro (355B) - Cloud Only

Dense 355B coding powerhouse for maximum code intelligence.


Ultra Line

Trillion-parameter models for cloud deployment.

ModelParametersActiveContextStatus
Zen4 Ultra1.04T MoE32B256KCloud Only
Zen4 Ultra Max1T+ MoETBD1MComing Soon

Zen4 Ultra (1.04T MoE) - Cloud

Trillion-parameter frontier model with 384 experts and native vision capabilities.

Zen4 Ultra Max (1T+ MoE) - Coming Soon

Next-generation trillion-parameter model with 1M context window.


Zen5 — Next Generation

Zen5 Ultra — 2T+ parameter MoDE (Mixture of Distilled Experts) model. The largest open-weight model in history, trained on-chain via NVIDIA TEE confidential compute on hanzo.network.

ModelParametersActiveArchitectureContextStatus
Zen5 Ultra2T+ MoDETBDMixture of Distilled Experts1M+Research Preview

Key Features

  • 2T+ parameters — the largest open-weight model ever released
  • MoDE architecture — Mixture of Distilled Experts for efficient routing
  • On-chain training — verifiable training via NVIDIA TEE on hanzo.network
  • 1M+ context — full codebase and document understanding
  • GT-QLoRA — Gate-Targeted fine-tuning for MoE behavioral modification (paper)

Request Research Access

Zen5 is in private research preview. Researchers and institutions can request early access to preprints, weights, and evaluation under a special research license.


Formats

All locally-runnable models are available in multiple formats:

FormatUse CasePlatform
SafeTensorsFull precision, transformersAll
GGUFQuantized, llama.cpp/OllamaAll
MLXApple Silicon optimizedmacOS

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load any Zen4 model
model = AutoModelForCausalLM.from_pretrained("zenlm/zen4-pro-max")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen4-pro-max")

messages = [{"role": "user", "content": "Hello!"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Multimodal & Specialized

In addition to Zen4, the Zen LM family includes multimodal and specialized models:

ModelTypeDescription
zen-omniMultimodalText + Vision + Audio
zen-vlVision-LanguageImage understanding with function calling
zen-videoVideoText-to-video and image-to-video generation
zen-3d3D Assets3D mesh generation from text/image

All models available at huggingface.co/zenlm.

On this page