Model notes

Qwen 3.5 0.8B

Compact Qwen3.5 checkpoint with a hybrid text-plus-vision stack and a small resident footprint for text-only local experimentation.

900M dense • 262,144 context • 2 KV heads

Architecture

Model spec

Architecture

Hybrid multimodal transformer

Total params

900M

Active params

Dense model

Layers

Hidden size

1,024

Attention heads

KV heads

KV-bearing layers

Context length

262,144

Modality

Multimodal, text-only estimate

License

Apache 2.0

Why it matters

Research highlight

Qwen3.5 alternates gated DeltaNet blocks with gated attention, so only a subset of layers carry a full KV cache during text generation.

Memory note

This text-only estimate still counts the resident multimodal checkpoint weights; only media-token-specific memory is excluded in v1.

Checkpoints

BF16 checkpoint

Current

Qwen ships Qwen3.5-0.8B in Hugging Face Transformers format with documented Transformers and vLLM usage.

vLLMTransformers

Sources