Model notes

OpenReasoning Nemotron 7B

Reasoning-tuned dense Nemotron checkpoint that tracks the familiar Qwen2.5 7B memory shape while targeting stronger math and code performance.

7.6B dense • 131,072 context • 4 KV heads

Open base model Open selected checkpoint

Architecture

Model spec

Architecture

Dense decoder-only transformer

Total params

7.6B

Active params

Dense model

Layers

Hidden size

3,584

Attention heads

KV heads

KV-bearing layers

Context length

131,072

Modality

Text

License

CC-BY-4.0 + Apache 2.0

Why it matters

Why memory behaves this way

Research highlight

OpenReasoning-Nemotron-7B is post-trained for deliberate reasoning but keeps the dense grouped-query Qwen2.5 backbone, so fit behavior remains straightforward.

Memory note

Resident weights set the floor, and the grouped KV layout keeps long-context cache growth moderate relative to older full-head dense models.

Checkpoints

Official profiles

Official BF16 checkpoint

BF16 checkpoint

Current

NVIDIA publishes OpenReasoning-Nemotron-7B as a Hugging Face Transformers checkpoint derived from Qwen2.5-7B, so v1 models it with the same dense grouped-query memory geometry.

vLLMTransformers

Open checkpoint

Sources

Reference links

https://huggingface.co/nvidia/OpenReasoning-Nemotron-7Bopen