FitMyGPU
Back to calculator

Qwen

Qwen 3 4B Thinking 2507

Qwen3 update focused on deeper reasoning and longer native context, tuned specifically for more complex thinking-heavy workloads.

Overview and architecture

What it is

Company

Qwen

Family

Qwen

Release date

Aug 5, 2025

Architecture

Dense decoder-only transformer

License

Apache 2.0

Modality

Text

Context window

262,144

Total params

4B

Active params

Dense model

Layers

36

Hidden size

2,560

Attention heads

32

KV heads

8

KV-bearing layers

36

Research highlight

What improved

Reasoning-focused update

Qwen describes this release as a scaled-up thinking-capability update rather than a general-purpose refresh.

256K native context

The 2507 update moves the model to 256K native context, which is one of the clearest deployment changes from the base 4B release.

Deeper thinking length

Qwen explicitly notes a longer thinking length and recommends the model for highly complex reasoning tasks.

Training and release context

How it was released

Release lineage

This is an updated reasoning-oriented version of Qwen3-4B rather than a separate new architecture family.

Model geometry

The update keeps the same 4.0B parameter, 36-layer, 32Q/8KV dense geometry as the base 4B model.

Context packaging

Unlike the base model’s 32K native context with YaRN extension, this update is packaged with 256K native context.

Where it is strong

Where it is strong

Complex reasoning

Best fit for logic, mathematics, science, coding, and other tasks where longer reasoning traces help.

Long-context understanding

The 256K native context makes it more useful for very long inputs than the base Qwen3 dense line.

Tool and instruction use

Qwen also positions the update as stronger on instruction following and tool usage, not only on benchmark reasoning.

Memory behavior

What dominates VRAM

This remains a dense 4B model, but the 256K native context means KV growth can become a much larger part of the total than on the base 32K-native release.

Sources

Where this page is grounded