Back to calculator

Model notes

Phi-4 14B

Reasoning-oriented dense Phi model with moderate context length and a straightforward single-GPU footprint.

14.7B dense • 16,384 context • 10 KV heads

Architecture

Model spec

Architecture

Dense decoder-only transformer

Total params

14.7B

Active params

Dense model

Layers

40

Hidden size

5,120

Attention heads

40

KV heads

10

KV-bearing layers

40

Context length

16,384

Modality

Text

License

MIT

Why it matters

Why memory behaves this way

Research highlight

Reasoning-focused dense architecture aimed at strong capability per parameter rather than sparse routing.

Memory note

With a moderate context window, the model behaves like a classic dense checkpoint where weights dominate and cache stays secondary.

Checkpoints

Official profiles

Official BF16 checkpoint

BF16 checkpoint

Current

Microsoft's official phi-4 repository is about 29.3 GB on Hugging Face.

vLLMTransformers
Open checkpoint

Official ONNX INT4 checkpoint

4-bit checkpoint

Microsoft's official phi-4 ONNX GPU INT4 checkpoint directory is about 8.99 GB on Hugging Face.

Transformers
Open checkpoint

Sources

Reference links