Back to calculator

Model notes

Gemma 2 27B

Larger Gemma model that trades a shorter native context window for more capacity per token.

27B dense • 8,192 context • 16 KV heads

Architecture

Model spec

Architecture

Dense decoder-only transformer

Total params

27B

Active params

Dense model

Layers

46

Hidden size

4,608

Attention heads

32

KV heads

16

KV-bearing layers

46

Context length

8,192

Modality

Text

License

Gemma terms

Why it matters

Why memory behaves this way

Research highlight

Scaled Gemma dense architecture with more capacity per token than the 9B variant.

Memory note

Because the context window is shorter, most VRAM pressure comes from resident weights rather than cache growth.

Checkpoints

Official profiles

Official BF16 checkpoint

BF16 checkpoint

Current

Google's official Gemma 2 27B Instruct release is exported in bfloat16.

vLLMTransformers
Open checkpoint

Sources

Reference links