AI Rookies

Prefill

Fact

The step where an AI reads the input and builds a KV cache.

In Plain Words

Prefill is a pizza shop reading your giant order first. No slice is out yet, but the sticky notes are ready.

You meet it in chatbots and long document Q&A. It shapes the wait for the first word, and part of the cost.

Related Concepts

KV cache
Prefill turns the whole prompt into a KV cache for later words.

Inference engine
The inference engine schedules prefill and word-by-word generation.

Context-window
A longer context-window gives prefill more text to read.