AI Rookies

BLEU — Bilingual Evaluation Understudy

Fact

A translation score based on matching short word chunks.

In Plain Words

BLEU is like a quiz grader with a tiny phrase checklist. Match the word chunks to get points. Skip half the answer and lose points.

It often scores machine translation and summaries. It is fast, but it does not truly understand meaning.

Related Concepts

MT
BLEU was first used to score machine translation quality.

N-gram LM
BLEU scores text by matching short runs of words.

WER
BLEU and WER are both automatic scores based on surface matches.

LLM-as-a-judge
LLM-as-a-judge can help with meaning BLEU misses.