Skip to content
ai

BLEU

Bilingual Evaluation Understudy

Definition

BLEU is an automatic metric for evaluating machine-generated text quality by measuring n-gram overlap with human reference translations. It was originally designed for machine translation and ranges from 0 to 1, with higher scores indicating closer match to reference text.

BLEU is widely criticized for correlating poorly with human judgment on open-ended generation tasks.


Ship secure code faster

Crash Override integrates security into the developer workflow. No context switching, no waiting on reviews.