Skip to content
ai

Attention mechanism

Attention Mechanism

Definition

The attention mechanism allows neural networks to dynamically weight the importance of different input positions when producing each output token. It computes compatibility scores between a query and a set of keys, uses those scores to create a weighted sum of values, and feeds the result to subsequent layers.

Attention is the core innovation that enabled transformers to surpass recurrent architectures on sequence modeling tasks.


Ship secure code faster

Crash Override integrates security into the developer workflow. No context switching, no waiting on reviews.