Skip to content
ai

Streaming

LLM Streaming Output

Definition

Streaming delivers language model output to the client token-by-token as it is generated, rather than waiting for the full response to complete. This dramatically reduces perceived latency and improves user experience in chat interfaces.

Streaming is implemented via server-sent events (SSE) or WebSockets and is supported by all major LLM APIs including OpenAI, Anthropic, and Google.


Ship secure code faster

Crash Override integrates security into the developer workflow. No context switching, no waiting on reviews.