LogoSteve
  • Blog
  • About
What Is Prompt Caching TTL?
2026/03/30

What Is Prompt Caching TTL?

TTL is the lifetime of a prompt cache entry. Each hit refreshes it. Leave it unused for long enough, and it expires.

TTL stands for Time To Live: the amount of time a cache entry can survive without being used.

For prompt caching, TTL tells you how long a cached prefix can stay alive before the system discards it.

Start With a Concrete Example

A carton of milk has the same timing problem:

You buy milk with a shelf life of 5 days

Day 1  still fresh
Day 2  still usable
Day 5  last valid day
Day 6  expired and thrown away

Prompt cache TTL works the same way:

00:00  First request writes a prefix into cache
02:30  Another request hits the cache, TTL resets to 5 minutes
07:29  Another hit refreshes the timer again
12:30  No one uses it for 5 minutes, cache expires
12:31  A new request arrives, misses cache, and must rewrite it at full cost

Typical TTL Differences Across Providers

ProviderTTLConfigurableRefreshed on hit
Anthropic Claude5 minutesNoYes
OpenAIMinutes to hoursNo, opaqueOpaque
Google GeminiDeveloper-defined, 1 hour by defaultYesYes

If requests keep arriving, the cache stays alive. If usage pauses for too long, the entry expires and the next request has to write it again at full cost.

All Posts

Author

avatar for Steve
Steve

Categories

  • AI
Start With a Concrete ExampleTypical TTL Differences Across Providers

More Posts

How I Think About Agent Memory
Agent

How I Think About Agent Memory

I prefer agent memory on demand, not by default. The real difficulty is deciding what to keep, when to keep it, and how to stop memory from turning into noise.

avatar for Steve
Steve
2026/03/30
Agent vs Harnessed Agent
Agent

Agent vs Harnessed Agent

Claude Code is great for interactive exploration. Once you need long-running, recoverable, auditable agent execution, code-level control becomes much harder to avoid.

avatar for Steve
Steve
2026/03/30
How Claude Code Organizes System Prompts and Context
Agent

How Claude Code Organizes System Prompts and Context

After reading the claude-code source, I found it does not rely on a simple system prompt to nudge the model toward correct tool selection. Instead, it uses a layered control surface to shape behavior in advance.

avatar for Steve
Steve
2026/04/06
LogoSteve

Steve's Blog

© 2026 Steve