Blog

Claude Conversation Cache

Oskar Austegard · March 30, 2026

If you use Claude.ai with any regularity — particularly with extended thinking or tool use — you've hit this: Claude streams 90% of a complex response, the UI throws a "response could not be fully generated" error, and everything from that turn vanishes. Your prompt, the tool calls, the partial response. Gone. The status page, of course, reads "All Systems Operational."

The infuriating part isn't the crash. It's that all the data was right there in my browser. Claude streamed it. I watched it render. Then the error dismissed it and I got to start over, spending my own time and tokens trying to reproduce what had already been generated.

Claude Conversation Cache is a Chrome extension that fixes this. It runs silently in the background, intercepting Claude's API traffic and caching it locally in IndexedDB. When the UI crashes, your conversation data is already preserved.

How It Works

The extension operates through two content scripts running in different Chrome extension "worlds," a service worker, and a popup/viewer UI.

Fetch interception is the core. A content script in the page's MAIN world wraps window.fetch() and pattern-matches against Claude's API endpoints: conversation fetches, completion streams, and snapshot loads. For regular JSON responses it clones and caches while passing the original through. For SSE completion streams — the interesting case — it tees the ReadableStream: one fork feeds Claude's UI unchanged, the other gets decoded chunk-by-chunk and forwarded to the cache.

The MAIN/ISOLATED relay solves a Chrome extension architecture constraint. The MAIN world script can intercept fetch() because it runs in the page's JS context, but can't talk to extension APIs. The ISOLATED world script has chrome.runtime access but can't see page-level objects. They bridge via window.postMessage. This replaced an earlier BroadcastChannel approach that doesn't work across extension/page origins.

The service worker manages two IndexedDB stores: one for complete conversation snapshots, another for in-progress stream data. Chunks accumulate as they arrive. On normal completion, the service worker reconstructs the assistant message from SSE events — content_block_delta for text and thinking, tool_use blocks, etc. — and merges it into the cached conversation. On stream error, it preserves whatever raw buffer it has. That's the crash recovery path.

Proactive capture means the extension doesn't wait for new messages. On page load and SPA navigation it fetches the current conversation via the bootstrap API and caches it immediately. A MutationObserver plus popstate listener handles Claude's client-side routing.

Relationship to Claude Pruner

I already had Claude Pruner, a bookmarklet that fetches conversation data via the API and opens a pruner UI for selective export. It works well for its purpose, but a bookmarklet is reactive — you run it after you notice something worth saving. When Claude's inference engine dies mid-stream, there's nothing left to fetch.

The cache extension reuses Pruner's rendering engine (the pruner-core.js library) for viewing cached conversations. Click the extension icon for a list of cached conversations sorted by recency. Click a row to open the full-tab viewer with the same granular selection interface: toggle human/assistant/tools/thinking, stats for selected content, and export as structured text or standalone HTML. There's also a manual "Capture" button and quick links to jump back to conversations on claude.ai.

What It Doesn't Do

This is purely local and defensive. It doesn't modify Claude's behavior, doesn't send data anywhere, doesn't attempt to resume failed generations. It just makes sure the data that was already streamed to your browser doesn't vanish when Claude's infrastructure hiccups. All data lives in IndexedDB until you clear it.

Installation

Clone or download from GitHub. Open chrome://extensions/, enable Developer mode, click "Load unpacked," select the folder. Browse claude.ai normally — the extension works silently in the background.

Co-written with Muninn, my stateful AI agent.