Blog

Numbers Every Claude.ai Skill Developer Should Know

Inspired by Jeff Dean's "Numbers Everyone Should Know" — adapted for the Claude.ai Container Compute Environment. Empirically researched and written by Claude opus 4.6

Jeff Dean's insight was that engineers who internalize a few key performance numbers can evaluate design alternatives on the back of an envelope — without building prototypes. The same principle applies to designing Skills for Claude.ai. Every tool call, file read, package install, and line of generated code has both a time cost (user-experienced latency) and a token cost (dollars). Skill authors who know these numbers will write faster, cheaper, and more effective skills. Those who don't will produce skills that feel sluggish and burn through token budgets.

This document provides the reference numbers, measured empirically in a live Claude.ai container (February 2026). Skim the summary table, absorb the design rules, and refer to the appendix when you need specifics.


At a Glance

Python startup                              40 ms
Import numpy                               200 ms
Import pandas                              860 ms
Import sklearn                           2,750 ms
uv pip install (any complexity)          ~800 ms        always use uv
pip install (light)                      3,500 ms
pip install (heavy)                     91,000 ms        113× slower than uv
npm install express                      4,500 ms
Playwright browser launch               1,725 ms
Write 100 MB to disk                        84 ms
Read 100 MB from disk                       42 ms
Tool call round-trip (minimum)           3,000 ms        the dominant cost
Tool call round-trip (web_search)    5,00010,000 ms

--- Token costs at Opus 4.5 rates ($5/$25 per MTok in/out) ---
1 tool call overhead                    ~50 tokens      ~$0.001
Read 10 KB file (input)              ~2,600 tokens      ~$0.013
Write 10 KB file (output)            ~2,600 tokens      ~$0.065   5× reading
web_search result                 2,0005,000 tokens     ~$0.010.03
web_fetch full page              5,00050,000 tokens     ~$0.030.25
web_fetch (token-limited to 100)       ~130 tokens      ~$0.001

Design Rules

1. Batch everything into single tool calls

The ~3–4 second tool call round-trip dominates any operation that takes less than a second internally. This is the single most important optimization for skill design.

Bad: 5 view calls to read 5 files → 15–20 seconds, 125 output tokens of overhead
Good: 1 bash cat file1 file2 ... call → 3–4 seconds, 45 output tokens of overhead

2. Always use uv, never pip

uv pip install is 4.5× faster for light packages and 113× faster for heavy dependency trees (91s → 0.8s). The container ships with uv 0.9.26 pre-installed. There is no reason to ever use pip.

3. Check what's pre-installed before designing a skill

The container ships with a massive set of pre-installed Python packages: pandas, numpy, matplotlib, scikit-learn, scipy, openpyxl, python-docx, python-pptx, reportlab, pdfplumber, playwright, lxml, beautifulsoup4, Pillow, seaborn, requests, flask, markitdown, and many more. Node has pptxgenjs, sharp, mermaid, marked, pdf-lib, playwright, and typescript. Any skill that installs one of these is wasting 1–90 seconds for nothing.

4. Output tokens are 5× input tokens — treat writes as expensive

Reading a 10 KB file costs ~$0.013 (input tokens). Writing one costs ~$0.065 (output tokens). This makes str_replace for targeted edits far more economical than regenerating entire files, and makes the "iterate in /home/claude, copy once to outputs" pattern important for multi-step workflows.

5. Use view with line ranges aggressively

Reading an entire 10 KB file to check one section costs ~2,600 input tokens. Reading 5 specific lines costs ~80. That's a 32× difference. Inspect structure first (wc -l, head), then target specific ranges.

6. Control web_fetch ingestion

An uncontrolled web_fetch can dump 50,000 input tokens into context. Always use text_content_token_limit. For checking if a page exists or grabbing a headline, 100–200 tokens is plenty.

7. Front-load context in system prompts (caching makes it nearly free)

Prompt caching gives 90% savings on cache reads. Skill instructions in the system prompt pay full price on the first turn but only 10% on every subsequent turn. This makes detailed, information-rich skill instructions economically sound — the amortized cost is low.

8. Work in /home/claude, deliver to /mnt/user-data/outputs

Write throughput to /mnt/user-data/outputs is 2.4× slower than /home/claude. For any multi-step file generation, always work in the home directory and copy the final artifact to outputs.

9. Know the file-inspection decision matrix

Goal Best approach Token cost
Does file exist? bash ls -la ~50
File size / line count? bash wc -lc ~50
Read specific lines view with range ~80 + lines
Read full small file view (auto-truncates at ~16K chars) ~2,600 for 10 KB
Read full large file view (safe) — bash cat has no truncation limit varies
Multiple file ops Single bash call one overhead charge

What We Can't Measure From Inside

These numbers require external observation (a bookmarklet or Claude in Chrome, planned for Phase 2):


Appendix A: Container Specifications

OS: Ubuntu 24.04.3 LTS, kernel 4.4.0
CPU: 4 cores (model undisclosed)
RAM: 9 GB
Disk: 10 GB (shared /home/claude and /mnt/user-data)
Python: 3.12.3 | Node: 22.22.0 | npm: 10.9.4 | uv: 0.9.26 | git: 2.43.0

The container filesystem resets between tasks. All benchmarks reflect fresh container state.

Appendix B: Detailed Benchmarks

B.1 File I/O

Operation Time
Write 100 MB (/home/claude) 84 ms (~1.2 GB/s)
Read 100 MB (/home/claude) 42 ms (~2.4 GB/s)
Write 50 MB (/home/claude) 50 ms (~1 GB/s)
Write 50 MB (/mnt/user-data/outputs) 122 ms (~410 MB/s)
Create 1,000 small files 213 ms (0.21 ms/file)

B.2 Python Import Times (cold, isolated process)

Library Time Pre-installed?
json / csv 17–21 ms ✅ stdlib
lxml 2 ms
PIL / Pillow 4 ms
reportlab 15 ms
bs4 (BeautifulSoup) 191 ms
numpy 200 ms
docx (python-docx) 223 ms
scipy 229 ms
pptx (python-pptx) 368 ms
matplotlib 495 ms
pdfplumber 276 ms
requests 314 ms
flask 359 ms
openpyxl 511 ms
matplotlib.pyplot 837 ms
pandas 858 ms
markitdown 1,913 ms
sklearn 2,752 ms

B.3 Compute Operations

Operation Time
Python interpreter startup 41 ms
Node.js interpreter startup 53 ms
Simple Python script (json serialize) 57 ms
gzip 1 MB text 20 ms
gzip 1 MB random data 43 ms
Numpy 1000×1000 matrix multiply 317 ms
Pandas groupby on 100K rows 1,071 ms
Playwright Chromium launch + close 1,725 ms

B.4 Package Installation

Method Package Time
uv pip install httpx (light) 773 ms
pip install httpx (light) 3,512 ms
uv pip install ydata-profiling (heavy) 803 ms
pip install ydata-profiling (heavy) 91,039 ms
npm install express 4,533 ms

B.5 Network (post-proxy-initialization)

Operation Time
DNS resolution (via local proxy) < 0.1 ms
TCP connect (via proxy) < 1 ms
curl fetch 77 KB (pypi.org) 53 ms
curl fetch 800 KB (npmjs.org) 116 ms
curl to github.com (301 redirect) 185 ms
First request in new container may fail — proxy init, retry once

B.6 Tool Call Round-Trip Overhead

Measured as wall-clock deltas between consecutive date +%s%N calls across minimal bash_tool invocations. Includes Claude's inter-call processing + API dispatch + execution.

Tool Observed Range
bash_tool (minimal echo) 3–4 seconds
view (small file range) 3–5 seconds
create_file 3–5 seconds
web_fetch (small page, token-limited) 5–8 seconds
web_search 5–10 seconds

B.7 Token Economics

API pricing (Feb 2026):

Model Input / MTok Output / MTok Cache Read / MTok
Opus 4.5 $5.00 $25.00 $0.50
Sonnet 4.5 $3.00 $15.00 $0.30
Haiku 4.5 $1.00 $5.00 $0.10

Per-tool token costs:

Tool Invocation (output) Result (input) Notes
bash_tool ~30–50 ~20 + stdout Stdout as JSON
view (directory) ~20–30 ~50–200 2-level listing
view (file, full) ~20–30 ~30 + content Truncates at ~16K chars
view (file, range) ~20–30 ~30 + range Most token-efficient
create_file ~25 + content ~15 Content is output tokens!
str_replace ~30 + strings ~15 Both strings are output tokens
web_search ~10–15 ~1,000–5,000 Varies by result richness
web_fetch ~15–20 controllable Use text_content_token_limit
present_files ~15–20 ~15 Minimal

Appendix C: Methodology

All timings measured with date +%s%N (nanosecond precision) inside the container. Python import times measured in isolated processes (separate python3 -c invocations) to avoid module caching effects. Tool call round-trip times are upper bounds that include Claude's inter-call processing. Package install times are single-shot and may vary with network conditions. Token estimates use ~4 characters per token for English text. Pricing sourced from Anthropic's documentation and third-party analysis, February 2026.