A Skill Is a Folder, Not a Prompt: What Anthropic Learned Running Hundreds of Them

TL;DR

Anthropic published a June 3, 2026 Claude blog post by Claude Code engineer Thariq Shihipar on how its teams use hundreds of Skills, reusable folders that can hold instructions, scripts, references and hooks. The confirmed development is the publication of those internal lessons; the main attributed claim is that verification Skills had the strongest effect on output quality.

Anthropic has published lessons from running hundreds of Claude Code Skills across its engineering organization, framing Skills as folders of reusable agent knowledge rather than one-off prompts. The June 3, 2026 post matters because it shows how a major AI lab is trying to make coding-agent work repeatable inside teams.

The confirmed post, Lessons from building Claude Code: How we use skills, was written by Thariq Shihipar, a Claude Code engineer. The source material says a Skill can include SKILL.md instructions, reference files, scripts, templates, configuration, hooks and memory.

The core point is definitional: a Skill is not just saved markdown. It is a discoverable folder the agent can read and use when a task matches its description, then pull deeper material only when needed.

Anthropic grouped its internal Skills into nine categories, including library references, product verification, business-process automation, scaffolding, code review, deployment, runbooks and infrastructure operations. According to the source material summarizing Anthropic’s measurements, verification Skills had the largest effect on output quality.

At a glance
reportWhen: Published June 3, 2026; covered in a Ju…
The developmentAnthropic published guidance describing what it learned from running hundreds of Claude Code Skills across its engineering organization.
AI Dispatch · Insights · 1 July 2026

A Skill is a folder, not a prompt

Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.

✕ The misconception

“A Skill is just a clever markdown prompt you save in a file.”

✓ What it actually is

A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.

Anatomy of a Skill — the file system is context engineering
my-skill/the unit you share & version
├─ SKILL.mdroot instructions + a description written for the model (its trigger)
├─ references/deep detail pulled in only when needed — progressive disclosure
├─ scripts/real code, so the agent composes instead of rebuilding boilerplate
├─ assets/templates & files to copy into the output
├─ config.jsonsetup the agent asks for if it’s missing (e.g. which Slack channel)
└─ hooks + memoryon-demand guardrails + an append-only log so it remembers
Why it matters: the folder itself is the knowledge base. The agent reads the root, then reaches deeper only when the task demands it — the same way you’d hand a new hire a one-pager that points to the detailed docs.
The nine types — a gap-analysis map for your own library
1Library / API reference
2Product verification ★ top impact
3Data fetching & analysis
4Business-process automation
5Code scaffolding & templates
6Code quality & review
7CI/CD & deployment
8Runbooks
9Infrastructure operations
By Anthropic’s own measurement, verification Skills — the ones that check the work — moved output quality the most. If you build one category well, build that one.
The craft — what separates a good Skill from a useless one
Gotchas = highest-signal section Describe for the model, not humans (it’s the trigger) Don’t state the obvious Ship scripts, not just prose On-demand guardrail hooks (/careful, /freeze) Let it remember (log / SQLite) Don’t railroad — leave room to adapt
The take

The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.

Source: “Lessons from building Claude Code: How we use skills,” Thariq Shihipar (Anthropic), Claude blog, 3 June 2026. Categories, examples & measured claims are Anthropic’s; framing is the author’s. Docs: code.claude.com/docs/en/skills.
thorstenmeyerai.com

Skills Turn Prompts Into Assets

For engineering leaders, the report points to a shift from ad hoc prompting to versioned operating procedures. If Skills are maintained like code, teams can review, share and update the knowledge that agents use for repeated work.

For developers, the practical change is that a Skill can package scripts and templates, not only advice. That may reduce repeated boilerplate and make agent outputs more consistent, though the result still depends on how well each Skill is written and maintained.

Amazon

AI agent workflow management tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Inside Anthropic’s Skills Library

Anthropic’s model uses progressive disclosure: the agent sees a short Skill description first, then opens references, scripts or assets when the task requires them. Thorsten Meyer AI described the folder itself as the knowledge base.

The July 1 dispatch cast the post as a business memo as much as a developer guide. Its reading: Skills can capture tribal knowledge, guardrails and repeatable checks in a form that agents can actually apply.

“A Skill is a folder, not a prompt.”

— Thorsten Meyer AI dispatch

Amazon

code automation scripting tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Quality Data Still Limited

It is not yet clear from the provided material how Anthropic measured quality gains, what baseline it used, or how much of the change came from Skills rather than other workflow changes. The post also does not prove that the same gains will appear in every company or codebase.

Open questions include maintenance cost, ownership rules and how hooks or memory should be governed when Skills are shared across large teams. Those are implementation risks, not confirmed failures.

Amazon

AI development reference files

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Teams Test Folder-Based Workflows

The next step for teams using coding agents is likely a small pilot: one frequent task, one verification Skill and a review process for updates. The source material says best practices are still evolving, and that curation matters more than simply collecting more Skills.

Amazon

AI prompt engineering folders

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What did Anthropic publish?

Anthropic published a Claude blog post on June 3, 2026 about how its engineering organization uses Claude Code Skills.

What is a Skill in Claude Code?

A Skill is described as a folder that can contain instructions, scripts, references, templates, config and hooks. It is more than a single prompt file.

Why do verification Skills matter?

According to the source material summarizing Anthropic’s measurement, verification Skills had the strongest effect on output quality because they check the agent’s work.

Is this proven outside Anthropic?

No. The material confirms Anthropic’s internal experience, but broader results across other teams, tools and codebases remain unverified.

What should teams try first?

The source material recommends starting with one Skill, one known failure pattern and the category that catches common mistakes.

Source: Thorsten Meyer AI

You May Also Like

The Google I/O 2026 Preview: What May 19-20 Will Reveal About Google’s Agentic Bet

Preview of Google I/O 2026 focusing on potential announcements about Google’s agentic AI, Gemini 4.0, and new smart glasses, with confirmed infrastructure groundwork.

Kill-Switch-Proof: How to Build So Washington Can’t Take Your AI Stack Down

Thorsten Meyer AI says June model access shocks show why companies need fallback AI architecture and self-hosted model tiers.

7 Best Tablet Stands and Docks for Prime Day Deals in 2026

Thorsten Meyer AI ranked seven tablet stands and docks for Prime Day 2026, led by Almoz, Lamicall and RAM MOUNTS picks.

Avengers Labs: How Ukraine Turned Its Front Line Into the World’s Scarcest AI Dataset

Ukraine is letting defense firms train AI on front-line drone data through Avengers Labs while keeping the finished models.