Building Claude Skills: A Complete Developer Guide

What Are Claude Skills

Direct answer: Claude Skills are folders of markdown instructions, scripts, and reference files that Claude loads dynamically to handle specialized tasks. They eliminate the need to re-paste context, style guides, or workflow rules into every conversation session.

Every time you start a new Claude conversation, you begin from zero. Your preferred output format, domain vocabulary, team writing style, and quality standards vanish. You spend the first few exchanges rebuilding context you already established in previous sessions. For a one-off question, that friction is tolerable. For repeatable professional work, it is a tax on every conversation.

Claude Skills fix this. A skill is a folder of instructions you build once; Claude loads it automatically when the task calls for it. Your preferences, workflows, and domain expertise live inside the skill, not in copy-pasted chat messages. Skills launched in October 2025 and quickly became the standard mechanism for giving Claude domain-specific capabilities across Claude Code, Claude Desktop, and the Claude API.

Anthropic published the official skills repository at github.com/anthropics/skills as a working reference. As of mid-2026, the repo has over 148,000 stars and 17,500 forks — one of the most-watched AI tooling repositories on GitHub. The repository contains production-grade skills for document creation (DOCX, PDF, PPTX, XLSX), frontend design, code review, and more.

Key Takeaways

  • What: Claude Skills are portable, open-source instruction folders (SKILL.md + supporting files) that give Claude persistent domain expertise across all surfaces.
  • Architecture: Three-level progressive disclosure — YAML frontmatter (always loaded, ~100 tokens), SKILL.md body (loaded on relevance), referenced files (loaded on demand).
  • Build time: Anthropic promises a working skill in 15-30 minutes using the skill-creator meta-skill from the official repository.
  • Key rules: SKILL.md is case-sensitive, folders must be kebab-case, descriptions need specific trigger phrases, no XML in frontmatter.
  • Standard: Agent Skills is an open standard at agentskills.io — designed for cross-platform portability beyond Claude.

The Three-Level Architecture

Skills are not plugins in the traditional sense. They are not compiled code, paid add-ons, or model modifications. A skill is a folder containing a SKILL.md file (required) and optional supporting directories: scripts/ for executable code, references/ for documentation Claude loads on demand, and assets/ for templates and supporting files.

What makes the architecture effective is a three-level progressive disclosure system designed to minimize token usage while maintaining specialized expertise:

  • YAML frontmatter: Always loaded into Claude’s system prompt, costing roughly 100 tokens per skill regardless of how many are installed. This metadata layer gives Claude enough information to decide whether the skill is relevant to the current task without loading the full content.
  • SKILL.md body: Loaded only when Claude determines the skill is relevant. Contains full instructions, step-by-step workflows, examples, and troubleshooting guidance.
  • Referenced files: Additional files in references/ and assets/ that Claude navigates only when the task specifically requires them. Long API reference guides, detailed style specifications, or extended troubleshooting sections live here rather than in the main file.

This system means you can install many skills simultaneously without bloating Claude’s context window. Only the frontmatter of each skill loads by default. Three design principles govern the entire system:

  • Progressive disclosure: Load detail only when needed, keeping the base context lean.
  • Composability: Claude can load multiple skills at once. Your skill should work alongside others, not assume it operates alone.
  • Portability: Skills work identically across Claude.ai, Claude Code, and the API. Build once, deploy everywhere.

For teams using Model Context Protocol (MCP) servers, skills add a knowledge layer on top of raw connectivity. The way Anthropic frames it: MCP provides the professional kitchen — access to tools, ingredients, and equipment. Skills provide the recipes and step-by-step instructions. MCP tells Claude what it can do. Skills tell Claude how to do it well.

Planning Before You Write

The most common mistake when building a skill is starting with the file structure rather than the use case. Anthropic’s guide is explicit: identify two or three concrete use cases before touching any files.

A well-defined use case answers four questions:

  1. What does a user want to accomplish?
  2. What multi-step workflow does this require?
  3. Which tools are needed — Claude’s built-in capabilities, or MCP-connected tools?
  4. What domain knowledge should be embedded that the user would otherwise need to explain every session?

Anthropic’s team has observed three categories that cover most skill use cases:

  • Document and Asset Creation: Creating consistent output documents, presentations, frontend designs, and code. The defining characteristic is embedded style guides and quality checklists. The official repository includes production-grade document skills for DOCX, PDF, PPTX, and XLSX manipulation.
  • Workflow Automation: Multi-step processes with consistent methodology — research pipelines, content workflows, and onboarding sequences. Key techniques include step-by-step workflows with validation gates between stages, templates for repeating structures, and iterative refinement loops.
  • MCP Enhancement: Workflow guidance layered on top of a working MCP server. If users have connected Notion, Linear, or Sentry via MCP but do not know which workflows to run, an enhancement skill provides the knowledge layer — sequencing tool calls, embedding domain expertise, and handling errors.

Before writing any SKILL.md content, define success criteria. Anthropic recommends two types: quantitative (the skill triggers on at least 90% of relevant queries and completes the workflow in a defined number of tool calls) and qualitative (users do not need to redirect Claude mid-workflow, outputs are structurally consistent across runs, and a new user can accomplish the task on the first try). These standards matter because function calling accuracy drops significantly in production — structured skills help mitigate that degradation.

File Structure and Naming Rules

This is where most skills fail silently. The rules are strict, and the errors they produce are confusing because Claude simply will not load a skill that violates them — with no error message explaining why.

The required file structure:

your-skill-name/
├── SKILL.md           # Required — main skill file
├── scripts/           # Optional — executable code
│   ├── process_data.py
│   └── validate.sh
├── references/        # Optional — documentation loaded as needed
│   ├── api-guide.md
│   └── examples/
└── assets/            # Optional — templates, fonts, icons
    └── report-template.md

Critical naming rules that cause silent failures when violated:

  • SKILL.md is case-sensitive. Variations like skill.md, SKILL.MD, or Skill.md will not be recognized. Claude simply will not load the skill — no error, no warning.
  • Folder names must use kebab-case. Lowercase letters and hyphens only. No spaces, no underscores, no capitals. The folder name must match the name field in your frontmatter exactly.
  • No README.md inside the skill folder. All documentation for Claude goes in SKILL.md or references/. If distributing on GitHub, put your human-readable README at the repository root.
  • Reserved names: Skill names cannot contain “claude” or “anthropic” — these are reserved and will be rejected.
  • No XML angle brackets in frontmatter. This is a security restriction enforced at the platform level, since frontmatter appears in Claude’s system prompt.

Writing Effective Instructions

The frontmatter is how Claude decides whether to load your skill. If it is weak or missing trigger conditions, the skill will not activate reliably. This is the single most common failure mode.

Minimal required format:

---
name: your-skill-name
description: What it does. Use when user asks to [specific phrases].
---

The structure that consistently produces reliable triggering follows a formula: [What it does] + [When to use it] + [Key capabilities]. The description field has a 1,024 character limit and must include both what the skill does and when to use it.

Examples of effective descriptions:

# Good — specific task, trigger phrases, file type mentioned
description: Analyzes Figma design files and generates developer handoff
documentation. Use when user uploads .fig files, asks for "design specs",
"component documentation", or "design-to-code handoff".

# Good — named service, concrete trigger language
description: Manages Linear project workflows including sprint planning,
task creation, and status tracking. Use when user mentions "sprint",
"Linear tasks", "project planning", or asks to "create tickets".

Descriptions that fail are too vague or omit triggers:

# Bad — too vague, no trigger conditions
description: Helps with design files.

# Bad — no trigger phrases, no specific task
description: A workflow automation skill.

For the main instructions body, Anthropic recommends this structure: a skill name header, numbered instruction steps with clear explanations and expected outputs, concrete examples showing common scenarios, and a troubleshooting section covering foreseeable failure modes. Four practices make instructions reliable in practice:

  • Be specific and actionable: Exact commands with expected outputs, not vague directives.
  • Include error handling: Cover every foreseeable failure mode.
  • Reference bundled files clearly: Use exact paths so Claude knows where to look.
  • Use progressive disclosure: Keep SKILL.md focused on core instructions. Move detailed documentation to references/ with a link.

Building a Complete Skill

Here is a production-quality skill for a content writer who wants Claude to follow their company’s article style guide automatically — in every session, without pasting guidelines into the chat each time.

Folder structure:

blog-content-writer/
├── SKILL.md
├── references/
│   └── style-guide.md
└── assets/
    └── post-template.md

The complete SKILL.md:

---
name: blog-content-writer
description: Drafts blog posts following the company's established style
guide. Use when the user asks to "write a blog post", "draft content
for the blog", "create a post", or any request to produce long-form
content for publication. Applies consistent voice, tone, header
structure, and formatting automatically.
license: MIT
metadata:
  author: Content Team
  version: 1.1.0
---

# Blog Content Writer

the writer to paste guidelines into each session.

## Instructions

### Step 1: Load the Style Guide
Before drafting anything, read `references/style-guide.md` to load
the current voice, tone, formatting, and structural requirements.

### Step 2: Clarify the Brief
If the request does not include all of the following, ask for them
before starting — all in one message, not one at a time:
- Topic
- Target audience
- Word count target
- Primary goal

### Step 3: Draft the Post
Apply the guidelines from references/style-guide.md:
- Intro formula: hook → context → promise
- Header hierarchy: H2 for main sections, H3 for subsections only
- Voice: direct, active, no jargon unless audience is technical

### Step 4: Quality Checklist
Verify each item before delivering:
□ Intro follows hook → context → promise formula
□ Every H2 is a specific claim or question
□ All paragraphs are 2-4 sentences
□ Passive voice is absent or near-absent
□ Conclusion is actionable
□ Word count is within 10% of target

The references/style-guide.md contains the detailed style rules — voice and tone by audience type, header conventions, sentence and paragraph length guidelines, and formatting standards. This file is only loaded when the skill body instructs Claude to read it, keeping the main context lean while making detailed guidelines available when needed.

Testing and Distribution

Anthropic’s guide recommends three testing approaches scaled to the skill’s visibility: manual testing in Claude.ai for fast iteration, scripted testing in Claude Code for repeatable validation, and programmatic testing via the Skills API for systematic evaluation.

The single most useful testing tip from the official guide: iterate on a single challenging task until Claude succeeds, then extract the winning approach into the skill. Do not start with broad coverage. Get one hard case working perfectly, then expand.

Three critical areas to test:

  • Triggering tests: Does the skill load when it should? Does it stay quiet when it should not? Build a test matrix with should-trigger queries (“Write a blog post about our product launch”) and should-not-trigger queries (“Summarize this article for me”). Run 10-20 queries and aim for 90%+ automatic triggering on relevant requests.
  • Output quality tests: Run the same request three to five times and compare outputs for structural consistency. Test edge cases: topics with no clear conclusion, conflicting instructions, impractical word counts.
  • Regression tests: The most common regression is a description edit that narrows triggers too aggressively and breaks previously working queries. After any frontmatter change, run the full trigger suite.

For distribution, individual users can zip the skill folder and upload via Settings > Capabilities > Skills in Claude.ai, or copy it to the appropriate Claude Code skills directory:

# Claude Code global installation
mkdir -p ~/.claude/skills
cp -r blog-content-writer/ ~/.claude/skills/

# Claude Code local (project-level) installation
mkdir -p ./.claude/skills
cp -r blog-content-writer/ ./.claude/skills/

Organization-level distribution is handled by admins who can deploy skills workspace-wide with automatic updates and centralized management. GitHub distribution is the standard approach for community sharing. Anthropic also published Agent Skills as an open standard at agentskills.io — the same SKILL.md format is designed to work across Claude and other AI platforms that adopt it.

Why Skills Matter Now

Skills represent a fundamental shift in how developers interact with AI assistants. Instead of treating every conversation as stateless, skills give Claude persistent, structured expertise that activates contextually. This matters for three reasons.

First, productivity. Teams using well-crafted skills report dramatic reductions in prompt engineering overhead. The domain knowledge, style guides, workflow steps, and quality standards that previously required careful re-stating in every session now load automatically. A developer working with a code review skill does not need to explain the team’s review criteria every time they open a new conversation.

Second, consistency. Without skills, Claude’s output varies based on how the prompt is phrased. With skills, the same structured instructions apply every time, producing structurally consistent outputs across sessions and team members. This is critical for organizations that need reliable, repeatable AI-assisted workflows.

Third, portability and composability. Skills work across Claude.ai, Claude Code, and the API without modification. They compose with each other — a team can run a code review skill alongside a testing skill and a documentation skill simultaneously. The progressive disclosure architecture means adding more skills does not bloat the context window — a critical advantage given that multi-agent reliability degrades sharply at scale.

The barrier to entry is intentionally low. Anthropic’s official guide promises a working skill in 15-30 minutes using the skill-creator meta-skill from the official repository. The entire system is built on markdown and folders — no compilation, no deployment pipelines, no vendor lock-in. For developers already building on Claude’s API or using Claude Code, skills are the mechanism for turning domain expertise into persistent AI capability — and a much more cost-effective approach than agentic workflows that routinely exceed budget by 5x.

The open standard at agentskills.io signals that Anthropic intends skills to be broader than a single-platform feature. As other AI platforms adopt the SKILL.md format, the skills you build today could become portable across assistants — making the investment in skill development even more valuable over time.

Common Pitfalls and Fixes

A quick reference for the most frequent issues developers encounter:

  • Skill never triggers: The description is too vague or missing trigger phrases. Rewrite the description with specific user-facing language and concrete trigger phrases.
  • Skill triggers constantly: The description is too broad. Add explicit “Do NOT use when” conditions to narrow activation.
  • Instructions ignored: Directives are vague or conflicting. Make instructions specific: exact commands, expected outputs, unambiguous criteria.
  • MCP calls fail: The server is not running or authentication has expired. Add reconnection steps to the troubleshooting section.
  • Works in Claude.ai but fails in Code: Missing environment dependencies. Document all requirements in the compatibility frontmatter field.
  • Inconsistent output across sessions: Instructions are too flexible. Add a quality checklist and require self-verification before delivering output.

The root cause behind most of these issues is insufficient specificity in the description or instructions. Skills that work reliably are skills that tell Claude exactly when to activate, exactly what steps to follow, and exactly what success looks like. The skill architecture handles the loading logic. The developer’s job is to define the domain expertise precisely enough that Claude can execute it consistently.

Sources and References