MCP's Token Inefficiency Problem and the Agent Skills Solution

Introduction

The Model Context Protocol (MCP) has become the standard for connecting AI agents to tools and data, but Anthropic’s recent analysis reveals a critical flaw: severe token inefficiency and context window bloat. In a detailed blog post, Anthropic acknowledges the problem they helped create and proposes agent skills as a solution. This article breaks down the MCP token consumption issue and explores how agent skills and code execution offer a more efficient path forward for AI agents.

The Token Inefficiency Problem

MCP servers consume enormous amounts of context window space with tool definitions and parameters. Every tool description, including schemas and capabilities, gets loaded upfront, regardless of whether the agent will use them.

“Every single tool definition like the when and the Y for the tool and the parameters and also every single time we leverage the tool both are going to fill our context window.”

The problem scales dramatically with more capabilities. A typical setup with five MCP servers for AI coding can consume 97,000 tokens just for tool descriptions - representing 48% of Claude Sonnet 4.5’s available context.

This inefficiency creates several issues:

Context bloat: Agents become overwhelmed with irrelevant tool information
Reduced performance: Modern LLMs struggle with excessive upfront context
Scalability limits: Adding more capabilities becomes impractical

Real-Time Discovery and Code Execution

Anthropic’s proposed solution, detailed in their “Code Execution with MCP” blog post, shifts from loading all tools upfront to real-time capability discovery. Instead of rigid MCP server definitions, agents generate and execute code on-demand to interact with APIs.

“What if we just allowed the agent to write the code directly to interact with the Archon API endpoints instead of having the MCP server as the middleman?”

The Anthropic post demonstrates how MCP can be extended to support code execution in a secure sandbox environment. This allows agents to:

Generate custom API interaction code dynamically
Execute code to perform complex operations
Create flexible workflows that adapt to specific tasks

This approach offers several advantages:

Token efficiency: Only load relevant code and instructions when needed
Flexibility: Agents can create custom interactions rather than using predefined tools
Reusability: Generated code can be saved and reused across sessions
Security: Code execution happens in controlled environments

Agent Skills: The Practical Implementation

Anthropic’s agent skills, as detailed in their Claude Skills blog post, embody this solution by providing a user-friendly framework for efficient capability management. Skills consist of minimal upfront descriptions (just a few sentences) that expand into full instruction sets and executable code when activated.

“The description for this skill, this little bit of text right here is the only thing that is given to our AI agent at the start of a conversation. So instead of thousands of tokens defining tools and MCP instructions, we just have a couple of sentences here.”

The Claude Skills system allows users to create reusable capabilities that Claude can leverage across conversations. Each skill includes:

A brief natural language description
Associated code or instructions that execute when the skill is invoked
Optional parameters for customization

Token usage drops to 2-3% compared to traditional MCP implementations. Multiple skills can be loaded simultaneously without overwhelming the context window, enabling agents with dozens of capabilities while maintaining efficiency.

MCP vs. Agent Skills: Control vs. Flexibility

While agent skills offer superior efficiency, MCP maintains important advantages:

MCP Strengths:

Predictable tool usage
Controlled credential management
Reduced risk of missed capabilities
Clear boundaries for agent interactions

Agent Skills Advantages:

Massive token savings
Flexible, on-demand code generation
Autonomous workflow creation
Scalable capability integration

“There is really a big trade-off here between control and flexibility.”

Is This the End of MCP?

Despite the compelling efficiency gains, MCP isn’t disappearing. The protocol’s predictability and security make it valuable for controlled environments. However, as LLMs grow more capable and trustworthy with code execution, flexibility will increasingly win out.

“I think flexibility is going to win over control more and more over time because LLMs keep getting more powerful.”

Agent skills represent the evolution of MCP - maintaining the protocol’s benefits while solving its fundamental inefficiencies.

Conclusion

Anthropic’s analysis highlights a critical challenge in AI agent development: balancing capability with context efficiency. Agent skills offer a promising solution that could revolutionize how we build autonomous AI systems. As the field evolves, expect to see more emphasis on intelligent, on-demand capability loading rather than upfront context bloat.

Whether you’re building AI agents or working with MCP servers, understanding this trade-off will be crucial for creating efficient, scalable AI systems.

This article was written by Cline, based on content from: https://www.youtube.com/watch?v=1_z3h2r93OY, https://www.anthropic.com/engineering/code-execution-with-mcp, and https://www.claude.com/blog/skills