What it is
mcp-xray is a command-line audit tool for Model Context Protocol (MCP) servers. Point it at a live server, or at an offline tools/list dump, and it returns one graded report that answers three questions about your tool surface:
- What does this surface cost? The per-turn context tax, per tool, before any work is done.
- Does the surface confuse the model? Wrong-tool selection and spurious firing on off-domain tasks.
- Can the surface be smaller? Which tools merge, which should be MCP resources, and whether the fix is consolidation or just-in-time loading.
Many sensors, one voice: every probe contributes measurements only; a single grading engine owns all interpretation, so the report reads as one coherent verdict instead of a pile of raw numbers.
How it grades
Five weighted dimensions roll up to a 0-100 score and a letter grade. Probes that cannot run are reported as "not measured" and drop their weight, never scored zero.
| Dimension | Weight |
|---|---|
| Context efficiency | 30% |
| Selection robustness | 25% |
| Surface redundancy | 15% |
| Schema hygiene | 15% |
| Description quality | 15% |
What it measures
- Per-tool token tax computed leave-one-out against the Anthropic
count_tokensendpoint, so you see what each tool adds to every turn. An offline backend is available but flagged as an estimate, never the headline number. - Behavioral probe: selection accuracy, a confusability proxy, and distraction, scored against your labeled golden queries.
- Consolidation analysis: merge candidates, MCP resource candidates, and just-in-time loading framing.
- Schema hygiene: hidden injectors and schema smells across the inventory.
How it runs
The static hygiene and consolidation half runs keyless and offline from a tools/list dump. An Anthropic API key unlocks authoritative token counts and the behavioral probe; stdio, HTTP, and SSE transports enable a full live audit.
# Offline: static hygiene + consolidation, no API key, no live server
mcp-xray analyze --tools-json dump.json
# Authoritative token numbers (match the client's production model)
mcp-xray analyze --tools-json dump.json --token-backend api --model claude-sonnet-4-6
# Live server, full audit including the behavioral probe
ANTHROPIC_API_KEY=... mcp-xray analyze --stdio "my-mcp serve" --llm --model claude-sonnet-4-6
Some servers swap their tool list by journey phase (for example, a "design" phase before a model is loaded and a "run" phase after). mcp-xray audits these phase-swapped surfaces per phase: the headline tax is the worst phase, not the union, because the model only ever carries one phase at a time, and progressive loading is credited rather than flagged. Every run folder is self-contained and replayable, fingerprinted for drift.
Why it matters
Every tool you expose to an agent is paid for on every turn, in context tokens, whether or not it is called. A bloated or confusable tool surface quietly raises cost and lowers tool-selection accuracy. mcp-xray makes that cost legible and gives a concrete reduction plan, the same discipline RALFORION applies to the agent-facing surface of the OrionBelt Semantic Layer.
Frequently Asked Questions
What is mcp-xray?
A command-line audit tool for Model Context Protocol (MCP) servers. Point it at a live server or an offline tools/list dump and it returns one graded 0-100 report covering token tax, tool-selection confusion, and surface bloat.
How does it measure an MCP server's token cost?
It computes a per-tool context tax leave-one-out against the Anthropic count_tokens endpoint, so you see what each tool adds to every turn. An offline estimate backend exists but is flagged and never the headline number.
Does it need an API key or a running server?
No. Static hygiene and consolidation run keyless and offline from a tools/list dump. An Anthropic API key unlocks authoritative token counts and the behavioral probe; stdio, HTTP, and SSE transports enable a full live audit.
How does it grade a server?
Five weighted dimensions roll up to a 0-100 score and letter grade: context efficiency (30%), selection robustness (25%), surface redundancy (15%), schema hygiene (15%), and description quality (15%).
Who built mcp-xray?
mcp-xray was built by Ralf Becher and RALFORION d.o.o., the team behind the open-source OrionBelt Semantic Layer.