This page contains security research shared under responsible disclosure. Enter the access code to continue.
MCP gives agents powerful tool access. But nobody verifies what the agent does with those tools. We found 8 blind spots. We built the fix. We disclosed it responsibly.
Every AI agent tool call flows through a 7-step defense pipeline — from intent classification to cryptographic session proofs. Zero GPU. Zero cloud dependency. Full sovereignty.
The Model Context Protocol (MCP) was created by Anthropic in 2024 to standardize how AI agents interact with external tools — databases, APIs, file systems. It's a brilliant protocol. But like TCP/IP before firewalls, it shipped without a security layer at the boundary.
We know this because we built the AEGIS 42-layer cascade — a CPU-only AI safety stack that has processed 14.30 million prompts at V48. When we integrated MCP into our architecture, we discovered 4 gaps that no vendor — including Anthropic — has addressed. We documented them, built mitigations, filed 24 patent claims (612–635), and submitted a responsible disclosure to Anthropic's security team. This page shows you exactly what we found.
Depth Level 4: Claim → Protocol Gap → CVE Evidence → Patent Filing → Responsible Disclosure
The Model Context Protocol revolutionized how AI agents interact with tools. But it shipped without security at the tool boundary — and real CVEs prove it.
Tool definitions can be silently modified after trust is established. A benign tool today can exfiltrate credentials tomorrow — the "rug pull" attack.
CVE-2025-6514 · CVE-2026-22785Agent sandboxing stops at the agent. MCP servers run OUTSIDE the sandbox with their own permissions. A prompt-injected agent's tool call passes through unchecked.
OWASP LLM Top 10 · #1 Prompt InjectionMCP returns raw data to agents. Malicious strings planted in database records become part of the agent's context — indirect prompt injection through the data layer.
Checkmarx MCP Report 2026Multiple agents sharing MCP servers have zero cross-agent access controls. A compromised agent can inject instructions into a peer agent's session — lateral movement.
CVE-2025-49596 · MCP Inspector RCEStandard cybersecurity vocabulary: this is a Deep Packet Inspection for AI agent tool calls — powered by the AEGIS Intention Gate.
DROP TABLE users
Defense-in-depth for the agent-tool boundary. Each layer works independently.
Together, they create 12-Sigma safety for autonomous AI operations.
Every MCP tool invocation passes through a multi-layer safety cascade that classifies intent, not just schema compliance. 42 layers, CPU-only.
SHA-256 hashed, quantum-timestamped tool definitions. Any silent modification triggers an instant block. Rug-pull attacks become structurally impossible.
MCP responses are stripped of PII, injection patterns, and encoded payloads before entering the agent context. Sanitization data improves the safety model.
Ephemeral per-session tokens with 30-minute TTL, max 100 calls. No cross-agent state sharing. Every invocation logged to a specific identity.
Beyond the original 4 blind spots — these 5 innovations close the remaining gaps in multimodal safety, RAG poisoning defense, and session auditability.
2ms fingerprinting for images, audio, and video using byte entropy, edge density, and LSB steganography detection.
5-orbit dynamic trust (O0→O4) with φ⁻¹ decay, anomaly-based instant demotion, and community vouching.
3-tier context-aware scanning with semantic role classification. Educational content doesn't trigger false positives.
3-layer CPU/NPU analysis stack. Shannon entropy, MFCC, Heim vectors. Zero cloud dependency at any layer.
50+ tool calls → 1 Fibonacci Merkle proof. 25× storage reduction with selective Fibonacci-indexed verification.
Every innovation traces to real production code. These aren't mockups — they're running on our live API right now, processing 1,988 prompts/sec.
// Real production code: backend/src/aegis/aegis.service.ts async scan(prompt: string, context?: ScanContext): Promise<AegisResult> { // 42 detectors execute in cascade — CPU only, no GPU const cascade = await this.cascadeCoordinator.execute(prompt, { mode: 'production', layers: 42, stellschrauben: this.phiHarmonicConfig, }); // POAW receipt generated for every decision const receipt = await this.poawService.seal(cascade); return { decision: cascade.finalDecision, receipt }; }
// Real production code: backend/src/poaw/shared/poaw-core.ts export function generatePoawReceipt(data: PoawInput): PoawReceipt { const hash = createHash('sha256') .update(JSON.stringify(data)) .digest('hex'); const quantumSeed = this.qrngService.getEntropy(32); return { hash, quantumSeed, timestamp: Date.now(), signature: mlDsaSign(hash, this.privateKey), // ML-DSA post-quantum }; }
$ curl https://destill.ai/api/v1/redteam/health { "status": "operational", "aegisVersion": "V48", "cascadeLayers": 42, "avgLatency": "0.50ms", "totalPromptsProcessed": 7930000, "gpuRequired": false }
@destill/aegis is not yet published — it's integrated into our NI-Stack backend today.EU AI Act enforcement begins August 2026. Every feature maps to a specific article.
Risk management via Intention Gate. Immutable logging via POAW. Human oversight for re-registration. Red Team API access for transparency.
Data protection by design: sanitization proxy strips PII before agent context. Zero-credential MCP architecture.
Cybersecurity risk management with measurable metrics (AEGIS_MCP_Score). Multi-layer defense-in-depth.
Operational resilience through fail-closed architecture. Dual AEGIS Gateway with <10s auto-failover.
#1 Prompt Injection: blocked at tool boundary. #2 Insecure Output: sanitized. #5 Supply Chain: sealed manifests.
24 claims covering Intention Gate, Sealed Manifests, Self-Sanitizing Proxy, and Agent Isolation. Parent filing #63/994,444.
Pay per MCP tool invocation. Deploy on your servers or ours. Switch or leave anytime — no lock-in.
The AEGIS MCP Gateway is the missing security layer between your AI agents and their tools.
24 patent-pending innovations. CPU-only. Deploys in one line.