Jun 18, 2026 · 6:30 PM · 100 Queens Quay East, Toronto

Input Sanitization for Agentic Systems: What Actually Works

KC Udonsi on the production sanitization layer that sits between untrusted input and the model.

Capacity

Venue

100 Queens Quay East, Toronto

RSVP

7 / 65 confirmed

Speaker

KC Udonsi

DC416 co-organizer. AI defense lead. Works at Stan (DC416's recurring venue sponsor). Co-presenter for May 21 on the defensive half of AI in security.

About

Agentic systems amplify every classic LLM safety problem. A prompt injection isn't a jailbreak anymore, it's remote code execution by way of your assistant, or an intellectual property or data leak. A PII leak isn't a compliance footnote, it's training data for a vendor's next model. And as agents start reading tool outputs, retrieved documents, and other agents' messages, the trusted-input boundary disappears entirely.

This talk walks through the design of a production sanitization layer that sits between untrusted input and the model, regardless of whether that input comes from a user, a tool, or another agent.

What we cover

Why generic guardrails fail. Regex gets bypassed in seconds. Bracketed PII redaction like [NAME_1] actively provokes hallucinations. Single-classifier approaches miss paraphrased attacks.
A layered detection model: heuristics, fine-tuned classifiers, semantic drift, and LLM-as-judge. When each pays for itself and when it doesn't.
Context-preserving pseudonymization: replacing PII with structurally valid fakes (real names, reserved IPs, 555-phones) instead of placeholders, and why this keeps downstream reasoning intact.
Integration trade-offs: transparent proxy vs SDK hook vs sidecar gRPC. Latency budgets, blast radius, and the operational cost of each.

You leave with a concrete reference architecture, the failure modes we hit in production, and the numbers behind why some "obvious" defenses make things worse.