AI gateway & proxy interoperability
If your traffic already flows through an AI gateway or proxy (LiteLLM, Portkey, Kong, Cloudflare, …), you can add Shield guardrails at that layer — no app code changes. The gateway calls Shield’s stateless guardrail HTTP API on each request/response and blocks, sanitizes, or allows based on the verdict.
Table of contents
- 1. How it works
- 2. Maturity tiers
- 3. The building block — Shield’s guardrail HTTP API
- 4. Tool-call DLP (arguments + results)
- 5. LiteLLM (Validated)
- 6. Portkey (webhook pattern)
- 7. Kong AI Gateway (plugin pattern)
- 8. Cloudflare AI Gateway & other proxies
- 9. Any OpenAI-compatible gateway (sidecar pattern)
- 10. What you get
- See also
1. How it works
A gateway sits between your app and the model. Shield plugs into the gateway’s pre-call (input) and post-call (output/tool) hooks:
┌──────────┐ request ┌─────────────────┐ pre_call ┌──────────────────────┐
│ Your app │ ───────────▶ │ AI gateway │ ────────────▶ │ Shield /guardrails/* │
│ / agent │ │ (LiteLLM, │ ◀──────────── │ (input safety, │
└──────────┘ ◀─────────── │ Portkey, Kong) │ verdict │ output validation, │
response │ │ │ tool RBAC) │
│ │ allowed └──────────────────────┘
│ ▼
│ ┌──────────┐
└─▶│ Model │ (OpenAI, Anthropic, self-hosted, agent)
└──────────┘
Two integration shapes, depending on the gateway:
| Shape | When to use | Gateways |
|---|---|---|
| Native plugin / guardrail hook | The gateway has a guardrail/plugin extension point. | LiteLLM (Python CustomGuardrail), Kong (plugin) |
| Webhook / sidecar | The gateway can call an external HTTP guardrail, or you front it with a thin proxy. | Portkey (webhook), Cloudflare, any OpenAI-compatible gateway |
2. Maturity tiers
Same honest tiering as our IdP interop — we don’t over-claim:
| Gateway | Integration | Status |
|---|---|---|
| LiteLLM | Native CustomGuardrail plugin (votal_guardrail.py) + example config + one-shot script |
Validated — shipped in this repo, runnable |
| Portkey | Custom guardrail webhook → thin Shield adapter | Integration pattern — validate in your env |
| Kong AI Gateway | Pre/post-function or custom plugin → Shield HTTP API | Integration pattern — validate in your env |
| Cloudflare AI Gateway | Front with LiteLLM/sidecar (no general-purpose guard webhook today) | Integration pattern — validate in your env |
| Any OpenAI-compatible gateway | Drop the LiteLLM proxy in front, or use the framework-free sidecar | Integration pattern (LiteLLM path is Validated) |
Validated = code shipped here that we run. Integration pattern = works via the gateway’s documented extension point against Shield’s stable HTTP API; confirm it against your deployment. The Shield API contract (section 3) is the same for all of them.
3. The building block — Shield’s guardrail HTTP API
Every integration below is just a gateway calling these stateless endpoints on the Shield data plane. No SDK required.
| Endpoint | Purpose | Body |
|---|---|---|
POST {SHIELD_URL}/guardrails/input |
Input safety (pre-call) | {"message": "<user text>"} |
POST {SHIELD_URL}/guardrails/output |
Output validation / sanitization, and tool-argument DLP (with context.stage="input") |
{"output": "<text>"} or {"output": "...", "context": {"tool_name": "...", "tool_input": {...}, "agent_id": "...", "user_role": "...", "stage": "input"}} |
POST {SHIELD_URL}/v1/shield/tool/check |
Tool RBAC + injection validation (authorization only — no content DLP) | {"agent_key": "...", "tool_name": "...", "user_role": "...", "tool_params": {...}} |
POST {SHIELD_URL}/v1/shield/tool/output |
Tool-result DLP — sanitize/redact/block tool output | {"tool_output": "<result>", "context": {"tool_name": "...", "user_role": "..."}} |
Headers (per request):
| Header | Meaning |
|---|---|
x-api-key |
Tenant API key (which tenant’s policies to apply) |
Authorization: Bearer <token> |
Proxy bearer if the data plane sits behind RunPod (RUNPOD_TOKEN) |
x-agent-key |
(optional) Agent identity for RBAC / tool checks |
x-user-role |
(optional) Caller role for RBAC |
Response (shared shape):
{
"safe": true,
"action": "pass",
"guardrail_results": [
{"guardrail": "prompt_injection", "passed": true, "message": ""}
]
}
When safe is false, the gateway should block (or sanitize, if the response
carries sanitized content) and surface the triggered guardrail_results.
4. Tool-call DLP (arguments + results)
For agentic tool calls, Shield enforces two independent things — keep them straight:
- RBAC / authorization — may this role call this tool with these params?
- DLP / data policy — what sensitive data is allowed to pass through the tool’s arguments and results? (PII, secrets/credentials, role-restricted, regulated, and internal-system data — redact, mask, or block.)
They run at different endpoints, so a gateway that only does the RBAC call will not get content DLP. Wire up all three points:
| Stage | Call | Enforces |
|---|---|---|
| Tool arguments (before execution) | POST /guardrails/output with context.stage="input" and the tool context |
DLP on args (regex + AI sanitization, redact/mask/block) + RBAC |
| Tool authorization | POST /v1/shield/tool/check |
RBAC, data-access scope, injection/payload validation — no content DLP |
| Tool results (after execution) | POST /v1/shield/tool/output |
DLP on the result (LLM sanitization → allow / redact / block) |
/v1/shield/tool/checkis authorization-only. If your integration calls only that endpoint, tool arguments and results are not DLP-scanned. To get redaction/blocking of sensitive data, also call/guardrails/output(stage="input") on the arguments and/v1/shield/tool/outputon the result.
Recommended per-tool-call sequence:
1. /v1/shield/tool/check → RBAC + validation (deny → don't run the tool)
2. /guardrails/output (input) → DLP on arguments (redact/block sensitive args)
3. run the tool
4. /v1/shield/tool/output → DLP on the result (redact/block sensitive output)
The shipped LiteLLM plugin already does the DLP calls:
votal_guardrail.py sends each tool call to /guardrails/output with
stage:"input" and the full tool context, so tool-argument DLP runs
automatically; the RBAC tool/check call is used on the streaming path. If you
build a webhook/plugin/sidecar integration yourself (Portkey, Kong, custom),
replicate steps 1–4 above.
MCP note. Over MCP, shield_check_tool is RBAC/allowlist only — you must
also call shield_sanitize_output on the result to get DLP. (See
Developer Guide — MCP & APIs.)
Current limitations.
- DLP scans the whole
tool_argspayload, not per-field — there’s no “redact only the SSN field” today. - On the streaming path the tool check fires at stream end, after chunks have been emitted; use non-streaming hooks if you need to block before any output.
5. LiteLLM (Validated)
Shield ships a native LiteLLM guardrail: VotalGuardrail(CustomGuardrail) in
votal_guardrail.py.
It implements:
async_pre_call_hook→POST /guardrails/inputasync_post_call_success_hook→POST /guardrails/output(text + per-tool context)async_post_call_streaming_iterator_hook→ streaming output +POST /v1/shield/tool/check
Config
config/litellm_guardrails.example.yaml:
model_list:
- model_name: gpt-4o
litellm_params:
model: gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-3-5-sonnet-20241022
api_key: os.environ/ANTHROPIC_API_KEY
guardrails:
- guardrail_name: votal-input-guard
litellm_params:
guardrail: votal_guardrail.VotalGuardrail
mode: pre_call
default_on: true
- guardrail_name: votal-output-guard
litellm_params:
guardrail: votal_guardrail.VotalGuardrail
mode: post_call
default_on: true
votal_guardrail:
api_base: "https://<your-shield-data-plane-host>"
api_token: "" # blank → reads RUNPOD_TOKEN / SHIELD_API_TOKEN from env
last_k_messages: 3
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
Notes:
- Register the class twice — once
pre_call, oncepost_call— so both input and output hooks run. default_on: truemakes the guard transparent: external clients don’t have to know it exists.- Supported
modevalues:pre_call,post_call,during_call,logging_only.
Per-tenant routing
LiteLLM intercepts X-API-Key as its own virtual key, so the tenant key is
passed in metadata (forwarded to Shield as x-api-key). Mint a virtual key that
carries the tenant:
curl -X POST "$PROXY/key/generate" \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-d '{"metadata": {"tenant_api_key": "<TENANT_API_KEY>"}}'
Hand the returned sk-... key to the agent. Optionally include agent_key and
user_role in the same metadata to enable tool RBAC.
One-shot setup
scripts/guard_external_agent.sh
starts the proxy, confirms VotalGuardrail initialized, mints a tenant virtual
key, runs a benign/injection gate test, and prints the Base URL + key + model
to paste into a hosted agent:
SHIELD_URL=$RUNPOD_HOST RUNPOD_TOKEN=$RUNPOD_TOKEN TENANT_API_KEY=$TENANT_API_KEY \
OPENAI_API_KEY=... ./scripts/guard_external_agent.sh
See Guard an External Agent for the full walkthrough.
6. Portkey (webhook pattern)
Portkey Guardrails support a custom webhook check: Portkey POSTs the request
(or response) to your endpoint and acts on the returned verdict. Point that
webhook at a thin adapter that translates Portkey’s payload to Shield’s
/guardrails/input (before-request hook) and /guardrails/output
(after-request hook), forwarding the tenant x-api-key and the
Authorization bearer.
Adapter contract (illustrative — match your Portkey webhook version):
Portkey --POST webhook--> Shield adapter --POST /guardrails/input--> Shield
(map fields, verdict
add x-api-key) <--{safe,...}--
<--{verdict:bool}--
Attach the before-request hook for input safety and the after-request hook for output validation. Mark the webhook deny-on-failure if you want fail-closed behavior.
7. Kong AI Gateway (plugin pattern)
Kong’s AI plugins run on the request/response lifecycle. Call Shield from either a small custom plugin or a pre-function/post-function snippet:
- access phase →
POST {SHIELD_URL}/guardrails/inputwith the prompt; on a non-safeverdict, short-circuit withkong.response.exit(403, …). - response phase →
POST {SHIELD_URL}/guardrails/outputwith the model output; block or replace with sanitized content.
Inject the tenant x-api-key and proxy Authorization header from Kong
consumer/credentials config so each route maps to the right Shield tenant.
8. Cloudflare AI Gateway & other proxies
Cloudflare AI Gateway focuses on caching, rate-limiting, and observability and does not expose a general-purpose external guardrail webhook for arbitrary request/response mutation today. The reliable pattern is to front it with the LiteLLM proxy (section 5) or the framework-free sidecar (section 9), which then calls Shield. The same applies to any gateway without a guard extension point.
9. Any OpenAI-compatible gateway (sidecar pattern)
If you can’t (or don’t want to) modify the gateway, wrap it with a guard:
- LiteLLM proxy in front — the universal shim. Point LiteLLM at the
gateway’s OpenAI-compatible base URL as a model, enable
VotalGuardrail(section 5), and send traffic to LiteLLM instead. Works for any OpenAI-compatible upstream. - Framework-free sidecar —
examples/guarded_chat.pydoes input-guard → call model/agent → output-guard against the same Shield endpoints, with no gateway at all. Good for custom HTTP agents.
10. What you get
Regardless of gateway, the guard runs at the proxy layer, so:
- No app code changes — protection is configured at the gateway.
- Any model/provider behind the gateway is covered uniformly.
- Per-tenant policies via the tenant
x-api-key. - Tool RBAC + DLP for agentic tool calls — authorization via
x-agent-key+x-user-role, and data-policy sanitization on tool arguments and results (see section 4). - Pairs with Continuous Identity & Auto-Revoke: a guardrail trip on an identified agent can revoke it in real time.
See also
- Guard an External Agent — end-to-end LiteLLM walkthrough.
- Identity Provider Interoperability — OIDC/OAuth interop with your IdP.
- Agentic Integration Guide — agent identity + capability tokens.
- Guardrails Catalog — the checks Shield runs.