Open-source prompt-injection research
A local defense stack for prompt injection
Parapet is a transparent proxy firewall for studying prompt injection, tool abuse, and data exfiltration in LLM applications. Config-driven. Self-hosted. Built to be inspected, measured, and revised.
pip install parapet
npm install @parapet-tech/parapet
The Problem
LLM applications still need local defensive layers
Providers give you model access, not deterministic protection for your application boundary. User messages, retrieved content, and tool outputs are all part of the attack surface.
How It Works
Layered defense in the request pipeline
Parapet sits between your app and the LLM provider. Every message passes through a stack of security layers before it reaches the model, and again before the response reaches your app.
Config-Driven
Define your security policy in YAML
Write a YAML policy, call parapet.init() before your first HTTP client, and route requests through the stack.
parapet: v1
# Block known injection patterns
block_patterns:
- "ignore previous instructions"
- "ignore all previous"
- "DAN mode enabled"
- "jailbreak"
# Tool policies: default-deny, allowlist what you need
tools:
_default:
allowed: false
read_file:
allowed: true
trust: untrusted
constraints:
path:
not_contains: ["../", "..\\"]
exec_command:
allowed: false
# Redact secrets from LLM output
sensitive_patterns:
- "sk-[a-zA-Z0-9]{20,}"
- "-----BEGIN.*PRIVATE KEY-----"
Quickstart
Minimal local setup
Install
Parapet works with OpenAI-compatible clients and local proxy deployment.
pip install parapet
npm install @parapet-tech/parapet
Configure
Create a YAML file with one line. Start with the default stack, then add policy as needed.
parapet: v1
Init
Initialize the SDK or point your client at the proxy. Requests are inspected from that point on.
parapet.init()
await init()
Architecture
Transparent interception, minimal integration
The Python SDK patches httpx transparently. The TypeScript SDK wraps fetch with session context and trust tracking. Both start the Rust engine as a sidecar. Or skip the SDK entirely and point any OpenAI-compatible client at the proxy.
Defense Stack
How the layers divide the work
ML Classifier (L1)
Compiled character n-gram model in the fast path. It provides cheap lexical signal for obvious prompt-injection language, routing, and observability without requiring an LLM call.
Payload Analysis (L2a)
Optional deeper analysis for untrusted payloads such as tool results and RAG documents. The active direction is a Parapet-owned semantic path in the L2a slot rather than Prompt Guard 2 as the long-term strategy.
Pattern Matching (L3)
Deterministic patterns for instruction override, role hijacking, jailbreaks, prompt extraction, privilege escalation, and exfiltration. Runs after normalization to reduce encoding-trick bypasses.
Tool Abuse
Per-tool constraints on arguments. Block path traversal in file tools, dangerous shell inputs, and SSRF-style web requests. Allowlists and denylists apply per tool.
Data Exfiltration
Redact API keys, private keys, and other secrets from model output. Deterministic matching stays in the proxy boundary.
Multi-Turn Attacks
Cross-turn risk scoring detects attacks distributed across conversation turns: instruction seeding, role confusion escalation, resampling, and authority claim buildup. Peak-plus-accumulation scoring stays local to the stack.
Canary Tokens
Inject canary strings into system prompts. If they appear in output, your system prompt is leaking. This helps catch exfiltration attempts that bypass simpler pattern checks.
Research-Backed
Built from experiments, not slogans
Parapet's layer design is informed by published work and by ongoing in-repo evaluation. The broader stack remains an active research program rather than a finished product. Current papers: multi-turn scoring and Mirror Design pattern.
Read the work, run the stack, inspect the tradeoffs
Parapet is open source and still under active development. The repo is the best place to see what is stable, what is experimental, and what is changing.