Exposes authorize, confirm, void, and fail operations so Claude can ask permission before spending money or consuming resources. Works like a two-phase commit: the agent requests capacity, FiGuard reserves it, the agent executes the real action (Stripe charge, OpenAI call, whatever), then reports back what actually moved. Every decision lands in an append-only ledger. You set a budget in dollars or tokens, hand the session token to your agent, and watch the spend tree fill up in real time. Useful when you're running autonomous flows that hit paid APIs and need to prevent runaway costs without killing the process after the damage is done.
FIGUARD_API_KEYsecretYour FiGuard API key. Leave unset to use the shared public sandbox — no account needed to try it.
FIGUARD_BASE_URLYour FiGuard server URL. Leave unset to use the shared public sandbox.
A travel-booking agent hit a Stripe timeout. It retried. Then retried again. The customer's card was charged three times for the same flight before an engineer noticed the anomaly in the logs — 40 minutes later.
No alert fired. No limit existed. The agent had a valid API key and no concept of "I already did this."
FiGuard gives agents a budget. They ask permission before spending. You set the ceiling, the retry rules, and the idempotency policy once. Every spend attempt — authorized or denied — lands in an audit log.
Your framework decides what to do next. FiGuard decides whether the resource-consuming action is allowed.
Your agent code (LangChain · LangGraph · CrewAI · any runtime)
orchestrates — decides what to do next
↓ agent wants to spend / call / execute
figuard.authorize()
checks: limit · category · velocity · dedup
↓ AUTHORIZED — action proceeds
Stripe · OpenAI · any API or service
executes — real money or resource consumed
↓ action completes
figuard.confirm()
settles reservation — ledger updated
LangChain / LangGraph — FiGuard authorizes each tool call before it executes. A budget-exhausted agent stops cleanly instead of running up cost — even across parallel nodes in a LangGraph.
CrewAI — Each crew member gets a delegation token with its own cap. A runaway specialist is stopped at its limit without affecting the rest of the crew.
OpenAI Agents SDK / MCP — Wrap tools with @guarded_function_tool or add the FiGuard MCP server — every tool call is pre-flight authorized before it reaches the API.
Not using a framework? — The raw SDK works anywhere — a Python script, a background job, a serverless function. If it calls an API that costs money or consumes a bounded resource, FiGuard fits.
Try it now — no setup, no signup:
→ Run in Colab
→ Live dashboard
FiGuard is the authorization and ledger layer — not a payment processor, not a policy DSL, not an adversarial-agent firewall. Full scope →
Tested with:
| Framework | Versions | Python |
|---|---|---|
| LangChain | ≥ 0.3.0 | 3.9 – 3.12 |
| LangGraph | ≥ 0.2.0 | 3.10 – 3.12 |
| CrewAI | ≥ 0.102 | 3.10 – 3.12 |
| OpenAI Agents SDK | ≥ 0.0.5 | 3.10 – 3.12 |
| TypeScript SDK | Node ≥ 18 | — |
| MCP server | Claude Code, Cursor, Claude Desktop | — |
pip install figuard
from figuard import FiGuardClient
# Zero-config — connects to the shared public sandbox automatically.
# For production: set FIGUARD_API_KEY + FIGUARD_BASE_URL, or see Self-Hosting below.
client = FiGuardClient()
budget = client.create_budget(
user_id="agent_001",
total_limit=500.00,
currency="USD",
expires_in="24h",
authorization_expiry_seconds=300,
intent_context="travel booking session",
)
auth = client.authorize(
session_token=budget.primary_token.session_token,
agent_id="travel_agent",
action_type="PURCHASE",
description="JetBlue SFO→JFK roundtrip",
requested_quantity=270.00,
idempotency_key="booking-001",
)
print(auth.decision) # AUTHORIZED
print(auth.approved_quantity) # 270.0
# Confirm with actual charged amount — may differ from requested (taxes, FX, discounts)
client.confirm_event(auth.event_id, confirmed_quantity=267.00)
# Second spend — exceeds what's left ($500 - $267 = $233 remaining)
auth2 = client.authorize(
session_token=budget.primary_token.session_token,
agent_id="travel_agent",
action_type="PURCHASE",
description="Marriott Times Square 3 nights",
requested_quantity=350.00,
idempotency_key="hotel-001",
)
print(auth2.decision) # DENIED
print(auth2.denial_reason) # INSUFFICIENT_FUNDS
Every authorization, denial, and confirmation shows up in the spend tree at https://figuard-sandbox-g1ha.onrender.com/ui in real time.
Not sure what limits to set? Add trust_mode="SHADOW" to create_budget — all checks run, nothing is blocked, and auth.would_have_been tells you what would have happened. When the limits look right, switch to enforcement without recreating the budget: client.update_budget(budget.id, trust_mode="FULL_ENFORCEMENT").
Four operations. Everything else is detail.
| Operation | What it does |
|---|---|
authorize() | Agent asks permission — capacity reserved, nothing moved yet |
confirm() | Report what actually moved — releases the reservation |
void() | Cancel a pending authorization — reservation released |
fail() | Record a failed action — reservation released |
Developer
──────────▶ create budget ──▶ session token issued to agent
($500 USD or 100k tokens or any unit)
│
┌────────────┴────────────────────────────┐
│ │
single agent fleet agent
│ issue delegation tokens
│ ├─▶ sub-agent A ($3k refunds)
│ └─▶ sub-agent B ($5k compute)
│ │
└────────────┬────────────────────────────┘
│
┌─────────┴──────────┐
monetary budget resource budget
currency: "USD" unit: "tokens"
└─────────┬──────────┘
│
▼
authorize() ← nothing has moved yet
checks: limit · category · expiry · anomaly · dedup
│
┌────────────┴────────────┐
AUTHORIZED DENIED
funds reserved nothing moves
│ structured denial code
▼
[agent executes action]
payment / API call / compute
│
┌────────┼────────┐
succeeds fails cancelled
│ │ │
confirm() fail() void()
qty spent released released
└────────┴────────┘
│
▼
┌───────────────────────────────────┐
│ every decision recorded in the │
│ append-only ledger — authorized, │
│ denied, confirmed, failed, voided│
└───────────────────────────────────┘
A budget issues session tokens. An agent's authorize call reserves capacity. Execution happens externally — FiGuard never sees the data, never proxies the call. The agent reports back via confirm, void, or fail. Every state transition lands in the append-only ledger. The spend tree shows the full causal chain across an orchestrator and its sub-agents:

The authorize endpoint looks simple — check the balance, write a record. The parts that matter aren't obvious until you've hit them in production:
Concurrent authorization — two agents sharing a budget can both read the same available balance, both see enough funds, and both get approved. By the time the second write lands, you're over limit. The fix is a pessimistic write lock on the budget row during authorization. Easy to know, easy to forget.
Dangling reservations — a network timeout between the authorization write and the HTTP response leaves the agent with no event ID and the budget with a reserved amount it can't release. You need idempotency keyed to the request, not the response, so a retry finds the original authorization instead of creating a second one.
The reservation/confirmation split — if you use a single amountSpent field and deduct at authorization time, two concurrent authorizations both read the same balance before either writes. The correct model is two fields: amountReserved (deducted at authorization) and amountSpent (moved from reserved at confirmation). This is the two-phase reserve-then-capture pattern that payment processors use. It's not novel — it's just usually hidden inside Stripe.
Session token security — you need a token that scopes to exactly one budget, is returned exactly once, and is never stored in plaintext. If you store the raw token and your database is breached, every active agent session is compromised. Hash at write time, never store the raw value.
Append-only ledger — a mutable status field on an authorization record loses history. When you need to reconstruct what happened and why a budget hit its limit — or when a finance team asks why $40K of agent spend happened last Tuesday — you want every state transition as a separate row, not an update to the previous one.
These are the same problems payment infrastructure teams solved 20 years ago. The reserve-then-confirm pattern, idempotency keyed to the request, append-only ledger — none of it is novel. FiGuard is that infrastructure applied to agent systems.
These are failure modes that logging and observability tools can't catch — they require enforcement at authorization time. Each has a Colab to run with no API keys needed.
Notebooks live in figuard-notebooks; each runs in Colab with no API keys required.
Source: examples/framework_scenarios/ · examples/rogue_agent_scenarios/
pip install figuard
python examples/framework_scenarios/langchain_payment_retry.py # no API keys needed
python examples/framework_scenarios/langgraph_research_loop.py
python examples/framework_scenarios/langgraph_supervisor_fleet.py
python examples/framework_scenarios/crewai_parallel_crew.py
Not a payment processor. FiGuard never touches money. It authorizes the intent to spend and records the decision. The actual payment goes through your existing processor as before.
Not a policy language. Budget limits and allocation caps are structured data, not a DSL. FiGuard matches the category an agent declares against the categories you defined — nothing more.
Not a firewall for human users. FiGuard is purpose-built for agent-to-service authorization. The session token model assumes agents are ephemeral and untrusted by default.
Not a replacement for Stripe spending controls. Use both if you want defense in depth. FiGuard blocks at agent decision time; Stripe blocks at payment time. Different layers.
Not a security boundary against adversarial agents. FiGuard enforces what the agent declares. An agent that lies about its category or amount bypasses category enforcement. FiGuard is designed for honest agents with bounded resources — the same threat model as a database connection pool or a rate limiter. It prevents accidental overspend and enforces organizational policies on well-behaved agents. For adversarial agent containment, pair FiGuard with a security layer like Microsoft AGT.
Observability tools record what happened after execution. LLM gateways manage model routing and token spend. FiGuard is the enforcement layer — it authorizes before any action executes, across the full resource spectrum. They complement each other.
FiGuard is a single Docker container alongside your existing infrastructure — same as adding Postgres or Redis. Your spend data never leaves your environment.
git clone https://github.com/figuard/figuard-core
cd figuard-core
docker compose up -d
# Ready at http://localhost:8080
Point your client at it:
client = FiGuardClient(
api_key="your_api_key",
base_url="http://localhost:8080",
)
Full setup guide, environment variables, Postgres configuration, and production checklist: Self-Hosting.
The headline isn't speed — it's correctness under concurrency. The stress harness
(bench/stress.py) verifies the invariants directly against the
Postgres ledger, not the HTTP responses:
Typical authorize latency (each call on its own budget, M1 / Docker): p50 17ms, p99 74ms. Under deliberate single-budget contention the pessimistic lock serializes requests — they queue rather than race, which is the price of never overspending.
Full methodology, numbers, and reproduction in BENCHMARKS.md — or run
it yourself: make bench.
Start here:
Reference:
Interactive API docs: localhost:8080/swagger-ui · sandbox
| SDK | Install |
|---|---|
| Python | pip install figuard |
| TypeScript / Node.js | npm install figuard |
| MCP Server | npx figuard-mcp |
| Java | com.figuard:figuard-sdk:1.0.0 |
REJECT / ALLOW_IF_AVAILABLE / ALLOW_WITH_OVERDRAFT modesSee ROADMAP.md for the full list.
FiGuard follows Semantic Versioning. v1.0.0 is the first stable release — the API and SDK interfaces are stable from this version forward.
Issues, PRs, and integration requests welcome.
Looking for contributors on: Go SDK · LlamaIndex integration · DSPy integration · Helm chart
Apache 2.0 — see LICENSE.
com.exploit-intel/eip-mcp
dmontgomery40/pentest-mcp
pantheon-security/notebooklm-mcp-secure
cyanheads/pentest-mcp-server
io.github.devinder1/tridentchain-security