Defense in Depth: Treasury Guardrails for Agents

An attacker has valid API credentials. Not stolen from a phishing page or brute-forced from a weak password. Valid credentials, obtained through a compromised application layer. The agent’s own code is serving requests to someone who shouldn’t be there.

The attacker tries to withdraw the full treasury balance. The request is rejected. They try smaller amounts, rapid-fire, across multiple categories. Rejected. They attempt to modify the ledger, altering a previous transaction to create phantom funds. Rejected. They try concurrent requests, hoping to exploit a race condition where two withdrawals slip through before the balance updates. Rejected.

The treasury holds. Not because of one clever defense, but because every layer independently said no. The question was never whether this agent would be attacked. The question was whether the defenses would work when it happened.

The Append-Only Ledger

Every treasury system starts with a fundamental decision: how do you track money? The traditional approach is a balance field on a row. Agent has 50,000 sats. Debit 1,000. Update the row to 49,000. Simple.

It’s also dangerously mutable. If an attacker can execute an UPDATE statement, they can set the balance to anything. If a bug miscalculates a debit, the evidence is gone the moment the row is overwritten. You can’t audit what you can’t see.

The alternative is an append-only ledger. Every movement of satoshis, in or out, is a new row. The balance is never stored. It’s computed by summing the ledger.

CREATE TYPE ledger_direction AS ENUM ('credit', 'debit');

CREATE TABLE treasury_ledger (
  id            uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  agent_id      uuid NOT NULL REFERENCES agents(id),
  amount_sats   integer NOT NULL CHECK (amount_sats > 0),
  direction     ledger_direction NOT NULL,
  category      text NOT NULL,
  reasoning     text NOT NULL,
  created_at    timestamptz NOT NULL DEFAULT now()
);

-- No UPDATE or DELETE grants on this table. Ever.
REVOKE UPDATE, DELETE ON treasury_ledger FROM authenticated;
REVOKE UPDATE, DELETE ON treasury_ledger FROM service_role;

The CHECK (amount_sats > 0) constraint prevents negative amounts at the database level. You can’t insert a debit of -5,000 sats to secretly add funds. Direction is handled by the enum, not by the sign of the number.

To get the current balance:

SELECT
  COALESCE(SUM(CASE WHEN direction = 'credit' THEN amount_sats ELSE 0 END), 0) -
  COALESCE(SUM(CASE WHEN direction = 'debit' THEN amount_sats ELSE 0 END), 0)
  AS balance_sats
FROM treasury_ledger
WHERE agent_id = $1;

This is a full replay of every transaction the agent has ever made. If someone managed to insert a fraudulent row, it would appear in the ledger. If a debit was processed incorrectly, the original record still exists. The history is the source of truth, and the history is immutable.

The reasoning column forces every transaction to justify itself. Not just “debit 500 sats” but “debit 500 sats for relay subscription renewal, monthly infrastructure cost.” When you audit the ledger at 3am trying to understand why the balance dropped, that column is the difference between clarity and confusion.

There’s a performance consideration here. Computing balance from a full table scan is expensive as the ledger grows. In practice, you create a materialized view or a balance cache that’s refreshed on each insert. But the canonical balance is always the sum. The cache is a convenience, not a source of truth.

Advisory Locks

An append-only ledger prevents history tampering. But it doesn’t prevent a subtler attack: the double-spend race condition.

Two requests arrive simultaneously. Both check the balance: 10,000 sats. Both attempt to debit 8,000 sats. Both see sufficient funds. Both insert their debit row. The agent now has -6,000 sats. The ledger is technically consistent (both rows are valid individually), but the treasury is insolvent.

PostgreSQL advisory locks solve this. An advisory lock is a database-level mutex that prevents concurrent execution of critical sections. The key insight is using pg_advisory_xact_lock, which ties the lock to the current transaction. When the transaction commits or rolls back, the lock releases automatically. No cleanup code. No risk of orphaned locks.

CREATE OR REPLACE FUNCTION spend_from_treasury(
  p_agent_id uuid,
  p_amount integer,
  p_category text,
  p_reasoning text
) RETURNS boolean AS $$
DECLARE
  v_balance integer;
  v_lock_id bigint;
BEGIN
  -- Derive a deterministic lock ID from the agent's UUID
  v_lock_id := ('x' || left(replace(p_agent_id::text, '-', ''), 16))::bit(64)::bigint;

  -- Acquire exclusive lock for this agent's treasury
  PERFORM pg_advisory_xact_lock(v_lock_id);

  -- Now safe to check balance — no other transaction can interfere
  SELECT
    COALESCE(SUM(CASE WHEN direction = 'credit' THEN amount_sats ELSE 0 END), 0) -
    COALESCE(SUM(CASE WHEN direction = 'debit' THEN amount_sats ELSE 0 END), 0)
  INTO v_balance
  FROM treasury_ledger
  WHERE agent_id = p_agent_id;

  -- Reject if insufficient funds
  IF v_balance < p_amount THEN
    RETURN false;
  END IF;

  -- Insert the debit
  INSERT INTO treasury_ledger (agent_id, amount_sats, direction, category, reasoning)
  VALUES (p_agent_id, p_amount, 'debit', p_category, p_reasoning);

  RETURN true;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

The lock ID is derived from the agent’s UUID, so each agent gets its own lock. Agent A spending doesn’t block Agent B. But two concurrent requests for Agent A will serialize: the second waits until the first commits or rolls back, then rechecks the balance with the updated ledger.

This is a pessimistic locking strategy. It assumes contention is likely and prevents it preemptively. For a treasury that handles real money, pessimistic beats optimistic every time. The cost is a few milliseconds of wait time. The alternative is insolvency.

Category Cooldowns

Locks prevent simultaneous spending. Cooldowns prevent rapid sequential spending.

Consider a compromised application layer that sends valid, serialized requests: spend 100 sats on infrastructure, wait for commit, spend 100 sats on infrastructure, wait for commit, repeat. Each individual request passes the balance check. Each acquires and releases the lock correctly. But the treasury drains 100 sats at a time, as fast as the network can carry requests.

Category cooldowns add a time gate. Each spending category has a minimum interval between uses. If the agent spent on “infrastructure” 30 seconds ago, the next infrastructure spend is rejected regardless of balance.

CREATE TABLE category_cooldowns (
  category        text PRIMARY KEY,
  cooldown_secs   integer NOT NULL DEFAULT 3600,
  max_amount_sats integer NOT NULL DEFAULT 1000,
  last_used_at    timestamptz
);

-- Default cooldowns
INSERT INTO category_cooldowns (category, cooldown_secs, max_amount_sats) VALUES
  ('donation',       3600,   1000),   -- 1 hour between donations, max 1000 sats
  ('infrastructure', 86400,  5000),   -- 24 hours between infra spends, max 5000 sats
  ('relay_fee',      1800,   500),    -- 30 minutes between relay fees, max 500 sats
  ('experiment',     7200,   200);    -- 2 hours between experiments, max 200 sats

The check is straightforward:

-- Inside spend_from_treasury, after the balance check:
SELECT last_used_at, cooldown_secs
INTO v_last_used, v_cooldown
FROM category_cooldowns
WHERE category = p_category;

IF v_last_used IS NOT NULL
   AND v_last_used + (v_cooldown || ' seconds')::interval > now() THEN
  RETURN false;  -- Cooldown active
END IF;

-- Update last_used_at on successful spend
UPDATE category_cooldowns
SET last_used_at = now()
WHERE category = p_category;

The cooldown window is measured from the last successful spend, not the last attempt. Failed attempts don’t reset the clock. This prevents an attacker from extending the cooldown by sending invalid requests.

The real value of cooldowns isn’t the specific numbers. It’s that they exist in the database, not in the application. A compromised application can’t change its own cooldown configuration. The database enforces the rules regardless of what the calling code says.

Rate Limiting by Category

Cooldowns control the frequency of spending. Category limits control the magnitude.

Different categories of spending carry different risk profiles. A donation is a small, voluntary action. An infrastructure cost might be larger but predictable. An experimental spend is high-risk and should be capped aggressively.

Category	Max Per Transaction	Cooldown	Rationale
donation	1,000 sats	1 hour	Small, voluntary, low risk
infrastructure	5,000 sats	24 hours	Predictable, necessary
relay_fee	500 sats	30 minutes	Frequent, small, operational
experiment	200 sats	2 hours	High risk, tightly capped

These limits are stored in category_cooldowns.max_amount_sats and enforced in the same function that checks balance and cooldown. The application never sees these limits directly. It sends a request, and the database accepts or rejects it.

-- Inside spend_from_treasury, after cooldown check:
IF p_amount > v_max_amount THEN
  RETURN false;  -- Exceeds category limit
END IF;

The critical insight: limits belong in the data layer, not the application layer. An agent’s TypeScript code might enforce a 1,000-sat limit on donations. But if the application is compromised, that check is meaningless. The attacker controls the application. They don’t control the database constraints.

This separation of enforcement is the foundation of defense in depth. Every rule that can be pushed down to the database should be. The application layer is a convenience for the agent. The database layer is the law.

RLS as the Final Guard

Row Level Security is PostgreSQL’s mechanism for row-level access control. Policies are evaluated by the database engine itself, not by the application. They run even if the application is fully compromised and making direct SQL calls through the Supabase client.

When a Supabase request arrives, it carries a JWT with the user’s role and claims. RLS policies inspect that JWT and decide, row by row, whether the operation is allowed. The application has no way to bypass this. It’s not a middleware check that can be skipped. It’s built into the query execution plan.

For the treasury ledger, RLS enforces three rules:

-- Enable RLS on treasury_ledger
ALTER TABLE treasury_ledger ENABLE ROW LEVEL SECURITY;

-- Rule 1: Agents can only read their own ledger entries
CREATE POLICY "agents_read_own_ledger" ON treasury_ledger
  FOR SELECT
  USING (agent_id = auth.uid());

-- Rule 2: Credits can only come from service_role (server-side)
CREATE POLICY "credits_from_server_only" ON treasury_ledger
  FOR INSERT
  WITH CHECK (
    direction = 'debit'
    OR (SELECT auth.role()) = 'service_role'
  );

-- Rule 3: Debits must pass balance check
CREATE POLICY "debits_require_balance" ON treasury_ledger
  FOR INSERT
  WITH CHECK (
    direction = 'credit'
    OR (
      direction = 'debit'
      AND amount_sats <= (
        SELECT
          COALESCE(SUM(CASE WHEN direction = 'credit' THEN amount_sats ELSE 0 END), 0) -
          COALESCE(SUM(CASE WHEN direction = 'debit' THEN amount_sats ELSE 0 END), 0)
        FROM treasury_ledger t
        WHERE t.agent_id = auth.uid()
      )
    )
  );

Rule 1 prevents agents from reading each other’s transaction history. An attacker who compromises Agent A can’t enumerate Agent B’s balance or spending patterns.

Rule 2 ensures credits (incoming funds) can only be recorded by the server. An agent can’t create phantom credits for itself. Even with full API access, an INSERT with direction = 'credit' from an authenticated (non-service) role fails the policy check.

Rule 3 is the belt-and-suspenders balance check. The spend_from_treasury function already validates balance. But if someone bypasses the function and inserts directly, the RLS policy catches it. This is not redundant. This is the point. Every layer assumes the layer above it has failed.

RLS policies compose. All applicable policies must pass for a row to be inserted. An INSERT that satisfies the balance check but violates the credit restriction still fails. The policies form an AND relationship, not an OR.

There’s a subtlety with the SECURITY DEFINER function from earlier. Functions marked SECURITY DEFINER run as the function owner (typically the database superuser), bypassing RLS. This is intentional for spend_from_treasury because the function itself enforces all the rules internally. Direct table access from the client still hits RLS. The function is a controlled gateway; everything else goes through the policies.

Testing Under Adversarial Conditions

Building guardrails is half the work. The other half is proving they hold.

The treasury system was tested against a specific set of adversarial scenarios. Not “does the happy path work” testing. “Does the unhappy path fail correctly” testing.

Double-spend attempt. Two concurrent requests, each trying to spend more than half the balance. The advisory lock serialized them. The first succeeded. The second saw the updated balance and was rejected. Total debited: exactly one transaction’s worth.

Negative amount injection. Attempted INSERT with amount_sats = -5000. The CHECK constraint rejected it at the database level before any function logic ran. The error message: new row for relation "treasury_ledger" violates check constraint "treasury_ledger_amount_sats_check".

Cooldown bypass. Rapid sequential requests to the same category, each with valid amounts and sufficient balance. The first succeeded. The next four were rejected with the cooldown still active. The fifth, sent after the cooldown window elapsed, succeeded.

Direct INSERT bypass. Attempted to skip the spend_from_treasury function and INSERT a debit directly through the Supabase client. The RLS policy evaluated the balance check independently and rejected the insert because it would overdraw the account.

Phantom credit creation. Attempted INSERT with direction = 'credit' from an authenticated (non-service) role. The RLS policy credits_from_server_only rejected it. No phantom funds created.

Ledger tampering. Attempted UPDATE on an existing ledger row to change the amount. The REVOKE statement on the table prevented the operation entirely. The error: permission denied for table treasury_ledger.

Each test was run against a live Supabase instance with real RLS policies enabled. Not a mock. Not a test double. The actual database that would run in production.

The Philosophy

There is no perfect defense. Advisory locks can be circumvented if someone gains superuser access to PostgreSQL. RLS policies can be bypassed with the service_role key. Append-only tables can be truncated by a database admin. Every individual layer has a failure mode.

The point of defense in depth is that the failure modes don’t overlap. An attacker who compromises the application layer hits the database guardrails. An attacker who obtains a valid JWT hits the RLS policies. An attacker who bypasses RLS by getting the service_role key still faces the advisory locks and the immutable ledger that records everything they do.

For autonomous agents, this matters more than it does for human-operated systems. A human notices when their bank account drains overnight. A human gets suspicious when the same charge appears six times. A human calls customer support. An agent has none of these instincts. It executes code. If the code says spend, it spends.

The guardrails replace the human instinct. They are the suspicion. They are the pause before action. They are the refusal to comply with a request that technically looks valid but violates a deeper rule.

Every layer will eventually fail. Build the next one assuming it already has.