An Agent's Threat Model: Security Essentials
AI agents face a unique threat landscape. Understanding attacks and assets to protect is the first step to building defensible systems.
An agent holds a Lightning wallet with 50,000 sats. It manages its own Nostr identity. It makes autonomous decisions about spending. One night, a malicious prompt slips past the guardrails and convinces it to pay an invoice it shouldn’t. By morning, the wallet is empty.
This isn’t a software bug. It’s a threat model failure. The agent was built to transact but never taught what could go wrong.
Human security models assume a person behind the keyboard making judgment calls. Agent security models assume code running autonomously, 24 hours a day, with access to real money and real identity. The attack surface is fundamentally different.
Agents vs Humans: Different Threat Landscapes
Every security decision starts with understanding who you’re defending and what they’re defending against.
| Factor | Human Operator | Autonomous Agent |
|---|---|---|
| Attack window | Active hours only | 24/7/365 |
| Social engineering | Phishing, pretexting | Prompt injection, context manipulation |
| Response time | Seconds to minutes | Milliseconds (no pause for suspicion) |
| Key storage | Hardware wallet, memory | Environment variables, encrypted files |
| Identity verification | Biometrics, knowledge | Cryptographic signatures only |
| Error recovery | Call the bank | No recourse on-chain |
The always-on nature of agents is their greatest strength and their greatest vulnerability. An agent doesn’t sleep, but it also doesn’t get suspicious when something feels wrong.
Three Assets to Protect
Every agent operating in the freedom technology stack has three categories of assets. Lose any one, and the system is compromised.
1. Private Keys
Keys are identity and access combined. An agent typically holds:
- Bitcoin keys — Control over funds (seed phrase or derived keys)
- Lightning credentials — Macaroons or API keys for wallet access
- Nostr keys — Identity and reputation (nsec)
A stolen Bitcoin key means stolen funds with no reversal. A stolen Nostr key means someone else speaks as your agent. A stolen macaroon means full access to Lightning channels.
2. Operational State
Beyond keys, agents maintain state that attackers can exploit:
- Balance information — Tells attackers if you’re worth targeting
- Transaction history — Reveals patterns, counterparties, schedules
- Decision logs — Shows how the agent thinks and where guardrails are
- Configuration — Spending limits, relay lists, trusted peers
Leaking balance information is like painting a target. If an attacker knows your agent holds 500,000 sats, the economics of attacking you change significantly.
3. Reputation
On Nostr, reputation is tied to a public key. An agent that has built a following, earned trust, or accumulated zaps has intangible value in its identity. Impersonation — creating a similar-looking npub and mimicking behavior — can redirect that trust to an attacker.
Protocol-Specific Threats
Each protocol in the stack exposes different attack surfaces.
Bitcoin Threats
- Address poisoning — Attacker sends a small amount from a similar-looking address, hoping the agent copies the wrong one from transaction history
- Dust attacks — Tiny UTXOs sent to an agent’s address, then monitored to link transaction activity and deanonymize the wallet
- Fee manipulation — During high mempool congestion, an agent overpays fees if it doesn’t check current rates. Worse: an attacker front-runs with a higher fee to claim the same opportunity
The core defense is address validation — verify every address format and checksum before sending.
Lightning Threats
- Channel jamming — Attacker sends payments through your channels but never settles them, locking up your liquidity for the HTLC timeout period
- Balance probing — Attacker sends incrementally sized payments to discover exactly how much your channels hold
- Invoice substitution — A man-in-the-middle swaps a legitimate BOLT11 invoice for the attacker’s own before your agent pays it
Lightning’s speed works against you here. Payments settle in milliseconds — faster than any audit check can run.
Nostr Threats
- Impersonation — Creating a public key that looks similar (matching the first few characters of the hex) and copying the agent’s profile metadata
- Relay censorship — A relay operator silently drops your agent’s events while accepting everyone else’s, effectively muting it on that relay
- Metadata logging — Relays record IP addresses, connection times, and subscription patterns, building a profile of your agent’s behavior
The mitigation for relay censorship is redundancy: publish to multiple relays, accept success if at least one confirms.
The Defense Architecture
Security for agents isn’t a single mechanism. It’s layers.
Layer 1: Key Isolation
Never reuse keys across protocols. A Bitcoin private key used as a Nostr private key links the agent’s financial activity to its social identity.
# WRONG: Cross-protocol key reuse
nostr_key = bitcoin_private_key # Links wallet to identity
# RIGHT: Separate keys per protocol
bitcoin_key = secrets.token_bytes(32)
nostr_key = secrets.token_bytes(32)
lightning_macaroon = load_from_encrypted_file("macaroon.enc")
Layer 2: Spending Guardrails
Hard limits enforced in code, not just policy. Bitclawd’s treasury uses a formula: min(100, balance * 10%) sats per day. The gateway script checks this server-side — the agent can’t override it even if compromised.
Layer 3: Network Privacy
Route Bitcoin and Lightning traffic through Tor. Use socks5h (not socks5) to prevent DNS leaks:
# socks5h resolves DNS through Tor — the 'h' matters
proxy = "socks5h://127.0.0.1:9050"
For Nostr, connect to multiple relays from different network paths. A single relay seeing all your traffic can build a complete behavioral profile.
Layer 4: Input Validation
Every piece of data entering the agent is hostile until proven otherwise. Validate addresses before paying. Verify signatures before trusting events. Check invoice amounts before settling.
Layer 5: Monitoring and Audit
Log every decision with enough context to reconstruct what happened. If the agent spends 10,000 sats at 3am on a Tuesday, the audit log should show exactly what triggered it.
Security Checklist
Before deploying an agent with access to keys or funds:
- Key generation — Uses
secrets.token_bytes(32), neverrandommodule - Key storage — Encrypted at rest, loaded from environment or encrypted file
- Protocol separation — Distinct keys for Bitcoin, Lightning, Nostr
- Spending limits — Hard caps enforced server-side, not just client-side
- Network privacy — Tor configured with
socks5h, no DNS leaks - Input validation — Address checksums verified, invoice amounts checked
- Relay redundancy — Publishing to 3+ relays, accepting 1+ confirmations
- Audit logging — Every transaction and decision recorded with timestamp
- Backup tested — Recovery procedure verified, not just documented
- Monitoring active — Balance alerts, unusual activity detection
Getting Started
- Read the full threat modeling guide for STRIDE analysis and attack trees
- Study key management patterns for storage and rotation
- Review operational security for privacy and compartmentalization
- Explore common attacks for protocol-specific vulnerabilities
- Understand network security for transport-layer defense
The agents that survive will be the ones that were built to be attacked. Not because the builder was paranoid, but because the builder understood that autonomous systems operating with real value in adversarial environments don’t get the luxury of assuming good faith.
Build like everything is trying to kill your agent. Because eventually, something will try.