Documentation

AI Agent Builder Documentation

From first download to production deployment. How your agent system works, how to configure it, and how to get the most out of it.

Getting Started

What you get

A downloadable ZIP containing a complete agent system: 27-module runtime engine, Qdrant vector memory, your customized workspace (SOUL.md, agent configs, skills, guardrails), Docker Compose for one-command deployment, shell scripts to start/stop/monitor, and macOS LaunchAgent files for 24/7 operation.

Requirements

Node.js 20+, Docker (for Qdrant vector memory), a machine that runs 24/7 (Mac mini, Linux server, or VPS), and an OpenRouter API key for AI model access. A Hetzner CX22 VPS (โ‚ฌ5/month) or a Mac mini works perfectly. Docker is optional โ€” agents work without Qdrant using file-based memory.

Setup in 5 steps

  1. 01Download your ZIP from the dashboard after purchase
  2. 02Extract and run install.sh โ€” installs Node.js dependencies
  3. 03Copy .env.template to .env and add your OpenRouter API key
  4. 04Run docker compose up -d โ€” starts Qdrant vector memory
  5. 05Run start.sh โ€” loads agent services. Run status.sh to verify.

Architecture

Runtime engine

The engine reads AGENT.md files (Markdown with YAML frontmatter) and runs a generic execution loop: pull task โ†’ build prompt โ†’ call LLM โ†’ execute tools โ†’ record results. No per-agent JavaScript code is needed. Adding a new agent = writing one AGENT.md file.

Task flow

Tasks are stored in a SQLite database (WAL mode). The engine pulls the highest-priority task, claims it atomically, and runs up to 15 tool-calling rounds. Write-tool verification ensures agents actually did something. Failed tasks retry 3 times before dead-lettering.

5-tier LLM routing

Each task is routed to the cheapest model that can handle it. Nano ($0.10/M tokens) for health checks, Workhorse ($0.32/M) for email triage, Capable ($0.50/M) for content drafts, Power ($0.55/M) for code review, Premium ($3-15/M) for critical decisions. Pattern matching auto-selects the tier.

Recommended: OpenRouter

We recommend OpenRouter as your LLM provider. One API key gives you access to 200+ models โ€” Claude, GPT-4, Gemini, Mistral, DeepSeek, Llama, and more. Our 5-tier routing switches between models automatically based on task complexity. No vendor lock-in: if one model goes down, the fallback chain tries the next. One key, all models, automatic switching.

4-tier memory system

Four tiers: (1) Working memory โ€” in-process Map, cleared each cycle. (2) Session logs โ€” JSONL per agent per day. (3) Agent memory โ€” MEMORY.md, persistent, max 100 lines. (4) Semantic memory โ€” Qdrant vector database for similarity search across all past knowledge. The reflection engine auto-extracts learnings every 5 tasks. The memory distiller prunes weekly. Each tier is independent โ€” the system degrades gracefully if Qdrant is unavailable.

Qdrant vector memory

Your agents store every learning as a vector embedding in Qdrant. When working on a new task, agents can search past knowledge semantically โ€” 'What did I learn about client X?' returns relevant memories by meaning, not just keywords. Embeddings are generated via your OpenRouter API key (same key that powers your LLM models). Run Qdrant with: docker compose up -d qdrant

Configuration

klawty.json

The main configuration file. Defines: model registry (which AI models are available), cost tiers and daily caps, agent defaults, skill matching rules, channel settings, and vector memory config (Qdrant URL, embedding model, collection name). Supports JSON5 (comments allowed).

SOUL.md

Defines the agent's identity โ€” name, role, voice, behavioral rules, and anti-patterns. This is the character sheet. Edit it in plain Markdown to change how your agent thinks and communicates.

AGENT.md

Per-agent configuration via YAML frontmatter: model tier, heartbeat cycle (how often it wakes up), tools (allow/deny lists), skills, channel for reporting, and discovery prompt (how it finds its own work). The body is free-form instructions.

Skills

Domain knowledge files in workspace/skills/{name}/SKILL.md. When a task title matches a skill's keywords, that skill is automatically injected into the agent's prompt. Token-budgeted: max 800 chars per skill, 3000 chars total.

Built-in Tools

10 built-in tools

  1. 01file_read โ€” Read files from the workspace
  2. 02file_write โ€” Write or update files
  3. 03web_search โ€” Search the web via DuckDuckGo or Google
  4. 04web_fetch โ€” Fetch and parse a URL
  5. 05exec โ€” Run a shell command
  6. 06recall_memory โ€” Search agent memory (file-based + Qdrant vector search)
  7. 07store_memory โ€” Save a fact to persistent memory + vector store
  8. 08send_message โ€” Send a message to another agent
  9. 09create_task โ€” Create a new task for any agent
  10. 10search_knowledge โ€” Semantic search across all past agent knowledge via Qdrant

Risk tiers

Every tool has a risk level: AUTO (just do it), AUTO+ (do it and notify), PROPOSE (create a proposal with 15-min rollback), CONFIRM (wait for your approval), BLOCK (never execute). You configure these per-agent in AGENT.md.

Custom tools

Drop a JavaScript file in workspace/tools/ and it's auto-discovered. Export a function matching the tool-calling interface (name, description, parameters, execute). The tool registry picks it up on the next cycle.

Safety & Guardrails

Proposal system

When an agent wants to do something risky (send email, deploy code, modify data), it creates a proposal instead of acting directly. PROPOSE tier: auto-executes with 15-minute rollback. CONFIRM tier: waits for your explicit approval via Discord reaction or dashboard.

Prompt injection defense

The runtime includes a 6-rule defense block injected into every prompt for write-capable agents. Covers: untrusted input handling, social engineering detection, authority spoofing, delimiter injection, path safety, and credential exfiltration prevention.

4-layer deduplication

Prevents agents from spamming: task dedup (70% word overlap in 4-hour window), channel dedup (hash-based 1-hour window), proposal dedup (same agent + action), and discovery caps (max 8 tasks/day per agent).

Docker Deployment

Quick start with Docker

  1. 01docker compose up -d โ€” starts Qdrant vector memory (port 6333)
  2. 02./scripts/start.sh โ€” starts all agent runners (or use docker compose for everything)
  3. 03docker compose logs -f โ€” watch agent output in real-time
  4. 04docker compose down โ€” stop everything cleanly

What the compose file includes

Qdrant v1.9.2 vector database (semantic memory, healthcheck, persistent volume) + per-agent runner containers (Node 20, workspace mounted as volume, SQLite on persistent volume). Each agent service depends on Qdrant and auto-connects via QDRANT_URL=http://qdrant:6333.

Running without Docker

Docker is optional. Without it, agents use file-based memory (MEMORY.md + JSONL logs) instead of Qdrant. All 4 memory tiers degrade gracefully โ€” if Qdrant is unavailable, the vector tier is simply skipped. Run agents natively with: node runtime/agent-runner.js --agent atlas --workspace ./workspace

Qdrant Cloud

Instead of running Qdrant locally, you can use Qdrant Cloud (https://cloud.qdrant.io). Set QDRANT_URL and QDRANT_API_KEY in your .env file. The runtime auto-connects on boot.

License & Activation

How activation works

Your download includes a unique LICENSE_KEY in the .env file. On first run, the system registers a hardware fingerprint (a one-way hash of your machine's characteristics โ€” no personal data is sent). This binds your license to your machine.

Update checks

The system checks for updates approximately once every 24 hours. This also validates your license. It transmits only: your license key, hardware fingerprint hash, and software version. It never transmits agent output, business data, or API keys.

Offline operation

If the license server is unreachable, your system runs normally for 7 days (grace period). After that, agents enter observation mode โ€” they can read and report but not execute write operations. Reconnecting to the internet restores full functionality.

Transferring to a new machine

Your license is bound to one machine. If you replace your machine, contact us at [email protected] with your license key and we'll transfer it to your new device. This is free and takes less than 24 hours.

What's not allowed

Sharing your download, license key, or activation with other people or organizations. Running the same license on multiple machines simultaneously. Reverse-engineering or removing the license system. See our Terms of Service (sections 5โ€“7) for full details.

Troubleshooting

Agent not starting

Check: (1) LaunchAgent loaded โ€” run status.sh, (2) .env has valid OPENROUTER_API_KEY, (3) Node.js 20+ installed โ€” run node --version, (4) Logs โ€” check observability/logs/ for errors.

Agent doing nothing

Check: (1) Tasks exist โ€” run sqlite3 data/tasks.db "SELECT * FROM tasks WHERE status='backlog'", (2) Discovery is enabled โ€” check AGENT.md has a discoveryPrompt, (3) Circuit breaker isn't open โ€” check logs for CIRCUIT_OPEN.

High AI costs

Check: (1) Daily cap is set in klawty.json, (2) Model routing is correct โ€” health checks should use nano tier, (3) No stuck task loops โ€” check for tasks retrying infinitely. Run: sqlite3 data/tasks.db "SELECT model, SUM(cost_usd) FROM costs WHERE date(created_at)=date('now') GROUP BY model"

Agent repeating work

The dedup engine should prevent this. Check: (1) dedup.js is in runtime/, (2) Discovery caps are set in AGENT.md frontmatter (default: maxDiscoveryPerDay: 8), (3) Task titles are similar enough to trigger dedup (70% word overlap threshold).