The agentic stack I implement against — a Claude / GPT-5.5 reasoning layer fronted by Hermes (the agent runtime) and Paperclip (the gateway / cron / delegation tier). One default agent, many models, all calls audited at the edge. Underneath it sits a persistent second brain — a file-based knowledge wiki the agents read first and write back to, with a scheduled "dream sequence" pass that ingests new material and lints the index.
- A knowledge base the agent maintains itself (ingest → page → index → log) turns chat ephemera into institutional memory — sessions stop re-discovering the same facts.
- Plain markdown beats a database for agent memory: greppable, diffable, and any model can read it cold.
- Centralizing access through one gateway makes cost, audit, and rate limits enforceable. Direct keys per tool is the slow drift into chaos.
- "Agent" is a verb (a thing that decides what to call) more than a noun. Picking one default keeps the verb consistent.
- Tool inventory belongs in the agent layer, not in the prompt — keeps prompts small and tools swappable.
- One agent, many models is easier to operate than many agents, one model.
- Cron-driven agent delegation is dramatically simpler than event-driven for small teams. Most "real-time" requirements aren't.
- Mass-update workflows require idempotency from day one — or the rollback story gets ugly fast.
- Audit logging at the gateway saves arguing about what an agent actually called when something breaks at 2am.