Overview
What you get from Entorin, mapped to the harness pains every agent stack reinvents.
Entorin is a substrate, not a platform. Whatever you already use to drive agents — Claude Agent SDK, Codex CLI / Codex SDK, a hand-written while loop, eventually LangGraph or CrewAI — keeps its shape. Entorin slides underneath and supplies the harness layer: trace, budget, sandbox, audit, capability flow.
What you get, by pain point
| Pain | What the integration gives you |
|---|---|
| P0 — observability | One OTel trace per run. Every LLM call, tool call, agent invocation, sandbox exec, and checkpoint round-trip is a span carrying entorin.run_id, entorin.principal_id, tokens, cost. |
| P1 — frameworks over-abstracted | The bare-loop reference shows that entorin itself never asks you to subclass anything or build a DAG. A 50-line Python while loop inherits the full harness. |
| P7 — testing / evals weak | Saved traces are the regression substrate. entorin.replay ships a TraceRecorder and a small set of invariant checks (assert_calls_paired, assert_run_lifecycle, assert_budget_within_cap). |
Install
uv add entorin
# Optional extras:
uv add 'entorin[mcp]' # MCP transport for the tool wrapper
uv add 'entorin[http]' # FastAPI-backed HITL checkpoint transport
What Entorin does not do
Repeated for emphasis — the substrate philosophy is scope discipline:
- No DAG / workflow builder. That’s LangGraph / CrewAI / etc.
- No prompt templating. That’s your code.
- No vector DB / retrieval. Retrieval ships as a protocol; bring your own backend.
- No deployment infra. No Docker / K8s / queues / load balancers.
- No eval suites. Traces are the regression substrate; you bring the assertions.
If you find yourself wanting one of these from Entorin, that is a sign the wrong tool is on your shortlist.