What is ReasonBlocks?

ReasonBlocks is a drop-in Python SDK for production AI agents. Add one middleware (or a one-line framework adapter) and your agent gains live observability, mid-run failure correction, lower token cost, and an A/B harness to prove the difference.

The problem

Long-running agents fail in ways you can’t see until the bad output — or the bill — lands. They loop on the same action, hedge and backtrack, drift off task, skip verification, and re-read context until the token budget is gone. In production you usually have no window into why a run went sideways, and no way to steer it back while it’s still running.

What you get

See every run. Each step streams to a dashboard: reasoning-health scores, which failure-mode monitors fired, tokens, and dollar cost — across every supported framework.
Catch and correct mid-run. Server-side trajectory monitors detect failure modes (semantic loops, skipped verification, silent topic drift, and more) and inject corrective steering into the agent’s next step. Mined corrections and successful patterns are reused across runs from the E-trace library; your instance-level patterns stay scoped to your org.
Spend less. Route easy steps to a cheaper model, keep the prompt cache intact across turns, compress stale tool outputs, and nudge stuck agents to wrap up — cutting cost without sacrificing accuracy.
Prove it. Built-in A/B: randomly route runs to ReasonBlocks ON vs a vanilla control, then pull a per-arm report of cost, token, and accuracy deltas. See Run an A/B evaluation.
Drop in anywhere. The same pipeline plugs into LangChain, LangGraph, the OpenAI Agents SDK, the Anthropic Messages API, and the Claude Agent SDK.

What works on which framework

Capability	LangChain	LangGraph	OpenAI Agents	Claude Messages	Claude Agent SDK
Telemetry to the dashboard	yes	yes	yes	yes	yes
FSM step scoring	yes	yes	yes	yes	—
Server-side monitor steering	yes	yes	yes	yes	—
E-trace injection (E1 / E2 / E3)	yes	yes	yes	yes	—
Model routing	yes	yes	yes (with factory)	yes	—
Token-saving compression + early exit	yes	via `create_agent`	—	—	—
A/B evaluation harness	yes	yes	—	—	—
`CodebaseMemory` tool factory	yes	yes	yes	yes	yes
`ImportGraph` blast-radius queries	yes	yes	yes	yes	yes

LangChain 1.0’s create_agent is built on LangGraph, so the LangChain and LangGraph rows track each other for create_agent-based apps. For hand-rolled StateGraphs, the LangGraph guide shows how to wire the steering pipeline into your own graph nodes. The Claude Agent SDK path is telemetry-only because its agent loop runs inside the Claude Code CLI process.

Entry points

rb.middleware() — LangChain 1.0 AgentMiddleware. The reference implementation; token-saving and general-monitor middleware compose alongside it.
rb.ab_middleware() — same pipeline, wrapped for an A/B evaluation: a deterministic coin routes each run to the full pipeline (on) or a vanilla control (off), and a per-arm report compares them.
rb.openai_model(default_model, ...) — wraps an openai-agents Model so Agent(model=...) runs the pipeline before each get_response call. Pair with an optional model_factory to enable model routing.
rb.claude_messages_session() — builds a SteeringSession you pass into run_messages_agent_loop(..., session=...) to run steering on every Messages-API turn.
rb.claude_agent_telemetry() — telemetry-only adapter for the Claude Agent SDK query() stream (steering can’t apply inside Claude Code, but run_start / step / run_finish events still reach the dashboard).

What the steering pipeline does

On every step of a LangChain agent, the ReasonBlocks middleware:

Scores the agent’s last reasoning step using a heuristic that combines hedging density, response length, error language, and entity density.
Advances a difficulty FSM (INIT, FAST, NORMAL, SLOW, SKIP) using the score plus recent history.
Posts the current trace to the ReasonBlocks API’s /monitors/evaluate endpoint, which runs the monitor suite server-side and returns a steering intervention when the trajectory looks broken.
Retrieves up to three tiers of E-traces from the pattern store: E1 (instance-level, org-scoped, gated by monitor signal), E2 (failure-mode patterns), and E3 (universal rules, fired once on the first call).
Renders any pending injections into a [REASONBLOCKS] block appended to the system message, and overrides the model if model_routing maps the current FSM state to a different model.
Streams per-step telemetry to the ReasonBlocks dashboard.

Key capabilities

E-trace injection

Three tiers of guidance pulled from the pattern store and appended to the system message.

FSM state machine

Tracks agent difficulty across FAST, NORMAL, SLOW, and SKIP with hysteresis.

Server-side monitors

/monitors/evaluate runs trajectory monitors and returns steering interventions.

Model routing

Map FSM states to model identifiers and let the middleware swap models per step.

A/B evaluation

Route runs ON vs a vanilla control and get a per-arm cost/accuracy report.

Codebase memory

Persist and recall per-repo findings semantically across agent runs.

Token saving

Compress stale tool outputs and exit early when a trajectory looks finished.

Get started

Quickstart

Add ReasonBlocks to a LangChain agent in five minutes.

Installation

Install the SDK and configure the client.

LangChain guide

Full middleware walkthrough.

Run an A/B evaluation

Prove the cost/accuracy impact with on/off arms.

Getting Started

Concepts

Using ReasonBlocks

Connectors and sync

The problem

What you get

What works on which framework

Entry points

What the steering pipeline does

Key capabilities

E-trace injection

FSM state machine

Server-side monitors

Model routing

A/B evaluation

Codebase memory

Token saving

Get started

Quickstart

Installation

LangChain guide

Run an A/B evaluation

​The problem

​What you get

​What works on which framework

​Entry points

​What the steering pipeline does

​Key capabilities

E-trace injection

FSM state machine

Server-side monitors

Model routing

A/B evaluation

Codebase memory

Token saving

​Get started

Quickstart

Installation

LangChain guide

Run an A/B evaluation

The problem

What you get

What works on which framework

Entry points

What the steering pipeline does

Key capabilities

Get started