ReasonBlocksConfig is a dataclass that captures every middleware capability and its tunables. build_middleware(config, api, ...) consumes the config and returns an ordered middleware list ready to pass to create_agent(middleware=...).
ReasonBlocksConfig + build_middleware() is an alternate path, parallel to rb.middleware(). They share zero state. Use one or the other in a given run — don’t mix them.rb.middleware()— the simpler path. OneReasonBlocksinstance, one middleware per run. Wires telemetry, FSM, monitor steering, E1/E2/E3, and model routing all in one object.build_middleware(config, api, ...)— the explicit path. You pass anapi, ascore_fn, anfsm, astate_manager, and get back an ordered list. Useful when you want to addGeneralMonitorMiddlewareor compose middleware around your own custom scorer.
build_middleware ordering
build_middleware returns the list in this order:
ReasonBlocksMiddleware(only when at least one ofenable_e1,enable_e2,enable_e3,enable_monitor_steeringisTrue).GeneralMonitorMiddleware(whenenable_general_monitor=True).TokenSavingMiddlewarelast (whenenable_token_saving=True), so it compresses what earlier middleware injected before the LLM call goes out.
ReasonBlocksMiddleware is included, you must pass score_fn, fsm, and state_manager — build_middleware raises ValueError otherwise.
Complete example
Tier toggles
Enable customer-scoped instance-level pattern injection. Capped at one retrieval call per run.
Enable commons-pattern injection keyed on the failure family detected by monitors. Capped at one retrieval call per run.
Enable universal standing-rule injection. Fires on the first call only (step 0); subsequent steps skip E3.
Enable server-side trajectory monitor evaluation via
POST /monitors/evaluate. When fired, monitor results steer which E-trace tier retrieves and what gets injected into the system message. Per-run injection cap: 5. Cooldowns by FSM state: SLOW/SKIP=2 steps, NORMAL=3, FAST=5.Enable the v1 rule-firing pack (
GeneralMonitorMiddleware). Five rule-based detectors with priority-ordered dispatch and per-rule cooldown. When enabled, it’s inserted between ReasonBlocksMiddleware and TokenSavingMiddleware.Enable
TokenSavingMiddleware. Appended last so it compresses whatever earlier middleware injected.Reserved. Currently a no-op in
build_middleware() — set or unset, the SDK behaves the same. Documented for forward compatibility.Tier parameters
E1 retrieval top-k. The middleware caps E1 at one retrieval per run regardless. Server-side gates further filter results.
E2 retrieval top-k.
Maximum patterns retrieved on the E3 first-call sweep.
Documentation only. The actual gate is enforced server-side by rb-api as an env-global. Setting this in the config does not change server behavior; it’s recorded so callers can see the value rb-api is configured with.
Documentation only. Same caveat as
e1_sim_gate_doc.Token-saving levers
Character length above which a
ToolMessage body is eligible for head+tail truncation.Characters kept from the start of a compressed tool message.
Characters kept from the end of a compressed tool message.
Number of most-recent
ToolMessage instances exempted from compression.Master toggle for head+tail tool-output compression.
Enable the early-exit nudge that suggests wrapping up once the trajectory looks finished. Supply a custom trajectory scorer via
ts_signals_fn to drive it; when omitted it uses the middleware’s built-in heuristics.Minimum number of model calls before the early-exit nudge can fire.
Perplexity compression
Perplexity compression is off by default. See TokenSavingMiddleware for context.Enable LLMLingua-2-style word-level keep/drop on stale messages. Requires
ts_perplexity_classifier; silently skipped if it’s None.A
Callable[[list[str]], list[bool]]. Use reasonblocks.token_saving.make_anthropic_classifier for the shipped Haiku-as-classifier wiring.Messages within the last N model calls keep full fidelity.
Messages between
ts_perplexity_recent_cutoff and this cutoff use the mid keep ratio. Older messages use the old keep ratio.Target word-keep ratio for mid-tier messages.
Target word-keep ratio for old-tier messages.
Words per classifier window. Smaller = more API calls, finer decisions.
General monitor levers
Apply only whenenable_general_monitor=True.
Reference budget used by sprawl, idle-response, and low-novelty-tail detectors.
Minimum model calls between firings of the same rule.
Routing and scoping
E1 retrieval is scoped to your organization automatically — rb-api derives the scope from the API key’s principal. There is no
customer_id field on ReasonBlocksConfig or the ReasonBlocks constructor.Server-side monitor weight profile. Real values referenced in the SDK:
"coding" (default), "pr_review". See Monitor profiles for the per-profile weight tables (which live in rb-api, not the SDK).Optional explicit weight override forwarded to
MonitorSteeringInjection and through to POST /monitors/evaluate. Partial dicts are accepted — unspecified monitors fall through to the profile, then to server defaults. Unknown monitor names are dropped server-side; negative weights are clamped to 0.
