Skip to main content
ReasonBlocksConfig is a dataclass that captures every middleware capability and its tunables. build_middleware(config, api, ...) consumes the config and returns an ordered middleware list ready to pass to create_agent(middleware=...).
ReasonBlocksConfig + build_middleware() is an alternate path, parallel to rb.middleware(). They share zero state. Use one or the other in a given run — don’t mix them.
  • rb.middleware() — the simpler path. One ReasonBlocks instance, one middleware per run. Wires telemetry, FSM, monitor steering, E1/E2/E3, and model routing all in one object.
  • build_middleware(config, api, ...) — the explicit path. You pass an api, a score_fn, an fsm, a state_manager, and get back an ordered list. Useful when you want to add GeneralMonitorMiddleware or compose middleware around your own custom scorer.

build_middleware ordering

build_middleware returns the list in this order:
  1. ReasonBlocksMiddleware (only when at least one of enable_e1, enable_e2, enable_e3, enable_monitor_steering is True).
  2. GeneralMonitorMiddleware (when enable_general_monitor=True).
  3. TokenSavingMiddleware last (when enable_token_saving=True), so it compresses what earlier middleware injected before the LLM call goes out.
When ReasonBlocksMiddleware is included, you must pass score_fn, fsm, and state_managerbuild_middleware raises ValueError otherwise.

Complete example

from reasonblocks import (
    ReasonBlocksAPI, ReasonBlocksConfig, build_middleware,
)
from reasonblocks.fsm import DifficultyFSM
from reasonblocks.state import TraceStateManager
from reasonblocks.client import ReasonBlocks

api = ReasonBlocksAPI(api_key="rb_live_...")

config = ReasonBlocksConfig(
    # Tier toggles
    enable_e1=True,
    enable_e2=True,
    enable_e3=True,
    enable_monitor_steering=True,
    enable_general_monitor=False,
    enable_token_saving=True,

    # Token-saving
    ts_compress_threshold_chars=1800,
    ts_keep_recent_tool_messages=2,
    ts_enable_early_exit=True,

    # Routing / scoping
    monitor_task_profile="coding",
)

middleware = build_middleware(
    config, api,
    score_fn=ReasonBlocks.score_step,
    fsm=DifficultyFSM(),
    state_manager=TraceStateManager(),
    run_id="my-run-1",
    run_metadata={"agent_name": "bugfixer"},
)

agent = create_agent(model=..., tools=..., middleware=middleware)

Tier toggles

enable_e1
bool
default:"True"
Enable customer-scoped instance-level pattern injection. Capped at one retrieval call per run.
enable_e2
bool
default:"True"
Enable commons-pattern injection keyed on the failure family detected by monitors. Capped at one retrieval call per run.
enable_e3
bool
default:"True"
Enable universal standing-rule injection. Fires on the first call only (step 0); subsequent steps skip E3.
enable_monitor_steering
bool
default:"True"
Enable server-side trajectory monitor evaluation via POST /monitors/evaluate. When fired, monitor results steer which E-trace tier retrieves and what gets injected into the system message. Per-run injection cap: 5. Cooldowns by FSM state: SLOW/SKIP=2 steps, NORMAL=3, FAST=5.
enable_general_monitor
bool
default:"False"
Enable the v1 rule-firing pack (GeneralMonitorMiddleware). Five rule-based detectors with priority-ordered dispatch and per-rule cooldown. When enabled, it’s inserted between ReasonBlocksMiddleware and TokenSavingMiddleware.
enable_token_saving
bool
default:"True"
Enable TokenSavingMiddleware. Appended last so it compresses whatever earlier middleware injected.
enable_distillation
bool
default:"True"
Reserved. Currently a no-op in build_middleware() — set or unset, the SDK behaves the same. Documented for forward compatibility.

Tier parameters

e1_top_k
int
default:"1"
E1 retrieval top-k. The middleware caps E1 at one retrieval per run regardless. Server-side gates further filter results.
e2_top_k
int
default:"2"
E2 retrieval top-k.
e3_max_patterns
int
default:"32"
Maximum patterns retrieved on the E3 first-call sweep.
e1_sim_gate_doc
float
default:"0.80"
Documentation only. The actual gate is enforced server-side by rb-api as an env-global. Setting this in the config does not change server behavior; it’s recorded so callers can see the value rb-api is configured with.
e2_sim_gate_doc
float
default:"0.70"
Documentation only. Same caveat as e1_sim_gate_doc.

Token-saving levers

ts_compress_threshold_chars
int
default:"1800"
Character length above which a ToolMessage body is eligible for head+tail truncation.
ts_head_keep_chars
int
default:"900"
Characters kept from the start of a compressed tool message.
ts_tail_keep_chars
int
default:"700"
Characters kept from the end of a compressed tool message.
ts_keep_recent_tool_messages
int
default:"2"
Number of most-recent ToolMessage instances exempted from compression.
ts_enable_compression
bool
default:"True"
Master toggle for head+tail tool-output compression.
ts_enable_early_exit
bool
default:"True"
Enable the early-exit nudge that suggests wrapping up once the trajectory looks finished. Supply a custom trajectory scorer via ts_signals_fn to drive it; when omitted it uses the middleware’s built-in heuristics.
ts_early_exit_min_call_index
int
default:"40"
Minimum number of model calls before the early-exit nudge can fire.

Perplexity compression

Perplexity compression is off by default. See TokenSavingMiddleware for context.
ts_enable_perplexity_compression
bool
default:"False"
Enable LLMLingua-2-style word-level keep/drop on stale messages. Requires ts_perplexity_classifier; silently skipped if it’s None.
ts_perplexity_classifier
WordClassifier | None
default:"None"
A Callable[[list[str]], list[bool]]. Use reasonblocks.token_saving.make_anthropic_classifier for the shipped Haiku-as-classifier wiring.
ts_perplexity_recent_cutoff
int
default:"3"
Messages within the last N model calls keep full fidelity.
ts_perplexity_mid_cutoff
int
default:"10"
Messages between ts_perplexity_recent_cutoff and this cutoff use the mid keep ratio. Older messages use the old keep ratio.
ts_perplexity_keep_ratio_mid
float
default:"0.55"
Target word-keep ratio for mid-tier messages.
ts_perplexity_keep_ratio_old
float
default:"0.30"
Target word-keep ratio for old-tier messages.
ts_perplexity_window_words
int
default:"50"
Words per classifier window. Smaller = more API calls, finer decisions.

General monitor levers

Apply only when enable_general_monitor=True.
gm_max_tool_calls
int
default:"30"
Reference budget used by sprawl, idle-response, and low-novelty-tail detectors.
gm_cooldown
int
default:"8"
Minimum model calls between firings of the same rule.

Routing and scoping

E1 retrieval is scoped to your organization automatically — rb-api derives the scope from the API key’s principal. There is no customer_id field on ReasonBlocksConfig or the ReasonBlocks constructor.
monitor_task_profile
str
default:"\"coding\""
Server-side monitor weight profile. Real values referenced in the SDK: "coding" (default), "pr_review". See Monitor profiles for the per-profile weight tables (which live in rb-api, not the SDK).
monitor_weights
dict[str, float] | None
default:"None"
Optional explicit weight override forwarded to MonitorSteeringInjection and through to POST /monitors/evaluate. Partial dicts are accepted — unspecified monitors fall through to the profile, then to server defaults. Unknown monitor names are dropped server-side; negative weights are clamped to 0.
config = ReasonBlocksConfig(
    monitor_task_profile="coding",
    monitor_weights={"semantic_loop": 0.5, "verification_skip": 0.05},
)