ReasonBlocks tracks each step’s difficulty through a finite state machine. By passing model_routing to the ReasonBlocks constructor, you can map any of the four post-INIT states (FAST, NORMAL, SLOW, SKIP) to a model identifier. The middleware swaps the model on the active ModelRequest inside wrap_model_call, before pattern injections render — so format-routing pattern selection sees the post-routing model family.
Model routing is opt-in. Without model_routing, the agent’s configured model runs every step.
How states map to routing
| State | Trigger | Routing |
|---|
INIT | First call only | No routing override |
FAST | All scores in fast_window (default 6) below fast_threshold (default 0.2) | Routed if mapped; also skips E1/E2/E3 retrieval |
NORMAL | Default working state | Routed if mapped |
SLOW | All scores in slow_window (default 5) above slow_threshold (default 0.6) | Routed if mapped |
SKIP | All scores in skip_window (default 35) above skip_threshold (default 0.85) | Routed if mapped |
FAST is the only state that affects E-trace retrieval — it skips E1, E2, and E3 because the agent is sailing through. Monitor scoring still runs in FAST so loop detection stays active. Any state can be mapped or unmapped independently; unmapped states leave the agent’s configured model untouched.
Keys must match FSMState names: "FAST", "NORMAL", "SLOW", "SKIP". Values follow LangChain’s init_chat_model format ("provider:model-name").
from reasonblocks import ReasonBlocks
rb = ReasonBlocks(
api_key="rb_live_...",
model_routing={
"FAST": "anthropic:claude-haiku-4-5-20251001",
"NORMAL": "anthropic:claude-haiku-4-5-20251001",
"SLOW": "anthropic:claude-sonnet-4-6",
"SKIP": "anthropic:claude-sonnet-4-6",
},
)
You don’t have to map every state. Omit any you don’t want to override:
# Only escalate when struggling — every other state uses the agent's default model
rb = ReasonBlocks(
api_key="rb_live_...",
model_routing={
"SLOW": "anthropic:claude-sonnet-4-6",
},
)
The middleware caches resolved chat models per id (init_chat_model is called once per unique id and reused).
Inspect routing decisions
StepLogEntry.model_id holds the resolved model id when routing fired on that step. Steps with no routing override leave it empty.
mw = rb.middleware(agent_name="bugfixer")
agent = create_agent(..., middleware=[mw])
with mw:
result = agent.invoke({"messages": [("user", "Fix the bug.")]})
for entry in mw.step_log:
routed = f" -> {entry.model_id}" if entry.model_id else ""
print(f"step {entry.step}: {entry.fsm_state}{routed} difficulty={entry.difficulty}")
Sample output:
step 0: INIT (first call)
step 1: NORMAL difficulty=0.512
step 2: NORMAL difficulty=0.489
step 3: NORMAL difficulty=0.141
step 4: NORMAL difficulty=0.098
step 5: NORMAL difficulty=0.077
step 6: NORMAL difficulty=0.061
step 7: FAST -> anthropic:claude-haiku-4-5-20251001 difficulty=0.055
step 8: FAST -> anthropic:claude-haiku-4-5-20251001 difficulty=0.043
step 9: NORMAL difficulty=0.731
step 10: SLOW -> anthropic:claude-sonnet-4-6 difficulty=0.812
Tune the FSM thresholds
Pass fsm_thresholds to adjust when routing transitions happen. See Advanced configuration for the full set of knobs and the hysteresis rule that prevents thrashing at boundaries.
rb = ReasonBlocks(
api_key="rb_live_...",
model_routing={
"FAST": "anthropic:claude-haiku-4-5-20251001",
"SLOW": "anthropic:claude-sonnet-4-6",
},
fsm_thresholds={
"fast_threshold": 0.15, # require very-easy steps to drop to FAST
"slow_threshold": 0.65, # require clearly-hard steps to escalate
"fast_window": 4, # react sooner to easy runs
"slow_window": 3, # react sooner to struggling runs
},
)
Lowering fast_threshold keeps the agent on the cheap model longer. Raising slow_threshold delays escalation. Either change reduces cost; both can affect quality on genuinely hard work.