Skip to main content
ReasonBlocks tracks each step’s difficulty through a finite state machine. By passing model_routing to the ReasonBlocks constructor, you can map any of the four post-INIT states (FAST, NORMAL, SLOW, SKIP) to a model identifier. The middleware swaps the model on the active ModelRequest inside wrap_model_call, before pattern injections render — so format-routing pattern selection sees the post-routing model family.
Model routing is opt-in. Without model_routing, the agent’s configured model runs every step.

How states map to routing

StateTriggerRouting
INITFirst call onlyNo routing override
FASTAll scores in fast_window (default 6) below fast_threshold (default 0.2)Routed if mapped; also skips E1/E2/E3 retrieval
NORMALDefault working stateRouted if mapped
SLOWAll scores in slow_window (default 5) above slow_threshold (default 0.6)Routed if mapped
SKIPAll scores in skip_window (default 35) above skip_threshold (default 0.85)Routed if mapped
FAST is the only state that affects E-trace retrieval — it skips E1, E2, and E3 because the agent is sailing through. Monitor scoring still runs in FAST so loop detection stays active. Any state can be mapped or unmapped independently; unmapped states leave the agent’s configured model untouched.

Configure routing

Keys must match FSMState names: "FAST", "NORMAL", "SLOW", "SKIP". Values follow LangChain’s init_chat_model format ("provider:model-name").
from reasonblocks import ReasonBlocks

rb = ReasonBlocks(
    api_key="rb_live_...",
    model_routing={
        "FAST":   "anthropic:claude-haiku-4-5-20251001",
        "NORMAL": "anthropic:claude-haiku-4-5-20251001",
        "SLOW":   "anthropic:claude-sonnet-4-6",
        "SKIP":   "anthropic:claude-sonnet-4-6",
    },
)
You don’t have to map every state. Omit any you don’t want to override:
# Only escalate when struggling — every other state uses the agent's default model
rb = ReasonBlocks(
    api_key="rb_live_...",
    model_routing={
        "SLOW": "anthropic:claude-sonnet-4-6",
    },
)
The middleware caches resolved chat models per id (init_chat_model is called once per unique id and reused).

Inspect routing decisions

StepLogEntry.model_id holds the resolved model id when routing fired on that step. Steps with no routing override leave it empty.
mw = rb.middleware(agent_name="bugfixer")
agent = create_agent(..., middleware=[mw])

with mw:
    result = agent.invoke({"messages": [("user", "Fix the bug.")]})

for entry in mw.step_log:
    routed = f" -> {entry.model_id}" if entry.model_id else ""
    print(f"step {entry.step}: {entry.fsm_state}{routed} difficulty={entry.difficulty}")
Sample output:
step 0: INIT (first call)
step 1: NORMAL difficulty=0.512
step 2: NORMAL difficulty=0.489
step 3: NORMAL difficulty=0.141
step 4: NORMAL difficulty=0.098
step 5: NORMAL difficulty=0.077
step 6: NORMAL difficulty=0.061
step 7: FAST -> anthropic:claude-haiku-4-5-20251001 difficulty=0.055
step 8: FAST -> anthropic:claude-haiku-4-5-20251001 difficulty=0.043
step 9: NORMAL difficulty=0.731
step 10: SLOW -> anthropic:claude-sonnet-4-6 difficulty=0.812

Tune the FSM thresholds

Pass fsm_thresholds to adjust when routing transitions happen. See Advanced configuration for the full set of knobs and the hysteresis rule that prevents thrashing at boundaries.
rb = ReasonBlocks(
    api_key="rb_live_...",
    model_routing={
        "FAST": "anthropic:claude-haiku-4-5-20251001",
        "SLOW": "anthropic:claude-sonnet-4-6",
    },
    fsm_thresholds={
        "fast_threshold": 0.15,   # require very-easy steps to drop to FAST
        "slow_threshold": 0.65,   # require clearly-hard steps to escalate
        "fast_window":    4,      # react sooner to easy runs
        "slow_window":    3,      # react sooner to struggling runs
    },
)
Lowering fast_threshold keeps the agent on the cheap model longer. Raising slow_threshold delays escalation. Either change reduces cost; both can affect quality on genuinely hard work.