Skip to main content
E-traces are the patterns ReasonBlocks pulls from the rb-api pattern store and appends to the system message as steering. There are three tiers, each with its own scope, retrieval logic, and per-run cap. Understanding which tier fires when explains exactly what guidance your agent is receiving on any given step.

The three tiers

Customer-scoped patterns retrieved by semantic similarity, gated by monitor signal, capped at one firing per run.E1Injection builds its query from the last 3 thoughts in the trace and asks rb-api for level="e1", top_k=1. Retrieval is scoped to your organization automatically — rb-api derives the scope from the API key’s principal, so there is no client-side customer_id parameter. Server-side, rb-api enforces the E1 similarity threshold (0.8) plus any per-pattern gate_threshold; these gates are not client-tunable.Client-side, E1 retrieval is gated by recent monitor history. The query is only issued when:
current_eval.fired is non-empty
OR current_eval.composite > 0.15
OR any monitor fired on either of the previous 2 steps
If no monitor has ever run in this session (no MonitorSteeringInjection configured), E1 is allowed unconditionally for backward compatibility.max_calls=1 means E1 fires at most once per run, regardless of how many subsequent steps satisfy the gate.
FAST state skips the full E-trace pipeline, so E1 is also skipped while the agent is sailing through easy steps.

Summary

TierScopeTop-kServer-side similarity gatePer-run capFires when
E1Instance (customer-scoped)10.81Monitor fired, composite > 0.15, or fired in last 2 steps. Never in FAST.
E2Failure-mode (shared)20.71Any scored step outside FAST. Narrowed by failure_type.
E3Universal32none1First call only.

How injections reach the system message

Pending injections are rendered and joined with blank lines, then appended to the system message as a separate content block:
[REASONBLOCKS]
<rendered guidance text>
The base system prompt is wrapped in its own content block with cache_control: {"type": "ephemeral"} so Anthropic’s prompt cache hits across every step. The [REASONBLOCKS] trailer rides uncached so its varying content does not bust the base cache key.
cache_control is ignored by non-Anthropic providers — safe no-op on OpenAI or Gemini models.
On steps where nothing fires (a FAST-state step with no pending injections), the [REASONBLOCKS] block is omitted entirely. The base system prompt is still wrapped with cache_control unconditionally so every step benefits from cache hits.

The E1 retrieval gate, in detail

The gate exists to keep vector search off of healthy steps. Pattern retrieval adds latency, and most steps in a well-functioning agent do not need instance-level steering. The two-step lookback ensures continuity: a single monitor firing keeps the gate open for the next two steps even if the score dips briefly. The check uses the last entry in _monitor_eval_history (the current call) plus the two prior entries:
allow E1 if:
  current_eval.fired is non-empty
  OR current_eval.composite > 0.15
  OR any monitor fired on either of the previous 2 steps
The 0.15 threshold is the only client-tunable value in this gate, and it is currently a private attribute (_e1_gate_composite_threshold) on the middleware. The 0.8 (E1) and 0.7 (E2) similarity thresholds are enforced server-side and are not exposed as SDK parameters.

What you can configure today

  • Disable all E-traces via e_traces_enabled=False on the constructor. E1, E2, and E3 are then never registered; the monitor steering injection still runs.
  • E1 retrieval is scoped to your organization automatically — rb-api derives it from your API key’s principal. There is no client-side customer_id parameter.
  • Influence E2’s metadata filter by the monitor-derived failure_type. This happens automatically when MonitorSteeringInjection returns a classification.
There is currently no public knob to change the E1 gate threshold, the E1/E2/E3 max_calls, or the server-side similarity thresholds.