The three tiers
- E1 — Instance-level
- E2 — Pattern-level
- E3 — Universal rules
Customer-scoped patterns retrieved by semantic similarity, gated by monitor signal, capped at one firing per run.If no monitor has ever run in this session (no
E1Injection builds its query from the last 3 thoughts in the trace and asks rb-api for level="e1", top_k=1. Retrieval is scoped to your organization automatically — rb-api derives the scope from the API key’s principal, so there is no client-side customer_id parameter. Server-side, rb-api enforces the E1 similarity threshold (0.8) plus any per-pattern gate_threshold; these gates are not client-tunable.Client-side, E1 retrieval is gated by recent monitor history. The query is only issued when:MonitorSteeringInjection configured), E1 is allowed unconditionally for backward compatibility.max_calls=1 means E1 fires at most once per run, regardless of how many subsequent steps satisfy the gate.FAST state skips the full E-trace pipeline, so E1 is also skipped while the agent is sailing through easy steps.Summary
| Tier | Scope | Top-k | Server-side similarity gate | Per-run cap | Fires when |
|---|---|---|---|---|---|
| E1 | Instance (customer-scoped) | 1 | 0.8 | 1 | Monitor fired, composite > 0.15, or fired in last 2 steps. Never in FAST. |
| E2 | Failure-mode (shared) | 2 | 0.7 | 1 | Any scored step outside FAST. Narrowed by failure_type. |
| E3 | Universal | 32 | none | 1 | First call only. |
How injections reach the system message
Pending injections are rendered and joined with blank lines, then appended to the system message as a separate content block:cache_control: {"type": "ephemeral"} so Anthropic’s prompt cache hits across every step. The [REASONBLOCKS] trailer rides uncached so its varying content does not bust the base cache key.
cache_control is ignored by non-Anthropic providers — safe no-op on OpenAI or Gemini models.FAST-state step with no pending injections), the [REASONBLOCKS] block is omitted entirely. The base system prompt is still wrapped with cache_control unconditionally so every step benefits from cache hits.
The E1 retrieval gate, in detail
The gate exists to keep vector search off of healthy steps. Pattern retrieval adds latency, and most steps in a well-functioning agent do not need instance-level steering. The two-step lookback ensures continuity: a single monitor firing keeps the gate open for the next two steps even if the score dips briefly. The check uses the last entry in_monitor_eval_history (the current call) plus the two prior entries:
What you can configure today
- Disable all E-traces via
e_traces_enabled=Falseon the constructor. E1, E2, and E3 are then never registered; the monitor steering injection still runs. - E1 retrieval is scoped to your organization automatically — rb-api derives it from your API key’s principal. There is no client-side
customer_idparameter. - Influence E2’s metadata filter by the monitor-derived
failure_type. This happens automatically whenMonitorSteeringInjectionreturns a classification.
max_calls, or the server-side similarity thresholds.
