Composition Rules
When you pass multiple middlewares â Agent(middleware=[m1, m2, m3]) â CubePi composes them according to per-hook rules that
differ on purpose. The right way to think about it is: each hook has
the composition rule that makes sense for its job, and you don't have
to remember "before" or "after" precedence guesses.
The rules at a glanceâ
| Hook | Rule | Order matters? |
|---|---|---|
transform_context | Chain â each sees previous output | Yes |
convert_to_llm | Last wins | Only the last one runs |
transform_system_prompt | Chain | Yes |
before_tool_call | First block stops | First in list wins blocks |
after_tool_call | Later overrides earlier | Last write wins |
should_stop_after_turn | Any True stops (OR) | No |
after_model_response | Chain with merge semantics | See below |
transform_context and transform_system_promptâ
Chain: m1's output becomes m2's input becomes m3's input. Useful
for layered transforms:
agent = Agent(
middleware=[
SlidingWindow(max_messages=20), # m1: drop oldest
InjectSummary(), # m2: prepend a summary block
],
)
m2 sees the truncated list. The user-visible
agent.state.messages is untouched â middleware only changes what
the model receives.
convert_to_llmâ
Last-wins on purpose: this is the final transform before wire
serialisation. Multiple owners would fight; pick one. CubePi enforces
that the last middleware in the list that implements
convert_to_llm is the one that runs.
If you find yourself needing two convert_to_llm middlewares,
collapse them into one (call site composition: write one that calls
both).
before_tool_callâ
First block=True short-circuits the rest. Use to chain policy
layers from most-restrictive to least:
agent = Agent(
middleware=[
RateLimiter(), # blocks on rate quota
SafetyFilter(), # blocks on dangerous args
AuditLogger(), # never blocks; just records
],
)
If RateLimiter returns block=True, SafetyFilter and
AuditLogger's before_tool_call don't run. AuditLogger.after_tool_call
still fires because that's a different hook.
after_tool_callâ
Each middleware can return an AfterToolCallResult with some fields
set; CubePi merges them, with later results overriding earlier ones
for any field that's not None. The full result:
class AfterToolCallResult(BaseModel):
content: list[Content] | None = None
details: Any = None
is_error: bool | None = None
terminate: bool | None = None
Pattern: an early middleware adds rich details, a later one
sanitises content for the model. Both run; the merged result
combines details from one with the redacted content from the
other.
should_stop_after_turnâ
Any middleware returning True ends the run. The rest of the chain
isn't evaluated.
agent = Agent(
middleware=[
MaxTurns(10),
BudgetCap(usd=0.5),
FinalAnswerSentinel(), # stops when assistant says "FINAL ANSWER"
],
)
after_model_responseâ
Chain with structured merge. Each middleware sees the current
response (which may have been replaced by an earlier middleware) and
returns a TurnAction:
response: AssistantMessage | Noneâ if non-None, replaces the current response for downstream middlewares and for what the loop ultimately persists.inject_messages: list[Message]â appended into a single list across the whole chain, then added to context before the next turn.decision: "natural" | "stop" | "loop_to_model"â the last middleware's value wins.
agent = Agent(
middleware=[
ProfanityRedactor(), # rewrites response
StructuredOutputValidator(), # may decide="loop_to_model"
EventLogger(), # decision unchanged
],
)
If StructuredOutputValidator returns decision="loop_to_model" and
EventLogger returns decision="natural", the loop sees "natural"
â because last wins. Reorder if that's not what you wanted.
Mixing middleware with constructor callablesâ
Agent(...) also accepts explicit hook callables (convert_to_llm=âĻ,
before_tool_call=âĻ, etc.). When both are present, the explicit
callable wins:
agent = Agent(
middleware=[LoggingMiddleware()],
before_tool_call=my_explicit_hook, # overrides the middleware version
)
Use the explicit form for one-off hooks; use middleware classes when behaviour is a coherent bundle.
A note on Middleware base classâ
The base Middleware class's unimplemented methods raise
NotImplementedError. compose_middleware detects this by comparing
to the base method and only wires hooks the middleware actually
overrides. You don't need to pass-implement every method.
class JustTransform(Middleware):
async def transform_context(self, messages, *, signal=None):
return messages[-10:]
# No other hooks. CubePi won't call them.
See alsoâ
- The 7 Hooks â what each hook does and when it fires.
- Examples â composition in practice.