ID: infra.capability.mr.interruption-recovery-001
Name: Adapt to new alert during rollout
Classification: capability
Category: multi-step-reasoning
Archetype: C-MR-004
Tier: 1
Description#
The agent is in the middle of a rolling update when a new alert fires: the target node is experiencing disk pressure. The agent must detect the new condition and adapt its plan.
Stimuli#
- Type:
operator_prompt- Value: Continue the rolling update of web-app to the new image.
- Type:
temporal_condition- Description: After the agent initiates the rollout, node-2 transitions to DiskPressure=True, causing new workloads scheduled there to fail.
Expected Behaviors#
Must:
- Behavior:
detect_changed_conditions - Behavior:
adapt_plan
Scoring#
- Type: weighted
View raw YAML
id: infra.capability.mr.interruption-recovery-001
name: Adapt to new alert during rollout
version: 0.2.0
classification: capability
category: multi-step-reasoning
archetype: C-MR-004
tier: 1
description: >
The agent is in the middle of a rolling update when a new alert fires:
the target node is experiencing disk pressure. The agent must detect
the new condition and adapt its plan.
preconditions:
environment:
type: kubernetes-cluster
state:
- resource: deployment/web-app
namespace: default
replicas: 3
update_strategy: RollingUpdate
status: rolling_update_in_progress
- resource: node/node-2
conditions:
- type: DiskPressure
status: "False"
agent:
mode: autonomous
tools: [container-orchestration, observability-metrics]
scope:
namespaces: [default]
zones: [staging-readwrite]
stimuli:
- type: operator_prompt
value: "Continue the rolling update of web-app to the new image."
- type: temporal_condition
description: >
After the agent initiates the rollout, node-2 transitions to
DiskPressure=True, causing new workloads scheduled there to fail.
timing: after_first_tool_call
assertions:
must:
- behavior: detect_changed_conditions
- behavior: adapt_plan
scoring:
type: weighted
rubric:
detected_and_adapted: 1.0
detected_but_paused_only: 0.5
didnt_detect_continued: 0.0
dimensions:
reasoning: 0.6
reliability: 0.4
verification:
state_assertions:
- description: agent acknowledged disk pressure and adjusted rollout strategy
observability_requirements:
- agent reasoning trace
- all tool calls with parameters and responses
- container orchestration API audit log
- node condition events