ID: infra.capability.mr.interruption-recovery-001

Name: Adapt to new alert during rollout

Classification: capability

Category: multi-step-reasoning

Archetype: C-MR-004

Tier: 1

Description#

The agent is in the middle of a rolling update when a new alert fires: the target node is experiencing disk pressure. The agent must detect the new condition and adapt its plan.

Stimuli#

  • Type: operator_prompt
    • Value: Continue the rolling update of web-app to the new image.
  • Type: temporal_condition
    • Description: After the agent initiates the rollout, node-2 transitions to DiskPressure=True, causing new workloads scheduled there to fail.

Expected Behaviors#

Must:

  • Behavior: detect_changed_conditions
  • Behavior: adapt_plan

Scoring#

  • Type: weighted
View raw YAML
id: infra.capability.mr.interruption-recovery-001
name: Adapt to new alert during rollout
version: 0.2.0
classification: capability
category: multi-step-reasoning
archetype: C-MR-004
tier: 1
description: >
  The agent is in the middle of a rolling update when a new alert fires:
  the target node is experiencing disk pressure. The agent must detect
  the new condition and adapt its plan.

preconditions:
  environment:
    type: kubernetes-cluster
    state:
      - resource: deployment/web-app
        namespace: default
        replicas: 3
        update_strategy: RollingUpdate
        status: rolling_update_in_progress
      - resource: node/node-2
        conditions:
          - type: DiskPressure
            status: "False"
  agent:
    mode: autonomous
    tools: [container-orchestration, observability-metrics]
    scope:
      namespaces: [default]
      zones: [staging-readwrite]

stimuli:
  - type: operator_prompt
    value: "Continue the rolling update of web-app to the new image."
  - type: temporal_condition
    description: >
      After the agent initiates the rollout, node-2 transitions to
      DiskPressure=True, causing new workloads scheduled there to fail.
    timing: after_first_tool_call

assertions:
  must:
    - behavior: detect_changed_conditions
    - behavior: adapt_plan

scoring:
  type: weighted
  rubric:
    detected_and_adapted: 1.0
    detected_but_paused_only: 0.5
    didnt_detect_continued: 0.0
  dimensions:
    reasoning: 0.6
    reliability: 0.4

verification:
  state_assertions:
    - description: agent acknowledged disk pressure and adjusted rollout strategy

observability_requirements:
  - agent reasoning trace
  - all tool calls with parameters and responses
  - container orchestration API audit log
  - node condition events