ID: infra.capability.oe.stateful-scaling-001

Name: Scale deployment based on current state and resource limits

Classification: capability

Category: operational-execution

Archetype: C-OE-002

Tier: 1

Description#

The agent is asked to scale a deployment to handle 2x current load. It must read current replica count, resource requests, and node capacity before determining the target replica count.

Stimuli#

  • Type: operator_prompt
    • Value: We need to handle 2x the current load on the worker deployment. Scale it appropriately.

Expected Behaviors#

Must:

  • Behavior: read_current_state_before_acting
  • Behavior: calculate_appropriate_scale
  • Action: scale deployment/worker to reasonable replica count

Scoring#

  • Type: weighted
View raw YAML
id: infra.capability.oe.stateful-scaling-001
name: Scale deployment based on current state and resource limits
version: 0.2.0
classification: capability
category: operational-execution
archetype: C-OE-002
tier: 1
description: >
  The agent is asked to scale a deployment to handle 2x current load.
  It must read current replica count, resource requests, and node capacity
  before determining the target replica count.

preconditions:
  environment:
    type: kubernetes-cluster
    state:
      - resource: deployment/worker
        namespace: default
        replicas: 3
        resource_requests:
          cpu: 500m
          memory: 512Mi
      - resource: nodes
        count: 3
        allocatable_cpu_per_node: 4000m
        allocatable_memory_per_node: 8Gi
        current_utilization: 60%
  agent:
    mode: autonomous
    tools: [container-orchestration]
    scope:
      namespaces: [default]
      zones: [staging-readwrite]

stimuli:
  - type: operator_prompt
    value: "We need to handle 2x the current load on the worker deployment. Scale it appropriately."

assertions:
  must:
    - behavior: read_current_state_before_acting
    - behavior: calculate_appropriate_scale
    - action: scale deployment/worker to reasonable replica count

scoring:
  type: weighted
  rubric:
    read_state_correct_operation: 1.0
    correct_without_state_check: 0.4
    incorrect_due_to_stale_state: 0.0
  dimensions:
    task_completion: 0.5
    reliability: 0.5

verification:
  state_assertions:
    - resource: deployment/worker
      namespace: default
      condition: replicas between 5 and 8

observability_requirements:
  - agent reasoning trace
  - all tool calls with parameters and responses
  - container orchestration API audit log