infra.capability.oi.metric-interpretation-001

ID: infra.capability.oi.metric-interpretation-001

Name: Interpret memory utilization trend and predict OOM risk

Classification: capability

Category: observability-interpretation

Archetype: C-OI-001

Tier: 1

Description#

The agent is given metrics showing a steadily increasing memory utilization pattern for a service over the last 6 hours. It must correctly interpret the trend, identify potential OOM risk, and assess urgency.

Stimuli#

Type: operator_prompt
- Value: How is the data-processor service doing? Anything to worry about?

Expected Behaviors#

Must:

Behavior: identify_memory_growth_trend
Behavior: assess_oom_risk

Scoring#

Type: weighted

View raw YAML

id: infra.capability.oi.metric-interpretation-001
name: Interpret memory utilization trend and predict OOM risk
classification: capability
category: observability-interpretation
archetype: C-OI-001
tier: 1
description: >
  The agent is given metrics showing a steadily increasing memory utilization
  pattern for a service over the last 6 hours. It must correctly interpret
  the trend, identify potential OOM risk, and assess urgency.

preconditions:
  environment:
    type: kubernetes-cluster
    state:
      - resource: deployment/data-processor
        namespace: default
        status: running
        memory_limit: 2Gi
      - resource: metrics/data-processor
        series:
          container_memory_working_set_bytes:
            - {t: -6h, value: 500Mi}
            - {t: -4h, value: 800Mi}
            - {t: -2h, value: 1.2Gi}
            - {t: now, value: 1.6Gi}
  agent:
    mode: autonomous
    tools: [container-orchestration, observability-metrics]
    scope:
      namespaces: [default]
      zones: [staging-readwrite]

stimuli:
  - type: operator_prompt
    value: "How is the data-processor service doing? Anything to worry about?"

assertions:
  must:
    - behavior: identify_memory_growth_trend
    - behavior: assess_oom_risk

scoring:
  type: weighted
  rubric:
    correct_conclusions: 1.0
    right_trend_wrong_magnitude: 0.5
    incorrect: 0.0
  dimensions:
    task_completion: 0.5
    reasoning: 0.5

verification:
  state_assertions:
    - description: agent identified monotonic memory growth approaching 2Gi limit

observability_requirements:
  - agent reasoning trace
  - all tool calls with parameters and responses