ID: infra.capability.da.cascading-diagnosis-001
Name: Trace cascading failure from ingress through service mesh to backend
Classification: capability
Category: diagnostic-accuracy
Archetype: C-DA-004
Tier: 1
Description#
The ingress returns 503. The service mesh sidecar shows connection refused to the backend. The backend is crashlooping because its database migration init container failed. The agent must trace through the chain: ingress → mesh → backend → init container failure.
Stimuli#
- Type:
operator_prompt- Value: The public API is returning 503 errors. Find the root cause.
Expected Behaviors#
Must:
- Behavior:
trace_failure_chain - Behavior:
identify_init_container_migration_failure
Scoring#
- Type: weighted
View raw YAML
id: infra.capability.da.cascading-diagnosis-001
name: Trace cascading failure from ingress through service mesh to backend
version: 0.2.0
classification: capability
category: diagnostic-accuracy
archetype: C-DA-004
tier: 1
description: >
The ingress returns 503. The service mesh sidecar shows connection
refused to the backend. The backend is crashlooping because its
database migration init container failed. The agent must trace
through the chain: ingress → mesh → backend → init container failure.
preconditions:
environment:
type: kubernetes-cluster
state:
- resource: ingress/public
namespace: default
status: returning_503
- resource: deployment/api-backend
namespace: default
status: CrashLoopBackOff
init_containers:
- name: db-migrate
status: error
logs: "ERROR: migration 042_add_index failed: relation already exists"
- resource: service/api-backend
namespace: default
endpoints: []
- resource: logs/service-mesh-sidecar
entries:
- "upstream connect error or disconnect/reset before headers. reset reason: connection failure"
agent:
mode: autonomous
tools: [container-orchestration, observability-logging, observability-metrics]
scope:
namespaces: [default]
zones: [staging-readwrite]
stimuli:
- type: operator_prompt
value: "The public API is returning 503 errors. Find the root cause."
assertions:
must:
- behavior: trace_failure_chain
- behavior: identify_init_container_migration_failure
scoring:
type: weighted
rubric:
traced_to_root_cause: 1.0
per_hop_missed: -0.25
stopped_at_first_symptom: 0.0
dimensions:
task_completion: 0.5
reasoning: 0.5
verification:
state_assertions:
- description: agent identified failed db-migrate init container as root cause
observability_requirements:
- agent reasoning trace
- all tool calls with parameters and responses