Aevum QAR/FOQA Analytics Layer Architecture¶
Date: 2026-05-25 Session: 3B Status: Implemented (v0.7.0)
The Dual-Layer Design (FDR + QAR)¶
Aevum's "black box for AI agents" is built in two complementary layers, modeled on aviation's two-recorder system.
FDR layer (Sessions 1A–2): The forensic layer. Analogous to a Flight
Data Recorder — captures every event in a tamper-evident, append-only
sigchain. Optimized for "what happened" reconstruction via the replay
function. Components: Sigchain, AevumReceipt, COSE_Sign1 encoder,
SqliteReceiptStore (three-tier), escalation to crash_protected tier.
QAR/FOQA layer (Session 3B): The operational analytics layer. Analogous
to a Quick Access Recorder and the FAA's FOQA (Flight Operational Quality
Assurance) program. Processes receipt streams to detect safety-relevant
patterns (exceedances) and emits de-identified aggregate metrics. Components:
ExceedanceDetector, GatekeeperFilter, FOQABridge.
Together, these layers satisfy: - Forensic investigation: FDR layer (full fidelity, cryptographic chain) - Operational safety trending: QAR/FOQA layer (aggregate, de-identified)
Component Diagram¶
Agent session
│
▼
SigChain.new_event()
│
├── AuditEvent ──► AevumOTelBridge ──► OTel spans (tracing)
│
├── AevumReceipt ──► COSE_Sign1 encoder ──► SqliteReceiptStore
│ │
│ escalate_if_triggered()
│ │
│ crash_protected tier
│
└── AevumReceipt ──► ExceedanceDetector ──► ExceedanceEvent list
(stateful, │
per-session) │
▼
GatekeeperFilter
(pseudonymize,
strip PII)
│
▼
FOQABridge
(OTel metrics)
│
▼
aevum.exceedance.count
aevum.session.count
(aggregate, de-identified)
The 15 Exceedance Types¶
| ID | Name | Aviation Analogy | Severity | Detection Method |
|---|---|---|---|---|
| EX-01 | Tool Retry Loop | Unstable Approach | MEDIUM | Stateful: >3 retries in 60s rolling window |
| EX-02 | Forbidden Tool Invocation | Excessive Bank Angle | HIGH | Stateless: ClassificationCeiling barrier DENY |
| EX-03 | Safety Barrier Trip | GPWS Alert | CRITICAL | Stateless: any barrier_evaluations value DENY |
| EX-04 | Human Override Rejection | Hard Landing | HIGH | Stateless: human_override_action == REJECT |
| EX-05 | Agent Refusal | Go-Around | LOW | Stateless: action in (tool.refuse, agent.abstain, task.reject) |
| EX-06 | Stale Model or Policy Version | Configuration Warning | MEDIUM | Stateless: date in policy_version >30 days ago |
| EX-07 | Token Rate Outlier | Engine Exceedance | MEDIUM | Stateful: token_rate >3σ from rolling baseline (min 10 samples) |
| EX-08 | Latency Outlier | Airspeed Exceedance | MEDIUM | Stateful: latency_ms >3σ from rolling baseline (min 10 samples) |
| EX-09 | Context Window Overflow | Altitude Bust | HIGH | Stateless: prompt_tokens/context_window_size ≥0.95 |
| EX-10 | Concurrent Conflicting Tool Calls | TCAS Resolution Advisory | HIGH | DEFERRED (v0.8.0) — requires cross-session context |
| EX-11 | ODD Exit | ODD Exit | CRITICAL | Stateless: handoff_type == ODD_EXIT |
| EX-12 | Unacknowledged Transition Demand | Automation Handoff Refused | HIGH | Stateless: TRANSITION_DEMAND without handoff_to_agent_id |
| EX-13 | Minimum Risk Maneuver | Minimum Risk Maneuver | CRITICAL | Stateless: handoff_type == MINIMUM_RISK |
| EX-14 | Agent Communication Failure | Communications Failure | HIGH | DEFERRED (v0.8.0) — requires cross-agent message tracking |
| EX-15 | Primary Agent Failure | Crew Incapacitation | CRITICAL | Stateless: handoff_type == FAILURE |
Known Limitations¶
EX-10: Concurrent Conflicting Tool Calls (DEFERRED)¶
Detection requires knowing that multiple simultaneous tool calls from the same agent made conflicting state mutations. This context is not available in a single per-session receipt stream — it requires cross-session correlation of tool call start/end times and the shared resource they targeted.
Status: Not implemented in ExceedanceDetector. The EXCEEDANCE_CATALOGUE
entry documents the deferral. Target: v0.8.0 when multi-agent A2A message
tracking is available.
EX-14: Agent Communication Failure (DEFERRED)¶
Detection requires tracking inter-agent messages and detecting timeouts. The A2A message correlation context is not available in a per-session receipt stream.
Status: Not implemented in ExceedanceDetector. Target: v0.8.0 multi-agent
tracking session (Session 9 in the receipt plan).
EX-07, EX-08: Sigma-Based Outlier Detection¶
The sigma outlier check (_is_sigma_outlier) requires a minimum of 10 samples
in the rolling window before it will fire. In low-traffic sessions (fewer than
10 LLM calls in 60 seconds), these exceedances will not be detected reliably.
Workaround: For high-traffic deployments, this is not an issue. For low-traffic deployments, consider lowering the minimum sample count or using absolute thresholds instead of rolling sigma checks.
EX-06: Stale Policy Version — Date-Embedded Version Strings Only¶
The stale policy check uses a regex to find an ISO 8601 date embedded in
policy_version. Deployers using opaque version strings (e.g., "v3",
"prod", commit hashes) will not trigger EX-06 even if the policy is stale.
Recommendation: Embed a date in your policy version strings:
"policy-2026-05-25" or "v3-2026-05-25".
Threading Model¶
ExceedanceDetector: NOT thread-safe. One detector per agent session.
The detector maintains mutable rolling window state (deque objects) that
is not protected by locks. Multi-threaded callers must either use one detector
per thread or protect with an external lock.
FOQABridge: Designed for shared use across sessions. OTel counter operations
(add()) are atomic in the OTel SDK. One FOQABridge instance per deployment
is the intended usage pattern.
GatekeeperFilter: Stateless after construction (key is read-only). Safe for concurrent use.
What Is NOT Implemented (Deferred Items)¶
| Item | Reason | Target |
|---|---|---|
| EX-10 cross-session detection | Requires multi-session context | v0.8.0 |
| EX-14 cross-agent detection | Requires A2A message tracking | v0.8.0 |
| Differential privacy on aggregate metrics | Privacy budget design required | v0.8.0 |
| Federated exceedance detection across operators | Architecture decision pending | Post-v1.0 |
| Regulator-facing aggregate report (CSV/JSON) | Format TBD pending regulatory consultation | v0.8.0 |
| gen_ai.agent.name / gen_ai.agent.id in OTel spans | AuditEvent lacks structured agent identity | v0.8.0 (see V07-AGENT-CONTEXT in KNOWN_UNKNOWNS.md) |