Aevum Signing Specification — v1¶
This specification defines the canonical form, digest, and signature used to produce and verify AuditEvent entries in an Aevum sigchain.
Overview¶
Every AuditEvent is cryptographically signed. The signature covers a deterministic subset of the event's fields (the signing fields), enabling verification of event authenticity and chain integrity without relying on database ordering or application state.
The signing chain provides:
- Per-event authenticity — each event is signed with the deployment's signing key; forgery requires the private key
- Chain integrity — each event references the SHA3-256 digest of the message representative of the previous event; insertion, deletion, or reordering is detectable
- Session boundary transparency —
session.startevents explicitly declare the signing key scheme and, for persistent backends, link to the prior session's terminal event viacausation_id
Signing Fields¶
The signature covers exactly these 19 fields, extracted from the AuditEvent:
actor
causation_id
correlation_id
episode_id
event_id
event_type
hash_alg
key_scheme
payload_hash
prior_hash
schema_version
sequence
sig_format_version
signer_key_id
span_id
system_time
trace_id
valid_from
valid_to
Fields not in the signing set: payload, signature, mldsa65_sig,
mldsa65_pub, tsa_url, tsa_token, receipt_cbor, audit_id.
payloadis omitted because it is covered bypayload_hash; signing the payload directly would make the signature grow linearly with payload size.signatureis omitted because it is the output of the signing process.mldsa65_sig/mldsa65_pubare omitted because they are outputs of the ML-DSA signing pass that runs after the signing fields are canonicalized.tsa_url/tsa_tokenare omitted because they are RFC 3161 outputs appended after the entry is signed.receipt_cboris omitted because it is a transparency log receipt appended after the entry is written.audit_idis omitted because it is derived fromevent_id(same bytes, different representation) and would be redundant.
Note: sequence, episode_id, key_scheme, sig_format_version, and
hash_alg are included to make the chain position, episode grouping, algorithm
posture, field-set version, and hash algorithm tamper-evident.
Signing field encodings¶
| Field | Type in JSON entry | Encoding in signing fields |
|---|---|---|
actor |
string | string (UTF-8) |
causation_id |
string | null | string | null |
correlation_id |
string | null | string | null |
episode_id |
string | string |
event_id |
string (UUID v7) | string |
event_type |
string | string |
hash_alg |
string, const "sha3-256" |
string |
key_scheme |
string | string |
payload_hash |
string (64 hex chars) | string |
prior_hash |
string (64 hex chars) | string |
schema_version |
string, const "1.0" |
string |
sequence |
integer | integer |
sig_format_version |
integer, const 1 |
integer, always 1 |
signer_key_id |
string | string |
span_id |
string | null | string | null |
system_time |
integer (HLC nanoseconds) | string — str(system_time) |
trace_id |
string | null | string | null |
valid_from |
string (ISO 8601) | string |
valid_to |
string | null | string | null |
system_time safe-integer rule: HLC (Hybrid Logical Clock) values are
nanosecond-precision integers that commonly exceed 2^53-1, the RFC 8785
safe-integer domain. The signing field must be str(system_time) — the decimal
string representation — not the raw integer. Passing the raw integer to the
RFC 8785 canonicalizer is a hard error. The JSON entry carries the integer;
the signing field set carries the string.
Digest Construction¶
Step 1 — Construct signing object¶
Extract the 19 signing fields from the AuditEvent. Set absent optional fields
to null (JSON null). Do not omit them. Apply the encoding rules from the table
above — in particular, convert system_time to its decimal string.
{
"actor": "aevum-core",
"causation_id": null,
"correlation_id": null,
"episode_id": "01961234-5678-7abc-def0-123456789012",
"event_id": "01961234-5678-7abc-def0-123456789012",
"event_type": "session.start",
"hash_alg": "sha3-256",
"key_scheme": "ed25519",
"payload_hash": "abc123...64hexchars",
"prior_hash": "391f6bd6d761cb9af9e924d015a6fc18e9d236c965c3e5deda1145a25e11cf5e",
"schema_version": "1.0",
"sequence": 1,
"sig_format_version": 1,
"signer_key_id": "550e8400-e29b-41d4-a716-446655440000",
"span_id": null,
"system_time": "1746568451401122816",
"trace_id": null,
"valid_from": "2026-05-06T21:54:11.401122+00:00",
"valid_to": null
}
Step 2 — Canonicalize with RFC 8785 (JCS)¶
Apply the JSON Canonicalization Scheme (RFC 8785) using the rfc8785 library:
import rfc8785
canonical_bytes = rfc8785.dumps(signing_obj)
# Returns bytes; keys sorted by Unicode code point; no whitespace;
# non-ASCII characters appear as UTF-8 bytes (NOT \uXXXX escapes).
Do not use json.dumps as a substitute. json.dumps with ensure_ascii=True
(the default) escapes non-ASCII characters as \uXXXX, producing different bytes
from RFC 8785 for any non-ASCII actor, correlation ID, or similar field. Use the
rfc8785 library unconditionally.
Floats are forbidden in signing fields — the canonicalizer raises on any float value. All Aevum signing fields are strings, integers, or null.
Step 3 — Prepend domain prefix¶
Construct the message representative by prepending the domain separator:
The \x00 byte separates the ASCII prefix from the RFC 8785 JSON body,
eliminating any parsing ambiguity. This domain prefix binds the protocol name
and wire-format version into every signed byte, preventing cross-protocol
signature misuse. sig_format_version handles field-set evolution; this prefix
handles protocol/wire-format domain.
Step 4 — Hash¶
This 32-byte digest is the Ed25519 signed digest and the chain hash input (compute-once property). The digest is computed once and reused for both proofs: altering any signing field breaks both the signature check and the chain-linkage check simultaneously.
Step 5 — Sign (Ed25519)¶
The signer receives the 32-byte digest. It does NOT re-hash the input.
# InProcessSigner (Ed25519):
raw_signature = private_key.sign(digest)
# digest is passed directly to Ed25519's internal message-processing step
# VaultTransitSigner:
# POST /v1/transit/sign/{key_name}
# body: {"input": base64(digest), "prehashed": true}
Step 6 — Encode¶
import base64
signature = base64.urlsafe_b64encode(raw_signature).rstrip(b'=').decode()
# Result: base64url without padding, always 86 characters (Ed25519 = 64 bytes)
Hybrid Signing (ML-DSA-65)¶
Hybrid entries (key_scheme = "ed25519+ml-dsa-65") carry a second signature
produced by ML-DSA-65 (CRYSTALS-Dilithium, FIPS 204) over the message
representative — not its hash:
mldsa65_raw_sig = ml_dsa_private_key.sign(representative)
# ML-DSA-65 signs representative directly (not sha3_256(representative))
The resulting bytes are hex-encoded and stored as mldsa65_sig. The public key
bytes are hex-encoded and stored as mldsa65_pub.
Algorithm agility¶
The key_scheme field encodes the posture: "ed25519" for classical;
"ed25519+<level>" for hybrid, where <level> is the ML-DSA level suffix
(currently "ml-dsa-65"). The verifier uses this string to select the ML-DSA
algorithm:
Any unrecognized <level> suffix must cause a verification failure — fail
closed, never warn-and-fallback.
Fail-closed rule¶
If key_scheme is "ed25519+<level>" and mldsa65_sig or mldsa65_pub is
null or absent, the entry must be rejected. A missing ML-DSA signature on a
hybrid entry indicates a tamper or downgrade attack; it is never valid to verify
only the Ed25519 portion on a hybrid entry.
Chain homogeneity¶
All events in a chain must share the same key_scheme. A chain with mixed
schemes must be rejected. Signed posture-change transitions are not supported
in v0.8.0.
Hash Chain¶
Prior hash computation¶
The prior_hash of event N equals the hex-encoded SHA3-256 digest of the
message representative of event N-1. This is identical to the digest used to
produce event N-1's Ed25519 signature:
import rfc8785, hashlib
DOMAIN_PREFIX = b"aevum-sigchain-v1\x00"
def hash_event_for_chain(event_dict: dict) -> str:
"""
Compute the SHA3-256 hex digest of an event's message representative.
This is stored as the prior_hash of the NEXT event and is identical
to the digest over which Ed25519 signed.
"""
signing_fields = {
"actor": event_dict["actor"],
"causation_id": event_dict.get("causation_id"),
"correlation_id": event_dict.get("correlation_id"),
"episode_id": event_dict["episode_id"],
"event_id": event_dict["event_id"],
"event_type": event_dict["event_type"],
"hash_alg": event_dict["hash_alg"],
"key_scheme": event_dict["key_scheme"],
"payload_hash": event_dict["payload_hash"],
"prior_hash": event_dict["prior_hash"],
"schema_version": event_dict["schema_version"],
"sequence": event_dict["sequence"],
"sig_format_version": 1,
"signer_key_id": event_dict["signer_key_id"],
"span_id": event_dict.get("span_id"),
"system_time": str(event_dict["system_time"]), # HLC → string
"trace_id": event_dict.get("trace_id"),
"valid_from": event_dict["valid_from"],
"valid_to": event_dict.get("valid_to"),
}
representative = DOMAIN_PREFIX + rfc8785.dumps(signing_fields)
return hashlib.sha3_256(representative).hexdigest()
This property means a verifier computes the representative once per event and derives both the chain-link verification digest and the signature verification digest from the same bytes.
Genesis hash¶
The first event in a chain (sequence=1, always a session.start) has:
import hashlib
GENESIS_HASH = hashlib.sha3_256(b"aevum:genesis").hexdigest()
# = "391f6bd6d761cb9af9e924d015a6fc18e9d236c965c3e5deda1145a25e11cf5e"
Session boundary hash¶
When a persistent backend is restarted, a new session.start is written. Its
causation_id is set to the audit_id of the last event in the previous
session. The prior_hash links to the SHA3-256 digest of that last event's
message representative, creating a continuous, verifiable chain across process
restarts.
Verification Procedure¶
A verifier must:
- Sort events by
sequencenumber (ascending) - Pre-check: every entry must have
sig_format_version == 1(reject None or any other value — no fallback, no legacy path) - Pre-check: every entry must share the same
key_scheme(chain homogeneity) - For each event:
a. Construct the 19-field signing object applying encoding rules (see table
above —
system_timeas string, absent nullable fields as null) b. Computerepresentative = DOMAIN_PREFIX + rfc8785.dumps(signing_fields)c. Computedigest = SHA3-256(representative)(32 bytes) d. Verifyprior_hashequalssha3_256(representative_of_previous_event).hexdigest(). For the first event: verifyprior_hashequals the genesis hash constant. e. Verify Ed25519 signature:public_key.verify(base64url_decode(signature), digest)f. Ifkey_schemestarts with"ed25519+":- Parse the level suffix (e.g.
"ml-dsa-65") - Map to OQS algorithm name; unknown suffix → reject (fail closed)
- Verify
mldsa65_sigandmldsa65_pubare non-null; null → reject - Verify ML-DSA signature:
ml_dsa_verify(representative, bytes.fromhex(mldsa65_sig), bytes.fromhex(mldsa65_pub)) - Verify
mldsa65_pubmatches the pinned ML-DSA public key (out-of-band) g. Verifypayload_hashequalssha3_256(json.dumps(payload, sort_keys=True, separators=(',',':')).encode()).hexdigest()h. Verifysystem_timeis >= previous event'ssystem_time(HLC monotonicity)
- Parse the level suffix (e.g.
- Report: pass/fail per event, total events verified, any chain breaks
A reference verifier is provided at tools/verify/verify_chain.py.
Trust Model¶
Pinned-key anchor¶
The verifier's trust anchor is the published Ed25519 public key (and the ML-DSA-65 public key for hybrid entries) supplied out-of-band — for example, from the operator's published key material, a hardware security module, or a key management service.
For hybrid entries, the mldsa65_pub embedded in each entry must equal the
pinned ML-DSA public key. A mismatch must cause a verification failure.
signer_key_id is informational¶
signer_key_id is a signed field (included in the digest) that carries a
human-readable key label. It is NOT the key-identity check. A verifier must not
accept an entry based solely on a matching signer_key_id — it must verify the
cryptographic signature against the pinned public key.
signer_key_id is useful for log browsing and key-rotation audit trails (you
can see which key label was in use for a given run), but it does not constitute
a security boundary by itself. Key rotations are detectable because signer_key_id
changes, and chain continuity is preserved because prior_hash still links
correctly across key changes.
Out-of-band key distribution¶
Public keys must be distributed through a channel independent of the sigchain
(e.g., a corporate PKI, a hardware attestation, or a separately signed manifest).
A sigchain is self-describing but NOT self-authenticating with respect to key
identity — a forger who controls the chain can also control signer_key_id.
Signature Encoding¶
| Property | Value |
|---|---|
| Algorithm | Ed25519 (RFC 8032) |
| Signed input | SHA3-256 (FIPS 202) of DOMAIN_PREFIX + rfc8785.dumps(19 signing fields) |
| Signature encoding | base64url without padding (RFC 4648 §5) |
| Public key format | SubjectPublicKeyInfo PEM (32-byte Ed25519 raw key) |
| ML-DSA algorithm | ML-DSA-65 (CRYSTALS-Dilithium, FIPS 204 draft) |
| ML-DSA input | DOMAIN_PREFIX + rfc8785.dumps(19 signing fields) (representative, not its hash) |
| ML-DSA public key encoding | Hex (stored as mldsa65_pub) |
Verifiable-Log Layer¶
Merkle tree structure¶
Aevum implements a RFC 6962-style Merkle tree over the append-only sigchain using SHA3-256 (consistent with the chain hash algorithm).
Leaf hash: sha3_256(0x00 || bytes.fromhex(hash_event_for_chain(event)))
Node hash: sha3_256(0x01 || left_child_hash || right_child_hash)
Empty tree root: sha3_256(b"") (MTH of 0 entries, per RFC 6962)
Tree shape: complete binary tree; split point k = 1 << ((n-1).bit_length()-1)
(largest power of 2 less than n), per RFC 6962 §2.1.
Signed Tree Head (STH)¶
A Signed Tree Head commits to the full log state: tree_size, root_hash,
timestamp (wall-clock), signer_key_id, and key_scheme.
STHs use their own domain prefix b"aevum-sth-v1\x00" — intentionally distinct
from b"aevum-sigchain-v1\x00" — providing type-level cross-domain separation:
an entry signature cannot verify as an STH signature and vice versa.
Signing follows the same hybrid pattern as entry signing:
- Ed25519 signs sha3_256(b"aevum-sth-v1\x00" + rfc8785.dumps(sth_fields))
- ML-DSA-65 (if configured) signs the representative directly
Inclusion proofs¶
An inclusion proof for entry at index leaf_index in a tree of size tree_size
is a list of sibling hashes that allows a verifier to reconstruct the root hash
from the leaf hash alone, per RFC 6962 §2.1.1.
The verifier re-derives the leaf hash from hash_event_for_chain(event) and
walks the proof path to the root, checking that the computed root matches the
STH root.
Consistency proofs¶
A consistency proof between two tree states (old tree of size first, new tree
of size second) allows a verifier to confirm that the new tree is an extension
of the old tree with no edits, per RFC 6962 §2.1.2.
TSA anchoring¶
The Signed Tree Head carries an optional tsa_token (hex-encoded RFC 3161 DER
bytes) timestamped over the 32-byte Merkle root by an external Timestamp
Authority. This is independent of the self-asserted timestamp field — both
claims coexist in the STH.
TSA cert-chain validation: full validation of the TSA certificate chain
against a pinned TSA root is the responsibility of the standalone verifier.
The aevum-core library checks only that the message imprint in the token
matches the Merkle root — it does not perform cert-chain trust evaluation.
A TSA outage never blocks STH production: if the TSA request fails, the STH is
issued without a timestamp token and tsa_token is null.
GENESIS_HASH constant¶
import hashlib
GENESIS_HASH = hashlib.sha3_256(b"aevum:genesis").hexdigest()
# "391f6bd6d761cb9af9e924d015a6fc18e9d236c965c3e5deda1145a25e11cf5e"
Used as prior_hash for the first event in every chain.