Concrete example: a publisher opt-out, end to end
Walk one specific scenario from submission to lab-side acknowledgment, so a technical evaluator can see exactly what code paths run, what artifacts each step produces, and what's verifiable independently.
Framing — read this first: Part of what's described below (per-record Ed25519 signing, canonical-payload Arweave anchoring, brand-neutral core packages, public verify) already runs in production for Stelais today. Part of what's described (the publisher DNS challenge, the Merkle batcher, the lab-facing lookup endpoint, the opt-out-specific canonical schema) is the registry-layer extension built on top — designed, not shipped. The doc calls out which is which inline. The point of walking one concrete example is to make the boundary between substrate and extension visible, not to obscure it.
The scenario
example-publisher.com publishes long-form journalism. They want to register a domain-wide opt-out: "do not train any model on content fetched from this domain." They want the opt-out to be:
- Verifiable independently by any AI lab without trusting Akaeon.
- Provably timestamped — they need to point at a record predating a training cutoff.
- Auditable in both directions — the publisher wants proof their request was honored; the lab wants proof they checked the registry at training time.
A compliance engineer at examplelabs.ai will call the registry's lookup API during their data-ingestion pipeline. They need an answer they can put in their audit log that survives later challenge.
This document walks every step from publisher submission to lab acknowledgment.
Step 1 — Publisher submits the opt-out
[Registry extension. Not present in current codebase.]
The publisher's automation hits:
POST https://api.akaeon-registry.com/v1/optouts
Authorization: Bearer <publisher-api-key>
Content-Type: application/json
{
"domain": "example-publisher.com",
"policy": "no-training",
"scope": "domain",
"effective_from": "2026-05-11T00:00:00Z"
}
The handler at services/akaeon-registry/src/routes/optouts.ts (does not yet exist) does four things in order:
-
Validates the request shape. Rejects unknown
policyvalues or malformed domains. -
Issues a DNS challenge token — a random 32-byte nonce, base64-encoded, stored against the pending submission with a 24h TTL.
-
Returns
202 Acceptedwith the challenge:{ "status": "pending_dns_verification", "submission_id": "01J9XW...", "challenge": { "record_name": "_akaeon-registry-challenge.example-publisher.com", "record_type": "TXT", "record_value": "akaeon-registry-v1=<32-byte-nonce-b64>", "expires_at": "2026-05-12T00:00:00Z" } } -
Persists the pending row in a
pending_optoutstable. Nothing is signed, nothing is anchored. The submission is provisional until DNS verifies.
Where this lives architecturally: top band of the architecture diagram — the registry's API surface, peer to Stelais's API surface.
Step 2 — DNS challenge verification
[Registry extension. Not present in current codebase.]
The publisher's ops team adds the TXT record at their DNS provider.
A worker in services/akaeon-registry/src/workers/dnsVerify.ts (does not yet exist) polls pending submissions at a fixed interval. For each pending row:
- Resolves the
_akaeon-registry-challenge.<domain>TXT record against multiple resolvers (Cloudflare 1.1.1.1, Google 8.8.8.8, a third) to defeat single-resolver poisoning. - Confirms each resolver returns a record matching the expected nonce.
- On match: marks the submission
dns_verifiedand enqueues it for anchoring. Records the DNSSEC chain (where present) into the submission row. - On no-match or partial-match: leaves the submission pending. On TTL expiry: deletes.
Why DNS and not OAuth / certificate-based auth? DNS control is the existing, durable proof-of-domain-authority that every publisher already operates. OAuth ties the registry to a centralized identity provider; cert auth requires the publisher to manage a separate keypair the registry doesn't yet have a story for. DNS gets the job done with zero new infrastructure on the publisher side and is the same primitive Let's Encrypt, Google Search Console, and ACME challenge flows already rely on.
Step 3 — Build the canonical opt-out record
[Substrate primitive exists in production today. Registry-specific schema is the extension.]
Once dns_verified, the registry constructs the canonical record. This step is the first place the registry touches the brand-neutral core packages that Stelais already uses in production.
The opt-out canonical payload (new, registry-specific schema — likely buildOptoutCanonicalPayload added to core/arweave/src/canonicalPayload.ts alongside the existing buildCanonicalPayload and buildSnapshotCanonicalPayload):
{
"version": 1,
"type": "domain_optout",
"submission_id": "01J9XW...",
"domain": "example-publisher.com",
"policy": "no-training",
"scope": "domain",
"effective_from": "2026-05-11T00:00:00Z",
"submitted_at": "2026-05-11T14:23:00Z",
"dns_verified_at": "2026-05-11T14:31:00Z",
"dns_challenge_record_sha256": "<sha256-of-the-txt-record-value-as-resolved>",
"publisher_account_id": "01J9XW...",
"app": "akaeon-registry",
"network": "arweave"
}
Note the app and network fields. Same pattern as Stelais's existing buildCanonicalPayload — both are caller-supplied with no defaults. The conscious-choice rule makes this clean: the registry passes 'akaeon-registry' and 'arweave' deliberately; nothing else assumes a brand.
Step 4 — Sign the record
[Substrate primitive exists in production today. Used as-is.]
The registry signs the canonical record with its own Ed25519 keypair. The signing flow is the same code Stelais runs in production — same createCanonicalMessageBuilder, same ed25519Sign, same key encryption shape.
The brand-coupling lives in the prefix the registry closes over:
// services/akaeon-registry/src/lib/registrySigning.ts (does not yet exist)
import { createCanonicalMessageBuilder } from '@akaeon/core-verification'
const builder = createCanonicalMessageBuilder({
prefix: 'akaeon-registry:optout:v1',
})
// For a domain opt-out, the (userId, fileHashHex) signature of the existing
// API doesn't quite fit. The registry adds a thin wrapper that builds an
// equivalent canonical message for the opt-out shape:
//
// akaeon-registry:optout:v1|<submission_id>|<domain>|<policy>
//
// either by extending the core builder or by calling the underlying
// `${prefix}|<a>|<b>|<c>` shape directly. Decision point: do we generalize
// the core builder to accept variable-arity components, or does the registry
// own this wrapper locally? See "Open architectural questions" below.
The signing key itself: a service keypair owned by the registry, not the publisher. (The publisher's identity is established by DNS in step 2; the signature here is the registry's attestation that we observed the verified submission at this moment.)
The signed canonical message and Ed25519 signature attach to the canonical record. Both are written into the bundle in step 5.
A sharp prospect will ask: "Why isn't the publisher signing this?"
Answer: they could, and a future version may add it as an optional second signature. For v1, the registry's signature is the load-bearing one because the registry's promise to the lab is "we verified this submission via DNS at this time." The publisher's signature would prove "the publisher intended this opt-out" — useful, not strictly necessary, and requires the publisher to manage a keypair the DNS-based flow deliberately avoids.
Step 5 — Batch into the next anchor
[Registry extension. Not present in current codebase — but the underlying anchoring path is what Stelais uses today, with a Merkle layer added.]
Stelais today anchors one Arweave transaction per proof — the performAnchor path calls uploadProof or performTurboAnchor with a single canonical payload. That works for creator proofs because the volume is bounded by creator activity (each user has a daily quota).
The registry can't use the same one-tx-per-record model because publisher opt-outs arrive at much higher volume (every domain on the public internet is a potential submission). So the registry adds a batching layer:
- Accumulate verified opt-out canonical records in an in-memory or Redis-backed queue.
- At a fixed cadence (initial design: hourly, configurable), close the current batch.
- Build a Merkle tree over the canonical record hashes — SHA-256 of each canonical payload as the leaves, standard binary Merkle, leaf-pair concatenation with a
0x00or0x01prefix byte to prevent second-preimage attacks (RFC 6962-style). - Build the batch canonical payload — same shape pattern as the existing core builders, with a
merkle_root_sha256_hex, the batch'sstarted_at/closed_at, the count of leaves, and (importantly) the list of leaf hashes is not in the on-chain payload — only the root is. The leaves are stored in the registry's database, addressable bysubmission_id, and served to anyone who needs an inclusion proof. - Anchor the root via the same
performAnchorpath Stelais uses, with the same preflight checks (kill switch, cost cap, daily/monthly budget). One Arweave transaction per batch, not per opt-out. - Update each submission row with
arweave_tx_id,leaf_index,merkle_proof(the sibling hashes the verifier needs to recompute the root from this leaf).
A successful batch produces one Arweave transaction id that fixes the position of every opt-out in that batch in public, third-party-operated time. The publisher's submission is now provably anchored as of a public network timestamp — the property that makes it useful as evidence against a training cutoff.
Note on Merkle vs. the snapshot anchor today. Stelais already has a parallel structure for infringement snapshots using buildSnapshotCanonicalPayload. That path is per-snapshot, not Merkle-batched, because the volume is bounded by user-initiated capture. The registry's higher volume is the reason to add Merkle; the substrate underneath is the same.
Step 6 — Registry returns acknowledgment to the publisher
[Registry extension. Not present in current codebase.]
A webhook fires (or the publisher polls) once the batch lands:
POST <publisher_webhook_url>
Content-Type: application/json
X-Akaeon-Signature: <hmac-sha256-over-body>
{
"submission_id": "01J9XW...",
"status": "anchored",
"anchored_at": "2026-05-11T15:00:00Z",
"arweave_tx_id": "ABC123...",
"arweave_url": "https://arweave.net/ABC123...",
"batch": {
"merkle_root_sha256_hex": "f3a9...",
"leaf_index": 142,
"leaf_count": 2814
},
"verify_url": "https://api.akaeon-registry.com/v1/optouts/01J9XW.../verify"
}
The publisher's audit log now contains a row pointing at a public Arweave transaction id. They can verify the registry's claim against the public network without trusting Akaeon further.
Step 7 — Lab calls the lookup endpoint
[Registry extension. Not present in current codebase.]
This is the moment the registry's value is realized. A compliance engineer at examplelabs.ai is about to ingest content from example-publisher.com. Their pipeline issues:
GET https://api.akaeon-registry.com/v1/lookup?domain=example-publisher.com
Authorization: Bearer <lab-api-key>
The registry responds with the full bundle the lab needs to put in their audit log:
{
"domain": "example-publisher.com",
"lookup_at": "2026-05-12T09:14:00Z",
"optouts": [
{
"submission_id": "01J9XW...",
"canonical_record": {
"version": 1,
"type": "domain_optout",
"domain": "example-publisher.com",
"policy": "no-training",
"scope": "domain",
"effective_from": "2026-05-11T00:00:00Z",
"submitted_at": "2026-05-11T14:23:00Z",
"dns_verified_at": "2026-05-11T14:31:00Z",
"dns_challenge_record_sha256": "...",
"app": "akaeon-registry",
"network": "arweave"
},
"registry_signature": {
"canonical_message": "akaeon-registry:optout:v1|01J9XW...|example-publisher.com|no-training",
"signature": "<base64-ed25519>",
"public_key": "<base64-32-byte-raw>",
"signature_scheme": "ed25519",
"version": "v1"
},
"merkle_inclusion": {
"leaf_hash": "<sha256-of-canonical-record>",
"leaf_index": 142,
"merkle_proof": [
"<sibling-hash-level-0>",
"<sibling-hash-level-1>",
"..."
],
"merkle_root": "f3a9...",
"arweave_tx_id": "ABC123...",
"arweave_url": "https://arweave.net/ABC123..."
}
}
]
}
The lab's pipeline records this entire response, verbatim, in their audit trail before deciding whether to ingest the URL.
Step 8 — Lab independently verifies the chain
[Substrate primitive exists in production today. Used as-is, plus Merkle proof check.]
This is the most important step in the entire walkthrough. Everything up to here could be a registry making things up. Step 8 is where the lab confirms it isn't.
The lab's audit verifier runs three independent checks, none of which call back to Akaeon:
// lab-side-verify.mjs — runs in the lab's environment, no Akaeon code
import crypto from 'node:crypto'
// or just use `fetch` against arweave.net
const bundle = /* the registry's lookup response, recorded in audit log */
// === Check 1: Ed25519 signature on the canonical record =================
//
// This is the exact same verification the Stelais public verify endpoint
// uses today — same 32-byte raw Ed25519 pubkey wrapped in DER SPKI, same
// crypto.verify call.
const SPKI_HEADER = Buffer.from('302a300506032b6570032100', 'hex')
const pubkeyDer = Buffer.concat([
SPKI_HEADER,
Buffer.from(bundle.registry_signature.public_key, 'base64'),
])
const pubKey = crypto.createPublicKey({ key: pubkeyDer, format: 'der', type: 'spki' })
const sigOk = crypto.verify(
null,
Buffer.from(bundle.registry_signature.canonical_message, 'utf8'),
pubKey,
Buffer.from(bundle.registry_signature.signature, 'base64'),
)
if (!sigOk) throw new Error('registry signature invalid')
// === Check 2: Merkle inclusion proof rolls up to the claimed root =======
let computed = Buffer.from(bundle.merkle_inclusion.leaf_hash, 'hex')
let index = bundle.merkle_inclusion.leaf_index
for (const sibHex of bundle.merkle_inclusion.merkle_proof) {
const sib = Buffer.from(sibHex, 'hex')
const concat = (index & 1) === 0
? Buffer.concat([Buffer.from([0x01]), computed, sib])
: Buffer.concat([Buffer.from([0x01]), sib, computed])
computed = crypto.createHash('sha256').update(concat).digest()
index >>= 1
}
if (computed.toString('hex') !== bundle.merkle_inclusion.merkle_root) {
throw new Error('merkle proof does not reconstruct claimed root')
}
// === Check 3: The claimed root actually appears on Arweave ==============
const arweaveBody = await fetch(bundle.merkle_inclusion.arweave_url).then(r => r.json())
if (arweaveBody.merkle_root_sha256_hex !== bundle.merkle_inclusion.merkle_root) {
throw new Error('arweave-anchored root does not match claimed root')
}
console.log('VERIFIED — opt-out is signed by the registry, included in a batch, and that batch is anchored on Arweave')
The three checks correspond to the three trust claims:
| Check | Trust claim | |---|---| | 1. Ed25519 signature on canonical record | "The registry actually attested to this opt-out." | | 2. Merkle inclusion proof reconstructs the claimed root | "This specific opt-out was part of the claimed batch." | | 3. Arweave transaction body contains the claimed root | "The batch root was actually anchored, at the public-network timestamp." |
The lab's audit log now contains:
- A timestamped record of the lookup.
- A bundle whose every claim is independently checkable.
- An Arweave transaction id that fixes the publisher's opt-out in public time.
Their compliance review later, against any challenge ("you trained on example-publisher.com after they opted out"), produces the Arweave tx and the inclusion proof. The burden of proof inverts: the challenger has to disprove a public-network timestamp.
The boundary between substrate and extension
The point of walking this in detail is to make explicit which parts of the chain are existing production infrastructure and which are registry-layer work yet to be built. The table below maps every step:
| Step | Substrate (exists today, Stelais runs it in production) | Registry extension (not yet built) |
|---|---|---|
| 1. Publisher submits | — | POST /v1/optouts endpoint, DNS challenge issuance |
| 2. DNS challenge verify | — | Multi-resolver DNS worker, challenge polling |
| 3. Canonical record | core-arweave payload pattern (app, network brand-neutral) | Opt-out-specific buildOptoutCanonicalPayload schema |
| 4. Sign | core-verification (createCanonicalMessageBuilder, ed25519Sign, key encryption) — production today | Registry's keypair + brand prefix (akaeon-registry:optout:v1) |
| 5. Batch + anchor | performAnchor path + preflight + spend logging — production today | Merkle tree builder; per-batch (not per-record) anchor cadence |
| 6. Publisher ack | — | Webhook + HMAC; or polling endpoint |
| 7. Lab lookup | — | GET /v1/lookup endpoint, bundle assembly |
| 8. Lab verifies | The Ed25519 verify path is identical to the existing Stelais public verify endpoint (same library, same RFC 8032) | The Merkle inclusion check is new; standard library, no special tooling |
What the table makes clear: the cryptographic spine — signing, canonical-payload anchoring, brand-neutral core packages, public Arweave trust root — is what Stelais runs every day in production. The registry's new code is the API surface (steps 1, 2, 6, 7) and the Merkle batching glue (step 5's tree builder). Those are the parts that need to be built; they sit on top of a substrate that already exists.
Open architectural questions
Things that need a call before any of the registry-extension code gets written. These are surfaced explicitly because the answers will shape the v1 codebase.
1. Canonical message arity. The current createCanonicalMessageBuilder takes a fixed (userId, fileHashHex) signature, producing ${prefix}|${userId}|${fileHashHex}. The opt-out flow wants three components after the prefix (submission_id, domain, policy). Decision: extend core to accept variable-arity components, or build a registry-local wrapper that calls a lower-level signing primitive directly? Either works; the trade-off is whether the registry's canonical-message shape lives in core (re-usable, but slightly couples core to the registry's needs) or in the registry workspace (clean separation, but two near-identical message builders).
2. Merkle batching primitive location. Does core-arweave grow a buildBatchCanonicalPayload + Merkle helper, or does the registry own its own batcher? Core today is brand-neutral and pure; adding Merkle doesn't violate that, but it's worth a deliberate call about whether batching is a substrate concern (yes if a second consumer will also need it; no if it's registry-specific for the foreseeable future).
3. Publisher keypair, optional or required. The v1 design above doesn't require the publisher to sign anything beyond the DNS challenge. Future versions could let publishers register their own Ed25519 public key alongside the domain and sign their submissions, which would produce a two-signature record (publisher's intent + registry's verification). Useful for higher-assurance opt-outs (e.g. opt-outs from rightsholders rather than domain owners). Out of scope for v1.
4. What gets anchored vs. what gets stored. Today the registry would anchor only the Merkle root on-chain (the leaves stay in Postgres). This is cheap and fast but means the registry is the only party that can produce inclusion proofs. An alternative: anchor a manifest that includes all leaf hashes in the on-chain payload, so any third party with the Arweave tx can list every opt-out in the batch without calling the registry. The cost is larger on-chain payloads (still well under Arweave's per-tx limits for hourly batch sizes in the thousands). Trade-off: registry as essential mediator vs. registry as purely informational layer.
These four questions are exactly the kind of architectural choice to surface explicitly before implementation. They're noted here so the next person who picks this up reads them before writing any code.
Companion documents
- Technical specification — the normative cryptographic spec the steps above reference.
- Lab integration runbook — the engineering reference for step 7 and step 8 from the lab's side.
- Standalone verifier — the copy-paste-ready code for step 8.
- Architecture — the single picture that makes the relationship between Stelais, the registry, the core packages, and Arweave visible at a glance.