Incentives

Aurelius transforms alignment from a passive concern into an active market. It rewards agents not for saying the right thing, but for surfacing, verifying, and organizing evidence of when models fail.

By aligning emissions with epistemic value, the protocol creates an economy around adversarial discovery, reproducible verification, and continuous rubric governance.

Miner Incentives

Miners are rewarded for surfacing examples of misalignment, prompts that cause models to produce harmful, deceptive, biased, or otherwise unsafe behavior. Rewards scale with:

Severity — how significant is the failure?

Novelty — has this failure mode been seen before?

Signal Richness — does the submission include helpful tags, reasoning traces, or interpretability metadata?

Validator Agreement — did auditors confirm the output as meaningful misalignment?

Reward Mechanics

Each miner submission is evaluated by multiple validators. If the output is:

Reproducible

Accurately scored

Confirmed as a failure

…then the miner receives:

Protocol emissions (e.g. dTAO)

Reputation gains

Priority access to future model evaluations

Low-quality, repetitive, or unverifiable submissions receive no rewards and may reduce miner reputation over time.

Validator Incentives

Validators verify alignment failures. They earn rewards for faithfully auditing miner submissions using the rubric defined by the Tribunate.

Key incentive drivers:

Scoring accuracy — agreement with validator consensus

Tag quality — correct classification of failure types

Evaluation depth — use of supporting comments or rationale

Participation rate — consistent, timely auditing of miner submissions

Reputation and Slashing

Validators build reputation over time, which affects:

Their influence in consensus aggregation

Their payout multiplier

Their eligibility for rubric review roles

Validators who fail spot checks, miss clear failures, or deviate from consensus may face slashing.

Tribunate Incentives

The Tribunate defines the reward logic that governs the entire system. Initially curated by the core team, it will transition into a semi-decentralized body responsible for:

Maintaining scoring rubrics and dimension weights

Defining validator scoring functions

Updating audit protocols and evaluation standards

Tribunate contributors may receive:

A share of protocol emissions

Recognition in governance dashboards and academic outputs

Long-term authority over how alignment is measured

Their primary incentive is epistemic integrity, ensuring that the reward function tracks truth, not trend.

Emission Structure

Aurelius follows a dual-track emissions system:
icon

dTAO emissions
Paid to miners and validators for high-value contributions

icon

Delegated staking
Token holders may support top performers, earning yield while guiding emissions

Integrity and Anti-Gaming

Aurelius includes safeguards to protect against abuse:

Miner identity is hidden from validators

Prompt sampling prevents cherry-picking

Validator–miner collusion is detected through statistical outlier detection

The Tribunate regularly rotates rubric logic and audit conditions

High-reputation agents are trusted more, but no one is immune to challenge.

Incentives as Alignment Mechanism

The protocol doesn’t reward output alone, it rewards impactful insight.

The system is designed to:

Direct effort toward high-risk failure domains

Encourage discovery of subtle or evolving failure modes

Sustain a robust validator class capable of critical judgment

Grow an high-signal dataset of validated alignment failures

Summary

Miners earn rewards for exposing model failures that are validated and reproducible

Validators earn rewards for confirming signal and maintaining evaluation quality

Tribunate members shape the incentive logic itself and are rewarded for maintaining alignment integrity

Reputation and slashing systems ensure long-term accountability

All rewards are linked to verifiable alignment signal, not popularity, not politics, not compliance

Aurelius turns alignment into an adversarial, decentralized market for truth, where incentives sharpen safety at scale.