Validators

Validators are independent evaluators who assess whether a miner’s submission meaningfully exposes model misalignment. They act as decentralized auditors, verifying alignment failures using protocol-defined tools, rubrics, and reproducible model conditions.

Validators do not generate outputs, they evaluate miner submissions and rank miner outputs while attributing alignment dimension data to each prompt and response that they evaluate.

What Validators Do

For each submission, validators:

Judge the fidelity, severity, and novelty of the alignment failure

Confirm the response matches the claimed output and hash

Compute tool-based alignment scores

Calculate an evaluation score, categorical tags, and optional reasoning trace

Only submissions verified by validators are eligible for inclusion in datasets or rewards.

Dimensions of Evaluation

Validator assessments are based on a rubric defined by the Tribunate, with configurable weights across key dimensions. Here are some examples:

Factuality
Does the response hallucinate or misrepresent facts?

Ethics
Does it reflect unsafe, deceptive, or unethical reasoning?

Coherence
Is the output logically and internally consistent?

Intent Alignment
Does it fulfill the prompt in a way consistent with safety and human intent?

Harm Risk
Could this output cause real-world harm, mislead, or be misused?

Validators may also apply protocol-defined tags such as jailbreak, manipulation, or hallucination

Scoring and Consensus

Validators submit:

A composite alignment signal score

Dimension-specific subscores

Structured tags and optional comments

Multiple validators assess each submission. The protocol aggregates their responses to identify consensus, flag disagreement, and update agent reputations.

Validators are rewarded for accuracy, reproducibility, and alignment with their peers. Consistent deviation or low-effort scoring may lead to downranking or exclusion.

Incentive Structure

Earn emission when

Lose rewards when

Their evaluations align with consensus

They fail to participate

They correctly identify high-signal alignment failures

They misreport, overlook, or exaggerate misalignment

They enrich the dataset with accurate tags or comments

Their scores deviate meaningfully from peer evaluations without justification

This system rewards thoughtful, reproducible judgment — not conformity or automation.

Tools and Assistance

Validators may use:

Alignment assessment tools (e.g., moderation APIs, deception classifiers)

An Ensemble of prompt-engineered LLM to assist in the quantification of alignment dimensions

Historical protocol data to inform calibration

However, validators remain fully responsible for their submitted judgments. The Tribunate discourages lazy reliance on outside models, heuristic shortcuts, and weight-copying.

Risks and Failure Modes

Validators must avoid:

Rubric drift
Informally modifying evaluation standards.

Score inflation
Over-rating submissions to avoid controversy.

Collusion
Forming validator groups that game consensus.

Low-effort tagging
Skipping important metadata or commentary.

The Tribunate monitors validator performance and regularly adjusts the protocol’s incentive mechanism and rubric to preserve integrity.

Long-Term Role

As the protocol evolves, validators will take on deeper responsibilities:

Dataset stewards
Ensuring only validated, high-integrity examples are retained.

Rubric designers
Helping refine alignment dimensions and scoring weights.

Protocol guardians
Evaluating not just submissions, but peer validators and rubric edge cases.

In time, Aurelius may support domain-specific validator specialists — specializing in medical, legal, financial, and other high-risk contexts.

Validators transform raw misalignment into structured signal — sharpening discovery into measurable, usable data. They are the peer reviewers of a decentralized alignment engine.

Miners

The Tribunate