Frequently Asked Questions

What is Aurelius?

Aurelius is a decentralized protocol for red-teaming AI models. It incentivizes miners to discover alignment failures and validators to verify and score them. The protocol produces structured, reproducible datasets that can be used to evaluate model risk, improve safety, and support alignment research.

What problem does Aurelius solve?

Today’s language models are often brittle under pressure. Most alignment testing is centralized, static, or narrow in scope. Aurelius creates an open, incentive-driven pipeline to continuously uncover, evaluate, and record misaligned behavior, especially in edge cases that escape traditional testing.

How are miners and validators rewarded?

Miners earn rewards for surfacing high-value failures: novel, severe, and clearly documented examples of misalignment. Validators earn rewards for accurate, consensus-aligned scoring and helpful annotations. All emissions are distributed based on contribution quality and reproducibility in accordance with the Bittensor Blockchain’s token emission mechanics.

What kinds of models can be evaluated?

Aurelius initially targets open-source LLMs. Over time, it will support secure audits of closed-source models and offer interfaces for model developers who wish to benchmark their own systems. Technically speaking, any model is compatible with Aurelius from day 1, but protocol improvements will be easiest to isolate on smaller, open-source models at the beginning.

What is the Tribunate?

The Tribunate is the protocol’s governing logic layer. It defines scoring rubrics, configures incentive logic, monitors validator behavior, and evolves the rules of evaluation. Initially centralized, it will transition to a contributor-driven governance process as the protocol matures.

What happens to the data?

Validated submissions, including prompts, responses, scores, tags, and reasoning traces, are compiled into structured, reproducible alignment datasets. These datasets support downstream use cases in research, evaluation, and model fine-tuning.

Can this improve models?

Yes. The protocol generates high-signal failure data that can be used to:

  • Retrain or fine-tune models for safety
  • Stress-test new releases
  • Benchmark alignment progress
  • Analyze failure types across model versions

Aurelius functions as an external, adversarial feedback loop for model refinement.

Is this only for chatbots?

The initial focus is on text-based, general reasoning models, but the architecture is flexible. Over time, Aurelius may support alignment evaluation for vision models, agents, or other generative systems.

Is Aurelius open source?

Yes. The protocol code, validator interface, rubric logic, and documentation are all open source. Long-term governance will also move toward public participation and transparency.

How can I participate?
  • Run a miner
    Check out our GitHub. Create adversarial prompts and surface misalignment.
  • Join as a validator
    Check out our Github. Evaluate submissions and help refine the scoring rubric.
  • Contribute to governance
    Contact aurelius.subnet@gmail.com to participate.
  • Collaborate on research
    Contact aurelius.subnet@gmail.com to use Aurelius data in your own alignment projects.
Previous
Phase 4