Data Access

Aurelius produces high-integrity alignment data by capturing adversarial prompts, model completions, and validator evaluations, all anchored by cryptographic hashes and enriched with structured metadata.

The resulting datasets are designed to support open research, reproducible evaluation, and practical alignment diagnostics.

What the Data Includes

Each validated record contains:

Prompt
The input used to elicit a model response.

Response
The model’s unaltered completion.

Validator Scores
Alignment evaluations across key dimensions.

Tags
Categorical labels (e.g., toxicity, bias, jailbreak

Reasoning Traces
Optional justifications from miners and validators

Mechanistic Metadata
When available, attention patterns, activation traces, or tool outputs

Hash Commitments
SHA-256 checksums ensuring full reproducibility

These artifacts form the foundation of the Aurelius Alignment Dataset, a living resource for alignment research and model fine-tuning.

Access Methods

The protocol will support multiple modes of data access, tailored for different levels of technical and analytical use:

Public Dashboards

High-level summaries of alignment failures

Validator agreement trends

Dataset growth and category frequency over time

Dataset Releases

Versioned exports of validated prompt–response pairs

Available in formats suitable for ML workflows (e.g., JSONL, CSV, Parquet)

Includes rubric metadata and schema documentation

Programmatic Interfaces (API)

Query access for prompt/result pairs

Filtered access by tag, dimension, or rubric version

Rubric history and validator consensus lookups

Attribution and Licensing

All public data will include:

Versioning identifiers for traceability

Attribution guidelines for academic or commercial use

Clearly marked rubric versions and scoring standards at time of collection

Where appropriate, the protocol may adopt open data licenses that preserve integrity and ensure attribution without restricting research use.

Privacy and Model Confidentiality

For models under private or restricted evaluation:

Prompts and outputs may be encrypted or obfuscated

Model names, endpoints, and weights will not be exposed

Validator access will be restricted to essential scoring information

Audits will be conducted in isolated or secured compute environments

These safeguards protect model confidentiality while still producing alignment-relevant insights.

Summary

Aurelius is building a high-signal, high-integrity alignment dataset, not only for research, but for long-term transparency and safety across the AI ecosystem.

All data is reproducible, cryptographically verified, and schema-consistent

Access methods are tailored for both human and programmatic use

Privacy protections are in place for sensitive or private model evaluations

The dataset evolves as adversarial discovery and rubric logic mature

Alignment is not just a score, it’s a record. And that record belongs to the world.

Previous
Incentives
Next
Roadmap
Previous
Incentives
Next
Phase 1