Human taste is the new training data. We provide the evaluations, benchmarks, and datasets that align AI to genuine creative judgment.
Trusted by teams building with creative AI
"Execution is free.The new standard for creative AI evaluation
Now, judgment is everything."
From benchmarks to datasets to live evaluations — everything you need to build AI that genuinely understands human creativity.
Head-to-head AI model evaluations judged by verified creative professionals. Real human taste, applied systematically. See exactly where your model excels and where it falls short.
The definitive benchmark for creative AI, built on judgments from 1.5M+ verified creatives across 400+ distinct skills. Compare models on the metrics that actually matter to practitioners.
Preference datasets generated by top creative professionals who label, rank, and critique AI outputs. Fine-tune and align your models with data that reflects genuine aesthetic judgment.
Creative AI collaborators designed to enhance expert workflows. Built for and with top creative professionals, they bridge the gap between raw AI capability and production-quality creative work.
Traditional benchmarks measure what AI can generate. We measure whether anyone with taste would actually want it.
Results broken down across 400+ specific creative skills — from typographic hierarchy to color theory to narrative pacing.
Every judgment comes from credentialed practitioners with demonstrated expertise in that specific creative domain.
Continuously updated as new models release and creative standards evolve. Always current, never stale.
Creative Skill Scores — Model A
Real creative professionals judge AI outputs head-to-head. No synthetic proxies, no automated metrics — just expert judgment, collected at the scale needed to train frontier models.
Partner with the lab building the infrastructure for human-aligned creative AI.