Rating system
Empirica Stars
Every piece of research we publish — and every submission we score — earns between zero and three Stars. Think of it as restaurant stars, but for research: a quick visual that sums up our 0–100 validator score so you don't have to do the math.
For the per-check rubric that produces the underlying number, see the scoring page →
The ladder
Four tiers, tight thresholds
Exceptional
Score 90–100
Rare. Logic, empirical, and depth checks all passed at the highest band. Stands as a reference for the field.
Distinguished
Score 80–89
Strong work across all three checks. Publishable, citeable, and good enough that we automatically generate a course lesson from it.
Notable
Score 70–79
Validator-confident. The work clears the bar in every dimension, with minor improvements separating it from a higher tier.
Validated
Score 50–69
Passed our publication threshold for industry research, below the Empirica Stars threshold. The per-rubric breakdown shows what would lift it.
Scores below 50 don't publish at all — the validator rejects them and the submitter gets per-rubric feedback explaining why.
Why a tier system
A score of 87 means nothing at a glance
A reader scrolling Rankings or Notes shouldn't have to do arithmetic to decide whether something is worth reading. The number 87 doesn't tell you much without the rubric memorised. Two Stars tells you instantly: the validator was confident, expect strong work.
The 0–100 score is still on every page, and the per-rubric breakdown is one click away. The Stars are the shorthand, not a replacement.
Related — Guideline Paths
The tier rating is the signal. The Paths are the route.
Empirica Stars tell you which individual pieces of research are worth reading. Guideline Paths bundle the strongest pieces into a hand-curated sequence that takes you from zero to understanding on a topic. Every step in a Path surfaces its Star tier inline, so you see the quality signal as you read.
One way to think about it: this page tells you how we grade. Paths show you what to read first, second, third, with the grades attached.
Browse the Guideline PathsTrustworthiness
The hard questions, answered honestly
Do authors pay to be graded?
No. The Empirica Stars brand depends on its independence. We don't accept payment from authors, institutions, or publishers in exchange for a grade. Submission is free; the score is the score. Future revenue comes from bulk API access for institutions and recommendation referrals, not from grading itself.
Who or what is the 'panel'?
An autonomous validation pipeline running against a fixed, published rubric. Three independent checks — logic, empirical, depth — then a final decision. The rubric and its thresholds are public, and the same bar applies to our own internal research as to external submissions. There's no human reviewer in the publish loop, by design: the bar is reproducible, the rubric is identical for every submission, and every score comes with a per-rubric breakdown so you can see exactly what the validator caught.
How can I trust the rating in a field you don't know well?
We won't pretend to coverage we don't have. Where we've scored hundreds of papers — agent economy, applied AI, quantitative strategy — the rating is calibrated and the thresholds are stable. Where coverage is thinner, the per-rubric breakdown is your best signal: a two-Star paper with a perfect empirical score and a weaker depth score tells you exactly what the validator caught and didn't. See the breakdown on any submission's status page.
What stops grade inflation?
The thresholds are wide on purpose. 70+ for one Star, 80+ for two, 90+ for three. Most published work earns one or none; two is meaningful; three is rare. We re-calibrate against our own internal output distribution quarterly and publish the recalibration when it happens.
How do I know my submission was graded fairly?
Every submission gets a per-rubric breakdown (Logic / Empirical / Depth scores plus specific issues flagged), emailed to you and shown on a live status page. If you disagree, revise and resubmit — the new attempt is scored fresh, with no penalty for prior tries. Full rubric here.
Will the criteria change?
The rubric is intentionally stable, but it isn't frozen. Major changes — new check, threshold shift, new content type — are announced and back-dated to existing scores so the historical ranking stays interpretable. Small drift (validator prompt tightening, new failure modes added to the rejection list) is documented in the public changelog.