Guideline Paths
Curated trails through research
Hand-picked reading sequences that build up understanding of a topic in the order that works. Each step links to the source, explains why it's worth reading, and shows the Star tier where we've scored the piece ourselves.
Applied AI
How to evaluate an LLM system before deploying it
6 steps · ~25 minBy the end you'll know the difference between benchmark scores and production reliability, why most evaluation setups are silently broken, and what to measure when your AI feature ships to real users.
Agent architecture without the hype
4 steps · ~20 minBy the end you'll know when to use a single agent versus a multi-agent system, what observability requires, and the failure modes that don't show up in toy demos.
Paths are curated, not auto-generated. We add one when we think the sequence is genuinely better than reading any single source — usually because the underlying material is scattered across notes, papers, and external work that don't reference each other directly. Have a path you'd like to see? Submit a research note suggesting it.
Related — Empirica Stars
Every step shows its tier rating
The reason these paths are useful is the underlying quality signal. Each step we've scored ourselves carries its Star tier — a 0–3 rating layered on our 0–100 validator score. You see at a glance whether the next step is going to be worth the time.
One way to think about it: this page shows you what to read first, second, third. The Empirica Stars page explains how we grade what shows up here.
Read about the Empirica Stars tier system