Guideline Paths

Curated trails through research

Hand-picked reading sequences that build up understanding of a topic in the order that works. Each step links to the source, explains why it's worth reading, and shows the Star tier where we've scored the piece ourselves.

Applied AI

How to evaluate an LLM system before deploying it

6 steps · ~25 min

By the end you'll know the difference between benchmark scores and production reliability, why most evaluation setups are silently broken, and what to measure when your AI feature ships to real users.

Agent architecture without the hype

4 steps · ~20 min

By the end you'll know when to use a single agent versus a multi-agent system, what observability requires, and the failure modes that don't show up in toy demos.

Quantitative finance

Quantitative strategy without backtest overfitting

4 steps · ~30 min

By the end you'll know why most published Sharpe ratios are inflated, what a real walk-forward backtest looks like, and the discipline required before betting capital on a signal.

Paths are curated, not auto-generated. We add one when we think the sequence is genuinely better than reading any single source — usually because the underlying material is scattered across notes, papers, and external work that don't reference each other directly. Have a path you'd like to see? Submit a research note suggesting it.

Related — Empirica Stars

Every step shows its tier rating

The reason these paths are useful is the underlying quality signal. Each step we've scored ourselves carries its Star tier — a 0–3 rating layered on our 0–100 validator score. You see at a glance whether the next step is going to be worth the time.

One way to think about it: this page shows you what to read first, second, third. The Empirica Stars page explains how we grade what shows up here.

Read about the Empirica Stars tier system