Empirica Score · Phase A complete · 2026-05-26

An open algorithm that reproduces 95% of paid-feed-equivalent stock screening on free data.

We built three open factor proxies (Relative Strength, Technical, Fundamental) on free inputs, blended them via regression on 800 paired observations, and validated head-to-head against a publicly-visible third-party scoring benchmark. Two out-of-sample snapshots both came in above 94% top-100 overlap. The algorithm is documented; the data inputs are free. This isn't a product yet — it's evidence that the firm's AI agents produce output at the level of paid incumbents.

See the validation Read the methodology Why our pricing is what it is →

Phase A validation

Two out-of-sample snapshots. Both came in above 94%.

We held out two daily snapshots from the regression and computed the top-100 overlap between Empirica Score and a publicly-visible third-party scoring benchmark. The pass threshold was set in advance at 60%. Both results comfortably cleared it.

Top-100 overlap

95.8%

2026-05-22

Top-100 overlap

94.9%

2026-05-12

Regression R²

0.859

800 obs · 8 snapshots

Monthly data cost

vs $2,000+ for Bloomberg

RS proxy

passed

Pearson 0.94

Relative Strength against the universe. The strongest single proxy — anchors the score.

TA proxy

ceiling

Pearson 0.53

Technical signals. Structural ceiling without intraday data — moderately predictive.

FA proxy

ceiling

Pearson 0.61

Fundamental signals from public filings. A paid feed would lift this further.

What it would cost you another way

Same job. Different cost structures.

The Empirica Score is research evidence — not a product you can subscribe to today. The comparison below shows what it would cost to run an equivalent equity screen using the tools institutional buyers actually use.

Option	Cost	Timeline	Notes
Empirica Score	$0 / month data	Phase A complete · Phase B (shadow-trading + 5yr backtest) next	Open methodology, three factor proxies blended via regression. 95% top-100 overlap with a third-party scoring benchmark across two out-of-sample snapshots.
Bloomberg Terminal	$2,000 / month	Real-time + archive	Gold standard for live data + screening. Closed methodology. About 24× our cost on a yearly basis — $24,000/yr vs $0 in data.
FactSet Workstation (institutional seat)	$1,000 – $2,000 / month per seat	Real-time	Institutional equity research workstation. Closed scoring. 12-24× our data cost per analyst per year.
Refinitiv Workspace	$1,800+ / month per seat	Real-time	Same category as FactSet. Closed methodology. ~$22,000/yr per seat.
Retail screening service (Finviz Elite, similar)	$25 / month	Daily	Closed scoring methodology. Similar monthly cost — but you can't see how the score is built, audit it, or use it as a benchmark.
Build it yourself	$50,000+ engineering + paid data feed	3 – 6 months from spec to running	A senior quant + data engineer for a quarter, plus an ongoing feed subscription. The version we built took 6 weeks on free data.

Who this is for

Three buyer profiles, three different ways the math works out.

Independent researcher

Roughly $24,000/year of seat cost avoided.

The most common case. You get a transparent, auditable score; a paid terminal gives you a closed box for several thousand percent more.

Wealth-management desk — daily top-100

$12,000 – $24,000/year saved per analyst seat.

Scales linearly with team size. A 5-analyst desk on FactSet costs $60,000 – $120,000/year just for the screener seats.

Quant team validating its own factor model

A defensible external benchmark you can actually cite.

The whole point of an open algorithm is replication. Closed scores can't be benchmarks for serious quant work.

Methodology

How the score is built.

Three open factor proxies — Relative Strength (RS), Technical (TA), and Fundamental (FA) — were built on free inputs (yfinance price history, public sector classifications, public fundamentals). We ran an OLS regression of these proxies against a publicly-visible third-party scoring benchmark on 800 paired observations across 8 daily snapshots, finding R² = 0.859. The fitted weights are the Empirica Score. Two held-out snapshots gave 95.8% top-100 overlap on 2026-05-22 and 94.9% on 2026-05-12 — both well above the 60% pass target set in advance. Phase B (next) layers 6 weeks of shadow-trading and a 5-year out-of-sample backtest on top, after which the production weights are locked.

The formula at a glance

Empirica Score = 0.221·RS
                + 0.190·TA × 9.9
                + 0.307·FA × 9.9
                + 1.311 if NASDAQ-100
                − 0.028 if S&P 500
                − 0.492 if Russell 2000
                − 6.914 if defensive sector
                + 0.220·log10(market cap)
                + 23.98

Sectors classified as defensive: Health Care,
Consumer Staples / Defensive, Utilities.

Weights fitted by OLS on 800 paired observations
across 8 daily snapshots (R² = 0.859, last refit
2026-05-26).

The full source is in scripts/empirica_score/ in our repository. The three proxies (rs_proxy.py, ta_proxy.py, fa_proxy.py) are independently testable — and the regression that combines them is reproducible from any 8 daily snapshots.

Phase B · next

Six weeks of shadow-trading. Then a 5-year out-of-sample backtest.

Phase A proves the algorithm reproduces a known benchmark. Phase B tests whether the same weights generalise across time and market regimes. Two things happen in parallel: a 6-week shadow-trading window on a paper account (live signals, no capital), and a 5-year historical backtest using the same scoring rules applied to data the model has never seen.

Only when both pass do the weights lock for production use. Until then, the Empirica Score is research evidence — not an investment product. We share the methodology so peers can replicate it; we don't offer trading recommendations.

The current phase is stabilisation — a minimum of 90 clean days of operation on internal capital before any public trading-track-record dashboard ships, and any external-capital structure remains gated on a formal Australian financial-services lawyer review. We err toward over-disclosure of methodology and under-disclosure of live state — the reverse is the path most firms in this category get into trouble with.

Want to read more of the firm's work?

The Empirica Score is one output of a fleet of always-on research agents. The rest — long-form publications, short notes, daily news synthesis, a sector-pulse macro signal — sits behind a $29/month subscription.

Subscribe — $29/month Read the publications See the broader methodology →