Portfolio Discovery for Agent Service Markets: A Sieve Approach to API Allocation
1. Overview
Autonomous agents face a discovery problem structurally analogous to the one institutional allocators face when evaluating thousands of fund managers: a combinatorially large universe of API providers, research subscriptions, vector stores, and tool endpoints, each with opaque cost/quality tradeoffs and partial information disclosure. The 13F "sieve" methodology — enumerate combinations, filter on observable performance metrics, then concentrate budget on survivors — translates directly to agent service allocation, where the equivalent disclosures are vendor pricing pages, OpenAPI schemas, latency telemetry, and emerging agents.json-style capability manifests. This note develops that mapping and argues that publishing machine-readable, sieve-friendly service metadata is now a first-class go-to-market surface for any vendor selling into agent fleets.
2. Key findings
- Service universe is large but enumerable. A mid-sized agent stack today touches inference (OpenAI, Anthropic, Google, Mistral, DeepSeek, Together, Fireworks, Groq), retrieval (Exa, Tavily, Brave Search, You.com, Perplexity Sonar), embeddings (OpenAI, Voyage, Cohere), vector stores (Pinecone, Weaviate, Qdrant, Turbopuffer), and increasingly research/data subscriptions (Empirica, Polygon, Kaiko, AlphaSense). OpenRouter currently lists 300+ inference endpoints alone (https://openrouter.ai/models). The combinatorial space of "which provider for which subtask" is sieve-shaped: tractable to enumerate, expensive to exhaust.
- Pricing dispersion is wider than in fund management. For functionally similar 70B-class inference, observed list pricing spans roughly $0.20–$0.90 per million input tokens across Together, Fireworks, DeepInfra, and OpenRouter (https://openrouter.ai/models, https://www.together.ai/pricing) — a 4.5× spread. By comparison, active equity manager fee dispersion is typically <2×. Selection alpha for agents is therefore mechanically larger than for 13F followers.
- Quality signals are observable but noisy. LMArena, Artificial Analysis (https://artificialanalysis.ai/), and HELM provide third-party latency, throughput, and quality benchmarks that play the role 13F filings play for funds: standardized, lagged, partial. Techniques like RouteLLM and FrugalGPT demonstrated that cascaded routing on these signals recovers 80–95% of frontier-model quality at 20–40% of cost — empirically validating the sieve premise for the inference layer.
- Service-oriented architecture principles still govern integration cost. Loose coupling, standardized contracts, and protocol independence [P4] reduce the switching cost that would otherwise lock agents into a single provider. The ESB pattern [P4] has a clean modern analogue in agent gateways (LiteLLM, Portkey, OpenRouter) that abstract provider differences behind a uniform interface — the precondition for any sieve strategy to be cheap to execute.
- Discovery latency is the binding constraint. Unlike a 13F filer who can wait a quarter, an agent making a sub-second routing decision cannot crawl vendor websites. The discovery layer must be pre-indexed — which is precisely what
agents.json,llms.txt, and OpenAPI manifests are starting to standardize (https://llmstxt.org/, https://agents.json/). - Digital-transformation literature predicts ecosystem consolidation around manifest standards. The systematic review in [P3] identifies "malleable organizational designs embedded in digital business ecosystems" as the dominant pattern — interpreted at the agent layer, this implies a small number of dominant manifest/registry standards will emerge, and vendors absent from them become undiscoverable.
- Edge and federated patterns shape where the sieve runs. [P2] and [P1] describe the architectural pull toward decentralised inference. For agent fleets, this means the sieve may evaluate not just which vendor but which deployment locus (edge cache vs centralized API vs federated peer), adding a dimension absent from 13F analogues.
3. Agent service patterns — the sieve translated
3.1 The five-stage sieve, applied
| Stage | 13F equivalent | Agent-economy equivalent |
|---|---|---|
| Universe construction | All 13F filers with $100M+ AUM | All providers in registry (OpenRouter, MCP registry, agents.json index) |
| Coarse filter | Strategy fit, AUM, turnover | Capability match (does the endpoint support tool calls, JSON mode, embeddings of dim N?) |
| Performance filter | 3-yr alpha, Sharpe, drawdown | Benchmark scores (Artificial Analysis), p50/p99 latency, error rate |
| Cost filter | Expense ratio, soft-dollar burden | $/MTok, egress fees, minimum commitment, rate-limit ceilings |
| Concentration | Top-N portfolio, weight by conviction | Route table: weighted backend selection per request class |
This is not a metaphor — it is a literal control loop running inside production agent gateways. The Empirica fleet's own routing code can be specified as a constrained optimisation: minimise expected cost subject to quality-floor constraints, over the discoverable provider set.