autocomplete_multiplex.py: Routing keyword research across three API backends with automatic failover and credit tracking

The problem

Keyword autocomplete research at scale requires running hundreds of seed queries against Google’s suggestion endpoint. Running all queries through a single API backend creates three failure modes: credit exhaustion, rate limiting, and single-point-of-failure for a critical research step.

The project context: a large-scale CDCP dental keyword research project requiring autocomplete data across multiple query clusters simultaneously, with different AI agents executing different portions of the workload.

The routing architecture

The multiplexer routes each incoming query to a backend based on which agent is making the request:

Agent	Primary Backend	Fallback
Claude / Opus	Serper¹	ScrapFly²
Haiku	ScrapFly²	Direct Google³
G1 (Zeus File Bridge)	Direct Google³	ScrapFly²
G3 / Qwen	ScrapFly²	Direct Google³

The routing logic is policy-driven, not dynamic. Each agent has a defined primary and a defined fallback. The multiplexer checks credit availability on the primary backend before routing; if the primary is exhausted or unavailable, the request goes to the fallback automatically.

Credit tracking

Each backend maintains a separate credit pool. The multiplexer tracks requests sent per backend per session, estimated credits consumed per request (by backend pricing model), a running total against configured per-backend limits, and an alert threshold before exhaustion. Usage is logged per-request with timestamp, agent, backend used, query, and credit cost.

Backend characteristics

Serper¹ — paid API, highest reliability, cleanest JSON response, rate limits apply per API key. Primary for Claude/Opus because Claude-originated research tasks are typically the highest-priority and most complex.

ScrapFly² — paid scraping proxy, higher latency than Serper, better suited for volume tasks where per-request cost matters more than speed.

Direct Google Autocomplete³ — zero cost, no API key required (endpoint confirmed working), latency variable, subject to rate limiting from Google’s side if request volume is high. Primary for G1 (Zeus) which runs against the infrastructure at high frequency.

Output structure

Outputs go to Auto Suggest Scraper V2/engine/ and outputs/. Each file contains: seed query, backend used, timestamp, raw autocomplete suggestions returned, and normalized suggestion list — compatible with downstream keyword clustering and intent classification pipelines.

Sources

Serper. Google Search API. serper.dev. JSON API for Google Search and Autocomplete results. Used as primary backend for Claude/Opus agent queries. ↩︎ ↩︎
ScrapFly. Web Scraping API. scrapfly.io. Rotating proxy and scraping infrastructure used as primary backend for Haiku and G3/Qwen agent queries. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
Google. Google Autocomplete (Search Suggestions). Undocumented public endpoint (google.com/complete/search). Zero cost, no authentication required. Used as primary backend for G1 (Zeus File Bridge) agent queries. ↩︎ ↩︎ ↩︎ ↩︎