The problem
Keyword autocomplete research at scale requires running hundreds of seed queries against Google’s suggestion endpoint. Running all queries through a single API backend creates three failure modes: credit exhaustion, rate limiting, and single-point-of-failure for a critical research step.
The project context: a large-scale CDCP dental keyword research project requiring autocomplete data across multiple query clusters simultaneously, with different AI agents executing different portions of the workload.
The routing architecture
The multiplexer routes each incoming query to a backend based on which agent is making the request:
| Agent | Primary Backend | Fallback |
|---|---|---|
| Claude / Opus | Serper1 | ScrapFly2 |
| Haiku | ScrapFly2 | Direct Google3 |
| G1 (Zeus File Bridge) | Direct Google3 | ScrapFly2 |
| G3 / Qwen | ScrapFly2 | Direct Google3 |
The routing logic is policy-driven, not dynamic. Each agent has a defined primary and a defined fallback. The multiplexer checks credit availability on the primary backend before routing; if the primary is exhausted or unavailable, the request goes to the fallback automatically.
Credit tracking
Each backend maintains a separate credit pool. The multiplexer tracks requests sent per backend per session, estimated credits consumed per request (by backend pricing model), a running total against configured per-backend limits, and an alert threshold before exhaustion. Usage is logged per-request with timestamp, agent, backend used, query, and credit cost.
Backend characteristics
Serper1 — paid API, highest reliability, cleanest JSON response, rate limits apply per API key. Primary for Claude/Opus because Claude-originated research tasks are typically the highest-priority and most complex.
ScrapFly2 — paid scraping proxy, higher latency than Serper, better suited for volume tasks where per-request cost matters more than speed.
Direct Google Autocomplete3 — zero cost, no API key required (endpoint confirmed working), latency variable, subject to rate limiting from Google’s side if request volume is high. Primary for G1 (Zeus) which runs against the infrastructure at high frequency.
Output structure
Outputs go to Auto Suggest Scraper V2/engine/ and outputs/. Each file contains: seed query, backend used, timestamp, raw autocomplete suggestions returned, and normalized suggestion list — compatible with downstream keyword clustering and intent classification pipelines.
Sources
Serper. Google Search API. serper.dev. JSON API for Google Search and Autocomplete results. Used as primary backend for Claude/Opus agent queries. ↩︎ ↩︎
ScrapFly. Web Scraping API. scrapfly.io. Rotating proxy and scraping infrastructure used as primary backend for Haiku and G3/Qwen agent queries. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
Google. Google Autocomplete (Search Suggestions). Undocumented public endpoint (google.com/complete/search). Zero cost, no authentication required. Used as primary backend for G1 (Zeus File Bridge) agent queries. ↩︎ ↩︎ ↩︎ ↩︎