Local Business Scout v2 Build Summary
Local Business Scout v2 is a production-grade local lead generation and website audit system.
By Sean WeldonLocal Business Scout v2 — Build Summary
Overview
Local Business Scout v2 is a production-grade local lead generation and website audit system. It systematically crawls, audits, scores, and prioritizes small business websites using a fully deterministic, resumable pipeline designed for scale, observability, and reliability.
Objectives
- Build a scalable, resumable audit pipeline for local business websites
- Produce deterministic, evidence-backed scores suitable for outreach
- Persist all artifacts for transparency, debugging, and reuse
- Eliminate duplicate processing via content-hash deduplication
Architecture Summary
Pipeline Flow:
Lead Created → Crawl → Extract → Audit → Score → Done
Core Stack:
- Next.js 14 (App Router) + TypeScript
- Python 3.11 async workers
- Playwright + Lighthouse CLI
- PostgreSQL (workflow + state machine)
- Object storage with content hashing (R2 / MinIO)
Key Architectural Decisions:
- Row-level locking (
FOR UPDATE SKIP LOCKED) for concurrency safety - JSONB storage for stage outputs and audit trail preservation
- Deterministic, versioned scoring rubric (v1.0.0)
- Artifact-first design (HTML, screenshots, Lighthouse JSON)
Core Capabilities
- Automated full-site crawl and rendering
- Structured data extraction (SEO, schema, contact signals)
- Lighthouse-based performance, SEO, accessibility audits
- Deterministic multi-factor scoring and priority classification
- Secure artifact access via presigned URLs
Outcomes
- End-to-end pipeline validated in production conditions
- Typical audit duration: ~1–3 minutes per site
- Storage efficiency improved ~30–50% via deduplication
- Reliable prioritization of outreach targets (High / Medium / Low)
Limitations (Phase 1)
- No automated lead discovery (manual input only)
- No LLM-generated reports or outreach copy
- No scheduling or recurring audits
Next Steps
- Automated lead discovery (Maps / Yelp)
- LLM-powered report generation and outreach drafts
- Scheduling and recurring audits
- Authentication and multi-user support
Reflection
This build establishes a strong deterministic backbone for future intelligence layers. By solving reliability, observability, and scoring consistency first, the system creates a trustworthy substrate that future LLM-driven features can safely build on.