Krtek - B2B Lead Generation Pipeline

B2B lead generation pipeline for the Czech market. Pulls the CSO RES dump (1.5 GB CSV), filters by NACE codes, enriches via the ARES API (region, VAT), scrapes company sites for contacts. Bun + bun:sqlite for batch ops, Playwright for JS-rendered pages. **100K+ companies in the DB**, ready to expand to NACE 620/631 (IT/SaaS).

BunTypeScriptbun:sqlitebetter-sqlite3PlaywrightCheerioARES APICSO open dataNACE filter

100K+ firms in B2B DB

Goal

Autonomous pipeline for generating B2B leads from public open data. NACE 692xx (accounting/audit) as the first target segment.

Pipeline

StepAction
1Fetch CSO RES dump (1.5 GB CSV), filter by NACE codes
2Find company websites (search + heuristics)
3Scrape contacts from impressum/contact pages
4Enrich via ARES API (region, VAT)
5Segmentation + export

Stack choices

  • Bun over Node - faster startup, native SQLite (bun:sqlite)
  • better-sqlite3 + Bun.sql - sync DB ops, no async overhead for batch
  • Playwright for JS-rendered sites, raw fetch for static HTML

Outcome

  • 100K+ firms loaded (Step 1 done)
  • Steps 2–5 in incremental rollout
  • Queryable from dev terminal via bun:sqlite