Krtek - B2B Lead Generation Pipeline
B2B lead generation pipeline for the Czech market. Pulls the CSO RES dump (1.5 GB CSV), filters by NACE codes, enriches via the ARES API (region, VAT), scrapes company sites for contacts. Bun + bun:sqlite for batch ops, Playwright for JS-rendered pages. **100K+ companies in the DB**, ready to expand to NACE 620/631 (IT/SaaS).
BunTypeScriptbun:sqlitebetter-sqlite3PlaywrightCheerioARES APICSO open dataNACE filter
100K+ firms in B2B DB
Goal
Autonomous pipeline for generating B2B leads from public open data. NACE 692xx (accounting/audit) as the first target segment.
Pipeline
| Step | Action |
|---|---|
| 1 | Fetch CSO RES dump (1.5 GB CSV), filter by NACE codes |
| 2 | Find company websites (search + heuristics) |
| 3 | Scrape contacts from impressum/contact pages |
| 4 | Enrich via ARES API (region, VAT) |
| 5 | Segmentation + export |
Stack choices
- Bun over Node - faster startup, native SQLite (
bun:sqlite) - better-sqlite3 + Bun.sql - sync DB ops, no async overhead for batch
- Playwright for JS-rendered sites, raw fetch for static HTML
Outcome
- 100K+ firms loaded (Step 1 done)
- Steps 2–5 in incremental rollout
- Queryable from dev terminal via bun:sqlite