External attack-surface monitoring for small businesses, built on the premise that a scanner is an attack with a permission slip, and the permission slip should be enforced by the architecture.
"No active packet leaves our infrastructure toward a target until we've verified the customer controls that asset. Unauthorized active scanning isn't a feature we gate; it's a crime we make unreachable."
The problem cuts both ways
A small business has internet-facing assets it doesn't know about and can't audit. The obvious fix, a tool that scans them, is itself dangerous: scanning a target you don't own is an attack, and a platform that catalogs other people's weaknesses is a high-value target the day it exists. So the design has two adversaries at once: the exposure on the customer's perimeter, and the abuse of the scanner itself.
Consent is the control
An unverified domain gets a passive scan: public data only, not one connection to the target. The full active pipeline unlocks only after DNS-TXT ownership verification. The distinction is a security boundary, not a plan tier.
unverified · passive mode
public data only
· DNS + certificate-transparency lookups
· breach-exposure check (public data)
· email auth records (SPF / DMARC)
not one packet leaves our infrastructure toward the target
dns-txt ownership proof
verified · full mode
1subdomain discovery (subfinder, active)
2port scan on found hosts (naabu)
3web + TLS probe (httpx, tlsx)
4templated exposure checks (nuclei, constrained)
The mode isn't a plan tier, it's a legal and safety boundary: active scanning without authorization is a CFAA problem, so the architecture makes it unreachable rather than merely discouraged.
The scanner pipeline
A Next.js frontend talks to a FastAPI service that enqueues work on Redis; a Python worker runs the scan pipeline and persists findings to Postgres. The pipeline is a pure package with no web or database dependencies, which is what makes its security-critical stages unit-testable in isolation.
system · request path
Next.js web→
FastAPI api→
Redis queue→
scanner worker→
Postgres
pipeline · every scan, in order
safety→
discovery→
probe→
vuln→
enrich→
score→
validate→
report
safety runs first in every mode: no stage sees a target that hasn't passed the SSRF and validation gate. The pipeline itself is a pure package (no web, no database): inputs in, a scan result out, which is what makes the security-critical paths unit-testable.
In the code
SSRF-by-DNS: every answer must be public
A scanner is the ideal SSRF pivot: point it at a public-looking name that resolves to 169.254.169.254 and it fetches cloud-metadata credentials for you. The guard resolves the name and requires that EVERY answer is globally public. Requiring all of them, not just one, is the point: it defeats a DNS-rebinding answer that mixes a public and a private record. This is the app-layer half; a kernel egress firewall is the second.
packages/scanner/safety.py · resolve_safe()
def resolve_safe(host: str) -> list[str]: """Resolve host and assert EVERY answer is public. Returns the pinned IPs.""" infos = socket.getaddrinfo(host, None) ips = sorted({cast("str", i[4][0]) for i in infos}) if not ips: raise UnsafeTargetError("does not resolve") for ip in ips: if not _ip_is_public(ip): raise UnsafeTargetError(f"{host} resolves to non-public address {ip}") return ipsdef _ip_is_public(ip: str) -> bool: addr = ipaddress.ip_address(ip) if not addr.is_global: # private / reserved / loopback / multicast return False # _BLOCKED is an explicit second pass, so one missed is_global flag # can't open an internal range (incl. IPv4-mapped IPv6). return not any(addr in net for net in _BLOCKED)
A passive scan can't emit an active finding
Free/passive scans use public data only, so they can never legitimately produce an open-port or CVE finding. This frozenset lists the kinds only the active block emits, and it lives next to the code that emits them so it can't drift. The persister imports it and DROPS any full-only kind that shows up on a passive result, then audits the anomaly. A forged passive payload claiming active findings gets dropped, not stored.
# The kinds that ONLY the active (full-mode) block of run_scan emits.# Defined next to the mappers that produce them so it cannot drift; the# persister imports THIS set and drops any full-only kind on a passive scan.FULL_ONLY_FINDING_KINDS: frozenset[str] = frozenset( { "open_port", # naabu open admin/db port (active connect) "tls_expired", # tlsx certificate hygiene (active handshake) "tls_self_signed", "tls_weak_cipher", "cve", # nuclei CVE finding (active web probe) "exposure", # nuclei non-CVE exposure finding "breach", # owner-verified HIBP domain search })
The kernel backstop, kept in parity by a test
The app-layer guard can be defeated by DNS rebinding after its check, and CLI tools like nuclei won't honor an app-pinned IP. So scanner nodes carry a destination-deny egress firewall that drops the private/reserved/metadata ranges at the kernel. It can't be a port allow-list, because a port scanner legitimately hits arbitrary public ports. A parity test fails CI if any range in the app-layer blocklist isn't covered here.
infra/scanner-egress.rules
# Destination-DENY: the scanner must reach arbitrary public IPs on# arbitrary ports (it port-scans authorized targets), so we can't# allow-list ports. We DROP the dangerous destinations instead,# mirroring safety._BLOCKED exactly. Parity enforced by# tests/test_egress_rules.py.*filter-A OUTPUT -d 169.254.0.0/16 -j DROP # link-local -> cloud metadata-A OUTPUT -d 10.0.0.0/8 -j DROP # RFC1918 private-A OUTPUT -d 172.16.0.0/12 -j DROP # RFC1918 private-A OUTPUT -d 192.168.0.0/16 -j DROP # RFC1918 private-A OUTPUT -d 127.0.0.0/8 -j DROP # loopback
the restraint
Perimeter detects and identifies exposure. It never exploits, never brute-forces credentials, never runs DoS or fuzz templates: nuclei's intrusive tags are excluded. It finds the unlocked door; it doesn't walk in. That's not a limitation I ran out of time to remove, it's the product boundary. The moment a monitoring tool starts exploiting, it stops being one.
Private repository, in active development. Happy to walk through the code on request.