~/About~/Systems~/Foundry~/Blueprint~/Journal~/Projects
Book a Call
Blueprint

Compliance Document Substrate

·6 min read·Kingsley Onoh·

Project

Compliance Document Substrate

Proof type

Architecture proof

Best for

CTO / architect

Source

Private build

Inspect

Architectural Brief: Compliance Document Substrate

Klevar Docs had to solve a narrow problem with wide consequences: every entity in the group needs legal documents, but none of those documents can borrow identity, numbering, banking data, or audit history from another entity. The result is a headless document substrate that treats rendering, invoice compliance, signing, retention, and event fanout as one transaction boundary, not as separate utilities.

System Topology

Architecture diagramScroll on small screens

Infrastructure Decisions

  • Compute: Docker on a DigitalOcean VPS with Traefik. Chose over serverless because Puppeteer, Ghostscript, veraPDF, mustangproject, and EU DSS need stable binaries, local filesystem access, and Unix socket IPC. A short-lived function would hide too much of the failure surface.
  • API Layer: Fastify 5 with schema-first TypeScript. Chose over a heavier framework because the service is a headless internal API with 59 OpenAPI path groups, scoped API keys, idempotency, and typed error envelopes. The router needs to stay explicit.
  • Data Layer: PostgreSQL 16 with Drizzle. Chose over document storage because invoice state, document identity, sequence allocation, chain verification, intercompany pairing, payments, and retention rules all need relational constraints. JSONB is used for document bodies and snapshots, but the invariants live in tables.
  • Queue Layer: Redis 7 plus BullMQ. Chose over database-only jobs because hash-chain verification, TSA anchoring, audit packet generation, certificate checks, and retryable background work need a queue. Long-running scheduled work that belongs outside the service is delegated to the Workflow Engine.
  • Rendering: Handlebars plus Puppeteer. Chose over a PDF-only library because the system has 42 authored template bundles and needs browser-grade CSS, markdown sections, signature blocks, status banners, and shared partials. The tradeoff is operational: Chromium is SHA-pinned so a binary drift cannot silently change output.
  • Compliance Sidecars: mustangproject and EU DSS run as two Java processes behind Unix sockets. Chose over embedding Java libraries into Node because iText dependency versions conflict across the e-invoicing and signing stacks. The two-JVM form keeps each compliance tool in its own runtime while the TypeScript API owns orchestration.
  • Storage: Supabase Storage buckets for PDFs, XML, archives, and signing cert material. Chose over local disk because generated documents need signed URLs, archive movement, and stable retrieval after container replacement.
  • Ecosystem Boundary: Notification Hub, Webhook Engine, and Workflow Engine are external services. Chose over building email, webhook retries, and cron dashboards inside Klevar Docs because this service has one job: produce and prove documents. Delivery and orchestration are shared infrastructure.

Constraints That Shaped the Design

  • Input: API requests from internal systems and the CLI, each carrying an entity scope, a document type, and a schema-gated body. Mutating calls require Idempotency-Key.
  • Output: Rendered PDFs, optional XML artifacts, payment links, audit events, hash-chain rows, and storage paths. The service does not expose a client-facing UI.
  • Scale Handled: The repo currently carries 41 schema files, 42 template bundles, 14 document types, and 59 OpenAPI path groups. The scale problem is correctness surface, not public traffic.
  • Hard Constraint: Every data-bearing row is entity-scoped except intentional global rows like vendors. The codebase uses entity_id, never tenant_id, because the business is one group with multiple legal entities, not a public multi-tenant SaaS.
  • Finalization Constraint: Document numbering cannot rely on PostgreSQL sequences because sequences advance on rollback. The allocator uses pending reservations, documented gaps, advisory locks, and a reaper.
  • Compliance Constraint: Factur-X failures and PDF/A-3b failures halt invoice send. The service does not substitute a valid-looking PDF when the compliance artifact fails.
  • Operational Constraint: docs.klevar.ai responds, but the registry still marks the project as in development, so public frontmatter should not present it as a live demo.

Data Model

Architecture diagramScroll on small screens

Decision Log

Decision Alternative Rejected Why
Entity snapshots on every rendered document Reading the live entity row on re-render Legal output must match the issuance state even after brand, address, banking, or officer data changes.
entity_id everywhere, no tenant_id Generic tenant middleware The business has one owner group and multiple legal entities. A tenant abstraction would hide the actual compliance boundary.
Pending allocation plus sequence gaps PostgreSQL sequences Sequences advance on rollback. German-style gap accounting needs documented gaps and reclaimable reservations.
Private Handlebars instances per compose Global Handlebars singleton Global helper registration leaks across tests and modules. Private instances keep template behavior local to one render.
mustangproject and EU DSS sidecars Native TypeScript implementations EN16931, XRechnung, and PAdES are compliance domains. Reimplementing them would create liability without improving control.
Unix socket IPC Public HTTP ports between containers The sidecars are internal compliance tools. Sockets on a named tmpfs volume keep them off the network surface.
Factur-X fail-closed send path Ship plain PDF and warn later A failed XML or PDF/A artifact is not a valid invoice package. The send operation must stop before the client receives it.
Workflow Engine for selected crons Keep every schedule in BullMQ Late fees, dunning, recurring invoices, and retention sweeps are operational workflows. The service keeps the function, the Workflow Engine owns the schedule.

Operating Limits

The current design is shaped for internal Klevar volume: invoices, board records, engagement letters, vendor bills, receipts, reports, and audit packets across FZE, LLC, and Ltd. The first scaling pressure will not be request throughput. It will be conformance breadth: more jurisdictions, more e-invoice profiles, more external verification modes, and more document families.

If this ever becomes a public product, the first rebuild is not horizontal scaling. It is tenancy. The schema, auth, retention policy, event surfaces, and audit exports would need a real tenant boundary from day one. Until then, the narrower model is the right one because it makes the legal entity boundary visible everywhere the code writes data.

#typescript#fastify#postgresql#pdf-a#e-invoicing

The complete performance for Compliance Document Substrate

Get Notified

New system breakdown? You'll know first.