Architectural Brief: Compliance Document Substrate
Klevar Docs had to solve a narrow problem with wide consequences: every entity in the group needs legal documents, but none of those documents can borrow identity, numbering, banking data, or audit history from another entity. The result is a headless document substrate that treats rendering, invoice compliance, signing, retention, and event fanout as one transaction boundary, not as separate utilities.
System Topology
Infrastructure Decisions
- Compute: Docker on a DigitalOcean VPS with Traefik. Chose over serverless because Puppeteer, Ghostscript, veraPDF, mustangproject, and EU DSS need stable binaries, local filesystem access, and Unix socket IPC. A short-lived function would hide too much of the failure surface.
- API Layer: Fastify 5 with schema-first TypeScript. Chose over a heavier framework because the service is a headless internal API with 59 OpenAPI path groups, scoped API keys, idempotency, and typed error envelopes. The router needs to stay explicit.
- Data Layer: PostgreSQL 16 with Drizzle. Chose over document storage because invoice state, document identity, sequence allocation, chain verification, intercompany pairing, payments, and retention rules all need relational constraints. JSONB is used for document bodies and snapshots, but the invariants live in tables.
- Queue Layer: Redis 7 plus BullMQ. Chose over database-only jobs because hash-chain verification, TSA anchoring, audit packet generation, certificate checks, and retryable background work need a queue. Long-running scheduled work that belongs outside the service is delegated to the Workflow Engine.
- Rendering: Handlebars plus Puppeteer. Chose over a PDF-only library because the system has 42 authored template bundles and needs browser-grade CSS, markdown sections, signature blocks, status banners, and shared partials. The tradeoff is operational: Chromium is SHA-pinned so a binary drift cannot silently change output.
- Compliance Sidecars: mustangproject and EU DSS run as two Java processes behind Unix sockets. Chose over embedding Java libraries into Node because iText dependency versions conflict across the e-invoicing and signing stacks. The two-JVM form keeps each compliance tool in its own runtime while the TypeScript API owns orchestration.
- Storage: Supabase Storage buckets for PDFs, XML, archives, and signing cert material. Chose over local disk because generated documents need signed URLs, archive movement, and stable retrieval after container replacement.
- Ecosystem Boundary: Notification Hub, Webhook Engine, and Workflow Engine are external services. Chose over building email, webhook retries, and cron dashboards inside Klevar Docs because this service has one job: produce and prove documents. Delivery and orchestration are shared infrastructure.
Constraints That Shaped the Design
- Input: API requests from internal systems and the CLI, each carrying an entity scope, a document type, and a schema-gated body. Mutating calls require
Idempotency-Key. - Output: Rendered PDFs, optional XML artifacts, payment links, audit events, hash-chain rows, and storage paths. The service does not expose a client-facing UI.
- Scale Handled: The repo currently carries 41 schema files, 42 template bundles, 14 document types, and 59 OpenAPI path groups. The scale problem is correctness surface, not public traffic.
- Hard Constraint: Every data-bearing row is entity-scoped except intentional global rows like
vendors. The codebase usesentity_id, nevertenant_id, because the business is one group with multiple legal entities, not a public multi-tenant SaaS. - Finalization Constraint: Document numbering cannot rely on PostgreSQL sequences because sequences advance on rollback. The allocator uses pending reservations, documented gaps, advisory locks, and a reaper.
- Compliance Constraint: Factur-X failures and PDF/A-3b failures halt invoice send. The service does not substitute a valid-looking PDF when the compliance artifact fails.
- Operational Constraint:
docs.klevar.airesponds, but the registry still marks the project as in development, so public frontmatter should not present it as a live demo.
Data Model
Decision Log
| Decision | Alternative Rejected | Why |
|---|---|---|
| Entity snapshots on every rendered document | Reading the live entity row on re-render | Legal output must match the issuance state even after brand, address, banking, or officer data changes. |
entity_id everywhere, no tenant_id |
Generic tenant middleware | The business has one owner group and multiple legal entities. A tenant abstraction would hide the actual compliance boundary. |
| Pending allocation plus sequence gaps | PostgreSQL sequences | Sequences advance on rollback. German-style gap accounting needs documented gaps and reclaimable reservations. |
| Private Handlebars instances per compose | Global Handlebars singleton | Global helper registration leaks across tests and modules. Private instances keep template behavior local to one render. |
| mustangproject and EU DSS sidecars | Native TypeScript implementations | EN16931, XRechnung, and PAdES are compliance domains. Reimplementing them would create liability without improving control. |
| Unix socket IPC | Public HTTP ports between containers | The sidecars are internal compliance tools. Sockets on a named tmpfs volume keep them off the network surface. |
| Factur-X fail-closed send path | Ship plain PDF and warn later | A failed XML or PDF/A artifact is not a valid invoice package. The send operation must stop before the client receives it. |
| Workflow Engine for selected crons | Keep every schedule in BullMQ | Late fees, dunning, recurring invoices, and retention sweeps are operational workflows. The service keeps the function, the Workflow Engine owns the schedule. |
Operating Limits
The current design is shaped for internal Klevar volume: invoices, board records, engagement letters, vendor bills, receipts, reports, and audit packets across FZE, LLC, and Ltd. The first scaling pressure will not be request throughput. It will be conformance breadth: more jurisdictions, more e-invoice profiles, more external verification modes, and more document families.
If this ever becomes a public product, the first rebuild is not horizontal scaling. It is tenancy. The schema, auth, retention policy, event surfaces, and audit exports would need a real tenant boundary from day one. Until then, the narrower model is the right one because it makes the legal entity boundary visible everywhere the code writes data.