Architectural Brief: Inventory Allocation Simulator
Inventory planning fails when yesterday's stock position is treated as today's truth. This system separates mutable planning data from frozen simulation evidence so a planner can ask why a transfer was recommended after the warehouse, SKU, or lane data has already changed.
System Topology
Infrastructure Decisions
- Compute: Docker Compose with a Julia application container. Chose this over a split Python optimizer service because Genie, JuMP, and the worker lifecycle can ship as one deployable unit without a cross-language boundary.
- Data Layer: PostgreSQL 16 for tenant data, simulations, recommendations, decisions, local notifications, and outbox rows. Chose this over document storage because the domain depends on scoped relationships between warehouses, SKUs, lanes, policies, and audit records.
- Analytics Layer: DuckDB for backtests. Chose this over loading every historical analysis into PostgreSQL because CSV-heavy replay work should not compete with the operational API tables.
- Optimization: JuMP with HiGHS. Chose this over a black-box model because planners need binding constraints, service-level tail behavior, solver status, and net-value math they can inspect.
- Frontend: Server-rendered Genie views with HTMX. Chose this over a separate SPA because the console is form-heavy and operational, not a consumer app.
- Jobs: Redis plus persisted job and run tables. Chose this over in-memory tasks only because imports, outbox dispatch, and simulations need restart-safe status.
Constraints That Shaped the Design
- Input: The system accepts CSV and API data for warehouses, SKUs, inventory positions, demand history, transfer lanes, and allocation policies.
- Output: The system produces simulation runs, stored demand scenarios, allocation recommendations, decision audit rows, CSV exports, local notifications, and optional ecosystem outbox events.
- Scale Handled: The benchmark fixture covers 50 warehouses, two thousand SKUs, and 100 demand scenarios. The live Batch 053 run completed in 17,928.4753 ms against a 600,000 ms target and produced two thousand recommendations.
- Hard Constraints:
scenario_countis capped at 100 insrc/planning/simulations_lifecycle.jleven though configuration parsing accepts higher defaults. The cap prevents one API request from creating runaway worker and database load. - Solver Limits:
SOLVER_TIMEOUT_SECONDS=120andMAX_SOLVER_GAP=0.05come from.env.example. Time-limit incumbents are accepted only when the gap is at or below the configured ceiling. - Audit Boundary:
capture_simulation_input_snapshotreads up toSNAPSHOT_MAX_ROWS = 1_000_000for each planning surface and stores the snapshot before worker processing begins. - Integration Boundary: Delivery Gateway, Notification Hub, and Workflow Engine are disabled by default. The simulator keeps CSV import, simulation, review, notification, and export paths alive without them.
Decision Log
| Decision | Alternative Rejected | Why |
|---|---|---|
Frozen input_snapshot on simulation creation |
Reread live tables during worker processing | Completed runs must remain explainable after inventory, demand, lanes, or policy settings change. |
| Stockout-adjusted demand cleaning | Treat observed sales as demand | A zero-sales period during a stockout is unavailable inventory, not proof of low demand. |
Shared recommendation_net_value |
Recompute net value separately in API, UI, CSV, and notifications | One calculation path prevents a planner seeing one value in the console and another in the export. |
| Local notifications plus optional outbox | Direct external notifications only | Adapter failure cannot change recommendation truth or make the standalone console unusable. |
| PostgreSQL decision rows | Status changes without an audit record | Approve, reject, export, and expire actions need user, reason, time, and idempotency evidence. |
| Server-rendered console | React SPA | The UI is an operations workbench with tables, forms, and review screens; server rendering keeps deployment smaller. |
Scaling Limits
The current solver benchmark proves the reference workload, not every network shape. A dense lane graph across every warehouse and SKU would create many more lane-SKU variables than the shipped benchmark. At that point the next design change would be partitioning by region, category, or policy rather than asking one optimization model to cover the full network at once.
The architecture already has the important boundary: simulations are frozen, jobs are persisted, and recommendations are audit records. Scaling the optimizer changes the worker strategy. It does not change the planner contract.