Architectural Brief: Centralized Property Intelligence Hub
Four microservices, four separate Supabase databases, one AI agent that needed to query all of them at once. The application scraped UK property listings, analyzed floorplans, scored images for condition defects, and assigned unique identifiers across all sources. Each service worked fine in isolation. But the RAG agent powering the frontend couldn't join data across four databases without making four separate API calls per question, and those calls were slow, inconsistent, and impossible to filter in combination.
This is the replication layer that consolidated everything into a single queryable surface.
System Topology
Infrastructure Decisions
- Replication Method: PostgreSQL Native Logical Replication (Publication/Subscription). Chose over CDC tools like Debezium or custom API-based sync because the source and target are both PostgreSQL on Supabase. Logical replication is built into the engine, requires no middleware, and streams changes in near-real-time. Debezium would have added a Kafka cluster and connector management for a problem that PostgreSQL solves natively.
- Data Layer: Supabase (managed PostgreSQL) for all five databases. Chose over self-hosted RDS because the client's existing microservices already ran on Supabase. Adding a fifth Supabase instance for the Central DB kept the operational tooling consistent. The Central DB runs on a Small instance (2GB RAM, 2 vCPU) with
max_worker_processestuned to 20. - Consumption Layer: Materialized views in a dedicated
rag_apischema. Chose over live views or direct table queries because the RAG agent needs pre-joined data with consistent read performance. Materialized views give indexed reads against a snapshot. Live views would re-execute the 4-way join on every query. The two views (unified_property_master,rag_property_images) collapse 79 raw tables into two queryable surfaces. - Refresh Strategy:
pg_cronrunningREFRESH MATERIALIZED VIEW CONCURRENTLYevery 10 minutes. Chose over webhook-triggered refreshes because the source services don't emit events on data change. Concurrent refresh means the views remain readable during the rebuild. Therag_api.refresh_data()function wraps both view refreshes in a single call for manual triggers. - Schema Isolation: Each source service replicates into its own schema on the Central DB (
super_id,floorplan,rightmove). Chose over dumping everything intopublicbecause the Image Condition service already usespublic(Django default), and collisions would be inevitable. The one service that does usepublicrequired selective table replication to avoid pulling in Django framework tables (auth_*,django_*).
Constraints That Shaped the Design
- Input: Four PostgreSQL databases on separate Supabase instances, each with its own schema conventions. The Rightmove scraper alone has 58 tables. The Image Condition service shares the
publicschema with Django internals. Super ID provides thesuper_idUUID that links all four services. - Output: Two materialized views.
unified_property_masterproduces one row per property with address, price, condition score, floorplan status.rag_property_imagesproduces multiple rows per property with per-image condition analysis. Both are indexed, including a GIN index for full-text search on property descriptions. - Scale Handled: ~5,000 properties in the unified view, linked from ~13,000 raw scraped records. The gap exists because properties without a
super_idassignment are excluded by design. At 50,000 properties, the materialized view refresh would need partitioning or incremental refresh strategies. - Hard Constraints:
max_worker_processes = 20on the Central DB. Each active subscription consumes one background worker slot permanently. Initial table copies for large schemas (58 tables) spawn temporary sync workers that compete for the same pool. The Micro instance tier (4 worker limit) was physically incapable of running 4 subscriptions. IPv6 connectivity to Supabase source databases was unreachable from local development; all admin operations required the Session Mode Pooler (IPv4).
Decision Log
| Decision | Alternative Rejected | Why |
|---|---|---|
| PostgreSQL Logical Replication | Debezium CDC, custom ETL scripts | Source and target are both PostgreSQL. Logical replication is engine-native, requires zero middleware, and provides sub-second change streaming. Debezium would have introduced Kafka, Schema Registry, and connector management for a problem the database already solves. |
| Materialized Views over live views | Direct table queries, real-time views | The RAG agent doesn't need sub-second freshness. It needs fast, indexed, pre-joined reads. A live view re-executing a 4-way LEFT JOIN with LATERAL subqueries on every query would add 200-400ms per call. Materialized views serve indexed reads in single-digit milliseconds. |
| Selective table replication for Image Condition | Full schema replication | The Image Condition service uses Django's public schema, which includes 30+ framework tables (auth_user, django_session, etc.) irrelevant to the data layer. Only 15 application tables were published. Replicating everything would have polluted the Central DB with framework state and increased worker load. |
| "Traffic Control" protocol for large initial syncs | Increasing instance size, parallel sync | The Rightmove subscription (58 tables) exhausted all 6 worker slots during initial copy. Instead of upgrading the instance, I disabled the 3 existing subscriptions temporarily, let the large sync complete, and re-enabled them. Zero data loss, zero cost increase. Documented as a repeatable protocol. |
| Central DB First ordering for DDL changes | Source DB First, or automated migration sync | Logical replication doesn't replicate DDL. If a developer adds a column on the source before the Central DB, replication crashes immediately. The "Central DB First" protocol ensures the subscriber schema is always a superset of the publisher schema. The one exception: CREATE TABLE must happen on source first (to be added to the publication). |
| Small instance over Micro | Micro with manual worker management | Micro limits max_logical_replication_workers to 4. With 4 services, there is zero headroom for retries or initial syncs. The Small instance allows max_worker_processes = 20, leaving 16 slots for temporary sync workers and maintenance tasks. |