~/About~/Foundry~/Blueprint~/Journal~/Projects
Book a Call
Journal

Why I Compute Every Appointment Slot from Scratch on Every Request

From the Clinical Scheduling Engine system

·7 min read·Kingsley Onoh·View on GitHub

The obvious approach to appointment scheduling is a lookup table. Run a job at midnight. Generate every possible slot for the next two weeks. Store them in a slots table. When a patient asks what's available, query the table. When someone books, mark the row as taken.

I built the opposite. The Clinical Scheduling Engine has no slot table. Every time a patient asks "what's available Thursday?", the system loads all constraints from the database, computes valid slots in real time, scores each one, and returns a sorted list. Nothing is pre-generated. Nothing is cached except availability windows (those get a 60-second TTL because they change weekly, not hourly).

The decision came down to one question: what happens when the underlying data changes?

The Sync Problem

A slot table looks simple until you try to keep it accurate. A booking at 9:15 AM doesn't just remove one slot. It shifts buffer times, changes gap scores for adjacent slots, and potentially opens or closes overbooking options depending on how many bookings the provider already has that day. A cancellation reverses all of that. An availability change (Dr. Chen is out Thursday afternoon) invalidates dozens of slots at once.

Each of these events needs to trigger a slot regeneration. Miss one, and the calendar shows a slot that doesn't exist or hides one that does. The first failure mode confuses patients. The second wastes clinic capacity. Neither produces an error message.

I counted the events that would require synchronization: booking created, booking cancelled, booking marked as no-show, provider availability added or modified, overbooking rule created or modified, appointment type duration changed. Six triggers at minimum, each with a different scope (single slot, single provider, or global). The regeneration logic itself would be roughly the same complexity as real-time computation, but now it runs reactively instead of on demand, and every missed trigger is a silent data integrity bug.

Real-time computation sidesteps all of this. There's nothing to sync because there's nothing stored.

Computing Slots from Scratch

The core of the system is find_available_slots in src/optimizer/engine.py. It takes a target date and appointment type, then runs a multi-stage pipeline for each active provider.

First, it loads availability windows. Provider schedules are stored as recurring weekly patterns (Monday 08:00 to 17:00, Tuesday 08:00 to 17:00) with valid_from and valid_until date bounds. The get_availability_windows function in src/optimizer/constraints.py filters these by day of week and date range. Results are cached in memory for 60 seconds because these patterns change perhaps once a week.

Second, it subtracts existing bookings. Every confirmed booking for this provider on the target date gets carved out of the availability windows. The subtract_bookings function uses interval arithmetic: for each booking, it splits the containing window into the parts before and after the booked period. A 9-hour window with three bookings might become four or five smaller open intervals.

Third, it applies buffer time. Each provider has a configurable buffer (default: 10 minutes) between appointments. The apply_buffer function in constraints.py trims the end of each open interval by the buffer duration. Intervals that shrink to zero get dropped.

Fourth, it filters compatible rooms. Not every room can host every appointment type. A physical exam needs blood pressure equipment. An ECG test needs ECG equipment. The filter_compatible_rooms function checks room type and equipment arrays against the appointment type's requirements. If no compatible room exists, the optimizer returns an empty list rather than suggesting an incompatible room.

Fifth, for each compatible room, it subtracts that room's existing bookings. This prevents room double-booking independently of provider double-booking.

Finally, it generates candidate slots at the configured increment (default: 15 minutes) within each remaining open interval, and scores every one.

The scorer in src/optimizer/scorer.py combines four weighted components into a quality score between 0.0 and 1.0. Preference match (40% weight): how close is the slot to the patient's preferred time window? Inside the window scores 1.0; distance decays linearly, reaching 0.0 at 480 minutes away. Gap minimization (35%): how close is this slot to existing bookings? A slot immediately after the previous appointment scores 1.0 because it reduces dead time. Room switching penalty (15%): does this slot require the provider to move rooms? Staying in the same room scores 1.0; switching scores 0.3. Overbooking penalty (10%): overbooked slots get 0.0, pushing them to the bottom but not removing them.

def score_slot(
    slot_start, slot_end, provider_bookings,
    preferred_start, preferred_end,
    provider_rooms_used, room_id, is_overbooked,
) -> float:
    pref = score_preference_match(slot_start, preferred_start, preferred_end)
    gap = score_gap_minimization(slot_start, provider_bookings)
    room = score_room_switching(room_id, provider_rooms_used)
    overbook = 0.0 if is_overbooked else 1.0

    raw = (
        _W_PREFERENCE * pref
        + _W_GAP * gap
        + _W_ROOM * room
        + _W_OVERBOOK * overbook
    )
    return round(min(1.0, max(0.0, raw)), 4)

The 40/35/15/10 split came from the PRD, not from tuning. It encodes a judgment: patient convenience matters most, but not at the expense of schedule compactness. I'd experiment with pushing the gap weight higher (to 40 or 45) if utilization data showed persistent scheduling holes.

The overbooking system was more complex than I expected. A single max_overbook integer isn't enough because different appointment types have different no-show rates. The OverbookingRule model supports three levels of specificity: global (no provider, no type), provider-only, and provider+type (the most specific). The _get_max_overbook function in engine.py resolves these with a priority hierarchy: if Dr. Chen has a global allowance of 1 extra slot but a specific rule for ECG appointments allowing 2, the ECG rule wins. When a provider is maxed on normal bookings but overbooking is allowed, the system generates a second pass of overbooked slots with lower quality scores.

The resolution function shows the priority hierarchy:

def _get_max_overbook(rules, provider_id, appointment_type_id) -> int:
    best = 0
    for rule in rules:
        if (rule.provider_id and str(rule.provider_id) == str(provider_id)
                and rule.appointment_type_id
                and str(rule.appointment_type_id) == str(appointment_type_id)):
            return rule.max_overbook  # most specific wins immediately
        if (rule.provider_id and str(rule.provider_id) == str(provider_id)
                and not rule.appointment_type_id):
            best = max(best, rule.max_overbook)
        if not rule.provider_id and not rule.appointment_type_id:
            best = max(best, rule.max_overbook)
    return best

Ten Patients, One Slot

Slot computation worked. But what happens when ten patients are looking at the same 9:15 AM slot with Dr. Chen and all ten click "book now" within the same second?

In most scheduling systems, this is where things break. Application-level availability checks see the slot as open for all ten requests because they all query the database before any INSERT completes. Two bookings get created. Someone at the front desk has to call a patient and apologize.

I didn't build a fix for this. I let PostgreSQL handle it.

The bookings table has UniqueConstraint("provider_id", "date", "start_time", name="uq_provider_slot"). The booking service in src/booking/service.py does an INSERT, and if it hits an IntegrityError, it catches the exception and raises an AppError with code SLOT_UNAVAILABLE and HTTP status 409.

The stress test in tests/unit/test_performance.py fires 10 concurrent requests for the same slot using asyncio.gather. The assertion: exactly 1 gets 201 Created, exactly 9 get 409 Conflict. No application-level locking. No Redis. No retry logic. The database is the arbiter.

What I was wrong about: I initially thought I'd need SELECT ... FOR UPDATE before the INSERT to prevent the race. There's no need. The UNIQUE constraint alone is sufficient because the INSERT either succeeds or it doesn't. There's no intermediate state where two rows exist temporarily.

The backfill system added another dimension. When a patient cancels, the find_backfill_candidates function in src/booking/backfill.py queries for other cancelled bookings with the same appointment type within a 7-day window (the _SEARCH_WINDOW_DAYS constant). These are patients who lost their own appointment and might want the freed slot. The proximity score decays linearly: 1.0 for same-day, 0.0 at 7 days. Candidates appear in the cancellation response, so the front desk sees recovery options immediately.

What I'd Change

The availability cache uses a fixed 60-second TTL. If a clinic admin changes Thursday hours, the cache serves stale windows for up to a minute. Explicit invalidation on availability updates would eliminate that gap, but I chose the simpler approach because availability changes happen rarely and the 60-second window is acceptable.

The scorer weights are hardcoded at 40/35/15/10. A specialist clinic where room equipment is the real bottleneck might want 25/35/30/10 instead. Making the weights configurable per clinic is the next change worth making. Everything else has held up.

#scheduling#constraint-solving#scoring-algorithms#postgresql

The architecture behind this essay for Clinical Scheduling Engine

Get Notified

New system breakdown? You'll know first.