How a Solar Retailer Stopped Losing Sales to Stale Prices

The Situation

A German solar equipment retailer was running a Shopify store with over 4,000 products sourced from 13 different wholesale suppliers. Every morning, someone on the operations team would open supplier price lists, compare them against the current store catalog, update prices by hand, check competitor listings on Idealo, and adjust margins accordingly. The whole process took about 90 minutes on a good day. On a bad day, when suppliers changed formats or sent incomplete data, it took longer.

The problem wasn't just the time. By the time the morning update was done, some of the prices were already wrong. Suppliers update throughout the day. Competitors on Idealo reprice in real time. A store that refreshes once a day is always selling on yesterday's information, and in a market where customers comparison-shop across three tabs, being €5 too high on an inverter means losing the sale entirely.

The Cost of Doing Nothing

The 90-minute daily process amounts to roughly 375 hours per year. At the loaded cost of operations staff in Germany, that's approximately €11,000 in direct labor. But the real cost was invisible: mispriced products sitting on the storefront at stale margins, competitor undercuts going unnoticed for hours, and a pricing operation that couldn't scale beyond one person's capacity.

When the retailer began listing on Idealo (Germany's largest comparison shopping platform), the pricing gap became acute. Idealo shoppers see every seller's price on one screen. A listing that's €3 above the cheapest seller gets no clicks. The retailer was losing conversions not because the products were wrong, but because the prices were slow.

What I Built

A pricing pipeline that pulls supplier data from 13 sources, normalizes it, applies tiered markup rules, runs competition analysis against live Idealo data, syncs the final prices to Shopify via bulk GraphQL operations, and exports formatted feeds for Idealo, Solute, and Adcell. It runs on a scheduled container in AWS ECS Fargate.

The first version matched products to suppliers using the store's SKU. That covered about 85% of the catalog. The remaining 15% (roughly 600 products) had different SKUs in the supplier feeds. I had to run the entire pipeline twice: once matching on the product SKU, once matching on a secondary supplier SKU stored in Shopify metafields. The two-pass approach got coverage to 99.5%.

The hardest part wasn't the pricing logic. It was the data cleaning. Each supplier sends price data in a different format. One uses German decimal notation (commas instead of periods). Another sends stock quantities as text strings instead of numbers. A third changes column names between exports. I wrote 11 dedicated cleaner functions, one per supplier format, before any pricing rules could even be applied.

System Flow

Data Model

Architecture Layers

The Decision Log

Decision	Alternative Rejected	Why
Pandas DataFrames end-to-end	SQLAlchemy + Postgres	Batch script, not an API. No concurrent reads. Moving data between a managed database and ECS across the network was slower than keeping everything in-memory.
Google Drive as data bus	S3 or managed file store	The client's operations team already managed supplier feeds and pricing rules in Google Sheets. Replacing their workflow would have added months.
Two-pass SKU matching	Single-pass with fuzzy matching	Fuzzy matching false positives on solar part numbers are dangerous (wrong price on wrong product). Two deterministic passes on different key fields caught 99.5% with zero false matches.
Shopify Bulk GraphQL mutations	REST Admin API loops	Updating 4,162 variants one-by-one via REST took over 30 minutes and hit rate limits. Bulk operations via JSONL upload completed in under 7 minutes.
Per-supplier cleaner functions	Generic data normalizer	13 suppliers, 13 different CSV/XLSX formats. A generic normalizer couldn't handle one supplier's comma decimals, another's text stock values, and a third's column renames in the same pass. One cleaner per supplier is ugly but correct.
Category surcharge at supplier price level	Surcharge at final price level	Applying the category-specific surcharge before markup calculations ensures consistent margins. Applying it after would have cascaded the uplift through competition pricing logic and created invisible margin drift.

Results

Before the system went live, the operations team spent 90 minutes every morning on manual price updates. The store prices were stale by midday, and Idealo listings were always behind the competition.

After deployment, the full cycle (data pull, normalization, pricing, competition analysis, Shopify sync, and marketplace exports) runs in under 7 minutes. The operations team reviews an automated status report instead of building one. Product matching coverage went from roughly 85% to 99.5% after the two-pass SKU matching was implemented. The 20x speed improvement came primarily from replacing individual Shopify REST calls with bulk GraphQL mutations. At 10x scale (40,000 products), the current architecture would need partitioned bulk operations and a move from in-memory DataFrames to chunked processing, but the pricing logic and supplier pipeline wouldn't change.

Inselvolt Pricing Engine