Why the landing page randomly returns 500
Effect Healthcare — root-cause summary, May 2026
The symptom
On Azure App Service, the website occasionally returns 500 errors on pages that have nothing to do with master data — the homepage, sign-in, pricing. Restarting fixes it for a while. Then it comes back.
The root cause, in one paragraph
The Node.js process that runs the whole website has one thread for JavaScript. Every request — homepage, sign-in, pricing, master-data — shares that single thread. When a PWA client opens and starts pulling master data, it fires 16 endpoints in sequence, paginated. Each call runs through Payload's ORM, which holds the thread for hundreds of milliseconds to several seconds while it joins relations, runs hooks, and shapes the response. While the thread is busy with that work, every other request — including the landing page — sits waiting in a queue. If a request waits longer than Azure's gateway timeout, the user sees a 500.
Proof — one PWA is enough to break the homepage
Tested over real HTTP against a production build (pnpm build && pnpm start). One PWA client doing a real sync; meanwhile, a probe hits the homepage repeatedly.
| Condition | Homepage avg | Homepage peak |
| Idle (no PWA load) | 11 ms | 16 ms |
| 1 PWA syncing | 2 160 ms | 4 939 ms |
| 2 PWAs syncing in parallel | 4 059 ms | 11 892 ms |
Sign-in and pricing degrade the same way — they all share the same thread. With two PWAs syncing, the homepage took almost 12 seconds to respond. Anything over Azure's gateway timeout becomes a 5xx.
What it is NOT
- Not a memory leak. Heap stays flat.
- Not slow queries. The same data via direct database queries returns in 100–200 ms. The bottleneck is the ORM above the database, not the database itself.
- Not Azure-specific. Reproduced on localhost.
The fix
Replace the 16 paginated master-data endpoints with one cached bundle endpoint that returns everything in a single response, using direct database queries (no ORM in the hot path) and an in-memory cache invalidated by Payload hooks when admins edit data.
Net effect: instead of holding the Node thread for tens of seconds per PWA login, master-data work collapses to one short request — or a cache hit. The landing page stays responsive.
Already implemented on branch feature/master-data-bundle.