Ok, I'm feeling somewhat silly.
Google analytics, of all things, seems to be the primary cause of 503 errors. I've been sending anonymous "success / offline / error..." hits to G.Analytics to keep track of server usage / peak times / results and it used to be the fastest part of the whole process (much faster than WU or AcuRite interface for example). As part of today's optimization, I've disabled google analytics and 503 failure rate dropped to 0...
Google, you've let me down.
In any case, I'll continue debugging the new backend, since it will help avoiding cron-job load peaks in the future, but meanwhile 503 issues appear to be solved.