• Overview
    Frameworks
    • Drupal
    • WordPress
    • Symfony
    • Magento
    • See all frameworks
    Features
    • Observability
    • Auto-scaling
    Solutions
    • Marketing Teams
    • Retail
    • Higher Education
  • Pricing
  • Featured articles
    • Switching to Platform.sh can help IT/DevOps organizations drive 219% ROI
    • Organizations, the ultimate way to manage your users and projects
  • Support
  • Docs
  • Contact
  • Login
  • Free Trial
Blog
Thumbnail

Of Cicadas and cron jobs

August 21, 2018
Larry Garfield
Larry Garfield
Director of Developer Experience

The cicada is a flying insect found world-wide. It's loud but not particularly threatening. It's most famous attribute, though, is that many species of cicada (particularly in North America) are periodic, only emerging every 13 or 17 years depending on the species. When it does emerge, a huge brood reaches maturity all at once, mates, lays eggs, and then dies. The eggs hatch and the offspring spend the next 13 or 17 years living deep underground and burrowing before repeating the cycle again.

But why 13 and 17 years? That's a rather odd set of numbers... And that's actually the point. Those lifespans are both prime numbers, that is, they are divisible only by themselves and one. Many cicada predators also have multi-year life cycles rather than emerging every year. So what are the odds of a large number of cicada predators emerging in the same year as a large number of cicadas?

Very low, in fact. That's the point. Because a prime number is only divisible by 1 and itself, a smaller number sequence will overlap with it only when those two are multiplied. That is, a 4 year cycle predator and a 13 year cicada will only emerge at the same time every 4 * 13 = 52 years. If the cicada emerged every 12 years, however, the 4 year predator would have a veritable buffet every third generation and the cicadas would have a bad time every time.

Over time, evolutionary pressure weeded out the many-common-divisor periodic species of cicada, leaving only those that have overlapping generations every year and those that have a huge all-at-once generation on a prime-number schedule.

What can we learn from the little cicada? If you have two repeating events, and you want them to happen at the same time as rarely as possible, have them repeat on prime numbers.

But what does that have to do with web development?

A website frequently has background tasks that it needs to run from time to time; sometimes every few minutes, sometimes every few hours, sometimes every few days. Most often these are run using a cron task.

Generally speaking it's a bad idea to run more than one cron job at once. Even if they don't interfere with each other they may use a lot of CPU, and you don't want them to slam the system all at once. In fact, on Platform.sh we don't allow that to happen: If a cron task tries to start but there's another already running, we force the new one to pause and wait for the first to complete.

That can sometimes cause issues if, say, a nightly backup process wants to start while a routine every-few-minutes cron task is running. The snapshot will start but block waiting for the other cron task to finish, which if it's a long running task could result in a brief period of site outage while the snapshot waits its turn.

Avoiding predatory cron jobs

So how do we make sure one cron job runs at the same time as another as little as possible? The same way cicadas avoid predators: Prime numbers!

More specifically, say we have a cron task that runs normal system maintenance every 20 minutes. Then we have an import process that periodically reads data from an external system every 10 minutes, and another that runs every 5 minutes to send out pending emails.

The result will be that every 10 minutes we have two cron tasks competing to run at the same time, and every 20 minutes we have three cron tasks competing. That's no good at all!

Instead, let's set the system maintenance to run every 23 minutes, the import to run every 11 minutes, and the email runner every 7 minutes. It's almost the same schedule, but because the numbers are prime they will only very rarely overlap. (Every 77 minutes in the shortest case.) That spreads the load out far better and avoids any process blocking on another.

Now if we want to add a nightly backup, we can have it run at, say, 17 minutes past 4:00 am. It will be extremely rare for the other cron tasks to hit at the 17 minute mark exactly, so our snapshot will almost never need to block on another cron task and our site won't freeze while it waits.

Isn't it nice when bugs end up helping your software run faster?

Get the latest Platform.sh news and resources
Subscribe

Related Content

PHP 8.2 lays new ground on Platform.sh

PHP 8.2 lays new ground on Platform.sh

Company
AboutSecurity and complianceTrust CenterBoard and investorsCareersPressContact us
System StatusPrivacyTerms of ServiceImpressumWCAG ComplianceManage your cookie preferencesReport a security issue
© 2022 Platform.sh. All rights reserved.
Supported by Horizon 2020's SME Instrument - European Commission 🇪🇺