Open-sourcing checker jobs

September 24, 2018 | Nicolas Zermati | 4-minute read

We’ve recently extracted the checker_jobs gem from our codebase. It’s a simple alerting tool with a very specific purpose which this article will explain.

Over time, we update the rules that our data has to comply with. Making sure our data is always what we expect it to be is hard, especially when old constraints change, new constraints come along, new fields are added, backfill isn’t always possible…

Even with a careful team behind it, the system can produce corrupted data for weeks, months, or years before anyone notices. By that time, it could be too late or just impossible to fix. In comparison, crashes are noticed faster and could be corrected quickly, when a data issue could spread and impact many parts of the system making the issue way more expensive to fix.

The checker_jobs are here to be sure that when this sneaky data corruption happens, you notice it right away.

What was the problem?

Imagine we’ve got, a users table with a terms_of_services_accepted_at column. This column could be set for new users but not for old ones. We need the user to accept the ToS before they can book a trip on our platform. Unfortunately, old trips aren’t subject to that rule since the column didn’t exist back then.

We’ll do the best we can to be sure that we update all our user’s paths to take that new requirement into account. Even with our nice test suite, we don’t cover all the code paths, especially with all the production data. That data isn’t fresh from a testing factory, but testing on legacy, old, and sparse data is a different topic!

So to get some peace of mind, we would like to be sure that there are no recent trips booked where the driver didn’t accept the ToS. What we could do is write a piece of code verifying that we have no trips with users having the users.terms_of_services_accepted_at unset.

How this gem is useful

The gem is offering you a quick way to get alerted when this piece of code finds such a trip. You can basically get notifications (emails, bugtrackers, …) when a trip doesn’t honor the ToS rule.

It would look like this:

class TripChecker
  include CheckerJobs::Base

  notify :email, to: "oss@drivy.com"

  ensure_no :trip_without_users_terms_of_service_being_accepted do
    Trip.joins(:users).merge(User.terms_of_service_not_yet_accepted)
  end

  ensure_no :trip_with_deactivated_car do
    # ...
  end
end

Then you would have to enqueue that TripChecker as often as you want to do that verification. In our case, because we use a Ruby tasks scheduler and Sidekiq, it looks like this:

every(1.day, 'trip_checker', at: ['00:10', '12:05'], tz: "Paris") do
  Sidekiq::Client.enqueue(TripChecker)
end

Here is an example of what we see in Bugsnag when one of our checkers is triggered:

CheckerJob error in BugSnag

What are the other ways of solving this?

There are others solutions to this issue like:

  • code that is more defensive and crashes if the preconditions aren’t met,
  • some database features such as triggers, foreign keys, or checks, or
  • a better test suite that can work with production data.

We try to use those when it makes sense, and we advise you to do the same. Still, the checker_jobs are different from all of those solutions:

  • they are safer than code defensiveness, they don’t impact your production system,
  • they are cheaper to create, maintain, and, most of all, delete than database constraints, and
  • they are easier to setup than regression testing on production data.

Of course, they don’t provide the same guarantees compared to the other solutions thus the comparison isn’t that fair.

What’s next?

You could give checker_jobs a go, follow the instructions on Github and tell us how it went!

In the future, there are many things that we would like to see, things such as:

  • More job processors, ActiveJob is a good candidate,
  • More notifiers, I’m talking about PagerDuty, Bugtrackers, SMS, etc.
  • checker_jobs-web a extra gem that allows you to publish the results of the checks on a dedicated web UI, and of course
  • Contributions from the community!

We intend to extract and release more of this kind of libraries and we hope others will find them useful.

👍  Like this post? Join Drivy's engineering team! View openings