Quick wins to deal with users' broken email addresses

July 05, 2018 – Jean Anquetil 2-minute read

This article was written before Drivy was acquired by Getaround, and became Getaround EU. Some references to Drivy may therefore remain in the post

If a user signs up to Drivy, we want to welcome them. If a driver has an upcoming booked trip, we would like to send them the needed information. If they want to reset their password, they need to receive a confirmation email and so on.

In another words, transactional emails are very important for a successful experience. So, how do we deal with broken email addresses?

Regex-ing the format

First of all we decided to check the email-address format of a new user during her sign-up flow. To do so, we compare it with a very simple regex.

\A\S+@\S+\.\S+\z

Here is our assumption. An email address can:

  • have at least one or more instance of any non-whitespace character,
  • be imperatively followed by an at symbol,
  • then have again one or more instance of any non-whitespace character,
  • be imperatively followed by a dot symbol,
  • then have again one or more instance of any non-whitespace character.

And that’s it. We don’t want to define a complex pattern such as the RFC 5322 one whereas an email provider has its own syntax rules: we don’t want to block some potentially valid addresses.

Transliterating

Later down the line, we faced some delivery issues with email addresses containing special characters (i.e. àéèù…) so we decided to transliterate the email addresses of some specific domains.

For instance, we know that Gmail supports addresses with accents. But they don’t differentiate between an address with or without accents: they are the same. We therefore decided to transliterate the email addresses from the following domain names: Gmail, Outlook, Hotmail and Live.

Using a custom coercion with Virtus in our form object, this is done really smoothly. (However, Virtus is now deprecated so if we were to start from scratch today, we would use something else.)

However, transliteration is not without its limits. The day we open a country without a Latin alphabet we will not be able to transliterate the email addresses anymore:

I18n.transliterate("日本語")
# => "???"

Using an external service

To go further, we could carry out many more checks using an external service. For instance, MailGun released a library called Flanker.

It carries out the following checks:

  • DNS lookup: that the @domain.com exists.
  • MX check: if that @domain.com has a Mail Exchange record. In other words, that the domain is configured to receive emails.
  • That the email address complies with general validation rules but also specific ones. For instance, regarding @gmail.com they check if the address length is between 6 and 30 characters.

Conclusion

There will always be a lot of different ways to prevent or sanitize broken email addresses but it will remain difficult to handle all the use cases. Maybe another way to fight this would be by not relying too much on emails: using the phone number to verify a profile and using push or browser notifications to talk to a user.

Did you enjoy this post? Join Getaround's engineering team!
View openings