Why email address validation requires local SMTP server

Summary

Explains why you can reliably validate email addresses only if you’re not behind proxy and you have all of this:

  • Public IP address on your host.
  • DNS MX and PTR records for this IP address.
  • SMTP service on this host.

Details

Email address validation (such as with MailBee.NET Address Validator) is typically assumed as the following process:

  1. Validate email address syntax (abc@company.com being correct and abc@ being incorrect).
  2. If the previous step succeeds, obtain MX records for company.com via DNS MX lookup query. MX records specify which SMTP servers (if any) accept email for the given domain. This lets us find out if domain part is faked or not. If we succeed at this step, this means that at least domain part of the email address is OK.
  3. Now we may want to make a connection to the SMTP MX server (which we determined at the previous step) to check if this server is alive. Well, if at least one of the servers associated with the given domain is alive.
  4. If the connection succeeded, we may want to use this connection to submit the email address in question to check if the SMTP MX server accepts the given recipient. If it does, we claim the email address valid.

This approach is very simplified because in reality some SMTP MX servers accept all email addresses belonging to their domains, even non-existent ones (they do this to prevent spam bots from harvesting good addresses by brute force).

However, there are many cases when most servers will reply with an error even if you submit perfectly valid addresses to them. The reason is that they by default think that you’re a spam bot and you need to convince them that you’re not. Let’s see what we can do about that.

Most spam nowadays is sent by infected home and office systems. Normally, non-spam email is sent via SMTP relay servers. It’s uncommon to send email directly from a client to  the end recipient’s MX server. So, whenever the target MX thinks your IP address does not correspond to a living SMTP server, you may get banned. What differs a host running a full-fledged SMTP server from a host which does not look that solid?

A good and trusted SMTP server host must have:

  • IP address which is not in any popular blacklists. Most IP addresses nowadays are blacklisted. When you get some new web-hosting for your project, be sure to check if the IP address you’ll be using is not listed at http://mxtoolbox.com/blacklists.aspx. As for households, their IP addresses are usually blacklisted on entire sub-network levels.
  • A domain name must be assigned to the host. It does not need to be a second level domain, third level will suffice too.
  • IP address of the host must have a PTR record which resolves to the domain name above.

Also, the SMTP server for our particular purpose (email address validation) must also meet this requirement:

  • SMTP service on port 25 must be running on this host, and this service must accept our sender’s email address (on behalf of which we make our test connections) as a valid recipient.

For instance, you have an IP address 11.22.33.44 and the domain name of this host is myhost.myprovider.com. You must have:

  • MX record for myhost.myprovider.com pointing to myhost.myprovider.com domain.
  • PTR record for 11.22.33.44 IP address pointing to myhost.myprovider.com domain.
  • SMTP service on port 25 at myhost.myprovider.com (11.22.33.44) which accepts user@myhost.myprovider.com  at least (or just any email address) as a valid recipient when another server tries to send email to us. In short, it must reply with 250 code on RCPT TO command.

Our SMTP service does not actually need to accept email from anybody. During the SMTP session, it can reply with an error to any attempt to send actual message data (DATA commands follow RCPT TO commands but servers validating email addresses never go that far).

So, the entire process of validating abc@company.com address from the client’s point of view may look like below:

  1. MailBee.NET client operating at myhost.myprovider.com/11.22.33.44 checks abc@company.com address syntax. It’s OK.
  2. MailBee.NET client makes DNS MX query to get the list of SMTP MX servers. Let’s assume the domain is OK and we got the list containing a single entry, mx01.someserver.com.
  3. Now MailBee.NET connects to mx01.someserver.com on port 25 and says hello. Let’s assume we succeeded.
  4. Next step is to submit the sender. We send MAIL FROM:<user@myhost.myprovider.com>. Let’s assume it succeeded.
  5. Now we submit the recipient: RCPT TO:<abc@company.com>. If the MX server replies positively, we inform the caller that abc@company.com address is valid.

Now let’s explore the point of view of company.com MX server (mx01.someserver.com):

  1. We got an incoming SMTP connection from 11.22.33.44. OK.
  2. The client says hello. OK.
  3. The client says it wants to send us email from user@myhost.myprovider.com. OK, remember that.
  4. The client says it wants to send email to abc@company.com. Should we allow that? Let’s check what we know about the client to the moment:
    1. The domain in the email address is company.com. Do we know it? Yes, it’s a mail domain we host. OK.
    2. Is the client a spammer or not? Its IP address is 11.22.33.44. Look up this IP in some popular RBLs to see if it’s blacklisted. No matches? OK.
    3. Does this IP address have a domain name? Spam bots usually don’t have one so make sure it does. To check this, we run DNS PTR query against 11.22.33.44 address. OK, we found that this IP address is linked with a domain (myhost.myprovider.com in this sample). Good.
    4. Does myhost.myprovider.com name have an MX record associated with it? Spam bot PCs usually don’t have it while well-established SMTP servers do have at least one. For that, we perform a DNS MX query for myhost.myprovider.com domain. OK, we found an MX record which points to myhost.myprovider.com host and this is the same host we got from DNS PTR lookup above. So, at least from DNS it indeed looks like the client which is trying to send us email is an SMTP server of myhost.myprovider.com domain.
    5. We checked DNS, now we want to check if there is really SMTP service running on the host denoted by the DNS lookups. Connect to port 25 at 11.22.33.44 and say hello. Did it succeed? If it did, this means the client trying to send us email is indeed an SMTP server itself, not just a spam bot operating on some home PC. Just to be clear: we made a counter connection to the client. I.e. initially the client connected to us (we’re server) and now we act as a client connecting to it. Two simultaneous connections in opposite directions are open at this moment.
    6. To be bulletproof, let’s also check if this SMTP server can accept email to the addresses it sends from. The client earlier told us that it wanted to send from its user@myhost.myprovider.com address to our abc@company.com mailbox. Let’s see what it says if we try to send it email from our abc@company.com to their user@myhost.myprovider.com address (swapping the sender and recipient). We send MAIL FROM:<abc@company.com> and RCPT TO:<user@myhost.myprovider.com> to 11.22.33.44 and check the reply. The server says OK? Fine, we now got the final approval that the host which wants to send us email is indeed an SMTP server associated with the domain of the sender. Close the counter SMTP session.
    7. In the original SMTP session (initiated by 11.22.33.44) we respond positively to RCPT TO request of the client telling it that abc@company.com recipient is fine.

In practice, things are a bit more complex. For instance, MX record is not absolutely needed  (although still recommended) if the host accepting mail for a domain has that domain’s name. Also, MailBee.NET machine and SMTP your service machine can be different, they just need to share the same public IP. But the idea remains the same.

If you don’t have a local SMTP service on your server yet, you can install Windows Server’s built-in SMTP service for that.

Not all popular SMTP services perform counter SMTP connections. Some even don’t do DNS PTR checks so that you can do email address validation even from a home PC. Still, most services do these checks at least to some extent.

One more thing. You may get bans from some popular mail services like Yahoo if you send too many queries in an hour even if your DNS is fine and your local SMTP service is present. This varies from service to service but keeping the rate below 200 checks per hour per domain can be a good starting point for further adjustments.

Conclusion

So, SMTP MX servers accepting email for addresses we’re checking can often act as clients themselves and make counter connections to the host which makes email validation (MailBee.NET host, actually). Because of that, you need to make all the required setups for this host to turn it into something which looks like an SMTP service for other SMTP hosts.

This is not usually needed for normal send-email tasks where you just connect to an SMTP relay server in order to send email. Relay servers don’t make counter connections so you (a client) can be behind proxy, don’t have a public IP address and so on.

Alternatives

If you don’t have the required infrastructure for building mass email validation facility, you can still implement validation of single addresses using an SMTP relay server (like you normally do for sending emails). For that, use Smtp.TestSend method instead of normal Smtp.Send (assuming that your SMTP relay was added in Smtp.SmtpServers collection). This method is less accurate because many relay servers don’t validate email addresses the very same moment when you submit emails to them. They just accept the email from you, add into their internal queue, reply you with OK and close the session. Later, they relay your email to the destination MX server and in case if that server rejects the recipient’s address, they send you a bounce mail. With such approach, you’ll always need to implement monitoring and processing of bounced emails, such as with DeliveryStatusParser class.

Also, there is a special SMTP command VRFY for checking email addresses for existence. However, due to excessive use of this command by spam bots for email address harvesting most popular email services no longer support it.

Why email address validation requires local SMTP server

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s