#!perl -w

=head1 NAME

helo - validate the HELO message presented by a connecting host.

=head1 DESCRIPTION

Validate the HELO hostname. This plugin includes a suite of optional tests,
selectable by the I<policy> setting. The policy section details which tests
are enforced by each policy option.

It sets the connection notes helo_forward_match and helo_reverse_match when
I<policy rfc> or I<policy strict> are used.

Adds an X-HELO header with the HELO hostname to the message.

Using I<policy rfc> will reject a very large portion of the spam from hosts
that have yet to get blacklisted.

=head1 WHY IT WORKS

The reverse DNS of the zombie PCs is out of the spam operators control. Their
only way to get past these tests is to limit themselves to hosts with matching
forward and reverse DNS, and then use the proper HELO hostname when spamming.
At present, this presents a very high hurdle.

=head1 HELO VALIDATION TESTS

=over 4

=item is_in_badhelo

Matches in the I<badhelo> config file, including yahoo.com and aol.com, which
neither the real Yahoo or the real AOL use, but which spammers use a lot.

Like qmail with the qregex patch, the B<badhelo> file can also contain perl
regular expressions. In addition to normal regexp processing, a pattern can
start with a ! character, and get a negated (!~) match.

=item invalid_localhost

Assure that if a sender uses the 'localhost' hostname, they are coming from
the localhost IP.

=item is_plain_ip

Disallow plain IP addresses. They are neither a FQDN nor an address literal.

=item is_address_literal [N.N.N.N]

An address literal (an IP enclosed in brackets) is legal but rarely, if ever,
encountered from legit senders.

=item is_forged_literal

If a literal is presented, make sure it matches the senders IP.

=item is_not_fqdn

Makes sure the HELO hostname contains at least one dot and has only those
characters specifically allowed in domain names (RFC 1035).

=item no_forward_dns

Make sure the HELO hostname resolves.

=item no_reverse_dns

Make sure the senders IP address resolves to a hostname.

=item no_matching_dns

Make sure the HELO hostname has an A or AAAA record that matches the senders
IP address, and make sure that the senders IP has a PTR that resolves to the
HELO hostname.

Per RFC 5321 section 4.1.4, it is impermissible to block a message I<soley>
on the basis of the HELO hostname not matching the senders IP.

Since the dawn of SMTP, having matching DNS has been a minimum standard
expected and oft required of mail servers. While requiring matching DNS is
prudent, requiring an exact match will reject valid email. While testing this
plugin with rejection disabled, I noticed that mx0.slc.paypal.com sends email
from an IP that reverses to mx1.slc.paypal.com. While that's technically an
error, I believe it's an error to reject mail based on it. Especially since
SLD and TLD match.

To avoid snagging false positives, matches are extended to the first
3 octets of the IP and the last two labels of the FQDN. The following are
considered a match:

  192.0.1.2, 192.0.1.3

  foo.example.com, bar.example.com

This allows I<no_matching_dns> to be used without rejecting mail from orgs with
pools of servers where the HELO name and IP don't exactly match. This list
includes Yahoo, Gmail, PayPal, cheaptickets.com, exchange.microsoft.com, and
likely many more.

=back

=head1 CONFIGURATION

=head2 policy [ lenient | rfc | strict ]

Default: lenient

=head3 lenient

Runs the following tests: is_in_badhelo, invalid_localhost,
is_forged_literal, and is_plain_ip.

This setting is lenient enough not to cause problems for your Windows users.
It is comparable to running check_spamhelo, but with the addition of regexp
support, the prevention of forged localhost, forged IP literals, and plain
IPs.

=head3 rfc

Per RFC 2821, the HELO hostname is the FQDN of the sending server or an
address literal. When I<policy rfc> is selected, all the lenient checks and
the following are tested: is_not_fqdn, no_forward_dns, and no_reverse_dns.

If you have Windows users that send mail via your server, do not choose
I<policy rfc> without setting I<reject> to 0 or naughty.
Windows PCs often send unqualified HELO names and will have trouble
sending mail. The B<naughty> plugin defers the rejection, giving the user
the opportunity to authenticate and bypass the rejection.

=head3 strict

Strict includes all the RFC tests and the following: no_matching_dns, and
is_address_literal.

I have yet to see an address literal being used by a hammy sender. But I am
not certain that blocking them all is prudent.

It is recommended that I<policy strict> be used with <reject 0> and that you
examine your logs for false positives.

=head2 badhelo

Add domains, hostnames, or perl regexp patterns to the F<badhelo> config
file; one per line.

=head2 timeout [seconds]

Default: 5

The number of seconds before DNS queries timeout.

=head2 reject [ 0 | 1 | naughty ]

Default: 1

0: do not reject

1: reject

naughty: naughty plugin handles rejection

=head2 reject_type [ temp | perm | disconnect ]

Default: disconnect

What type of rejection should be sent? See docs/config.pod

=head2 loglevel

Adjust the quantity of logging for this plugin. See docs/logging.pod

=head1 RFC 2821

=head2 4.1.1.1

The HELO hostname "...contains the fully-qualified domain name of the SMTP
client if one is available.  In situations in which the SMTP client system
does not have a meaningful domain name (e.g., when its address is dynamically
allocated and no reverse mapping record is available), the client SHOULD send
an address literal (see section 4.1.3), optionally followed by information
that will help to identify the client system."

=head2 2.3.5

The domain name, as described in this document and in [22], is the
entire, fully-qualified name (often referred to as an "FQDN").  A domain name
that is not in FQDN form is no more than a local alias.  Local aliases MUST
NOT appear in any SMTP transaction.


=head1 RFC 5321

=head2 4.1.4

An SMTP server MAY verify that the domain name argument in the EHLO
command actually corresponds to the IP address of the client.
However, if the verification fails, the server MUST NOT refuse to
accept a message on that basis.  Information captured in the
verification attempt is for logging and tracing purposes.  Note that
this prohibition applies to the matching of the parameter to its IP
address only; see Section 7.9 for a more extensive discussion of
rejecting incoming connections or mail messages.

=head1 TODO

is_forged_literal, if the forged IP is an internal IP, it's likely one
of our clients that should have authenticated. Perhaps when we check back
later in data_post, if they have added relay_client, then give back the
karma.

=head1 AUTHOR

2012 - Matt Simerson

=head1 ACKNOWLEDGEMENTS

badhelo processing from check_badhelo plugin

badhelo regex processing idea from qregex patch

additional check ideas from Haraka helo plugin

=cut

use strict;
use warnings;

use Net::IP;

use Qpsmtpd::Constants;

sub register {
    my ($self, $qp) = (shift, shift);
    $self->{_args} = {@_};

    $self->{_args}{reject_type} = 'disconnect';
    $self->{_args}{policy} ||= 'lenient';
    $self->{_args}{dns_timeout} ||= $self->{_args}{timeout} || 5;

    if (!defined $self->{_args}{reject}) {
        $self->{_args}{reject} = 1;
    }
    $self->populate_tests();
    $self->get_resolver() or return;

    $self->register_hook('helo',      'helo_handler');
    $self->register_hook('ehlo',      'helo_handler');
    $self->register_hook('data_post', 'data_post_handler');
}

sub helo_handler {
    my ($self, $transaction, $host) = @_;

    if (!$host) {
        $self->log(LOGINFO, "fail, tolerated, no helo host");
        $self->adjust_karma(-2);
        return DECLINED;
    }

    return DECLINED if $self->is_immune();

    foreach my $test (@{$self->{_helo_tests}}) {
        my @err = $self->$test($host);
        if (scalar @err) {
            $self->adjust_karma(-1);
            return $self->get_reject(@err);
        }
    }

    $self->log(LOGINFO, "pass");
    return DECLINED;
}

sub data_post_handler {
    my ($self, $transaction) = @_;

    $transaction->header->delete('X-HELO');
    $transaction->header->add('X-HELO', $self->qp->connection->hello_host, 0);

    return DECLINED;
}

sub populate_tests {
    my $self = shift;

    my $policy = $self->{_args}{policy};
    @{$self->{_helo_tests}} =
      qw/ is_in_badhelo invalid_localhost is_forged_literal is_plain_ip /;

    if ($policy eq 'rfc' || $policy eq 'strict') {
        push @{$self->{_helo_tests}},
          qw/ is_not_fqdn no_forward_dns no_reverse_dns /;
    }

    if ($policy eq 'strict') {
        push @{$self->{_helo_tests}}, qw/ is_address_literal no_matching_dns /;
    }
}

sub is_in_badhelo {
    my ($self, $host) = @_;

    my $error = "I do not believe you are $host.";

    $host = lc $host;
    foreach my $bad ($self->qp->config('badhelo')) {
        if ($bad =~ /[\{\}\[\]\(\)\^\$\|\*\+\?\\\!]/) {    # it's a regexp
            return $self->is_regex_match($host, $bad);
        }
        if ($host eq lc $bad) {
            return $error, "in badhelo";
        }
    }
    return;
}

sub is_regex_match {
    my ($self, $host, $pattern) = @_;

    my $error = "Your HELO hostname is not allowed";

    #$self->log( LOGDEBUG, "is regex ($pattern)");
    if (substr($pattern, 0, 1) eq '!') {
        $pattern = substr $pattern, 1;
        if ($host !~ /$pattern/) {

            #$self->log( LOGDEBUG, "matched ($pattern)");
            return $error, "badhelo pattern match ($pattern)";
        }
        return;
    }
    if ($host =~ /$pattern/) {

        #$self->log( LOGDEBUG, "matched ($pattern)");
        return $error, "badhelo pattern match ($pattern)";
    }
    return;
}

sub invalid_localhost {
    my ($self, $host) = @_;
    if ($self->is_localhost($self->qp->connection->remote_ip)) {
        $self->log(LOGDEBUG, "pass, is localhost");
        return;
    }
    if ($host && lc $host eq 'localhost') {
        $self->log(LOGDEBUG, "pass, host is localhost");
        return;
    };

    #$self->log( LOGINFO, "fail, not localhost" );
    return "You are not localhost", "invalid localhost";
}

sub is_plain_ip {
    my ($self, $host) = @_;
    return if ! $self->is_valid_ip($host);

    $self->log(LOGDEBUG, "fail, plain IP");
    return "Plain IP is invalid HELO hostname (RFC 2821)", "plain IP";
}

sub is_address_literal {
    my ($self, $host) = @_;

    my ($ip) = $host =~ /^\[(.*)\]/;  # strip off any brackets
    return if !$ip;   # no brackets, not a literal

    return if ! $self->is_valid_ip($ip);

    $self->log(LOGDEBUG, "fail, bracketed IP");
    return "RFC 2821 allows an address literal, but we do not","bracketed IP";
}

sub is_forged_literal {
    my ($self, $host) = @_;
    return if ! $self->is_valid_ip($host);

  # should we add exceptions for reserved internal IP space? (192.168,10., etc)
    $host = substr $host, 1, -1;
    return if $host eq $self->qp->connection->remote_ip;
    return "Forged IPs not accepted here", "forged IP literal";
}

sub is_not_fqdn {
    my ($self, $host) = @_;
    return if $host =~ m/^\[(\d{1,3}\.){3}\d{1,3}\]$/;   # address literal, skip
    if ($host !~ /\./) {                                 # has no dots
        return "HELO name is not fully qualified. Read RFC 2821", "not FQDN";
    }
    if ($host =~ /[^a-zA-Z0-9\-\.]/) {
        return "HELO name contains invalid FQDN characters. Read RFC 1035","invalid FQDN chars";
    }
    return;
}

sub no_forward_dns {
    my ($self, $host) = @_;

    return if $self->is_address_literal($host);

    my $res = $self->get_resolver();

    $host = "$host." if $host !~ /\.$/;    # fully qualify name
    my $query = $res->query($host);

    if (!$query) {
        if ($res->errorstring eq 'NXDOMAIN') {
            return "HELO hostname does not exist", "no such host";
        }
        $self->log(LOGERROR, "skip, query failed (", $res->errorstring, ")");
        return;
    }
    my $hits = 0;
    foreach my $rr ($query->answer) {
        next unless $rr->type =~ /^(?:A|AAAA)$/;
        $self->check_ip_match($rr->address);
        $hits++;
        last if $self->connection->notes('helo_forward_match');
    }
    if ($hits) {
        $self->log(LOGDEBUG, "pass, forward DNS") if $hits;
        return;
    }
    return "HELO hostname did not resolve", "no forward DNS";
}

sub no_reverse_dns {
    my ($self, $host, $ip) = @_;

    my $res = $self->get_resolver();
    $ip ||= $self->qp->connection->remote_ip;

    my $query = $res->query($ip) or do {
        if ($res->errorstring eq 'NXDOMAIN') {
            return "no rDNS for $ip", "no rDNS";
        }
        $self->log(LOGINFO, $res->errorstring);
        return "error getting reverse DNS for $ip", "rDNS " . $res->errorstring;
    };

    my $hits = 0;
    for my $rr ($query->answer) {
        next if $rr->type ne 'PTR';
        $self->log(LOGDEBUG, "PTR: " . $rr->ptrdname);
        $self->check_name_match(lc $rr->ptrdname, lc $host);
        $hits++;
    }
    if ($hits) {
        $self->log(LOGDEBUG, "has rDNS");
        return;
    }
    return "no reverse DNS for $ip", "no rDNS";
}

sub no_matching_dns {
    my ($self, $host) = @_;

    # this is called iprev, or "Forward-confirmed reverse DNS" and is discussed
    # in RFC 5451. FCrDNS is done for the remote IP in the fcrdns plugin. Here
    # we do it on the HELO hostname.
    # consider adding status to Authentication-Results header

    if (   $self->connection->notes('helo_forward_match')
        && $self->connection->notes('helo_reverse_match'))
    {
        $self->log(LOGDEBUG, "foward and reverse match");
        $self->adjust_karma(1);    # a perfect match
        return;
    }

    if ($self->connection->notes('helo_forward_match')) {
        $self->log(LOGDEBUG, "name matches IP");
        return;
    }
    if ($self->connection->notes('helo_reverse_match')) {
        $self->log(LOGDEBUG, "reverse matches name");
        return;
    }

    $self->log(LOGINFO, "fail, no forward or reverse DNS match");
    return "That HELO hostname fails FCrDNS", "no matching DNS";
}

sub check_ip_match {
    my $self = shift;
    my $ip = shift or return;

    my $rip = $self->qp->connection->remote_ip;
    if ($ip eq $rip) {
        $self->log(LOGDEBUG, "forward ip match");
        $self->connection->notes('helo_forward_match', 1);
        return;
    }

    my ($dns_net, $rem_net);
    if ($ip =~ /:/) {
        if ($ip  =~ /::/) { $ip  = Net::IP::ip_expand_address($ip,  6); }
        if ($rip =~ /::/) { $rip = Net::IP::ip_expand_address($rip, 6); }

        $dns_net = join(':', (split(/:/, $ip ))[0, 1, 2, 3, 4, 5]);
        $rem_net = join(':', (split(/:/, $rip))[0, 1, 2, 3, 4, 5]);
    }
    else {
        $dns_net = join('.', (split(/\./, $ip))[0, 1, 2]);
        $rem_net = join('.', (split(/\./, $rip))[0, 1, 2]);
    }

    if ($dns_net eq $rem_net) {
        $self->log(LOGNOTICE, "forward network match");
        $self->connection->notes('helo_forward_match', 1);
    }
}

sub check_name_match {
    my $self = shift;
    my ($dns_name, $helo_name) = @_;

    return if !$dns_name;
    return if split(/\./, $dns_name) < 2;    # not a FQDN

    if ($dns_name eq $helo_name) {
        $self->log(LOGDEBUG, "reverse name match");
        $self->connection->notes('helo_reverse_match', 1);
        return;
    }

    my $dns_dom  = join('.', (split(/\./, $dns_name))[-2,  -1]);
    my $helo_dom = join('.', (split(/\./, $helo_name))[-2, -1]);

    if ($dns_dom eq $helo_dom) {
        $self->log(LOGNOTICE, "reverse domain match");
        $self->connection->notes('helo_reverse_match', 1);
    }
}