#!perl -w

=head1 NAME

headers - validate message headers

=head1 DESCRIPTION

Checks for missing or empty values in the From or Date headers.

Make sure no singular headers are duplicated. Singular headers are:

 Date From Sender Reply-To To Cc Bcc
 Message-Id In-Reply-To References Subject

Optionally test if the Date header is too many days in the past or future. If
I<future> or I<past> are not defined, they are not tested.

If the remote IP is whitelisted, header validation is skipped.

=head1 CONFIGURATION

The following optional settings exist:

=head2 require

 headers require [ From | Date | From,Date | From,Date,Subject,Message-ID,Received ]

A comma separated list of headers to require. 

Default: From

=head3 Requiring the Date header

As of 2012, requiring a valid date header will almost certainly cause the loss
of valid mail. The JavaMail sender used by some banks, photo processing
services, health insurance companies, bounce senders, and others do send
messages without a Date header. For this reason, and despite RFC 5322, the
default is not to require Date.

However, if the date header is present, and I<future> and/or I<past> are
defined, it will be validated.

=head2 future

The number of days in the future beyond which messages are invalid.

  headers [ future 1 ]

=head2 past

The number of days in the past beyond which a message is invalid. The Date header is added by the MUA, so there are many valid reasons a message may have an older date in the header. It could have been delayed by the client, the sending server, connectivity problems, recipient server problem, recipient server configuration, etc. The I<past> setting should take those factors into consideration.

I would be surprised if a valid message ever had a date header older than a week.

  headers [ past 5 ]

=head2 reject

Determine if the connection is denied. Use the I<reject 0> option when first enabling the plugin, and then watch your logs to see what would have been rejected. When you are no longer concerned that valid messages will be rejected, enable with I<reject 1>.

  headers reject [ 0 | 1 ]

Default: 1

=head2 reject_type

Whether to issue a permanent or temporary rejection. The default is permanent.

  headers reject_type [ temp | perm ]

Using a temporary rejection is a cautious way to enable rejections. It allows an administrator to watch for a trial period and assure no valid messages are rejected. If a deferral of valid mail is noticed, I<reject 0> can be set to permit the deferred message to be delivered.

Default: perm

=head2 loglevel

Adjust the quantity of logging for this plugin. See docs/logging.pod

=head1 TODO

=head1 SEE ALSO

https://tools.ietf.org/html/rfc5322

=head1 AUTHOR

2012 - Matt Simerson

=head1 ACKNOWLEDGEMENTS

based in part upon check_basicheaders by Jim Winstead Jr.

Singular headers idea from Haraka's data.rfc5322_header_checks.js by Steve Freegard

=cut

use strict;
use warnings;

use Qpsmtpd::Constants;

use Date::Parse qw(str2time);

my @required_headers = qw/ From /;  # <- to be RFC 5322 compliant, add Date here

#my @should_headers   = qw/ Message-ID /;
my @singular_headers = qw/ Date From Sender Reply-To To Cc Bcc
  Message-Id In-Reply-To References
  Subject /;

sub register {
    my ($self, $qp) = (shift, shift);

    $self->log(LOGWARN, "invalid arguments") if @_ % 2;
    $self->{_args} = {@_};

    $self->{_args}{reject_type} ||= 'perm';    # set default
    if (!defined $self->{_args}{reject}) {
        $self->{_args}{reject} = 1;            # set default
    }

    if ($self->{_args}{require}) {
        @required_headers = split /,/, $self->{_args}{require};
    }
}

sub hook_data_post {
    my ($self, $transaction) = @_;

    if ($transaction->data_size == 0) {
        return $self->get_reject("You must send some data first", "no data");
    }

    my $header = $transaction->header or do {
        return $self->get_reject("Headers are missing", "missing headers");
    };

    return DECLINED if $self->is_immune();

    my $errors  = $self->has_required_headers( $header ) || 0;
       $errors += $self->has_singular_headers( $header );

    my $err_msg = $self->invalid_date_range();
    if ($err_msg) {
        return $self->get_reject($err_msg, $err_msg);
    }

    if ( $errors ) {
        return $self->get_reject($self->get_reject_type(),
                "RFC 5322 validation errors" );
    };

    $self->log(LOGINFO, 'pass');
    return DECLINED;
}

sub has_required_headers {
    my ($self, $header) = @_;

    my $errors = 0;
    foreach my $h (@required_headers) {
        next if $header->get($h);
        $errors++;
        $self->adjust_karma(-1);
        $self->is_naughty(1) if $self->{args}{reject};
        $self->store_deferred_reject("We require a valid $h header");
        $self->log(LOGINFO, "fail, no $h header" );
    }
    return $errors;
}

sub has_singular_headers {
    my ($self, $header) = @_;

    my $errors = 0;
    foreach my $h (@singular_headers) {
        next if !$header->get($h);    # doesn't exist
        my @qty = $header->get($h);
        next if @qty == 1;            # only 1 header
        $errors++;
        $self->adjust_karma(-1);
        $self->is_naughty(1) if $self->{args}{reject};
        $self->store_deferred_reject(
                "Only one $h header allowed. See RFC 5322, Section 3.6",
                );
        $self->log(LOGINFO, "fail, too many $h headers" );
    }
    return $errors;
}

sub invalid_date_range {
    my $self = shift;

    return if !$self->transaction->header;
    my $date = shift || $self->transaction->header->get('Date') or return;
    chomp $date;

    my $msg_ts = str2time($date) or do {
        $self->log(LOGINFO, "skip, date not parseable ($date)");
        return;
    };

    my $past = $self->{_args}{past};
    if ($past && $msg_ts < time - ($past * 24 * 3600)) {
        $self->log(LOGINFO, "fail, date too old ($date)");
        $self->adjust_karma(-1);
        return "The Date header is too far in the past";
    }

    my $future = $self->{_args}{future};
    if ($future && $msg_ts > time + ($future * 24 * 3600)) {
        $self->log(LOGINFO, "fail, date in future ($date)");
        $self->adjust_karma(-1);
        return "The Date header is too far in the future";
    }

    return;
}

__END__

=head1 SMTP HEADERS

http://forum.unifiedemail.net/default.aspx?g=posts&t=68

=head2 From:

The eMail address, and optionally the name of the author(s). In many eMail clients not changeable except through changing account settings.

=head2 To:

The eMail address(es), and optionally name(s) of the message's recipient(s). Indicates primary recipients (multiple allowed), for secondary recipients see Cc: and Bcc: below.

=head2 Subject:

A brief summary of the topic of the message. Certain abbreviations are commonly used in the subject, including "RE:" and "FW:".

=head2 Date:

The local time and date when the message was written. Like the From: field, many email clients fill this in automatically when sending. The recipient's client may then display the time in the format and time zone local to him/her.

=head2 Message-ID:

Also an automatically generated field; used to prevent multiple delivery and for reference in In-Reply-To: (see below).

=head2 Bcc:

Blind Carbon Copy; addresses added to the SMTP delivery list but not (usually) listed in the message data, remaining invisible to other recipients.

=head2 Cc:

Carbon copy; Many eMail clients will mark eMail in your inbox differently depending on whether you are in the To: or Cc: list.

=head2 Content-Type:

Information about how the message is to be displayed, usually a MIME type.

=head2 In-Reply-To:

Message-ID of the message that this is a reply to. Used to link related messages together.

=head2 Precedence:

Commonly with values "bulk", "junk", or "list"; used to indicate that automated "vacation" or "out of office" responses should not be returned for this mail, e.g. to prevent vacation notices from being sent to all other subscribers of a mailinglist.

=head2 Received:

Tracking information generated by mail servers that have previously handled a message, in reverse order (last handler first).

=head2 References:

Message-ID of the message that this is a reply to, and the message-id of the message the previous was reply a reply to, etc.

=head2 Reply-To:

Address that should be used to reply to the message.

=head2 Sender:

Address of the actual sender acting on behalf of the author listed in the From: field (secretary, list manager, etc.).

=head2 Return-Path:

When the delivery SMTP server makes the "final delivery" of a message, it inserts a return-path line at the beginning of the mail data. Thisuse of return-path is required; mail systems MUST support it. The return-path line preserves the information in the from the MAIL command.

=head2 Error-To:

Indicates where error messages should be sent. In the absence of this line, they go to the Sender:, and absent that, the From: address.

=head2 X-*

No standard header field will ever begin with the characters "X-", so application developers are free to use them for their own purposes.

=cut