Keeping the message files around, and the message details in the database, is
useful for IMAP sessions that haven't seen/processed the removal of a message
yet and try to fetch it. Before, we would return errors. Similarly, a session
that has a mailbox selected that is removed can (at least in theory) still read
messages.
The mechanics to do this need keeping removed mailboxes around too. JMAP needs
that anyway, so we now keep modseq/createseq/expunged history for mailboxes
too. And while we're at it, for annotations as well.
For future JMAP support, we now also keep the mailbox parent id around for a
mailbox, with an upgrade step to set the field for existing mailboxes and
fixing up potential missing parents (which could possibly have happened in an
obscure corner case that I doubt anyone ran into).
We normally recover from those situations, printing stack traces instead of
crashing the program. But during tests, we're not looking at the prometheus
metrics or all the output. Without these checks, such panics could go
unnoticed. Seems like a reasonable thing to add, unhandled panics haven't been
encountered in tests.
DeliverMessage() is now MessageAdd(), and it takes a Mailbox object that it
modifies but doesn't write to the database (the caller must do it, and plenty
of times can do it more efficiently by doing it once for multiple messages).
The new AddOpts let the caller influence how many checks and how much of the
work MessageAdd() does. The zero-value AddOpts enable all checks and all the
work, but callers can take responsibility of some of the checks/work if it can
do it more efficiently itself.
This simplifies the code in most places, and makes it more efficient. The
checks to update per-mailbox keywords is a bit simpler too now.
We are also more careful to close the junk filter without saving it in case of
errors.
Still part of more upcoming changes.
In the common case, it's the same as the previous delivery. That means we don't
have to try to create the directory (fewer syscalls) and that we can sync the
dir to disk.
This also tweaks the defer handling in case of a late failure.
We effectively held the account write-locked by using a writable transaction
while processing the FETCH command. We did this because we may have to update
\Seen flags, for non-PEEK attribute fetches. This meant other FETCHes would
block, and other write access to the account too.
We now read the messages in a read-only transaction. We gather messages that
need marking as \Seen, and make that change in one (much shorter) database
transaction at the end of the FETCH command.
In practice, it doesn't seem too sensible to mark messages as seen
automatically. Most clients probably use the PEEK-variant of attribute fetches.
Related to issue #128.
The previous commit fixed an array out of bounds access that resulted in a
panic on an smtpserver connection. The panic is recovered and marked as
"unhandled panic" in metrics and the connection closed.
Fix up a test or two. Simplify the XOR logic when we train the junk filter:
Only if junk or nonjunk is set, but not when both (or none are set). i.e. when
the values aren't the same.
Locking the account when we do consistency checks prevents spurious test
failures that may have been introduced in the previous commit.
Writing to a connection goes through the flate library to compress. That writes
the compressed bytes to the underlying connection. But that underlying
connection is wrapped to raise a panic with an i/o error instead of returning a
normal error. Jumping out of flate leaves the internal state of the compressor
in undefined state. So far so good. But as part of cleaning up the connection,
we could try to flush output again. Specifically: If we were writing user data,
we had switched from tracing of protocol data to tracing of user data, and we
registered a defer that restored the tracing kind and flushed (to ensure data
was traced at the right level). That flush would cause a write into the
compressor again, which could panic with an out of bounds slice access due to
its inconsistent internal state.
This fix prevents that compressor panic in two ways:
1. We wrap the flate.Writer with a moxio.FlateWriter that keeps track of
whether a panic came out of an operation on it. If so, any further operation
raises the same panic. This prevents access to the inconsistent internal flate
state entirely.
2. Once we raise an i/o error, we mark the connection as broken and that makes
flushes a no-op.
REPLACE can be used to update draft messages as you are editing. Instead of
requiring an APPEND and STORE of \Deleted and EXPUNGE. REPLACE works
atomically.
It has a syntax similar to APPEND, just allows you to specify the message to
replace that's in the currently selected mailbox. The regular REPLACE-command
works on a message sequence number, the UID REPLACE commands on a uid. The
destination mailbox, of the updated message, can be different. For example to
move a draft message from the Drafts folder to the Sent folder.
We have to do quite some bookkeeping, e.g. for updating (message) counts for
the mailbox, checking quota, un/retraining the junk filter. During a
synchronizing literal, we check the parameters early and reject if the replace
would fail (eg over quota, bad destination mailbox).
MULTIAPPEND modifies the existing APPEND command to allow multiple messages. it
is somewhat more involved than a regular append of a single message since the
operation (of adding multiple messages) must be atomic. either all are added,
or none are.
we check as early as possible if the messages won't cause an over-quota error.
our previous approach was to hope clients did the ID command right after the
AUTHENTICATE command. with more extensions implemented, that's a stretch,
clients are doing other commands in between.
the new approach is to allow more commands, but wait at most 1 second. clients
are still assumed to send the ID command soon after authenticate. we also still
ensure login attempts are logged on connection teardown, so we aren't missing
any logging, just may get it slightly delayed. seems reasonable.
we now also keep the useragent value around, and we use when initializing the
login attempt. because the ID command can happen at any time, also before the
AUTHENTICATE command.
only with "return" including "metadata". so clients can quickly get certain
metadata (eg for display, such as a color) for mailboxes.
this also adds a protocol token type "mailboxt" that properly encodes to utf7
if required.
not just /private. /shared/ is the more commonly implemented namespace, because
it is easier te implement: you don't need per-user/account storage of metadata.
i initially approached it from the other direction: we don't have a mechanism
to share metadata with other accounts, so everything is private, and i assumed
that would be what a user would prefer. but email clients make the decisions,
and they'll likely try the /shared/ namespace.
ip addresses are invalid in server names. for ipv6 addresses, the
autocert.GetCertificate calls would return an error, which we logged, and
increased a metric about. but the alerts for this situation aren't helpful. so
recognize ip addresses early. if we are lenient about unknown server names (for
incoming smtp deliveries), we switch to the fallback hostname, otherwise we
return an error.
this was the error logged:
l=error m="requesting certificate" err="acme/autocert: server name component count invalid"
for ipv4 addresses, the name wouldn't be in our allowlist and should already
have caused us to switch to the fallback hostname.
i added the metadata extension to the imapserver recently. then i wondered how
a client would efficiently find changed metadata. turns out the qresync rfc
mentions that metadata changes should set a new modseq on the mailbox.
shouldn't be hard, except that we were not explicitly keeping track of modseqs
per mailbox. we only kept them for messages, and we were just looking up the
latest message modseq when we needed the modseq (we keep db entries for
expunged messages, so this worked out fine). that approach isn't enough
anymore. so know we keep track of modseq & createseq for mailboxes, just as for
messages. and we also track modseq/createseq for annotations. there's a good
chance jmap is going to need it.
this also adds consistency checks for modseq/createseq on mailboxes and
annotations to the account storage. it helped spot cases i missed where the
values need to be updated.
the tls resumption test was failing due to switch from net.Pipe to unix domain
socket pairs. on bsds, they have an empty name (on linux it is "@"), which
prevents tls resumption from working.
to compress the entire IMAP connection. tested with thunderbird, meli, k9, ios
mail. the initial implementation had interoperability issues with some of these
clients: if they write the deflate stream and flush in "partial mode", the go
stdlib flate reader does not return any data (until there is an explicit
zero-length "sync flush" block, or until the history/sliding window is full),
blocking progress, resulting in clients closing the seemingly stuck connection
after considering the connection timed out. this includes a coy of the flate
package with a new reader that returns partially flushed blocks earlier.
this also adds imap trace logging to imapclient.Conn, which was useful for
debugging.
so we can easily see the exact bytes on the wire, instead of having \n's
expanded as newlines. much easier to read. we had this in the past, but it must
have been lost in a refactor.
they are intended to be used by the server to automatically mark some messages
as important, based on server-defined heuristics. we don't have such heuristics
at the moment. perhaps in the future, but until then there are no plans.