<style>
pre.mermaid { display: flex !important; justify-content: center }
</style>
# Things we wish we knew before starting an IMAP library
Note:
- Presentation (30 min)
- Intro (5 min) (Simon) [10 min]
- Introduce ourselves
- Short IMAP Intro
- Layers
- Types + Syntax (8 min) (Damian)
- Flows (5 min) (Simon) [7 min]
- Operations/Semantics (5 min) (Simon)
- Extensions (2 min) (Simon)
- Q/A (5 min)
---
# Introduction
----
Simon (https://emersion.fr)
Damian (https://duesee.dev/about)
----
## Why IMAP?
- Fetch messages from a mail server
- Organize them in _mailboxes_
- Synchronize from multiple clients/devices
Note:
- Give examples for what is a "mailbox": inbox, archive, spam, drafts…
- "Folder"
- Give use-case for multi-client: laptop + mobile + workstation
- Start writing draft on laptop and finish+send it on mobile
- (Move message to archive mailbox on laptop, see the change reflected on workstation)
----
## Basic mode of operation
- Open TCP connection
- Write command
- Read response(s)
----
## Example command/response
```text [1|2]
C: cmd1 LOGIN "root@nsa.gov" "hunter1"
S: cmd1 OK You're in!
```
Note:
Explain "C:" and "S:"
Explain tags, explain OK response
Stuff after OK is a comment, ignored
----
## Example command/response
```text [1|2-3|4]
C: cmd2 FETCH 1:* (FLAGS ENVELOPE)
S: * 1 FETCH (FLAGS (\Seen \Important) ENVELOPE (…))
S: * 2 FETCH (FLAGS () ENVELOPE (…))
S: cmd2 OK …
```
Note:
- Explain how data is returned
- Note how tag is not included in data responses
- Note that server can notify you of arbitrary updates before end
- Explain basic parenthesed list stuff
- But not really consistent, see the syntax
- Explain what the command/responses mean, hand-wave seq sets for later
TODO: expand envelope?
----
## Referring to a message
32-bit unsigned integers, one of:
- UIDs
- Sequence numbers
<div class="fragment">
```text [1|1-3]
[4, 6, 12]
↓ delete UID 6
[4, 12]
```
</div>
Note:
- UIDs
- Don't change, except when UIDVALIDITY does
- Increase when a message is added to the mailbox
- Sequence numbers
- Ordinal number
- Start at 1
- Grows the same way as UIDs: sorted by date added in mailbox
- No gaps
- Many operations invalidate/reassign sequence numbers
- Indicated by server messages (EXPUNGE and EXISTS)
- Message data is immutable
----
## Referring to multiple messages
```text [1|2|3|4]
1
2:4
2:4,6:10
1:*
```
---
# Agenda (Layers)
* Types
* Syntax
* Flows
* Operations
* Backend
Note:
go-imap
* imap (toplevel package, shared declarations between client and server)
* imapwire (core syntax)
* imapclient, imapserver
* backend
imap-codec
* imap-types
* imap-codec (syntax)
* ...
---
# Types
----
|Client |Server |
|------------------|-------------------|
|<p class="fragment" data-fragment-index="1">Command</br>(only serialization)</p>|<p class="fragment" data-fragment-index="2">Command</br>(only parsing)</p> |
|<p class="fragment" data-fragment-index="1">Response</br>(only parsing)</p> |<p class="fragment" data-fragment-index="2">Response</br>(only serialization)</p>|
----
### Overlap in Types, Rules, & Functionality
* `Tag`,
* `AString`, `Atom`, `IString`, ...
* `DataItem{Name,}`, ...
* ...
----
#### Structure your code such that your implementation can be easily expanded
* Start with `client` or `server` module
* Use `shared` module
* ...
----
### Advantage: Testing
```rust
let message = /* random */
assert(message == parse(serialize(message)))
let bytes = /* random */
assert(bytes == serialize(parse(bytes)))
```

---
# Syntax
----
## State of mind
----

Note:
Citation from Mark Crispin; Count how many times I say the "Formal Syntax".
"First and foremost, the Formal Syntax [...] should be your holy book.
If any part of [the standard] distracts you from the Formal Syntax, ignore it in favor of the Formal Syntax."
----

Note:
"Your jaw will drop when you first see the Formal Syntax."
----
Simplified graph of rule dependencies

Note:
"Your eyes will glaze over. You will start saying "no, no, no." Just work through that stage."
"It's a steep hill to climb, but once you make it to the top ..."
----

Note:
"... you will see everything with crystal clarity."
----

Note:
"[And] Whatever you do, DO NOT ATTEMPT TO IMPLEMENT ANY COMMAND OR RESPONSE BY LOOKING AT THE EXAMPLES!"
* Not in errata
* Obsoleted by RFC 7162
* Same mistake (but has errata now)
----
### What Mark said (+ extra)
* Use the Formal Syntax
* Learn ABNF
* Lexer -> Nope
* "arguments invalid" is confusing!
* Parser generator -> Wait for it ...
----
### "Layers" of Formal Syntax
* ABNF core rules
* IMAP strings
* IMAP messages
----
### IMAP strings
```abnf
; LOGIN command with all rules inlined.
command_login = tag SP "LOGIN" SP astring SP astring CRLF
; ^^^^^^^ ^^^^^^^
; | |
; Username Password
```
Note:
* Innocent looking `astring`
----

----
<div class="fragment">
#### Atom
```text
password
```
</div>
<div class="fragment">
#### Quoted
```text
"Let's use a \"passphrase\""
```
</div>
<div class="fragment">
#### Literal
```text
{56}
Dear Bob,
Here is your password ...
Best,
Alice
```
</div>
----
## Ambiguities & Defects
* Defects
* https://github.com/modern-email/defects
* Interop
* https://dovecot.github.io/imaptest
----
## interruption for an announcement
IMAP knowledge is disappearing ...
----

Note:
* Great thread from mailinglist
* IMAP knowledge is disappearing
----
https://meli-email.org/

---
# Flow/Framing
How do we split IMAP into separate commands/responses?
Note:
Context-dependent
----
```text=
C: a001 LOGIN simon secret
S: a001 OK ...
C: a002 SELECT INBOX
S: * 18 EXISTS
S: a002 OK ...
```
Note:
Can we split this into messages? Split on newline?
----
```text [|1,3|2]
C: a001 LOGIN simon {6}
S: + ...
C: secret
S: a001 OK ...
C: a002 SELECT INBOX
S: * 18 EXISTS
S: a002 OK ...
```
Note:
Lines 1 and 3 are same command
Server needs to acknoledge the literal
Client needs to wait
----
<!-- .slide: data-transition="slide-in none-out" -->
Looking at the client side only:
```text [|1,3,5]
C: a001 LOGIN simon {6}
S: + ...
C: secret
S: a001 OK ...
C: a002 NOOP
S: a002 OK ...
```
Note:
Do clients really need to care about the continuation?
----
<!-- .slide: data-transition="none" -->
Looking at the client side only:
```text=
C: a001 LOGIN simon {6}
C: secret
C: a002 NOOP
```
----
<!-- .slide: data-transition="none" -->
What about this one?
```text=
C: a001 LOGIN simon {6}
C: a002 NOOP
```
----
<!-- .slide: data-transition="none-in slide-out" -->
What about this one?
```text=
C: a001 LOGIN simon {6}
S: a001 NO ...
C: a002 NOOP
S: a002 OK ...
```
Note:
Server can reject literals
----
Sending literals without waiting
```text [|2-4]
C: a001 APPEND INBOX {4242}
C: inject001 STORE 1:* FLAGS (\Deleted)
C: inject002 EXPUNGE
C: ....
S: a001 NO ...
```
Note:
Waiting for server ACK is important for security
Injection potentially possible in replied-to message
----
#### Lessons learned
- Literals can appear anywhere
- Literals interrupt the regular syntax
- Cannot parse by looking at a single side of the connection
- Important to wait for server to accept literals
<!--
* *Appearently* some higher-level protocol is involved…
* But: Literals are tightly coupled to the syntax!
* You can't "just detect `{n}\r\n`"
* What about:
`* OK Hello, your ID is {1337}`
-->
Note:
* Literals interrupt the regular syntax
* Painful to implement zero-copy streaming (unlike say, shell here-documents)
* Are *all-over-the-place* (and people forget about them, i.e., in SEARCH)
* Parser needs to understand the semantic
* Proxy needs to understand the semantic
* -> Must filter unknown extensions!
<!--
```mermaid
stateDiagram-v2
[*] --> send_line
send_line --> send_line: line
send_line --> send_literal: line
send_literal --> send_line: continue / literal
send_literal --> send_literal: continue / literal + line
send_literal --> send_line: abort
```
-->
----
## AUTHENTICATE
```text=
C: a001 AUTHENTICATE PLAIN
S: + ...
C: AA==
```
Note:
Explain what AUTHENTICATE is
Context-sensitive, "asdasdasdkajhasdjkh==" (AuthenticateData) is basically a special message
----
## IDLE
```text=
C: a001 IDLE
S: + ...
C: DONE
```
Note:
Context-sensitive, "DONE" is basically a special message
----
## STARTTLS/COMPRESS
Note:
Fabian will talk a lot about it already
----
## Flows: summary
IMAP *demands* to conflate parsing with business logic
---
# Operations & Semantics
----
## Fetching messages
- `ENVELOPE`: from/to/subject/etc
- `BODYSTRUCTURE`: tree of MIME parts
- `BODY[]`: full message body
----
## Fetching messages
```http
From: <root@nsa.gov>
Subject: Hiya
Howdy?
```
```text [1|2|3]
BODY[]
BODY[HEADER]
BODY[TEXT]
```
----
## Fetching messages
```http
From: <root@nsa.gov>
Subject: Hiya
Howdy?
```
```text [1|2|3]
BODY[HEADER.FIELDS (From)]
BODY[HEADER.FIELDS.NOT (Subject)]
BODY[TEXT]<0.2>
```
<!--
### Fetching messages
<img src="https://hackmd.io/_uploads/SkPhzWe5a.png" style="height:600px;">
-->
----
## Fetching messages
```http [|1|4-6|8-10|]
Content-Type: multipart/mixed; boundary=foo
--foo
Content-Disposition: inline
Howdy? Attached is your new password.
--foo
Content-Disposition: attachment; filename="password.txt"
hunter2
--foo--
```
```text [|1|2|3]
BODY[1]
BODY[1.HEADER]
BODY[1.MIME]
```
Note:
- `BODY[1]` doesn't include the header (different from `BODY[]`!)
- `HEADER` only for toplevel, `MIME` only for child parts
----
## Fetching messages
```http=
Content-Type: multipart/mixed; boundary=foo
--foo
Content-Disposition: attachment; filename="previous.eml"
Content-Type: message/rfc822
From: <root@nsa.gov>
Subject: Hiya
Howdy?
--foo--
```
----
## Fetching messages
```
multipart/mixed
├─ multipart/alternative
│ ├─ text/plain
│ └─ text/html
└─ message/rfc822
└─ multipart/mixed
├─ text/plain
└─ image/png
```
```text=
BODY[1.2]
BODY[2.HEADER]
BODY[2.MIME]
BODY[2.1.TEXT]
```
Note:
Part 2 refers to both the `message/rfc822` and the `multipart/mixed`
----
## Unilateral server data
```text
C: a001 FETCH 1 BODY[]
S: * 1 FETCH (BODY[] "Hello world!")
S: a001 OK FETCH completed
```
```text
C: a001 NOOP
S: * 1 FLAGS (\Important)
S: a001 OK NOOP completed
```
<!-- .element: class="fragment" -->
Note:
Client didn't ask for data, but server sends it.
----
## Unilateral server data
```text
C: a001 FETCH 1 BODY[]
S: * 1 FETCH (BODY[] "Hello world!")
S: * 1 FETCH (FLAGS (\Important))
S: a001 OK FETCH completed
```
Note:
IMAP isn't really designed to be implemented as an RPC (because the response data doesn't include the request tag)
----
# Extensions
Extensions are more like "amendments".
* Can fundamentally alter syntax, flows, operations, etc.
* IDLE, COMPRESS
* EXTENDED-LIST, ESEARCH, LITERAL+
Note:
IMAP is not modular, it's monolithic.
go-imap v1 design mistake
---
# Questions
----
Note:
TODO?
* https://paste.sr.ht/~emersion/a45ca9c3236e35bb79c64e08cb0bd6fc35424cf2
* IDLE doesn't *need* to exists, but fixes reality
* Allowed to change hierarchy seperator?
* PREAUTH and STARTTLS conflict (Fabian will mention it in his STARTTLS talk)
* Servers "basically should ignore pipelining" (Mark Crispin on imap-protocol mailing list)
* APPEND is special
* OK/NO/BAD is confusing (or wrong)
* https://github.com/modern-email/questions_imap/issues/3
* Interoperability required
* LOGIN
* How does go-imap provide server interface?
https://github.com/emersion/go-imap/blob/5a52b99cd03a3f30be219cec1982b379f5257221/imapserver/session.go#L50
Helpers for seq num mapping:
https://github.com/emersion/go-imap/blob/5a52b99cd03a3f30be219cec1982b379f5257221/imapserver/tracker.go#L10
* Pipelining
----
# Backup
----
## States
```mermaid
graph TD
notauth(Not authenticated) --> auth(Authenticated)
auth --> selected(Selected)
selected --> auth
selected --> selected
```
Note:
Cannot fetch messages from an unselected mailbox.
In selected state, updates about the mailbox are received.
Cannot receive updates about multiple mailboxes over a single connection.
----
## Recursive rules
----

----
## Ambiguities
----
```abnf
continue-req = "+" SP (resp-text / base64) CRLF
resp-text = ...
base64 = ...
```
```text
S: AA== // text or base64?
```
----
```text
S: * OK ...
S: * OK [CODE] ...
S: * OK [CODE] // What's this?
S: * OK [CODE // What's this?
```
* Good: Definitive (?) answer
* Bad: Want to fix your implementation?
Note:
Mark Crispin said "Yes" :P IMAP tells "In the case of alternative or optional rules in which a later rule overlaps an earlier rule, the rule which is listed earlier MUST take priority"
imap-types: Need to forbid `[]` in `text`
----
## `()` vs. `NIL`
----
```text
// Dovecot
C: A ID ()
S: * ID ("name" "Dovecot")
S: A OK ID completed.
```
```text
// Outlook
C: A ID ()
S: A BAD ID failed
```
----
## `()()` vs. `() ()`
```abnf
env-cc = "(" 1*address ")" / nil
; ...
address = "("
addr-name SP
addr-adl SP
addr-mailbox SP
addr-host
")"
```
----
## Missing rules
----
Cancellation of SASL authentication, i.e., ...
`*\r\n`
... is not mentioned in ABNF.
----
## INBOX has 96 variants
```IMAP
inbox
...
INBOX
"inbox"
"INBOX"
{5}\r\ninbox
...
{5}\r\nINBOX
```
----
## INBOX (is confusing)
```text
iNbOX/subfolder
iNbOX.subfolder
iNbOXxsubfolder
```
* Hierarchy separator is variable
* "(...) slash should be the One And Only True Hierarchy Delimiter. Almost everybody agrees that it was a mistake to allow others."
* Not (entirely) sure
* Likely: No special treatment
----
## Sequence numbers with multiple clients
```text
C₁: SELECT INBOX | C₂: SELECT INBOX
S₁: 42 EXISTS | S₂: 42 EXISTS
|
C₁: STORE 42 +FLAGS \Deleted | C₂: APPEND INBOX {1024}
C₁: EXPUNGE | C₂: …
S₁: 42 EXPUNGE | S₂: 43 EXISTS
```
What happens with the following commands?
```text
C₁: FETCH 42 | C₂: FETCH 42
C₁: FETCH 43 | C₂: FETCH 43
```
Note:
Either new visuals, or delete
----
## Sequence numbers with multiple clients
```text
C₁: SELECT INBOX | C₂: SELECT INBOX
S₁: 42 EXISTS | S₂: 42 EXISTS
|
C₁: STORE 42 +FLAGS \Deleted | C₂: APPEND INBOX {1024} …
C₁: EXPUNGE | S₂: 43 EXISTS
S₁: 42 EXPUNGE |
|
C₁: NOOP | C₂: NOOP
S₁: 42 EXISTS | S₂: 42 EXPUNGE
```
Note:
Server keeps a per-client view of message sequence numbers
{"title":"Things we wish we knew before starting an IMAP library","description":"Atom vs AtomExt","slideOptions":"{\"theme\":\"white\",\"transition\":\"slide\"}","contributors":"[{\"id\":\"dcc3b45a-49a6-4740-b16b-11302e644b1e\",\"add\":25776,\"del\":18503},{\"id\":\"b278e497-0cd1-42a1-ba06-22ebce6569e5\",\"add\":16542,\"del\":11163}]"}