<style> pre.mermaid { display: flex !important; justify-content: center } </style> # Things we wish we knew before starting an IMAP library Note: - Presentation (30 min) - Intro (5 min) (Simon) [10 min] - Introduce ourselves - Short IMAP Intro - Layers - Types + Syntax (8 min) (Damian) - Flows (5 min) (Simon) [7 min] - Operations/Semantics (5 min) (Simon) - Extensions (2 min) (Simon) - Q/A (5 min) --- # Introduction ---- Simon (https://emersion.fr) Damian (https://duesee.dev/about) ---- ## Why IMAP? - Fetch messages from a mail server - Organize them in _mailboxes_ - Synchronize from multiple clients/devices Note: - Give examples for what is a "mailbox": inbox, archive, spam, drafts… - "Folder" - Give use-case for multi-client: laptop + mobile + workstation - Start writing draft on laptop and finish+send it on mobile - (Move message to archive mailbox on laptop, see the change reflected on workstation) ---- ## Basic mode of operation - Open TCP connection - Write command - Read response(s) ---- ## Example command/response ```text [1|2] C: cmd1 LOGIN "root@nsa.gov" "hunter1" S: cmd1 OK You're in! ``` Note: Explain "C:" and "S:" Explain tags, explain OK response Stuff after OK is a comment, ignored ---- ## Example command/response ```text [1|2-3|4] C: cmd2 FETCH 1:* (FLAGS ENVELOPE) S: * 1 FETCH (FLAGS (\Seen \Important) ENVELOPE (…)) S: * 2 FETCH (FLAGS () ENVELOPE (…)) S: cmd2 OK … ``` Note: - Explain how data is returned - Note how tag is not included in data responses - Note that server can notify you of arbitrary updates before end - Explain basic parenthesed list stuff - But not really consistent, see the syntax - Explain what the command/responses mean, hand-wave seq sets for later TODO: expand envelope? ---- ## Referring to a message 32-bit unsigned integers, one of: - UIDs - Sequence numbers <div class="fragment"> ```text [1|1-3] [4, 6, 12] ↓ delete UID 6 [4, 12] ``` </div> Note: - UIDs - Don't change, except when UIDVALIDITY does - Increase when a message is added to the mailbox - Sequence numbers - Ordinal number - Start at 1 - Grows the same way as UIDs: sorted by date added in mailbox - No gaps - Many operations invalidate/reassign sequence numbers - Indicated by server messages (EXPUNGE and EXISTS) - Message data is immutable ---- ## Referring to multiple messages ```text [1|2|3|4] 1 2:4 2:4,6:10 1:* ``` --- # Agenda (Layers) * Types * Syntax * Flows * Operations * Backend Note: go-imap * imap (toplevel package, shared declarations between client and server) * imapwire (core syntax) * imapclient, imapserver * backend imap-codec * imap-types * imap-codec (syntax) * ... --- # Types ---- |Client |Server | |------------------|-------------------| |<p class="fragment" data-fragment-index="1">Command</br>(only serialization)</p>|<p class="fragment" data-fragment-index="2">Command</br>(only parsing)</p> | |<p class="fragment" data-fragment-index="1">Response</br>(only parsing)</p> |<p class="fragment" data-fragment-index="2">Response</br>(only serialization)</p>| ---- ### Overlap in Types, Rules, & Functionality * `Tag`, * `AString`, `Atom`, `IString`, ... * `DataItem{Name,}`, ... * ... ---- #### Structure your code such that your implementation can be easily expanded * Start with `client` or `server` module * Use `shared` module * ... ---- ### Advantage: Testing ```rust let message = /* random */ assert(message == parse(serialize(message))) let bytes = /* random */ assert(bytes == serialize(parse(bytes))) ``` ![upload_d46efd7440fedbc7d1e74b067fdf7a03](https://hackmd.io/_uploads/ry566FUqp.png) --- # Syntax ---- ## State of mind ---- ![image](https://hackmd.io/_uploads/SkAV6-EuT.png) Note: Citation from Mark Crispin; Count how many times I say the "Formal Syntax". "First and foremost, the Formal Syntax [...] should be your holy book. If any part of [the standard] distracts you from the Formal Syntax, ignore it in favor of the Formal Syntax." ---- ![image](https://hackmd.io/_uploads/SyAtAZVup.png) Note: "Your jaw will drop when you first see the Formal Syntax." ---- Simplified graph of rule dependencies ![image](https://hackmd.io/_uploads/HJ4mBHJwa.png) Note: "Your eyes will glaze over. You will start saying "no, no, no." Just work through that stage." "It's a steep hill to climb, but once you make it to the top ..." ---- ![image](https://hackmd.io/_uploads/rkv_i7pt6.png) Note: "... you will see everything with crystal clarity." ---- ![image](https://hackmd.io/_uploads/H1Ua3EEdT.png) Note: "[And] Whatever you do, DO NOT ATTEMPT TO IMPLEMENT ANY COMMAND OR RESPONSE BY LOOKING AT THE EXAMPLES!" * Not in errata * Obsoleted by RFC 7162 * Same mistake (but has errata now) ---- ### What Mark said (+ extra) * Use the Formal Syntax * Learn ABNF * Lexer -> Nope * "arguments invalid" is confusing! * Parser generator -> Wait for it ... ---- ### "Layers" of Formal Syntax * ABNF core rules * IMAP strings * IMAP messages ---- ### IMAP strings ```abnf ; LOGIN command with all rules inlined. command_login = tag SP "LOGIN" SP astring SP astring CRLF ; ^^^^^^^ ^^^^^^^ ; | | ; Username Password ``` Note: * Innocent looking `astring` ---- ![image](https://hackmd.io/_uploads/rkv_i7pt6.png) ---- <div class="fragment"> #### Atom ```text password ``` </div> <div class="fragment"> #### Quoted ```text "Let's use a \"passphrase\"" ``` </div> <div class="fragment"> #### Literal ```text {56} Dear Bob, Here is your password ... Best, Alice ``` </div> ---- ## Ambiguities & Defects * Defects * https://github.com/modern-email/defects * Interop * https://dovecot.github.io/imaptest ---- ## interruption for an announcement IMAP knowledge is disappearing ... ---- ![image](https://hackmd.io/_uploads/HkZXF4EdT.png) Note: * Great thread from mailinglist * IMAP knowledge is disappearing ---- https://meli-email.org/ ![image](https://hackmd.io/_uploads/rJvJKUUq6.png) --- # Flow/Framing How do we split IMAP into separate commands/responses? Note: Context-dependent ---- ```text= C: a001 LOGIN simon secret S: a001 OK ... C: a002 SELECT INBOX S: * 18 EXISTS S: a002 OK ... ``` Note: Can we split this into messages? Split on newline? ---- ```text [|1,3|2] C: a001 LOGIN simon {6} S: + ... C: secret S: a001 OK ... C: a002 SELECT INBOX S: * 18 EXISTS S: a002 OK ... ``` Note: Lines 1 and 3 are same command Server needs to acknoledge the literal Client needs to wait ---- <!-- .slide: data-transition="slide-in none-out" --> Looking at the client side only: ```text [|1,3,5] C: a001 LOGIN simon {6} S: + ... C: secret S: a001 OK ... C: a002 NOOP S: a002 OK ... ``` Note: Do clients really need to care about the continuation? ---- <!-- .slide: data-transition="none" --> Looking at the client side only: ```text= C: a001 LOGIN simon {6} C: secret C: a002 NOOP ``` ---- <!-- .slide: data-transition="none" --> What about this one? ```text= C: a001 LOGIN simon {6} C: a002 NOOP ``` ---- <!-- .slide: data-transition="none-in slide-out" --> What about this one? ```text= C: a001 LOGIN simon {6} S: a001 NO ... C: a002 NOOP S: a002 OK ... ``` Note: Server can reject literals ---- Sending literals without waiting ```text [|2-4] C: a001 APPEND INBOX {4242} C: inject001 STORE 1:* FLAGS (\Deleted) C: inject002 EXPUNGE C: .... S: a001 NO ... ``` Note: Waiting for server ACK is important for security Injection potentially possible in replied-to message ---- #### Lessons learned - Literals can appear anywhere - Literals interrupt the regular syntax - Cannot parse by looking at a single side of the connection - Important to wait for server to accept literals <!-- * *Appearently* some higher-level protocol is involved… * But: Literals are tightly coupled to the syntax! * You can't "just detect `{n}\r\n`" * What about: `* OK Hello, your ID is {1337}` --> Note: * Literals interrupt the regular syntax * Painful to implement zero-copy streaming (unlike say, shell here-documents) * Are *all-over-the-place* (and people forget about them, i.e., in SEARCH) * Parser needs to understand the semantic * Proxy needs to understand the semantic * -> Must filter unknown extensions! <!-- ```mermaid stateDiagram-v2 [*] --> send_line send_line --> send_line: line send_line --> send_literal: line send_literal --> send_line: continue / literal send_literal --> send_literal: continue / literal + line send_literal --> send_line: abort ``` --> ---- ## AUTHENTICATE ```text= C: a001 AUTHENTICATE PLAIN S: + ... C: AA== ``` Note: Explain what AUTHENTICATE is Context-sensitive, "asdasdasdkajhasdjkh==" (AuthenticateData) is basically a special message ---- ## IDLE ```text= C: a001 IDLE S: + ... C: DONE ``` Note: Context-sensitive, "DONE" is basically a special message ---- ## STARTTLS/COMPRESS Note: Fabian will talk a lot about it already ---- ## Flows: summary IMAP *demands* to conflate parsing with business logic --- # Operations & Semantics ---- ## Fetching messages - `ENVELOPE`: from/to/subject/etc - `BODYSTRUCTURE`: tree of MIME parts - `BODY[]`: full message body ---- ## Fetching messages ```http From: <root@nsa.gov> Subject: Hiya Howdy? ``` ```text [1|2|3] BODY[] BODY[HEADER] BODY[TEXT] ``` ---- ## Fetching messages ```http From: <root@nsa.gov> Subject: Hiya Howdy? ``` ```text [1|2|3] BODY[HEADER.FIELDS (From)] BODY[HEADER.FIELDS.NOT (Subject)] BODY[TEXT]<0.2> ``` <!-- ### Fetching messages <img src="https://hackmd.io/_uploads/SkPhzWe5a.png" style="height:600px;"> --> ---- ## Fetching messages ```http [|1|4-6|8-10|] Content-Type: multipart/mixed; boundary=foo --foo Content-Disposition: inline Howdy? Attached is your new password. --foo Content-Disposition: attachment; filename="password.txt" hunter2 --foo-- ``` ```text [|1|2|3] BODY[1] BODY[1.HEADER] BODY[1.MIME] ``` Note: - `BODY[1]` doesn't include the header (different from `BODY[]`!) - `HEADER` only for toplevel, `MIME` only for child parts ---- ## Fetching messages ```http= Content-Type: multipart/mixed; boundary=foo --foo Content-Disposition: attachment; filename="previous.eml" Content-Type: message/rfc822 From: <root@nsa.gov> Subject: Hiya Howdy? --foo-- ``` ---- ## Fetching messages ``` multipart/mixed ├─ multipart/alternative │ ├─ text/plain │ └─ text/html └─ message/rfc822 └─ multipart/mixed ├─ text/plain └─ image/png ``` ```text= BODY[1.2] BODY[2.HEADER] BODY[2.MIME] BODY[2.1.TEXT] ``` Note: Part 2 refers to both the `message/rfc822` and the `multipart/mixed` ---- ## Unilateral server data ```text C: a001 FETCH 1 BODY[] S: * 1 FETCH (BODY[] "Hello world!") S: a001 OK FETCH completed ``` ```text C: a001 NOOP S: * 1 FLAGS (\Important) S: a001 OK NOOP completed ``` <!-- .element: class="fragment" --> Note: Client didn't ask for data, but server sends it. ---- ## Unilateral server data ```text C: a001 FETCH 1 BODY[] S: * 1 FETCH (BODY[] "Hello world!") S: * 1 FETCH (FLAGS (\Important)) S: a001 OK FETCH completed ``` Note: IMAP isn't really designed to be implemented as an RPC (because the response data doesn't include the request tag) ---- # Extensions Extensions are more like "amendments". * Can fundamentally alter syntax, flows, operations, etc. * IDLE, COMPRESS * EXTENDED-LIST, ESEARCH, LITERAL+ Note: IMAP is not modular, it's monolithic. go-imap v1 design mistake --- # Questions ---- Note: TODO? * https://paste.sr.ht/~emersion/a45ca9c3236e35bb79c64e08cb0bd6fc35424cf2 * IDLE doesn't *need* to exists, but fixes reality * Allowed to change hierarchy seperator? * PREAUTH and STARTTLS conflict (Fabian will mention it in his STARTTLS talk) * Servers "basically should ignore pipelining" (Mark Crispin on imap-protocol mailing list) * APPEND is special * OK/NO/BAD is confusing (or wrong) * https://github.com/modern-email/questions_imap/issues/3 * Interoperability required * LOGIN * How does go-imap provide server interface? https://github.com/emersion/go-imap/blob/5a52b99cd03a3f30be219cec1982b379f5257221/imapserver/session.go#L50 Helpers for seq num mapping: https://github.com/emersion/go-imap/blob/5a52b99cd03a3f30be219cec1982b379f5257221/imapserver/tracker.go#L10 * Pipelining ---- # Backup ---- ## States ```mermaid graph TD notauth(Not authenticated) --> auth(Authenticated) auth --> selected(Selected) selected --> auth selected --> selected ``` Note: Cannot fetch messages from an unselected mailbox. In selected state, updates about the mailbox are received. Cannot receive updates about multiple mailboxes over a single connection. ---- ## Recursive rules ---- ![image](https://hackmd.io/_uploads/SJBWKxE_a.png) ---- ## Ambiguities ---- ```abnf continue-req = "+" SP (resp-text / base64) CRLF resp-text = ... base64 = ... ``` ```text S: AA== // text or base64? ``` ---- ```text S: * OK ... S: * OK [CODE] ... S: * OK [CODE] // What's this? S: * OK [CODE // What's this? ``` * Good: Definitive (?) answer * Bad: Want to fix your implementation? Note: Mark Crispin said "Yes" :P IMAP tells "In the case of alternative or optional rules in which a later rule overlaps an earlier rule, the rule which is listed earlier MUST take priority" imap-types: Need to forbid `[]` in `text` ---- ## `()` vs. `NIL` ---- ```text // Dovecot C: A ID () S: * ID ("name" "Dovecot") S: A OK ID completed. ``` ```text // Outlook C: A ID () S: A BAD ID failed ``` ---- ## `()()` vs. `() ()` ```abnf env-cc = "(" 1*address ")" / nil ; ... address = "(" addr-name SP addr-adl SP addr-mailbox SP addr-host ")" ``` ---- ## Missing rules ---- Cancellation of SASL authentication, i.e., ... `*\r\n` ... is not mentioned in ABNF. ---- ## INBOX has 96 variants ```IMAP inbox ... INBOX "inbox" "INBOX" {5}\r\ninbox ... {5}\r\nINBOX ``` ---- ## INBOX (is confusing) ```text iNbOX/subfolder iNbOX.subfolder iNbOXxsubfolder ``` * Hierarchy separator is variable * "(...) slash should be the One And Only True Hierarchy Delimiter. Almost everybody agrees that it was a mistake to allow others." * Not (entirely) sure * Likely: No special treatment ---- ## Sequence numbers with multiple clients ```text C₁: SELECT INBOX | C₂: SELECT INBOX S₁: 42 EXISTS | S₂: 42 EXISTS | C₁: STORE 42 +FLAGS \Deleted | C₂: APPEND INBOX {1024} C₁: EXPUNGE | C₂: … S₁: 42 EXPUNGE | S₂: 43 EXISTS ``` What happens with the following commands? ```text C₁: FETCH 42 | C₂: FETCH 42 C₁: FETCH 43 | C₂: FETCH 43 ``` Note: Either new visuals, or delete ---- ## Sequence numbers with multiple clients ```text C₁: SELECT INBOX | C₂: SELECT INBOX S₁: 42 EXISTS | S₂: 42 EXISTS | C₁: STORE 42 +FLAGS \Deleted | C₂: APPEND INBOX {1024} … C₁: EXPUNGE | S₂: 43 EXISTS S₁: 42 EXPUNGE | | C₁: NOOP | C₂: NOOP S₁: 42 EXISTS | S₂: 42 EXPUNGE ``` Note: Server keeps a per-client view of message sequence numbers
{"title":"Things we wish we knew before starting an IMAP library","description":"Atom vs AtomExt","slideOptions":"{\"theme\":\"white\",\"transition\":\"slide\"}","contributors":"[{\"id\":\"dcc3b45a-49a6-4740-b16b-11302e644b1e\",\"add\":25776,\"del\":18503},{\"id\":\"b278e497-0cd1-42a1-ba06-22ebce6569e5\",\"add\":16542,\"del\":11163}]"}
    262 views
   Owned this note