![Kagi Doggo Resting](https://assets.kagi.com/v1/kagi_assets/doggo/doggo_4.png)
# Kagi DB Practices
The following is a quick summary of techniques that we use to program
our database layer, that has allowed us to scale our DB code with minimal
issues as our app has grown in complexity and user base size.
## Tooling
### Cloud
We run on GCP, using Postgres under Cloud SQL, using one primary instance
and many zonal read-replicas.
### Libraries
We write in raw SQL using `crystal-db` and `crystal-pg`.
### Migrations
We use `micrate`. It has some rough edges, but it's a great & simple tool.
### psql
We make heavy use of `psql` for quick development. Particularly it becomes
very nice to work with using `\e` to open `$EDITOR` to edit queries.
Once the query works as we expect it, because we do not use any ORM, it
is mostly just a matter of pasting the query into our code and writing a
datatype (if relevant).
I can also recommend some lines from my `.psqlrc`:
```
\set QUIET 1
\x auto
\timing
\set COMP_KEYWORD_CASE upper
\set PROMPT1 '%[%033[1m%]%M %n@%/%R%[%033[0m%]%# '
\set PROMPT2 '[more] %R > '
\set VERBOSITY verbose
\set HISTFILE ~/.psql_history- :DBNAME
\set HISTCONTROL ignoredups
\pset null '(null)'
\pset pager off
\pset linestyle unicode
\unset QUIET
```
With options like `\timing`, we get an immediate sense if something could
be slow, and also have the chance to run the query planner to do analysis
before even integrating the query into code.
It also means use the same tool locally as we do for remote debugging on
staging or production database instances - it's a great tool to become
familiar with.
## API Design
All queries are exposed as static methods in our `DB` module:
```crystal
module DB
struct ReturnType
include ::DB::Serializable
include ::DB::Serializable::NonStrict
getter col1 : Int32
getter col2 : Bool
end
def self.table_query_name(...) : ReturnType
# ..
end
end
```
Queries are NOT written for arbitrary CRUD updates to the backing table.
Sometimes we do end up writing a plain CRUD layer, but that is not how we
approach writing the API.
Each query is written to serve a specific function that our app needs. In many
cases, there is no singular "backing table" - any query may read a subset of a
table,or join multiple tables, or even return virtual columns that do not exist
on disk. In a sense, our DB layer is "feature oriented" rather than any attempt
at a generic "object relational" mapping.
Treating our DB code as a sort of FFI layer, this decoupling brings many
benefits.
This means if we alter or remove a query, we have a list of all app features
that are affected via compiler errors.
We can also change the query underneath without affecting app code.
We can tune the performance of each query for its specific use case, rather
than attempting to preemptively optimize arbitrary table access.
Cloud SQL metrics will show you the raw SQL submitted to the database, and the
queries load on the database. Since we use raw SQL in our code, it is very easy
to trace back the under-performing query in our code base with a grep.
### Naming
Query methods are usually written in "big endian" name after the principle table
that they operate on - usually whatever table the query is `FROM`, even if there
are `JOIN`s involved - followed by a description of the action they perform.
This keeps names mostly predictable, and editors still autocomplete this just
fine, even without LSP.
### Heredocs
Almost all queries are written using Crystal's heredoc syntax:
```crystal
def self.table_action(param1, param2, *, db = SQL)
db.exec(<<-SQL, param1, param2)
UPDATE
table
SET
param1 = $1,
param2 = $2
SQL
end
```
This allows formatting queries in a legible and unobtrusive way, without having
to deal with any escapes. It will also trim leading whitespace using the last
`SQL` as a guide, which makes logs easier to read as well. Finally, editors and
sites like GitHub will actually recognize this and enable SQL syntax
highlighting in that section.
Almost all queries use a static string. A select few build an SQL string
incrementally: in most cases, this is simply tacking on an extra `WHERE` clause.
### Parameters
#### Objects vs IDs
API parameters accept only what they need. Generally what this means, if a query
operates on a user ID, it does not accept `User`, just `user_id`.
This gives a lot more flexibility by not forcing code to produce an entire
`User` object when that particular operation does not need any field other than
the user's ID.
We do not make overloads that accept `User`. It just adds noise / more code than
just typing `user.id`.
#### DB Parameter
*Every* query binding accepts an untyped, named `db` parameter. It has a
default value of a connection pool instance. (`DB::Database`)
We have two connection pools in global constants:
- `SQL`, a read-write primary
- `SQL_RR`, a zonal read-only replica
> Weird naming, but `SQL` was a legacy decision when we started the codebase
> because we couldn't pick a better one. And `SQL_RR` was added later. You
> can probably pick better names.
This has a few properties:
- At a glance, you can confirm or search for which queries go to our
read replica, and which need to perform writes.
- Switching a query between RW primary and RO replica means just changing
`SQL` -> `SQL_RR`, or vice versa.
```crystal
# targets primary
def self.table_create(..., *, db = SQL)
# ..
end
# targets read-replica
def self.table_get(..., *, db = SQL_RR) : ReturnType
# ..
end
```
> The `*, db` enforces that `db` must be used with an explicit
> `db:` argument by callers.
The untyped parameter allows for either a `DB::Database` **or** a
`DB::Connection` to be passed. This allows for trivially composing
multiple queries in a transaction with zero changes needed to the
API:
```crystal
# one-shot queries, no tx
table_update1(id, param1)
table_update2(id, param1)
table_update3(id, param1)
# wrap in a tx
DB.transaction do |tx|
db = tx.connection
table_update1(id, param1, db: db)
table_update2(id, param1, db: db)
table_update3(id, param1, db: db)
end
```
These transaction operations themselves will usually be wrapped in their
own static method, that also takes a `db` arg at the top level, for
consistency with everything else.
### Return Types
Most often, each query is given its own `Serializable` type bound to
the return type of that particular query. This way, each query is
treated as its own independent unit of code that can be freely moved around
and changed without concern of having a rippling effect across a lot of queries,
i.e. adding a new column without a default.
This also gives us a lot of control over bandwidth and memory size on a query
level - for example, some queries may only return a *subset* of the `users`
table columns that are needed for that particular query.
Every type will `include ::DB::Serializable::NonStrict`. This is important for
zero-downtime updates to our app - adding new columns should not crash any
existing code.
We may place utility methods on these types, but they are *always* kept
simple: performing some summary of data or state of the returned object,
and never i.e. performing more queries.
#### Enums
We prefer to use ints in DB for crystal enums, with explicitly declared values.
Using pg enums we felt was not worth the investment, nor did we prefer `TEXT`.
Consider also that we have things other than Crystal interacting with our
tables (Python, Go, ...)
We use the following converters to override the default serialization, and
everything has worked out fine.
```crystal
module EnumInt(EnumT)
def self.from_rs(rs)
value = rs.read(Int32)
EnumT.new(value)
end
end
module EnumNullInt(EnumT)
def self.from_rs(rs)
value = rs.read(Int32?)
return unless value
EnumT.new(value)
end
end
macro enum_str_lookup(value, enum_t)
case value
{% for member in EnumT.constants %}
when {{member.stringify.underscore}}.to_slice
EnumT::{{member}}
{% end %}
else
{{yield}}
end
end
module EnumStr(EnumT)
def self.from_rs(rs)
value = rs.read(Bytes)
enum_str_lookup(value, EnumT) do
raise "Unknown enum value for #{EnumT}: #{String.new(value)}"
end
end
end
module EnumNullStr(EnumT)
def self.from_rs(rs)
value = rs.read(Bytes?)
return unless value
enum_str_lookup(value, EnumT) do
nil
end
end
end
```
The `Str` variants were used to transition some columns away from using
strings.
### Update Confirmation
Minimally, our `UPDATE` or `DELETE` bindings will return a bool to confirm if
the query had any affect, in case the caller needs to know this. The majority
operate directly on a PK, so there will always be 0 or 1 matches.
```crystal
def self.table_update(id : Int64, param : String) : Bool
r = db.exec(<<-SQL, id, param)
UPDATE table SET param = $1
SQL
r.rows_affected == 1 # or > 0, if editing multiple
end
```
### Errors
None of our DB bindings raise errors (besides those propagated from
the `db` or `pg` shard itself). If the caller needs to handle some
case, we return an enum or union, and the caller uses exhaustive `case`.
Exceptions are unnecessary overhead 99% of the time, where we always just
need to inform the immediate caller of some logical state. Exceptions
are left for *actual* exceptions (protocol errors, unavailability, ...).
## Addendums
### Placement of Static Methods
Initially, we started by attaching static methods that do queries directly
on the "model" that they primarily deal with.
For example:
```crystal
module DB
struct User
def self.create(email : String) : User
# ..
end
def self.get(email : String, full : Bool) : User?
# ..
end
end
end
```
It felt fine to do this at first, but also stemmed from the fact that we
initially started our project with an ORM before removing it.
However, as our codebase grew, it became apparent that this was not a
great idea - more complex queries begged distracting questions as to
which "model" they should be placed on, so you wind up with a mess of
inconsistencies.
So I suggest: completely discard the idea of attaching the queries to
any "model" type; it is not useful, and does not scale.
All new queries that we write go directly as static methods on our `DB` module,
with the freedom to pick an expressive name, decoupled from congregating around
any one particular datatype or table.
Today we would write it as:
```crystal
module DB
# Limited/essential data about a user (ID, email)
struct UserLite
# ..
end
# "All" user data (ID, email, + all settings)
struct UserFull
# ..
end
def self.user_create(email : String, *, db = SQL) : UserLite
# ..
end
def self.user_get_lite(email : String) : UserLite?
# ..
end
def self.user_get_full(email : String) : UserFull?
# ..
end
end
```