Learn PostgreSQL

# Learn PostgreSQL # Chapter 1: Introduction to PostgreSQL ## Brief introduction PostgreSQL is incompatible between major versions due to data binary format -> need to dump and restore data. EOL time for stable versions is 5 years. ## PostgreSQL terminology A PostgreSQL instance (running daemon) = a cluster -> a cluster contains many different databases (isolated spaces) -> a database is organized into namespaces called **schemas**, schemas are flat. Every database objects (tables, funcs, triggers,...) belongs to one schema (named as the user creates the object if not specified) *Note: prior version 15, default is **public** schema, now is personal schema* Users are cluster-wide. Two types of users: - Normal users: depend on privilege set - Superusers: can do anything **Catalogs**: special tables and views presenting PG internal data, status,... in SQL way. It's information schema but stronger. **PGDATA**: directory contains user data and internal status (one time init). Contains at least WALs and data storage **postmaster**: bootstrap instance, spawn needed process to manage database. **backend process**: a fork of postmaster to server exact one user connection. ## Installing PG ### What to install - PG `server`: serve databases and store data - PG `client`: library and client tool - PG `contrib` package: extensions and utils - PG `docs`: man pages doc - PG PL/Perl, PL/Python, PL/Tcl: usage of these 3 langs within PG server Recommended set: server, client and `contrib` modules ### Ways to install - Install from binary packages: Debian-based, Fedora-based and BSD systems - Install from source: build using C and libs - Install with `pgenv` ### Install from binary packages There are two sources: OS vendor and PGDG. On Linux-based systems, PGDG is better, OS vendor is usually outdated. On BSD systems, OS vendor is good. # Chapter 2: Getting to Know Your Cluster ## Managing your cluster ### pg_ctl Perform management operations on a cluster: `start`, `stop`, `restart`, `status`, `initdb/init` (init the cluster, optionally remove old data), `reload`, `promote`. `pg_ctl` interact -> `postmaster` process -> redirects commands to other processes (`postmaster` may have name `postgres` on some systems, it's the first process launch within the cluster). `postmaster` is run by a normal user `postgres` (to prevent priv esc), owner of `PGDATA` dir (set by env var or `-D` flag) Stopping a cluster maybe problematic, three modes to stop with `-m`: - `smart`: wait for all clients to disconnect - `fast`: disconnect every clients (**default**) - `immediate`: abort every PG processes, shut down cluster in dirty way, data is dirty ### PG processes In Docker, PID 1 is the `postmaster` process, childer processes: - `postgres: checkpointer`: executing checkpoints, points in time where database ensures all data is actually stored persistently on disk - `postgres: background writer`: push data out of memory to permanent storage - `postgres: walwriter`: write the WALs - `postgres: logical replication launcher`: handling logical replication Some more processes (depends on configuration): - `background workers`: customized by users to perform background tasks - `WAL receiver and/or WAL sender`: receive/send data to another cluster in replication `backend process`: a process for handling client connection. PG uses processes instead of threads for concurrency for isolation and portability. ## Connecting to the cluster `initdb` -> builds the fs layout of `PGDATA` dir with 2 template dbs `template0` and `template1` `postgres`: default common space, belongs to `postgres` user `psql -l`: list dbs ### The template databases `template1` is the first db, then it's cloned to `template0` (as a backup, cannot connect). These dbs are not required, but creating a new db requires a template db. `template1` is the base common database for every new dbs ### The psql CLI Command termination can be either `;` or `\g`. `\i` for executing SQL file, `\e` for editing sql file or queries with default editor (`EDITOR` env var) `LibPQ`: underlying library that every app uses to connect to PG. An example connection string in `LibPQ`: `postgresql://username@host:port/database` ## Solving common connection problems `connection refused`: network problem -> edit `postgresql.conf` to listen all on interfaces `listen_addresses = '*'` -> restart `no pg_hba.conf entry`: PG built-in Host-Based Access (HBA) does not permit to enter -> edit `pg_hba.conf` add rule to allow, for example: `host all luca 127.0.0.1/32 trust` -> reload config ## Exploring the disk layout of PGDATA Main files: - `postgresql.conf`: main config file - `postgresql.auto.conf`: auto included config file, store dynamically changed settings via SQL - `pg_hba.conf`: HBA - `PG_VERSION`: text file contains the major version number - `postmaster.pid`: PID of master process Main directories: - `base`: user's data, including dbs, tables and other objects - `global`: cluster-wide objects - `pg_wal`: WAL files - `pg_stat` and `pg_stat_tmp`: permanent and temp information of status and health of the cluster ### Objects in PGDATA `base` dir contains files with numeric id. PG uses catalog to match memonic name to id. The identifier is `OID` (object identifier = `filenode`) `oid2name` is used to show mapping of OID to mnemonic names. To inspect a file: `oid2name -d <db> -f <filenode>` If file > 1GB -> filenode is name like: `123`, `123.1`, `123.2`, each with max cap of 1 GB ### Tablespaces `tablespace`: dir that can be outside of `PGDATA`, mapped into `PGDATA` by symbolic links in `pg_tblspc` subdir. It's used for different kinds of storage performance `oid2name -s`: list tablespaces 2 default tablespaces: - `pg_default`: default tblspc for objects directly under `PGDATA` - `pg_global`: system-wide objects ## Exploring configuration files and parameters Issue the a cluster `SIGHUP` signal to reload `postgresql.conf`. Every param associated with a `context`, the cluster may restart or not depending on the context. Available contexts: - `internal`: compile time, cannot be changed at runtime - `postmaster`: need to restart (the `postmaster` process) - `sighup`: issue a reload signal, `SIGHUP` to `postmaster` process - `backend` and `superuser-backend`: set at runtime but will be applied to next normal or admin connection - `user` and `superuser`: changed at runtime, immediately active for connections `postgresql.auto.conf` has similar syntax, but for dynamic SQL such as `ALTER SYSTEM`, it overwrites. `pg_hba.conf`: firewall for PG # Chapter 3: Managing Users and Connection ## Users and groups `role`: single user, group of users or both, but it should be only one (for easy mngt). It has unique username `role` is cluster-level, permission is database-level ## Managing roles `CREATE`/`ALTER`/`DROP` role for managing roles. ### Creating new roles Syntax: `CREATE ROLE name [ [ WITH ] option [ ... ] ]` Positive form that adds ability, and negative form (with `NO` prefix): `SUPERUSER` and `NOSUPERUSER` ### Passwords, connections, availability Main authentication options: - `PASSWORD` or `ENCRYPTED PASSWORD`: normal userpass - `PASSWORD NULL`: deny password-based authentication - `CONNECTION LIMIT <n>`: max n connections - `VALID UNTIL`: expiration time Example: create a user with pass + login permission `CREATE ROLE luca WITH PASSWORD 'xxx' LOGIN;` (default is `NOLOGIN`) ### Using a role as a group Create a group and assign users to group: ``` CREATE ROLE book_authors WITH NOLOGIN; -- a group is a role without LOGIN CREATE ROLE luca WITH LOGIN PASSWORD 'xxx' IN ROLE book_authors; -- IN ROLE assign a role to another role GRANT book_authors TO luca; -- assign a role to another role with GRANT command ``` Create a group with admin (able to add new members to group): ``` CREATE ROLE book_reviewers WITH NOLOGIN ADMIN luca; GRANT book_reviewers TO luca WITH ADMIN OPTION; ``` ### Removing an existing role Syntax: `DROP ROLE [ IF EXIST ] name [, ...]` Drop a group: member roles exist but no permission on the parent groups anymore (no cascading) ### Inspecting existing roles `SELECT CURRENT_ROLE`: get current role `<psql> \du`: list roles `pg_roles`: catalog storing role information, `pg_authid` is similar but it shows encrypted passwords (`SCRAM-SHA-256`) ## Manging incoming connections at the role level HBA is checked before checking credentials. Changes to `pg_hba.conf` needs to issue a reload signal to the cluster. Can be done from SQL with `SELECT pg_reload_conf();` (superuser only) This file can also include sub configuration files for cleaner structure, use `include_file`/`include_if_exist`/`include_dir` ### pg_hba.conf syntax Line: `<connection-type> <database> <role> <remote-machine> <auth-method>` Meaning: - `connection-type`: `local` (OS sockets) or `host` (TCP/IP, both ssl and no), `hostssl`, `nohostssl` - `database`: name of db, `all`, `replication` (connection used to replicate data) - `role`: name of user/group, `all` - `remote-machine`: hostname, IP, subnet, `all`, `samehost`, `samenet` - `auth-method`: `scram-sha-256`, `md5`, `reject`, `trust` (always accept) ### Order of rules The rules are processed in top-down order, one matches, next ones are skipped ### Merging rules User `comma` for the parts, to simplify the rules ### Using groups instead of single roles Using `+` to allow members from a group. for example: `host forumdb +book_authors all scram-sha-256` ### Using files instead of single roles `role` field accepts a text file (line- or comma- separated), file location is relative to `PGDATA` for example: ``` host forumdb @rejected_users.txt all reject host forumdb @allowed_users.txt all scram-sha-256 ``` ### Inspecting the rules Using the catalog `pg_hba_file_rules` ``` SELECT line_number, type, database, user_name, address, auth_method FROM pg_hba_file_rules; ``` # Chapter 4: Basic Statements ## Managing databases `CREATE DATABASE databasename;`: create db -> makes a physical copy of template db `template1` -> assign db name `\l`: list databases `\dt`: list tables `create database forumdb2 template forumdb;`: make a copy of database `drop table`/ `drop database`: drop tables/databases ### Schema Normal users can't perform DDL on `public` schema, and only perform DML on `public` schema if allowed to `search_path`: sequences of schemas that PG uses to find tables (default `$user,public`) ### Confirming the db size `\l+ <db name>`: view the size field via `psql` `select pg_database_size('<db name>');`: via SQL query `select pg_size_pretty(pg_database_size('forumdb'));`: human readable size ### Behinds the scene of db creation PG copies the a new directory into `base` data dir, give its a new name, new oid ## Managing tables 3 table types: - **Temporary tables:** very fast tables, visible to user created them - **Unlogged tables:** very fast tables, support tables common to all users - **Logged tables:** normal tables ### Temporary tables `create temp table...`: create a temp table, only visible inside the session `create temp table... on commit drop`: execute inside a transaction -> table only visible within transaction ### Unlogged tables Much faster than normal tables but not crash-safe (data consitency is not guaranteed when crashing) `create unlogged table...`: create unlogged table ## Basic DML ### NULL values `\pset null NULL`: view the NULL values present in the table `... WHERE <field> IS NULL`: filter for NULL value or (`NOT NULL`) Sorting by null values is default same as `... ORDER BY <field> NULLS LAST;` (or `FIRST`) ### Copy tables `create table temp_categories as select * from categories;`: copy table data and structure ### Delete table data `TRUNCATE TABLE <table name>`: delete all data, much faster than `DELETE` # Chapter 5: Advanced Statements ## SELECT statement `coalesce`: give two or more params, return the first not null `IN`/`NOT IN`: filter a record set `EXISTS`/`NOT EXISTS`: whether a subquery returns `TRUE` or `FALSE` ### Joins `JOIN` is preferred on `IN` or `EXISTS` due to better execution speed `select <fields> from <table1> <JOIN> <table2> <WHERE>` Type of joins: - `cross joins`: product join - `inner joins`: the intersection - `left joins`: intersection + left table - `right joins`: intersection + right table - `full outer join`: right join + left join - `lateral join`: join a table with subquery (run for each row) ## Aggregate functions Used in conjunction with `group by`. Before grouping the data, PG sorts it internally. ### UNION/UNION ALL Combine 2 or more `SELECT` -> same number of cols, same data types, same order `UNION`: removes duplicates, `UNION ALL` doesn't ### EXCEPT/INTERSECT `EXCEPT`: distinct rows from left query not in output of right query `INTERSECT`: intersect output ## UPSERT With PG Syntax: `INSERT INTO table_name(column_list) VALUES(value_list) ON CONFLICT target action;` Target action: `DO NOTHING` or `DO UPDATE SET...` ## RETURNING clause for INSERT `RETURNING` returns field after a insert/update/delete command `MERGE` can merge multiple rows from another tables ## Exploring CTEs `CTE`: common table expression, temp result taken from a SQL statement, lifetime = query ``` WITH cte_name (column_list) AS ( CTE_query_definition ) statement; ``` Example: ``` with posts_author_1 as (select p.* from posts p inner join users u on p.author=u.pk where username='enrico_pirozzi') select pk,title from posts_author_1; ``` ### Recursive CTEs Recursive CTE made of: - a non-recursive statement, for bootstrapping - a recursive statement, refer the boostrap or itself # Chapter 6: Window Functions Window function is similar to aggregate functions but doesn't group rows. ## Basic statement window functions Window functions doesn't group, but it replicaes the result for all rows. Have to add `DISTINCT` to have same result as `GROUP BY` ![image](https://hackmd.io/_uploads/SJPow88Lbx.png) ``` select category, count(*) over (partition by category) from posts order by category; ``` # Chapter 7: Server-Side Programming ## Exploring data types Boolean, numeric, character, date/time, nosql: `hstore`, `xml`, `json`, `jsonb` Numeric: `integer`, `real` and `numeric` type, should use `numeric` for money related data due to ceiling/flooring errors in `real` type Character: `char(n)` (fixed lenght, fill with blanks), `varchar(n)` (fixed max length) , `varchar`/`text` (unlimit length) ### Date data types View date time settings `select * from pg_settings where name ='DateStyle';` Simple date convert: `select to_date('31/12/2020','dd/mm/yyyy') ;`, the reve rse version is `to_char()` function. Timestamps have 2 types: with or without timezone. `show timezone`, `set timezone='CET';` ### NoSQL types `hstore`: storing key-value in a single value `row_to_json()`: convert row data to json. `jsonb` stores data as binary and indexable. ## Functions and languages PG supports server-side code: `SQL`, `C`. And it supports `PL/Python`, `PL/Perl`, `PL/Java` with addtional libraries ``` CREATE FUNCTION function_name(p1 type, p2 type,p3 type, ....., pn type) RETURNS type AS BEGIN -- function logic END; LANGUAGE language_name ``` `PL/pgSQL` can be used as a procedural language for programming functions. It has conditional statements, switch case, for loops, errors handling Function types: - `VOLATILE` (default): can do anything - `STABLE`: can't modify db, return same results in same transaction - `IMMUTABLE`: can't modify db, return same results forever ### security definer `security definer` allows the user to invoke a function as if they were its owner # Chapter 8: Triggers and Rules ## Exploring rules in PG Rules are simple event (**DML**) handlers. For a write operation (`INSERT`/`DELETE`/`UPDATE`), 3 possible actions: - cancel - perform another op instead of `INSERT` - execute it and another op in 1 transaction ![image](https://hackmd.io/_uploads/B1Htqi2LZg.png) ``` CREATE [ OR REPLACE ] RULE name AS ON event TO table [ WHERE condition ] DO [ ALSO | INSTEAD ] { NOTHING | command | ( command ; command ... ) } ``` Rules are always executed before the event. ## Managing triggers in PG Triggers are complex event (**DML**) handlers. Can handle I/D/U and `TRUNCATE`. Can be execute before `or` after the event. Rules are executed before triggers if they exist in a same table `TG_OP`: a variable tells which event the trigger is fired -> `INSERT`, `UPDATE`, `DELETE` and `TRUNCATE` ## Event triggers Rules/triggers are triggered by data changes, not data layout or table properties. `event triggers` are for **DDL** statements `command`: `CREATE`, `ALTER`,... `event`: - `ddl_command_start`, `ddl_command_end` - `sql_drop`: a `DROP` command is near completion - `table_rewrite`: full table rewrite is about to begin ``` CREATE EVENT TRIGGER name ON event [ WHEN filter_variable IN (filter_value [, ... ]) [ AND ... ] ] EXECUTE { FUNCTION | PROCEDURE } function_name() ``` Database scope, not attached to any tables, must be created by database administrators # Chapter 9: Partitioning ## Basic concepts All databases grows in size. `share_buffers`: server's RAM shared among all PG pses, to manage data present in tables. 1. Data is taken from hard disks. 2. Data is placed in shared buffers. 3. Data is processed in shared buffers. 4. Data is downloaded to disks. `shared_buffers` = 1/3 or 1/4 total server RAM (in dedicated PG server) Partitioning -> split a very large table into smaller tables. 2 ways: - table inheritance (PG < 10) - declarative partitioning (best way from PG >= 10) - range - list - hash ### Range partitioning Large tables, data can be divided by time range (audit tables, log tables)... ### List partitioning Data divided by a single field (city or state in customer list) ### Hash partitioning A hash function on a field ### Table inheritance Tbl A is parent, Tbl B is child -> all records in B can be accessed from A. PG propagates the operations on parent table to child tables. ## Declarative partitioning ``` CREATE TABLE ... PARTITION BY LIST/RANGE (<field>); ``` Partition key must be part of primary key. ``` CREATE TABLE <part_table_name> PARTITION OF <original table> FOR VALUES IN/FROM TO (...); ``` Index is auto propagated to child tables. ### Partition maintenance Attach partition: `CREATE TABLE <part_0> PARTITION OF <original table> FOR VALUES IN/FROM TO (...);` Detach partition: `ALTER TABLE <original table> DETACH PARTITION <part_table_name> ;`