### 1. Semantic design change #### 1.1 Naming problem In 7X, partitions are first-class tables. There is no such thing as alias (partition name) in 7X, therefore `pg_partitons.partitionname` and `pg_partitons.parentpartitionname` no longer make sense and not available either. Take the following table `partrl` as an example: ``` -- Classic Syntax -- Multi-level range-list partitioned table CREATE TABLE partrl (a int, b int, c int) DISTRIBUTED BY (a) PARTITION BY range(b) SUBPARTITION BY list(c) ( PARTITION p1 START (10) END (20) EVERY (5) ( SUBPARTITION sp1 VALUES (1, 2) ), PARTITION p2 START (0) END (10) ( SUBPARTITION sp2 VALUES (3, 4), SUBPARTITION sp1 VALUES (1, 2), DEFAULT SUBPARTITION others ) ); ``` Below is the `pg_partitions` entry in 6X for leaf partition `partrl_1_prt_p2_2_prt_sp1`. ``` test=# select * from pg_partitions where partitiontablename = 'partrl_1_prt_p2_2_prt_sp1'; -[ RECORD 1 ]------------+------------------------------ schemaname | public tablename | partrl partitionschemaname | public partitiontablename | partrl_1_prt_p2_2_prt_sp1 partitionname | sp1 parentpartitiontablename | partrl_1_prt_p2 parentpartitionname | p2 partitiontype | list partitionlevel | 1 partitionrank | partitionposition | 2 partitionlistvalues | 1, 2 partitionrangestart | partitionstartinclusive | partitionrangeend | partitionendinclusive | partitioneveryclause | partitionisdefault | f partitionboundary | SUBPARTITION sp1 VALUES(1, 2) parenttablespace | pg_default partitiontablespace | pg_default ``` `partitionname = 'sp1'` and `parentpartitionname = 'p2'` are not applicable in 7X. In 7X, we'd only have `partitionschemaname = 'partrl_1_prt_p2'` and `parentpartitiontablename = 'partrl_1_prt_p2_2_prt_sp1'`. ##### 1.1.1 Why did we need `partitionname` and `parentpartitionname` in the first place? It looks like `parentpartitionname` has been used to identify subpartitions. In 7X, there are multiple ways to acquire such information: - Option 1: use `pg_class.relispartition = 't' and pg_class.relkind = 'p'` to identify a middle-level table. - Option 2: use ` pg_partition_ancestors()`. ```sql test=# SELECT * FROM pg_partition_ancestors('partrl_1_prt_p2_2_prt_sp1'); relid --------------------------- partrl_1_prt_p2_2_prt_sp1 partrl_1_prt_p2 partrl (3 rows) -- This tells that the given table has more than one partition ancestors, -- hence it is a "subpartition". test=# SELECT count(*) - 1 as num_of_ancestors FROM pg_partition_ancestors('partrl_1_prt_p2_2_prt_sp1'); num_of_ancestors ------------------ 2 (1 row) ``` #### 1.2 Partition hierachy: heterogeneous in 7X vs. homogeneous in 6X This design philosophy difference affects the usage of the `pg_partition_columns` view. The entires of the `pg_partition_columns` view in 6X for the example `partrl` table looks like below: ``` test=# select * from pg_partition_columns where tablename = 'partrl'; schemaname | tablename | columnname | partitionlevel | position_in_partition_key ------------+-----------+------------+----------------+--------------------------- public | partrl | b | 0 | 1 public | partrl | c | 1 | 1 (2 rows) ``` However, in 7X, middle level partitions on the same level may be partitioned w/ differernt strategies AND partition keys. So, in 7X, using a given root table and a level can not uniquely identify the partition strategy and key(s). It seems this view is mostly used to fetch column name of specified partition level. Example heterogeneous partition hierachy in 7X: ```sql -- Modern Syntax hetrogeneous example -- Multi-level range-<list|range> partitioned table CREATE TABLE partrlr (a int, b int, c int) DISTRIBUTED BY (a) PARTITION BY range(b); -- First child on level 1 is further partitioned by list on column c. CREATE TABLE partrlr_p1 PARTITION OF partrlr FOR VALUES FROM (10) TO (15) PARTITION BY list(c); CREATE TABLE partrlr_p1_sp1 PARTITION OF partrlr_p1 FOR VALUES IN (1, 2); -- Second child on level 1 is further partition by range on column (a). CREATE TABLE partrlr_p2 PARTITION OF partrlr FOR VALUES FROM (0) TO (10) PARTITION BY range(a); CREATE TABLE partrlr_p2_sp1 PARTITION OF partrlr_p2 FOR VALUES FROM (3) TO (4); CREATE TABLE partrlr_p2_sp2 PARTITION OF partrlr_p2 FOR VALUES FROM (1) TO (2); CREATE TABLE partrlr_p2_others PARTITION OF partrlr_p2 DEFAULT; ``` In 7X, had we have the `pg_partition_columns` view available, the entries for `partrlr` would look like below: ``` test=# select * from pg_partition_columns where tablename = 'partrlr'; schemaname | tablename | columnname | partitionlevel | position_in_partition_key ------------+-----------+------------+----------------+--------------------------- public | partrlr | b | 0 | 1 public | partrlr | c | 1 | 1 public | partrlr | a | 1 | 1 (3 rows) ``` This view seems unnecessary/misleading in 7X because: - column name can be easily fetched from `pg_partitioned_table` and `pg_attribute`. - w/ the flexibility in 7X, partition columns returned from this view may come from differerent partitions. Not sure if this is what users expect. - partition key can be expressions rather than columns ### 2. Syntatic change #### 2.1 `FOR RANK` is not valid syntax in 7X Instead use `FOR VALUE` in `ALTER TABLE ALTER PARTITION` clause to refer to a child partition. Because of the removal of the `FOR RANK` syntax, `pg_partitions.partitionrank` is not required by many user queries, hence neither calculated nor stored in the catalog. For range partition maintenance purposes, such as finding the highest or lowest partition boundries, we may still want to sort partitions by their partition boundaries, for those cases, we can evaluate the usage and create UDFs based on 7X catalog. #### 2.2 `pg_dump` and `pg_restore` use modern syntax Keeping `pg_partitions.partitionboundary` which stores partition boundary specification in classic syntax may be unnecessary and even error prone. ### 3. Catalog change #### 3.1 `pg_partitions.partitionposition` is not stored in the catalog For range partitions, `partitionposition` is same as `partitionrank`, they are not stored in the catalog. We can acquire them by writing custom queries or UDFs to sort the range partition boundaries. List partition has problem with it. For list partitions, `partitionposition` monotonically increases as new list partitions being created. This information we don't store in 7X's catalog, and we can't reconstruct either. For hash partitions, which is new in Greenplum, we don't have a definition for `partitionposition` with hash partitions. #### 3.2 Some fields would be dummy - `pg_partitions.partitionstartinclusive`: always true - `pg_partitions.partitionendinclusive`: always false - `pg_partitions.partitioneveryclause`: always NULL #### 3.3 `pg_class.relpartbound` stores partition boundaries in 7X Example in 7X: ``` test=# select relname, pg_get_expr(relpartbound, c.oid) as relpartbound from pg_class c join pg_partition_tree('partrl') p on c.oid = p.relid; relname | relpartbound ------------------------------+------------------------------ partrl | partrl_1_prt_p1_1 | FOR VALUES FROM (10) TO (15) partrl_1_prt_p1_2 | FOR VALUES FROM (15) TO (20) partrl_1_prt_p2 | FOR VALUES FROM (0) TO (10) partrl_1_prt_p1_1_2_prt_sp1 | FOR VALUES IN (1, 2) partrl_1_prt_p1_2_2_prt_sp1 | FOR VALUES IN (1, 2) partrl_1_prt_p2_2_prt_others | DEFAULT partrl_1_prt_p2_2_prt_sp2 | FOR VALUES IN (3, 4) partrl_1_prt_p2_2_prt_sp1 | FOR VALUES IN (1, 2) (9 rows) ``` from which `partitionrangestart`, `partitionrangeend`, `partitionlistvalues` can be parsed, if needed. #### 3.4 `pg_partitions.partitionlevel` is one off Taking `partrl` as an example: In 7X, root partition is at level 0 and leaf partitions are at level 2: ``` test=# select * from pg_partition_tree('partrl'); relid | parentrelid | isleaf | level ------------------------------+-------------------+--------+------- partrl | | f | 0 partrl_1_prt_p1_1 | partrl | f | 1 partrl_1_prt_p1_2 | partrl | f | 1 partrl_1_prt_p2 | partrl | f | 1 partrl_1_prt_p1_1_2_prt_sp1 | partrl_1_prt_p1_1 | t | 2 partrl_1_prt_p1_2_2_prt_sp1 | partrl_1_prt_p1_2 | t | 2 partrl_1_prt_p2_2_prt_others | partrl_1_prt_p2 | t | 2 partrl_1_prt_p2_2_prt_sp2 | partrl_1_prt_p2 | t | 2 partrl_1_prt_p2_2_prt_sp1 | partrl_1_prt_p2 | t | 2 (9 rows) ``` In 6X, `pg_partitions` doesn't have an entry for root table `partrl`, and `partitionlevel` starts from 0 for root table's direct children and 1 for the leaf partitions: ``` test=# select partitiontablename, parentpartitionname, tablename, partitionlevel from pg_partitions where partitiontablename in ('partrl_1_prt_p1_1', 'partrl_1_prt_p1_1_2_prt_sp1'); partitiontablename | parentpartitionname | tablename | partitionlevel -----------------------------+---------------------+-----------+---------------- partrl_1_prt_p1_1 | | partrl | 0 partrl_1_prt_p1_1_2_prt_sp1 | p1_1 | partrl | 1 (2 rows) ``` So `pg_partitions.partitionlevel` and `level` returned from `pg_partition_tree()` is one off, which is not a problem, but can be confusing. ### Summary #### pg_partition_columns schema of this view is problematic for - heterogeneous partition hierachy, AND - expression as partition key(s) #### pg_partitions Unavailable fields: - partitionname - parentpartitionname - partitionposition (List and Hash partitions) Dummy fields: - partitionstartinclusive - partitionendinclusive - partitioneveryclause Potential irrelavent field: - partitionboundary Potential misleading field: - partitionlevel ### [WIP] What do we really need? (What is currently missing in 7X) 1. Given a child table and its parent table, extract following fine-grained range partition information from `pg_class.relpartbound` for partition maintenance: - `partitionrank` - `partitionrangestart` - `partitionrangeend` - `partitionisdefault` 2. Looke more into what is needed for list partition maintenance, `pg_class.relpartbound` has them: - `partitionlistvalues` - `partitionisdefault` 3. Given a root table, maybe we can have a function which tells weather or not the partition hierachy is homogeneous. 4. Use pg_dump to get DDLs instead of pg_get_partition_def() 5. Get other trivial information, such as schema, tablespace, attributenames elsewhere. ### [TODO] Partition maintenance alternatives Need to look more into pg_partman. ### Appendix #### pg_partitions ``` test=# \d pg_partitions View "pg_catalog.pg_partitions" Column | Type | Modifiers --------------------------+----------+----------- schemaname | name | tablename | name | partitionschemaname | name | partitiontablename | name | partitionname | name | parentpartitiontablename | name | parentpartitionname | name | partitiontype | text | partitionlevel | smallint | partitionrank | bigint | partitionposition | smallint | partitionlistvalues | text | partitionrangestart | text | partitionstartinclusive | boolean | partitionrangeend | text | partitionendinclusive | boolean | partitioneveryclause | text | partitionisdefault | boolean | partitionboundary | text | parenttablespace | name | partitiontablespace | name | ``` Easy to get, naming convention may not be consistant w/ new catalog: `schemaname`: root table's schema name `tablename`: root table's table name `partitionschemaname`: this partition's schema name `partitionname`: this partitions table name `parentpartitiontablename`: parent table's schema name is parent is not root `parentpartitionname`: parent table's table name if parent is not root `partitiontype`: parent's partition by strategy `partitionlevel`: parent's partition level `partitionposition`: `partitionlistvalues`: parse text from list partition's relpartbound `partitionrangestart`: parse text from range partition's relpartbound `partrangestartinclusive`: if not null always true `partitionrangeend`: parse text from range partition's relpartbound `partitionrangeinclusive`: if not null always false `partitionisdefault`: parse from relpartbound `parenttablespce`: `partitiontablespace`: Missing in new catalog tables but can write UDF: `partitionrank`: for range partition only, starting from 1. index of this partition after ordering the range partition boundaries `partitionposition`: same as `partitionrank` for range partitions, monotonically increasing number for list partitions Need to understand usage: `partitionboundary`: convert pg_get_expr(relpartbound) to classic syntax #### pg_partition_columns ``` test=# \d pg_partition_columns View "pg_catalog.pg_partition_columns" Column | Type | Modifiers ---------------------------+----------+----------- schemaname | name | tablename | name | columnname | name | partitionlevel | smallint | position_in_partition_key | integer | ``` `schemaname`: root table's schema name `tablename`: root table's table name `columnname`: partition key `partitionlevel`: start from 0 `position_in_partition_key`: always 0 for range partition and can be greater than 0 for list partition. #### Test tables ```sql -- Classic Syntax -- Multi-level range-list partitioned table CREATE TABLE partrl (a int, b int, c int) DISTRIBUTED BY (a) PARTITION BY range(b) SUBPARTITION BY list(c) ( PARTITION p1 START (10) END (20) EVERY (5) ( SUBPARTITION sp1 VALUES (1, 2) ), PARTITION p2 START (0) END (10) ( SUBPARTITION sp2 VALUES (3, 4), SUBPARTITION sp1 VALUES (1, 2), DEFAULT SUBPARTITION others ) ); -- Modern Syntax hetrogeneous example -- Multi-level range-<list|range> partitioned table CREATE TABLE partrlr (a int, b int, c int) DISTRIBUTED BY (a) PARTITION BY range(b); -- First child on level 1 is further partitioned by list on column c. CREATE TABLE partrlr_p1 PARTITION OF partrlr FOR VALUES FROM (10) TO (15) PARTITION BY list(c); CREATE TABLE partrlr_p1_sp1 PARTITION OF partrlr_p1 FOR VALUES IN (1, 2); -- Second child on level 1 is further partition by range on column (a). CREATE TABLE partrlr_p2 PARTITION OF partrlr FOR VALUES FROM (0) TO (10) PARTITION BY range(a); CREATE TABLE partrlr_p2_sp1 PARTITION OF partrlr_p2 FOR VALUES FROM (3) TO (4); CREATE TABLE partrlr_p2_sp2 PARTITION OF partrlr_p2 FOR VALUES FROM (1) TO (2); CREATE TABLE partrlr_p2_others PARTITION OF partrlr_p2 DEFAULT; -- Classic Syntax -- Multi-level range-range partitioned table CREATE TABLE partrr (a int, b int, c int) DISTRIBUTED BY (a) PARTITION BY range(b) SUBPARTITION BY range(c) ( PARTITION p1 START (0) END (10) ( DEFAULT SUBPARTITION other_c ), PARTITION p2 START (10) END (20) ( SUBPARTITION sp2 START (300) END (600), SUBPARTITION sp1 START (100) END (300), DEFAULT SUBPARTITION other_c ) ); -- Sales CREATE TABLE sales (trans_id int, date date, amount decimal(9,2), region text) DISTRIBUTED BY (trans_id) PARTITION BY RANGE (date) SUBPARTITION BY LIST (region) SUBPARTITION TEMPLATE ( SUBPARTITION usa VALUES ('usa'), SUBPARTITION asia VALUES ('asia'), SUBPARTITION europe VALUES ('europe'), DEFAULT SUBPARTITION other_regions) (START (date '2023-01-01') INCLUSIVE END (date '2023-05-01') EXCLUSIVE, DEFAULT PARTITION other_date); ```