Reporte EPEC GC
===
###### tags: `epec_gc` `reporte`
:::info
- **Info:** Lista de los intentos para migrar la DB.
- **Date:** 27/08/2020
:::
## Estado de los planes
- :o: : abortada
- :x: : error
- :construction: en progreso
- :exclamation: : descartado
- :hourglass: : pendiente
- :ok: : implementado y operativo
|numero | plan |estado |
|-------|----------|--------|
|1 | mover la db al chasis nuevo | :exclamation: |
|2 | generar dump de la db sin la telemetria y exportarla en la nueva db | :o: |
|3 | generar dump de la db sin las tablas de mayor peso y exportarla en la nueva db | :x: |
|4 | hacer el restore de un backup existente | :o: |
|5 | borrar datos de la db vieja | :ok: |
|6 | arrancar con una db “limpia” | :o: |
|7 | cambiar la IP usada para conectarse a la DB | :x: |
|8 | hacer uqe "alguien" copie manualmente con un disco | :exclamation: |
|9 | pedir a epec que haga la copia del volumen entre servers | :construction: |
|10| hacer un restore selectivo de un backup | :x: |
|11| hacer la restauracion en una maquina | :o: |
|12| hacer snapshot de la VM | :exclamation: |
## Estado de la Base de datos
::: info
**Tamaño en disco**: 542 GB
:::spoiler resultado comando `du`
```bash
ascentio@tmov03:/opt/thingsboard-db$ sudo du -hs *
100G backups
20K changesets
542G data
4.0K docker-compose.yml
4.0K thingsboard-db.service
ascentio@tmov03:/opt/thingsboard-db$
```
:::
::: info
**tamaño de las tablas de la base de datos (las 4 mayores)**
|table_name |row_estimate|total |index |table |
|-----------------------|------------|----------|----------|-------|
|ts_kv |1.19997E+9 |397 GB |252 GB |145 GB |
|event |7.399E+7 |139 GB |18 GB |110 GB |
|ts_kv_latest |2.58909E+6 |2643 MB |1271 MB |1373 MB|
|attribute_kv |327984 |1659 MB |643 MB |1017 MB|
:::spoiler resultado completo
|table_name |row_estimate|total |index |table |
|-----------------------|------------|----------|----------|-------|
|ts_kv |1.19997E+9 |397 GB |252 GB |145 GB |
|event |7.399E+7 |139 GB |18 GB |110 GB |
|ts_kv_latest |2.58909E+6 |2643 MB |1271 MB |1373 MB|
|attribute_kv |327984 |1659 MB |643 MB |1017 MB|
|alarm |5835 |815 MB |24 MB |204 MB |
|relation |56013 |18 MB |10104 kB |8360 kB|
|audit_log |7271 |3568 kB |368 kB |3112 kB|
|pg_statistic |917 |2608 kB |136 kB |1840 kB|
|widget_type |71 |1888 kB |16 kB |136 kB |
|device_credentials |4037 |1416 kB |232 kB |1176 kB|
|pg_depend |6490 |1200 kB |712 kB |488 kB |
|device |4037 |1032 kB |240 kB |784 kB |
|pg_proc |2961 |992 kB |344 kB |640 kB |
|pg_attribute |3224 |720 kB |216 kB |504 kB |
|pg_rewrite |121 |616 kB |32 kB |128 kB |
|pg_description |3130 |544 kB |184 kB |352 kB |
|dashboard |11 |336 kB |16 kB |48 kB |
|component_descriptor |42 |304 kB |72 kB |168 kB |
|pg_collation |705 |296 kB |80 kB |216 kB |
|pg_class |441 |248 kB |120 kB |128 kB |
|pg_operator |788 |240 kB |88 kB |152 kB |
|pg_amop |722 |200 kB |112 kB |88 kB |
|pg_type |451 |192 kB |72 kB |120 kB |
|pg_amproc |458 |128 kB |64 kB |64 kB |
|pg_constraint |31 |128 kB |80 kB |40 kB |
|user_credentials |11 |112 kB |64 kB |40 kB |
|pg_index |187 |104 kB |32 kB |72 kB |
|pg_conversion |132 |104 kB |48 kB |56 kB |
|sql_features |671 |96 kB |0 bytes |88 kB |
|pg_opclass |134 |88 kB |32 kB |56 kB |
|tb_user |11 |80 kB |32 kB |40 kB |
|pg_cast |226 |80 kB |32 kB |48 kB |
|pg_ts_config_map |304 |80 kB |32 kB |48 kB |
|pg_opfamily |116 |80 kB |32 kB |48 kB |
|pg_language |4 |72 kB |32 kB |40 kB |
|pg_auth_members |3 |72 kB |32 kB |40 kB |
|pg_ts_template |5 |72 kB |32 kB |40 kB |
|pg_database |5 |72 kB |32 kB |40 kB |
|pg_tablespace |2 |72 kB |32 kB |40 kB |
|pg_authid |9 |72 kB |32 kB |40 kB |
|pg_namespace |6 |72 kB |32 kB |40 kB |
|pg_ts_parser |1 |72 kB |32 kB |40 kB |
|pg_ts_dict |16 |72 kB |32 kB |40 kB |
|pg_extension |1 |72 kB |32 kB |40 kB |
|pg_am |6 |72 kB |32 kB |40 kB |
|pg_ts_config |16 |72 kB |32 kB |40 kB |
|pg_shdepend |9 |72 kB |32 kB |40 kB |
|tenant |2 |64 kB |16 kB |40 kB |
|rule_node |23 |64 kB |16 kB |40 kB |
|rule_chain |5 |64 kB |16 kB |40 kB |
|widgets_bundle |12 |64 kB |16 kB |40 kB |
|pg_aggregate |138 |64 kB |16 kB |48 kB |
|customer |1 |64 kB |16 kB |40 kB |
|pg_init_privs |162 |64 kB |16 kB |48 kB |
|admin_settings |2 |64 kB |16 kB |40 kB |
|pg_shdescription |3 |64 kB |16 kB |40 kB |
|pg_range |6 |56 kB |16 kB |40 kB |
|pg_pltemplate |8 |56 kB |16 kB |40 kB |
|databasechangeloglock |1 |56 kB |16 kB |40 kB |
|databasechangelog |5 |48 kB |0 bytes |40 kB |
|sql_languages |4 |48 kB |0 bytes |40 kB |
|sql_implementation_info|12 |48 kB |0 bytes |40 kB |
|sql_sizing |23 |48 kB |0 bytes |40 kB |
|sql_parts |9 |48 kB |0 bytes |40 kB |
|sql_packages |10 |48 kB |0 bytes |40 kB |
|pg_trigger |0 |32 kB |24 kB |0 bytes|
|pg_statistic_ext |0 |32 kB |24 kB |0 bytes|
|pg_attrdef |0 |24 kB |16 kB |0 bytes|
|pg_enum |0 |24 kB |24 kB |0 bytes|
|pg_foreign_server |0 |16 kB |16 kB |0 bytes|
|asset |0 |16 kB |8192 bytes|0 bytes|
|entity_view |0 |16 kB |8192 bytes|0 bytes|
|pg_user_mapping |0 |16 kB |16 kB |0 bytes|
|pg_subscription |0 |16 kB |16 kB |0 bytes|
|pg_inherits |0 |16 kB |16 kB |0 bytes|
|pg_event_trigger |0 |16 kB |16 kB |0 bytes|
|pg_db_role_setting |0 |16 kB |8192 bytes|0 bytes|
|pg_foreign_data_wrapper|0 |16 kB |16 kB |0 bytes|
|pg_policy |0 |16 kB |16 kB |0 bytes|
|pg_replication_origin |0 |16 kB |16 kB |0 bytes|
|pg_default_acl |0 |16 kB |16 kB |0 bytes|
|pg_seclabel |0 |16 kB |8192 bytes|0 bytes|
|pg_shseclabel |0 |16 kB |8192 bytes|0 bytes|
|pg_transform |0 |16 kB |16 kB |0 bytes|
|pg_publication |0 |16 kB |16 kB |0 bytes|
|pg_publication_rel |0 |16 kB |16 kB |0 bytes|
|pg_subscription_rel |0 |8192 bytes|8192 bytes|0 bytes|
|pg_sequence |0 |8192 bytes|8192 bytes|0 bytes|
|pg_partitioned_table |0 |8192 bytes|8192 bytes|0 bytes|
|pg_largeobject_metadata|0 |8192 bytes|8192 bytes|0 bytes|
|pg_largeobject |0 |8192 bytes|8192 bytes|0 bytes|
|sql_sizing_profiles |0 |8192 bytes|0 bytes |0 bytes|
|pg_foreign_table |0 |8192 bytes|8192 bytes|0 bytes|
:::spoiler query sql
```sql
SELECT table_name, row_estimate, pg_size_pretty(total_bytes) AS total
, pg_size_pretty(index_bytes) AS index
, pg_size_pretty(table_bytes) AS table
FROM (
SELECT *, total_bytes-index_bytes-coalesce(toast_bytes,0) AS table_bytes FROM (
SELECT c.oid,nspname AS table_schema, relname AS table_name
, c.reltuples AS row_estimate
, pg_total_relation_size(c.oid) AS total_bytes
, pg_indexes_size(c.oid) AS index_bytes
, pg_total_relation_size(reltoastrelid) AS toast_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r'
) a ORDER BY total_bytes Desc
) a;
```
:::
::: info
**Datos de los server**
- srv-db:
- ip: 192.168.209.28 y 10.1.45.28
- srv-tb:
- ip: 192.168.209.27 y 10.1.45.27
:::
## Plan 1: mover la db al chasis nuevo
- **tareas**:
- [ ] paso 1: crear la db en la vm
- [ ] paso 2: scp de la VM vieja a la vm nueva
- [ ] paso 3: verificación
- [ ] paso 4: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- Se intento este camino en el primer dia, pero nos dimos cuenta que la tasa de tranferencia es de aprox 1.4MB/s (Megabyte/sec)
- el calculo para la tranferencia llevaria varios dias
::: danger
**Estado**: Este plan no se ejecuto.
:::
## Plan 2: generar dump de la db sin la telemetria y exportarla en la nueva db
- **tareas**:
- [ ] paso 1: generar el dump
- [ ] paso 2: crear la db en la vm
- [ ] paso 3: hacer el restore
- [ ] paso 4: verificación
- [ ] paso 5: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- existen microcortes en la conexion ssh, por lo que se genero un job en jenkins para que se pueda ejecutar
- el proceso lleva 15hs ejecutandose y sigue en proceso
::: warning
**Estado**: En ejecución...
:::
## Plan 3: generar dump de la db sin las tablas de mayor peso y exportarla en la nueva db
- **tareas**:
- [ ] paso 1: generar el dump
- [ ] paso 2: crear la db en la vm
- [ ] paso 3: hacer el restore
- [ ] paso 4: verificación
- [ ] paso 5: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- se realizo el proceso via ssh
- al realizar el dump se encontro un mensaje de error
- microcortes en el proceso de restore
::: danger
**Estado**: Error al hacer la restauracion del dump
::: spoiler detalle del error
```bash
ascentio@tmov04:/opt/thingsboard-db$ sudo docker exec --interactive thingsboard-db_postgres_1 pg_restore -F c -U thingsboard -v -d thingsboard < backups/thingsboard-dump_26_08_2020_local_2.sql.gz
pg_restore: connecting to database for restore
pg_restore: creating TABLE "public.admin_settings"
pg_restore: creating TABLE "public.alarm"
pg_restore: creating TABLE "public.asset"
pg_restore: creating TABLE "public.attribute_kv"
pg_restore: creating TABLE "public.audit_log"
pg_restore: creating TABLE "public.component_descriptor"
pg_restore: creating TABLE "public.customer"
pg_restore: creating TABLE "public.dashboard"
pg_restore: creating TABLE "public.databasechangelog"
pg_restore: creating TABLE "public.databasechangeloglock"
pg_restore: creating TABLE "public.device"
pg_restore: creating TABLE "public.device_credentials"
pg_restore: creating TABLE "public.entity_view"
pg_restore: creating TABLE "public.event"
pg_restore: creating TABLE "public.relation"
pg_restore: creating TABLE "public.rule_chain"
pg_restore: creating TABLE "public.rule_node"
pg_restore: creating TABLE "public.tb_user"
pg_restore: creating TABLE "public.tenant"
pg_restore: creating TABLE "public.user_credentials"
pg_restore: creating TABLE "public.widget_type"
pg_restore: creating TABLE "public.widgets_bundle"
pg_restore: processing data for table "public.admin_settings"
pg_restore: processing data for table "public.alarm"
pg_restore: processing data for table "public.asset"
pg_restore: processing data for table "public.attribute_kv"
pg_restore: processing data for table "public.audit_log"
pg_restore: processing data for table "public.component_descriptor"
pg_restore: processing data for table "public.customer"
pg_restore: processing data for table "public.dashboard"
pg_restore: processing data for table "public.databasechangelog"
pg_restore: processing data for table "public.databasechangeloglock"
pg_restore: processing data for table "public.device"
pg_restore: processing data for table "public.device_credentials"
pg_restore: processing data for table "public.entity_view"
pg_restore: processing data for table "public.event"
pg_restore: [custom archiver] could not read from input file: end of file
ascentio@tmov04:/opt/thingsboard-db$
```
:::
## Plan 4: hacer el restore de un backup existente
- **tareas**:
- [ ] paso 1: scp de la VM vieja a la vm nueva
- [ ] paso 2: restore de la db
- [ ] paso 3: verificación
- [ ] paso 4: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- microcortes, por lo que la tarea se repitio un par de veces
- se genero un job de jenkins para poder completar el restore
- al verificar la restauración, se perdieron los accesos de usuarios
- problemas hasta poder definir el job de jenkins
::: danger
**Estado**: Este plan no se completo.
:::
## Plan 5: borrar datos de la db vieja
- **tareas**:
- [ ] paso 1: descargar backup del mes de julio
- [ ] paso 1: hacer la restauracion
- [ ] paso 2: eliminar los datos de las tablas `events`, `ts_kv` y `ts_kv_latest`
- [ ] paso 3: comprimir el volumen y pos scp subirlo con el nombre de archivo: `epec_gc_wthitout_tk.zip`
- [ ] paso 4: integrar el volumen
- [ ] paso 5: verificación
- [ ] paso 6: hacer una lectura de perfiles para chequear la integración
- [ ] paso 7: hacer una lectura de billing completa
- [ ] paso 6: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- no hemos podido restaurar un backup, por lo que se podria hacer una copia del volumen
- solucion: restauramos ls db en una maquina personal
::: warning
**Estado**: pendiente
:::
## Plan 6: arrancar con una db "limpia"
- **tareas**:
- [X] paso 1: generar una db en ambiente de desarrollo
- [X] paso 2: hacer un backup del ambiente de desarrollo y ponerlo en producción
- [X] paso 3: verificación
- [ ] paso 4: integrar con los gateways
- [ ] paso 5: iniciar lectura de billing para que se carguen los medidores
- [ ] paso 6: migrar las vistas
- [ ] paso 7: iniciar lectura de perfiles desde el 1 de agosto
- **Problemas encontrados**:
-
::: danger
**Estado**: descartado ya que funciono el plan 5 y al querer replicar, notamos que no tenemos el detalle para replicar la VM tal cual esta en produccion. Ejemplo: no tenemos la contraseña del cliente que usa el sistema de facturación
:::
## Plan 7: cambiar la IP usada para conectarse a la DB
- **tareas**:
- [ ] paso 1: actualizar la configuración
- [ ] paso 2: verificación
- [ ] paso 3: iniciar la lectura de los perfiles
- **Problemas encontrados**:
- se repite el error por el que intentamos mover la db
:::spoiler
```bash
tb-gateway_1 | 2020-08-27 12:35:34,306 [nioEventLoopGroup-2-57] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:37,304 [nioEventLoopGroup-2-59] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:40,339 [nioEventLoopGroup-2-62] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:40,487 [nioEventLoopGroup-2-48] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-06_1 | 2020-08-27 12:35:41,471 [nioEventLoopGroup-2-18] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:41,452 [nioEventLoopGroup-2-50] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-07_1 | 2020-08-27 12:35:42,946 [nioEventLoopGroup-2-69] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:43,252 [nioEventLoopGroup-2-64] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-03_1 | 2020-08-27 12:35:42,686 [nioEventLoopGroup-2-52] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:42,839 [nioEventLoopGroup-2-54] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-05_1 | 2020-08-27 12:35:43,228 [nioEventLoopGroup-2-94] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-06_1 | 2020-08-27 12:35:43,551 [nioEventLoopGroup-2-20] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:44,530 [nioEventLoopGroup-2-55] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:45,312 [nioEventLoopGroup-2-66] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-03_1 | 2020-08-27 12:35:45,520 [nioEventLoopGroup-2-54] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-07_1 | 2020-08-27 12:35:45,062 [nioEventLoopGroup-2-97] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-06_1 | 2020-08-27 12:35:46,588 [nioEventLoopGroup-2-22] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:47,054 [nioEventLoopGroup-2-59] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:48,342 [nioEventLoopGroup-2-4] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-03_1 | 2020-08-27 12:35:47,645 [nioEventLoopGroup-2-57] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-05_1 | 2020-08-27 12:35:49,016 [nioEventLoopGroup-2-95] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:51,424 [nioEventLoopGroup-2-6] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-03_1 | 2020-08-27 12:35:51,574 [nioEventLoopGroup-2-60] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:51,052 [nioEventLoopGroup-2-62] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-07_1 | 2020-08-27 12:35:51,638 [nioEventLoopGroup-2-10] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-06_1 | 2020-08-27 12:35:51,864 [nioEventLoopGroup-2-24] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-04_1 | 2020-08-27 12:35:54,052 [nioEventLoopGroup-2-64] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-05_1 | 2020-08-27 12:35:54,624 [nioEventLoopGroup-2-96] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway-07_1 | 2020-08-27 12:35:54,971 [nioEventLoopGroup-2-12] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
tb-gateway_1 | 2020-08-27 12:35:55,314 [nioEventLoopGroup-2-8] WARN o.t.g.s.gateway.MqttGatewayService - Lost connection to ThingsBoard.
```
:::
::: danger
**Estado**: genera el mismo error de conexión
:::
## Plan 8: hacer que "alguien" copie manualmente con un disco
- **tareas**:
- [ ] paso 1: coordinar con alguien de EPEC para que haga una copia a disco de la carpeta y la copie en el segundo server
- [ ] paso 2: integrar la copia con el sistema
- [ ] paso 3: verificación
- [ ] paso 4: iniciar lectura de perfiles
- **Problemas encontrados**:
- dependemos de la capacidad del disco que tenga la gente de epec
-
::: danger
**Estado**: la copia se realizo via remota
:::
## Plan 9: pedir a epec que haga la copia del volumen entre servers
::: danger
**Estado**: se descarta
:::
## Plan 10: hacer un restore selectivo de un backup
::: danger
**Estado**: no se pudo realizar ya que el backup generado es un sql.
:::
## Plan 11: hacer la restauracion en una maquina
- **tareas**:
- [ ] paso 1: restaurar la db en una máquina
- [ ] paso 2: hacer backup del volumen
- [ ] paso 3: verificación
- [ ] paso 4: borrar los datos de telemetria
- [ ] paso 5: mover la DB a produccion
- [ ] paso 6: verificacion en produccion
- [ ] paso 7: iniciar lectura de perfiles desde el 1 de agosto
- **Problemas encontrados**:
-
-
::: danger
**Estado**: se realizo, pero la db restaurada se uso para poder ejecutar el plan 5, ya que al tener menos datos, la db comprimida ocupaba 500MB.
:::
## Plan 12: hacer snapshot de la VM
::: danger
**Estado**: descartado
:::
## Notas
<!-- Other important details discussed during the meeting can be entered here. -->
sss