# Debug tempest timeout in baremetal jobs #### Issue:- Tempest is timing out on fs001 running on real baremetal job in 16.2(component pipeline) #### Impacted jobs: * periodic-tripleo-ci-rhel-8-bm_envB-3ctlr_1comp-featureset001-baremetal-rhos-16.2 * periodic-tripleo-ci-rhel-8-bm_envC-3ctlr_1comp-featureset001-tripleo-rhos-16.2 This is happening from a while now, [Build history](https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?job_name=periodic-tripleo-ci-rhel-8-bm_envB-3ctlr_1comp-featureset001-baremetal-rhos-16.2) , Baremetal jobs are currently not in 16.2 critera, Removed after they faced some infra issue last month, but we would like to add them back in criteria. https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-baremetal/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-bm_envB-3ctlr_1comp-featureset001-baremetal-rhos-16.2/03f88a7/job-output.txt ~~~ 2021-09-23 02:39:36.277080 | primary | TASK [os_tempest : Execute tempest tests] ************************************** 2021-09-23 02:39:36.277088 | primary | Thursday 23 September 2021 02:39:36 -0400 (0:00:00.053) 0:04:44.480 **** 2021-09-23 06:41:13.961303 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master] 2021-09-23 06:41:13.986879 | POST-RUN START: [trusted : code.engineering.redhat.com/openstack/tripleo-ci-internal-config/playbooks/dlrn/post-dlrn.yaml@master] ~~~ ##### Observation:- * We can see TASK [os_tempest : Execute tempest tests] * started at **2021-09-23 02:39:36** * Timed out after ~4 hours at **2021-09-23 06:41:13** But if we take a look at tempest.log we can see tempest run logs only started at **2021-09-23 06:39** (this may be because of timezone difference between nodepool node and undercloud node) https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-baremetal/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-bm_envB-3ctlr_1comp-featureset001-baremetal-rhos-16.2/03f88a7/logs/undercloud/home/zuul/tempest/tempest.log ~~~ 2021-09-23 06:39:27.856 535205 INFO tempest [-] Using tempest config file /home/zuul/tempest/etc/tempest.conf ~~~ No error observed in tempest_run.log, but we can see tempest_run was not completed till logs were collected:- https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-baremetal/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-bm_envB-3ctlr_1comp-featureset001-baremetal-rhos-16.2/03f88a7/logs/undercloud/var/log/tempest/tempest_run.log While checking directly on baremetal node, we can see that tempest run was actually completing successfully, and it didn't took much longer only ~15 mins. ~~~ [root@undercloud tempest]# cat tempest_run.log {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_create [2.459328s] ... ok {1} keystone_tempest_plugin.tests.api.identity.v3.test_mapping_rules.MappingRulesTest.test_mapping_rule_update [2.964076s] ... ok {1} keystone_tempest_plugin.tests.api.identity.v3.test_mapping_rules.MappingRulesTest.test_mapping_rules_create [0.567365s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_create_with_bad_attributes [1.314761s] ... ok {1} keystone_tempest_plugin.tests.api.identity.v3.test_mapping_rules.MappingRulesTest.test_mapping_rules_create_without_mandatory_attributes_fails [0.480817s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_create_with_enabled_true [1.131471s] ... ok {1} keystone_tempest_plugin.tests.api.identity.v3.test_mapping_rules.MappingRulesTest.test_mapping_rules_get [1.129647s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_add_protocol_to_identity_provider [4.619346s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_create_with_relay_state_prefix [0.911674s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_create_without_mandatory_attributes [0.377334s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_get [0.672732s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_add_protocol_to_identity_provider_unknown_mapping [2.303053s] ... ok {1} keystone_tempest_plugin.tests.api.identity.v3.test_mapping_rules.MappingRulesTest.test_mapping_rules_list [2.811598s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_list [2.188552s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_update [1.465505s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_get_protocol_from_identity_provider [3.710659s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_provider_update_with_bad_attributes_fails [0.999335s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_create [1.244634s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_create_with_enabled_true [1.864970s] ... ok {2} keystone_tempest_plugin.tests.api.identity.v3.test_service_providers.ServiceProvidersTest.test_service_providers_in_token [3.208134s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_create_with_remote_ids [1.399431s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_get [1.958122s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_list [5.111974s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_identity_provider_update [2.031424s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_list_protocols_from_identity_provider [5.307588s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_update_mapping_from_identity_provider_protocol [4.324410s] ... ok {3} keystone_tempest_plugin.tests.api.identity.v3.test_identity_providers.IndentityProvidersTest.test_update_protocol_from_identity_provider_unknown_mapping [4.544826s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_connectivity_between_vms_on_different_networks [173.841089s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_hotplug_nic [103.662164s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_mtu_sized_frames [86.420564s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops [167.981590s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_preserve_preexisting_port [58.848144s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_router_rescheduling ... SKIPPED: Skipped because network extension: l3_agent_scheduler is not enabled {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details [97.361447s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state [119.400239s] ... ok {0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state [92.592431s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_authorize_request_token [2.217951s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_create_access_token [2.438565s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_create_and_show_consumer [0.982271s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_create_request_token [0.792249s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_delete_consumer [1.415031s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_list_access_tokens [4.302080s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_list_consumers [1.646343s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_list_roles_for_access_token [2.492090s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_revoke_access_token [2.693355s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_show_role_for_access_token [2.727612s] ... ok {0} keystone_tempest_plugin.tests.api.identity.v3.test_oauth1_tokens.OAUTH1TokensTest.test_update_consumer [1.164047s] ... ok ====== Totals ====== Ran: 47 tests in 936.9758 sec. - Passed: 46 - Skipped: 1 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 0 Sum of execute time for each test: 984.0829 sec. ============== Worker Balance ============== - Worker 0 (20 tests) => 0:15:27.901148 - Worker 1 (5 tests) => 0:00:07.955639 - Worker 2 (10 tests) => 0:00:14.734064 - Worker 3 (12 tests) => 0:00:38.430176 ~~~ * Last modified date/time on var/log/tempest_run.log also confirm it took ~15 mins. ~~~ ls -al tempest_run.log -rw-r--r--. 1 root root 7029 Sep 23 06:55 tempest_run.log ~~~ * I have also executed tempest tests manually and it tool ~16 mins:- ~~~ $ tempest run --concurrency 4 --serial --exclude-list tempest_excludelist.txt --include-list tempest_includelist.txt . . ====== Totals ====== Ran: 47 tests in 970.7604 sec. - Passed: 46 - Skipped: 1 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 0 Sum of execute time for each test: 963.2943 sec ~~~ # Comparision 1) If we compare BM job against OVB fs001 tempest run is starting as soon as TASK [os_tempest : Execute tempest tests] runs. https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-baremetal/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-ovb-3ctlr_1comp-featureset001-internal-baremetal-rhos-16.2/6255192/job-output.txt ~~~ 2021-09-23 00:59:52.158176 | primary | TASK [os_tempest : Execute tempest tests] ************************************** 2021-09-23 00:59:52.158183 | primary | Thursday 23 September 2021 00:59:52 -0400 (0:00:00.049) 3:30:58.155 **** 2021-09-23 01:16:14.304582 | primary | ok: [undercloud] ~~~ https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-baremetal/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-ovb-3ctlr_1comp-featureset001-internal-baremetal-rhos-16.2/6255192/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz ~~~ 2021-09-23 00:59:45.221 595912 INFO tempest [-] Using tempest config file /home/zuul/tempest/etc/tempest.conf ~~~ 2) If we compare against 17 working BM job. We see same behavior that undercloud node time logging is 4 hours ahead, But TASK [os_tempest : Execute tempest tests] completes within ~15 mins. ~~~ 2021-09-23 02:15:47.945002 | primary | TASK [os_tempest : Execute tempest tests] ************************************** 2021-09-23 02:15:47.945011 | primary | Thursday 23 September 2021 02:15:47 -0400 (0:00:00.049) 0:04:35.483 **** 2021-09-23 02:31:55.413165 | primary | ok: [undercloud] ~~~ ~~~ 2021-09-23 06:15:39.762 540823 INFO tempest [-] Using tempest config file /home/zuul/tempest/etc/tempest.conf ~~~ # What's done:- * Confirmed manually tempest test run is not slow, completing within ~15mins * Possible explanation of 4 hours difference between "TASK [os_tempest : Execute tempest tests]" and tempest.log start time. * EDT and UTC have difference of 4 hours. * By default, the nodepool provider node we get have time in EDT * During undercloud/standalone we set timezone to UTC ~~~ 2021-09-23 03:51:18 | 2021-09-23 03:51:18.888076 | 00137eea-76f1-4447-45de-000000000606 | TASK | Set timezone to UTC 2021-09-23 03:51:19 | 2021-09-23 03:51:19.604711 | 00137eea-76f1-4447-45de-000000000606 | OK | Set timezone to UTC | undercloud ~~~