# ruck / rover README, (info and links) ###### tags: `ruck_rover` ## CI-Config job maintanence * ruck / rover are responsible for monitoring the periodic ci-config jobs. * Monitoring: * http://dashboard-ci.tripleo.org/d/wb8HBhrWk/cockpit?orgId=1&fullscreen&panelId=381 * Responsibility * report a launchpad bug * this bug should be put the sprint board ## How often is $this tempest test failing http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario ## INFO AND LINKS How to find the job status per build: * https://trunk.rdoproject.org/api-centos8-train/api/civotes_agg_detail.html?ref_hash=$aggregated_hash * aggregated_hash = https://trunk.rdoproject.org/centos8-train/current-tripleo/delorean.repo.md5 ruck/rover primer: https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html Promoter Servers: master / ussuri: promoter.rdoproject.org other releases: http://38.102.83.109/ Cockpit: http://tripleo-cockpit.usersys.redhat.com/d/9DmvErfZz/cockpit?orgId=1 Internal Cockpit (WIP) http://tripleo-cockpit.usersys.redhat.com/?orgId=1 http://cistatus.tripleo.org/ https://trello.com/b/j4IcIomh/production-chain-escalation http://rhos-release.virt.bos.redhat.com:3030/rhosp Debugging Tools https://docs.google.com/document/d/1VZhje7ZN9sk4E31fYVrPxpqMJGz5ZhHRfhte_RYMXxg/edit# Review.rdoproject.org dashboard: https://review.rdoproject.org/grafana/?orgId=1&var-datasource=default&var-server=registry.rdoproject.org.rdocloud&var-inter=$__auto_interval_inter CentOS pre-release rpm updates for minor releases http://mirror.centos.org/centos/7/cr/x86_64/Packages/ hackmd.io rh-openstack-dev https://hackmd.io/team/rh-openstack-ci?nav=overview Internal software factory: https://sf.hosted.upshift.rdu2.redhat.com upstream rsync mirror logs: files.openstack.org/mirror/logs/rsync-mirrors/centos.log logs of all mirroring processes are available at: https://static.opendev.org/mirror/logs/ mirroring driven by: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update RDO FTBS failure to build system: https://review.rdoproject.org/r/q/topic:rdo-FTBFS TRELLO RETROSPECTIVE https://trello.com/b/0VFswmht/rdo-infra-retrospective?menu=filter&filter=label:UniSprint21 Internal Dashboard - https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/QE/view/OSP16/ OSP-10 - OSP-16 RHOS INFRA INFRARED ISSUES https://projects.engineering.redhat.com/issues/?filter=34183 CIX escalation https://mojo.redhat.com/docs/DOC-1098748#jive_content_id_CIX_Escalation_Automation_and_email_format CIX board https://trello.com/b/j4IcIomh/production-chain-escalation Nodepool image logs: https://softwarefactory-project.io/nodepool-log/ Upstream Mirror Monitors http://cacti.openstack.org/cacti/graph_view.php Upstream zuul monitors http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 Upstream qcow2 nodepool images #### https://nb01.opendev.org, https://nb02.opendev.org. https://nb04.opendev.org #### currently centos8 is here: https://nb02.opendev.org/images/ Job info script.. ooocijobs https://github.com/marios/tripleo_ruck_job_tool Components definitions https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml Promotion Servers: * vexxhost queens, stein, train http://38.102.83.109/ * promoter.rdoproject.org ussuri + master Where's my patch: https://docs.google.com/document/d/1He49bqXcgAWTadB_HmiUWN-xwSAIkYMJHm142SW_N1E/edit#heading=h.7lwp6dyqupip Outtages: https://url.corp.redhat.com/internal-outtage Engineer On Duty (infra): https://tree.taiga.io/project/morucci-software-factory/epic/1550 ## saved infra bugs https://url.corp.redhat.com/internal-container-registry-down ## INFRARED DOC Infrared gerrit: https://review.gerrithub.io/q/project:redhat-openstack/infrared Infrared doc: https://infrared.readthedocs.io/en/latest/ ## Vexxhost monitoring https://review.rdoproject.org/analytics/app/kibana#/visualize?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now)) ## Elastic recheck documentation :::spoiler Definitive Documentation https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html#tools Log Reduce Notes: https://drive.google.com/drive/u/1/folders/1TSiA940FOvaY_kEoy4WPKLAH-WcrwU0n Example: http://logs.rdoproject.org/06/587306/8/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/a470fce/report.html# Sova: http://cistatus.tripleo.org/ https://github.com/sshnaidm/sova/blob/master/tripleoci/data/patterns.yml Consolidated Error File: Example: https://logs.opendev.org/86/674686/4/check/tripleo-ci-centos-7-containers-multinode/6afce9c/logs/undercloud/var/log/extra/errors.txt.txt.gz Look in $server/var/log/extra/errors.txt of any tripleo job if the above link has expired. Instructions are also in the logs dir of any job. Elastic Recheck: Status: http://status.openstack.org/elastic-recheck/ https://opendev.org/opendev/elastic-recheck/src/branch/master/queries Elastic Recheck http://status.openstack.org/elastic-recheck/ Check the top 5-10 issues for infra issues or tripleo related issues e.g. Infra example http://status.openstack.org/elastic-recheck/#1708704 TQ example http://status.openstack.org/elastic-recheck/#1686542 See the logstash link in the ER page Patterns in failures can be tracked and monitored using elastic recheck E.g. http://status.openstack.org/elastic-recheck/#1708832 Defined in link How-To: git clone https://github.com/openstack-infra/elastic-recheck See doc virtualenv foo; source foo/bin/activate pip install -r requirements; python setup.py install elastic-recheck-query queries/1708832.yaml or other yaml file. Note you can change the query message to any text to look for patterns in ci. Example addition to recheck http://pastebin.test.redhat.com/514075 https://review.openstack.org/#/c/498766/ https://review.openstack.org/#/c/493525/ Project that tracks upstream infra issues: https://review.opendev.org/q/project:opendev/elastic-recheck * FOR TRIPLEO USE * https://review.opendev.org/q/project:openstack/tripleo-ci-health-queries LogStash Example: http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22%5BErrno%20256%5D%20No%20more%20mirrors%20to%20try%5C%22%20AND%20tags:%5C%22console%5C%22&from=864000s ::: ### test project example Test Project: Clone testproject from rdo softwarefactory https://review.rdoproject.org/r/testproject.git Cd testproject Vi zuul.yaml - project: check: jobs: - periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein: vars: force_periodic: true remove_ovb_after_job: false Weshay ( current run ) ssh zuul@38.145.33.185 ## Selinux denials Sova is now checking for selinux denials.. and we'll see them in the cockpit montioring under failed_reason. These denials are not actually failing jobs because upstream is permissive. It's a good thing that sova is warning us because bugs do need to be opened. ### How to find: ![](https://i.imgur.com/8oEPC0L.png) #### log to view tripleo-ci-centos-8-containers-multinode/dd8f53e/logs/$node/var/log/extra/denials.txt #### how to report: In bugzilla Product: Red Hat Enterprise Linux 8 ( match centos version) Component: selinux-policy or container-selinux Example: https://bugzilla.redhat.com/show_bug.cgi?id=1883980 https://bugzilla.redhat.com/show_bug.cgi?id=1883990 ## How to get all the rpms installed on a list of containers sudo podman run --net=host ${IMAGE_ID} rpm -qa ## Red Hat openstack organization contacts https://source.redhat.com/groups/public/openstack/openstack_wiki/red_hat_open_stack_organization