ruck_rover
ruck / rover are responsible for monitoring the periodic ci-config jobs.
Monitoring:
Responsibility
How to find the job status per build:
* https://trunk.rdoproject.org/api-centos8-train/api/civotes_agg_detail.html?ref_hash=$aggregated_hash
* aggregated_hash = https://trunk.rdoproject.org/centos8-train/current-tripleo/delorean.repo.md5
ruck/rover primer: https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html
Promoter Servers:
master / ussuri: promoter.rdoproject.org
other releases: http://38.102.83.109/
Cockpit: http://tripleo-cockpit.usersys.redhat.com/d/9DmvErfZz/cockpit?orgId=1
Internal Cockpit (WIP) http://tripleo-cockpit.usersys.redhat.com/?orgId=1
http://cistatus.tripleo.org/
https://trello.com/b/j4IcIomh/production-chain-escalation
http://rhos-release.virt.bos.redhat.com:3030/rhosp
Debugging Tools https://docs.google.com/document/d/1VZhje7ZN9sk4E31fYVrPxpqMJGz5ZhHRfhte_RYMXxg/edit#
Review.rdoproject.org dashboard: https://review.rdoproject.org/grafana/?orgId=1&var-datasource=default&var-server=registry.rdoproject.org.rdocloud&var-inter=$__auto_interval_inter
CentOS pre-release rpm updates for minor releases http://mirror.centos.org/centos/7/cr/x86_64/Packages/
hackmd.io rh-openstack-dev
https://hackmd.io/team/rh-openstack-ci?nav=overview
Internal software factory: https://sf.hosted.upshift.rdu2.redhat.com
upstream rsync mirror logs: files.openstack.org/mirror/logs/rsync-mirrors/centos.log
logs of all mirroring processes are available at:
https://static.opendev.org/mirror/logs/
mirroring driven by:
https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update
RDO FTBS failure to build system:
https://review.rdoproject.org/r/q/topic:rdo-FTBFS
TRELLO RETROSPECTIVE https://trello.com/b/0VFswmht/rdo-infra-retrospective?menu=filter&filter=label:UniSprint21
Internal Dashboard - https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/QE/view/OSP16/ OSP-10 - OSP-16
RHOS INFRA INFRARED ISSUES https://projects.engineering.redhat.com/issues/?filter=34183
CIX escalation https://mojo.redhat.com/docs/DOC-1098748#jive_content_id_CIX_Escalation_Automation_and_email_format
CIX board https://trello.com/b/j4IcIomh/production-chain-escalation
Nodepool image logs: https://softwarefactory-project.io/nodepool-log/
Upstream Mirror Monitors
http://cacti.openstack.org/cacti/graph_view.php
Upstream zuul monitors
http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1
Upstream qcow2 nodepool images
https://nb02.opendev.org/images/
Job info script.. ooocijobs https://github.com/marios/tripleo_ruck_job_tool
Components definitions
https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml
Promotion Servers:
Where's my patch:
https://docs.google.com/document/d/1He49bqXcgAWTadB_HmiUWN-xwSAIkYMJHm142SW_N1E/edit#heading=h.7lwp6dyqupip
Outtages:
https://url.corp.redhat.com/internal-outtage
Engineer On Duty (infra):
https://tree.taiga.io/project/morucci-software-factory/epic/1550
https://url.corp.redhat.com/internal-container-registry-down
Infrared gerrit: https://review.gerrithub.io/q/project:redhat-openstack/infrared
Infrared doc: https://infrared.readthedocs.io/en/latest/
Definitive Documentation
https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html#tools
Log Reduce
Notes: https://drive.google.com/drive/u/1/folders/1TSiA940FOvaY_kEoy4WPKLAH-WcrwU0n
Example:
http://logs.rdoproject.org/06/587306/8/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/a470fce/report.html#
Sova:
http://cistatus.tripleo.org/
https://github.com/sshnaidm/sova/blob/master/tripleoci/data/patterns.yml
Consolidated Error File:
Example:
https://logs.opendev.org/86/674686/4/check/tripleo-ci-centos-7-containers-multinode/6afce9c/logs/undercloud/var/log/extra/errors.txt.txt.gz
Look in $server/var/log/extra/errors.txt of any tripleo job if the above link has expired.
Instructions are also in the logs dir of any job.
Elastic Recheck:
Status: http://status.openstack.org/elastic-recheck/
https://opendev.org/opendev/elastic-recheck/src/branch/master/queries
Elastic Recheck http://status.openstack.org/elastic-recheck/
Check the top 5-10 issues for infra issues or tripleo related issues
e.g.
Infra example http://status.openstack.org/elastic-recheck/#1708704
TQ example http://status.openstack.org/elastic-recheck/#1686542
See the logstash link in the ER page
Patterns in failures can be tracked and monitored using elastic recheck
E.g. http://status.openstack.org/elastic-recheck/#1708832
Defined in link
How-To:
git clone https://github.com/openstack-infra/elastic-recheck
See doc
virtualenv foo; source foo/bin/activate
pip install -r requirements; python setup.py install
elastic-recheck-query queries/1708832.yaml or other yaml file.
Note you can change the query message to any text to look for patterns in ci.
Example addition to recheck
http://pastebin.test.redhat.com/514075
https://review.openstack.org/#/c/498766/
https://review.openstack.org/#/c/493525/
Project that tracks upstream infra issues:
https://review.opendev.org/q/project:opendev/elastic-recheck
LogStash
Example:
http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:\"[Errno 256] No more mirrors to try\" AND tags:\"console\"&from=864000s
Test Project:
Clone testproject from rdo softwarefactory
https://review.rdoproject.org/r/testproject.git
Cd testproject
Vi zuul.yaml
Weshay ( current run ) ssh zuul@38.145.33.185
Sova is now checking for selinux denials.. and we'll see them in the cockpit montioring under failed_reason. These denials are not actually failing jobs because upstream is permissive. It's a good thing that sova is warning us because bugs do need to be opened.
tripleo-ci-centos-8-containers-multinode/dd8f53e/logs/$node/var/log/extra/denials.txt
In bugzilla
Product: Red Hat Enterprise Linux 8 ( match centos version)
Component: selinux-policy or container-selinux
Example:
https://bugzilla.redhat.com/show_bug.cgi?id=1883980
https://bugzilla.redhat.com/show_bug.cgi?id=1883990
sudo podman run –net=host ${IMAGE_ID} rpm -qa
https://source.redhat.com/groups/public/openstack/openstack_wiki/red_hat_open_stack_organization