owned this note changed 3 years ago
Published Linked with GitHub

ruck / rover README, (info and links)

tags: ruck_rover

CI-Config job maintanence

How often is $this tempest test failing

http://status.openstack.org/openstack-health/#/test/tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario

How to find the job status per build:
* https://trunk.rdoproject.org/api-centos8-train/api/civotes_agg_detail.html?ref_hash=$aggregated_hash
* aggregated_hash = https://trunk.rdoproject.org/centos8-train/current-tripleo/delorean.repo.md5

ruck/rover primer: https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html

Promoter Servers:
master / ussuri: promoter.rdoproject.org
other releases: http://38.102.83.109/

Cockpit: http://tripleo-cockpit.usersys.redhat.com/d/9DmvErfZz/cockpit?orgId=1

Internal Cockpit (WIP) http://tripleo-cockpit.usersys.redhat.com/?orgId=1
http://cistatus.tripleo.org/
https://trello.com/b/j4IcIomh/production-chain-escalation
http://rhos-release.virt.bos.redhat.com:3030/rhosp

Debugging Tools https://docs.google.com/document/d/1VZhje7ZN9sk4E31fYVrPxpqMJGz5ZhHRfhte_RYMXxg/edit#

Review.rdoproject.org dashboard: https://review.rdoproject.org/grafana/?orgId=1&var-datasource=default&var-server=registry.rdoproject.org.rdocloud&var-inter=$__auto_interval_inter

CentOS pre-release rpm updates for minor releases http://mirror.centos.org/centos/7/cr/x86_64/Packages/

hackmd.io rh-openstack-dev
https://hackmd.io/team/rh-openstack-ci?nav=overview

Internal software factory: https://sf.hosted.upshift.rdu2.redhat.com

upstream rsync mirror logs: files.openstack.org/mirror/logs/rsync-mirrors/centos.log

logs of all mirroring processes are available at:
https://static.opendev.org/mirror/logs/

mirroring driven by:
https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update

RDO FTBS failure to build system:
https://review.rdoproject.org/r/q/topic:rdo-FTBFS

TRELLO RETROSPECTIVE https://trello.com/b/0VFswmht/rdo-infra-retrospective?menu=filter&filter=label:UniSprint21

Internal Dashboard - https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/QE/view/OSP16/ OSP-10 - OSP-16

RHOS INFRA INFRARED ISSUES https://projects.engineering.redhat.com/issues/?filter=34183

CIX escalation https://mojo.redhat.com/docs/DOC-1098748#jive_content_id_CIX_Escalation_Automation_and_email_format

CIX board https://trello.com/b/j4IcIomh/production-chain-escalation

Nodepool image logs: https://softwarefactory-project.io/nodepool-log/

Upstream Mirror Monitors
http://cacti.openstack.org/cacti/graph_view.php

Upstream zuul monitors
http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1

Upstream qcow2 nodepool images

https://nb01.opendev.org, https://nb02.opendev.org. https://nb04.opendev.org

currently centos8 is here:

https://nb02.opendev.org/images/

Job info script.. ooocijobs https://github.com/marios/tripleo_ruck_job_tool

Components definitions
https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml

Promotion Servers:

Where's my patch:
https://docs.google.com/document/d/1He49bqXcgAWTadB_HmiUWN-xwSAIkYMJHm142SW_N1E/edit#heading=h.7lwp6dyqupip

Outtages:
https://url.corp.redhat.com/internal-outtage

Engineer On Duty (infra):
https://tree.taiga.io/project/morucci-software-factory/epic/1550

saved infra bugs

https://url.corp.redhat.com/internal-container-registry-down

INFRARED DOC

Infrared gerrit: https://review.gerrithub.io/q/project:redhat-openstack/infrared

Infrared doc: https://infrared.readthedocs.io/en/latest/

Vexxhost monitoring

https://review.rdoproject.org/analytics/app/kibana#/visualize?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-7d,mode:quick,to:now))

Elastic recheck documentation

Definitive Documentation
https://docs.openstack.org/tripleo-docs/latest/ci/ruck_rover_primer.html#tools
Log Reduce
Notes: https://drive.google.com/drive/u/1/folders/1TSiA940FOvaY_kEoy4WPKLAH-WcrwU0n
Example:
http://logs.rdoproject.org/06/587306/8/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/a470fce/report.html#

Sova:
http://cistatus.tripleo.org/
https://github.com/sshnaidm/sova/blob/master/tripleoci/data/patterns.yml

Consolidated Error File:
Example:
https://logs.opendev.org/86/674686/4/check/tripleo-ci-centos-7-containers-multinode/6afce9c/logs/undercloud/var/log/extra/errors.txt.txt.gz

Look in $server/var/log/extra/errors.txt of any tripleo job if the above link has expired.
Instructions are also in the logs dir of any job.
Elastic Recheck:
Status: http://status.openstack.org/elastic-recheck/
https://opendev.org/opendev/elastic-recheck/src/branch/master/queries

Elastic Recheck http://status.openstack.org/elastic-recheck/
Check the top 5-10 issues for infra issues or tripleo related issues
e.g.
Infra example http://status.openstack.org/elastic-recheck/#1708704
TQ example http://status.openstack.org/elastic-recheck/#1686542
See the logstash link in the ER page

Patterns in failures can be tracked and monitored using elastic recheck
E.g. http://status.openstack.org/elastic-recheck/#1708832
Defined in link
How-To:
git clone https://github.com/openstack-infra/elastic-recheck
See doc
virtualenv foo; source foo/bin/activate
pip install -r requirements; python setup.py install
elastic-recheck-query queries/1708832.yaml or other yaml file.
Note you can change the query message to any text to look for patterns in ci.
Example addition to recheck
http://pastebin.test.redhat.com/514075
https://review.openstack.org/#/c/498766/
https://review.openstack.org/#/c/493525/

Project that tracks upstream infra issues:
https://review.opendev.org/q/project:opendev/elastic-recheck

LogStash
Example:
http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:\"[Errno 256] No more mirrors to try\" AND tags:\"console\"&from=864000s

test project example

Test Project:
Clone testproject from rdo softwarefactory
https://review.rdoproject.org/r/testproject.git
Cd testproject
Vi zuul.yaml

  • project:
    check:
    jobs:
    - periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-stein:
    vars:
    force_periodic: true
    remove_ovb_after_job: false

Weshay ( current run ) ssh zuul@38.145.33.185

Selinux denials

Sova is now checking for selinux denials.. and we'll see them in the cockpit montioring under failed_reason. These denials are not actually failing jobs because upstream is permissive. It's a good thing that sova is warning us because bugs do need to be opened.

How to find:

log to view

tripleo-ci-centos-8-containers-multinode/dd8f53e/logs/$node/var/log/extra/denials.txt

how to report:

In bugzilla
Product: Red Hat Enterprise Linux 8 ( match centos version)
Component: selinux-policy or container-selinux

Example:
https://bugzilla.redhat.com/show_bug.cgi?id=1883980
https://bugzilla.redhat.com/show_bug.cgi?id=1883990

How to get all the rpms installed on a list of containers

sudo podman run net=host ${IMAGE_ID} rpm -qa

Red Hat openstack organization contacts

https://source.redhat.com/groups/public/openstack/openstack_wiki/red_hat_open_stack_organization

Select a repo