Motivation
Currently, the default is doing a full build over all >110k configurations, for each PR change. That takes 2-3 hrs, which is bad for productivity(tm).
We need a solution that speeds up builds.
Getting more hardware is difficult, we'd need to get 128-256 fast cores to just cut those times in half.
Proposal
definitions
tPR = time to get PR feedback
tMERGE = time to get to master after merge intent
kaspar030 changed 2 years agoView mode Like Bookmark
Details
When: 05.10.22 10h Berlin time
Where: https://meet.jit.si/RealFlowersLaughFlatly
Previous meeting notes: https://hackmd.io/I4G2FKZ8RmSlRI37VlYl2g
Participants
Tom
Alex
kaspar030 changed 2 years agoView mode Like Bookmark
Outline
The main idea is to set up a second instance and test it (using one worker), then on "migration day", stop the old instance and scale up the new one.
That way, the new instance gets tested, and only needs to be scaled up.
The old instance stays configured (and then disabled), so we can roll back if we have to.
Note: the new instance will use 8gb RAM for ccache tmpfs and another 2GB for the first worker, by default. If RAM is tight, having all old workers and the new instance running at the same time might not be feasible. In that case, I suggest to first reconfigure the old instance to use less workers.
Steps
kaspar030 changed 2 years agoView mode Like Bookmark
Details
When: 01.09.22 11h Berlin time
Where: https://meet.jit.si/RealFlowersLaughFlatly
Previous meeting notes: https://hackmd.io/-3u4z5EiTzmpDnM5nfi9lQ
Participants
Kaspar
Alex
kaspar030 changed 2 years agoView mode Like Bookmark
Details
When: 20.06.22 10h Berlin time
Where: https://meet.jit.si/RealFlowersLaughFlatly
Previous meeting notes: https://hackmd.io/GOvgvGGeTkCpP7hwiGFImg
Participants
Kevin
Kaspar
kaspar030 changed 2 years agoView mode Like Bookmark
Latest info
27.09.22
ci migration:
ci-prod full build completed successfully, ui can handle it easily
murock-cli deployed to cleanup old builds
pifleet upgraded:
kaspar030 changed 2 years agoView mode Like 2 Bookmark
Idea: if a build fully succeeded, cache its data hash similar to the test result cache. When starting a build, check if there's already a full build that succeeded with the same data. This should skip rebuilds when only changing metadata, e.g., all squash builds.
Things to watch out for:
take container/environment change into account
progress
this should be trivial to add to murdock scripts. it is not trivial to test for lack of either test instance or access to production murdock. => postponed
09.11.21: renewed interest, will try in close collaboration with @tom
kaspar030 changed 3 years agoView mode Like Bookmark
idea: use drone for PR handling, nightly handling, container handling
why drone?
both server and runner are open source and freely available
very easy to bundle complex build steps into parameterize "plugin" containers example
drone can handle per-build container change and arm64 (raspi) workers
Status: (09.11.21) too much work for getting hacked dwq support, waiting to test @aabadie's murdock rewrite
kaspar030 changed 3 years agoView mode Like Bookmark
Meetings
RIOT CI meeting 08.11.2021
RIOT CI meeting 25.10.2021
RIOT CI meeting 03/2021
RIOT CI Meeting 12/2020
Documentation
RIOT CI Infrastructure Maintainer Overview
RIOT CI current work
kaspar030 changed 3 years agoView mode Like Bookmark
notes:
haw needs to migrate from centos to ubuntu
(background on riotdocker dwq container migration)
Ask Oleg to add a ci-staging.riot-os.org
We will setup a seperate test worker node
Should also handle worker key generation and adding public key to master
Should be deployed on another mobi (maybe mobi4)
kaspar030 changed 3 years agoView mode Like Bookmark
This page tries to give an overview about the components of RIOT's CI infra, and who's in charge. This collects physical access to hardware / administration, not the underlying software.
Murdock -> RIOT's compilation testing
Contact: Kaspar
Components:
ci.riot-os.org (head node, Web UI) -> Admins: Cenk, Kevin
worker nodes:
Kevin Weiss changed 3 years agoView mode Like Bookmark
time & date: 01/03/21 10am CET https://meet.jit.si/riot-ci
Agenda
document infra, maintainers, reduce bus factors
consider using bors
cancel PR builds early. currently set to 500 builds. reduce to e.g., 20?
split multi dwq inctances, e.g., "riotbuild" into "riotbuild0" .. "riotbuild7"
"final" location of these markdown documents?
Building a subset
riotdocker for Pi fleet?
kaspar030 changed 4 years agoView mode Like 1 Bookmark
For example, this implements a complex custom clone step (involving merging and pushing to a remote repository):
- name: clone
image: kaspar030/drone-clone
settings:
SSH_KEY:
from_secret: ssh_key_clone
mode: clone
base_repo: ${DRONE_REPO_LINK}
base_branch: ${DRONE_REPO_BRANCH}
kaspar030 changed 4 years agoView mode Like Bookmark