# ot journal - february 2024 ## friday, february 16th, 2024 - implemented and documented: - jenkins configuration - sonarqube configuration - jenkins/sonarqube interactions and security configuration - i hit a couple of potentially serious bottlenecks - we're using the latest available sonarqube community server container which uses a recent version of java. when triggering the sonarqube scan, the build throws an error like so: ``` [ERROR] Failed to execute goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.10.0.2594:sonar (default-cli) on project restservice: Execution default-cli of goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.10.0.2594:sonar failed: An API incompatibility was encountered while executing org.sonarsource.scanner.maven:sonar-maven-plugin:3.10.0.2594:sonar: java.lang.UnsupportedClassVersionError: org/sonar/batch/bootstrapper/EnvironmentInformation has been compiled by a more recent version of the Java Runtime (class file version 61.0), this version of the Java Runtime only recognizes class file versions up to 55.0 ``` there's a good explanation about why this happens at: - https://stackoverflow.com/a/75029889/68115 - https://community.sonarsource.com/t/java-versions-in-sonarqube/67551 basically, the fix might be to either - downgrade sonarqube. however, i dislike this approach immensely, since the intent here is to provide modern and sophisticated scoring based on the latest available metrics and a downgrade of the sonarqube server will introduce older vulnerabilities as well as significantly reduce the value and reliability of any scores generated. or possibly to - install a secondary jdk home on the build agent which we don't use to compile (because it probably wouldn't work for the majority of the older java applications) and redirect the sonar scan (only) to the modern jdk home the second approach is what i'm hoping to achieve but at the very least, this compromises our ability to demonstrate a working scoring mechanism on the 22nd since we need to implement a mechanism to determine the java versions each application targets, then a build agent selection mechanism based on what we determine for each app. further, we need to add jenkins agent configuration for each version of java that we determine we need to target based on what we discover about the java versions targetted by each repository. ## thursday, february 15th, 2024 - switched jenkins client auth to use the service account api token which doesn't expire, rather than the user created tokens which expire regularly - started [documentation](https://gitlab.omantel.om/alm/app-build/-/blob/main/readme.md) of the deployment topology - fixed a bug in the build scheduler that was causing build triggers to fail - modified the repo sync mechanism to make it easier to show the same folder structures on the dashboard as are used in gitlab - started on the openapi spec for the workflow - fixed the mongo-express pod configuration and got the connection to mongodb working correctly - fixed the sonarqube pod configuration and got the connection to postgres working correctly (which gives us persistent score histories, following sonarqube updates or crashes) ## wednesday, february 14th, 2024 - implemented an async error handler to better log and debug fetch timeouts and other auth exceptions - implemented sync mechanism for build history - discovered (through trial and error) that editing the deployment yaml seems to be the best way to get openshift to deploy recent container images - deployed a sonarqube server and supporting postgres db server to generate app scores using the yaml mechanism ## tuesday, february 13th, 2024 - determined that the openshift tls proxy is misconfigured in that it fails to correctly serve the tls certificate chain. specifically, when examining certificates provisioned by openshift, the ca certificate is listed but not served. ie: - applications are served with the wildcard cert: *.apps.ocpprod.otg.om - the above wildcard cert lists it's issuer as: WAT-OTG-CCL-01 with a uri of http://crl.omantel.om/WAT-OTG-CCL-01(1).crl. unfortunately, crl.omantel.om refuses to serve the certificate at the identified uri. this means that the nodejs client is unable to validate ot certificates and results in tls exceptions when api connections are attempted. - after spending time debugging the above issues, i was forced to revert the secure implementation of the api interactions to simply ignore all tls security. this is really unfortunate and i feel that fixing the underlying issues in the openshift implementation deserves a little attention when resources permit. - implemented a little better error handling when dealing with api fetch timeouts. ot hq network performance is noticeably degraded during normal working hours and is evidenced by an increase in fetch timeouts as the morning progresses followed by a decrease in fetch timeouts as the afternoon progresses. it is fortunate that the alm nightly build workflow does most of its work between the end of day and the start of the following day, when network performance is distinctly better. - added better handling of scenarios where repositories are deleted from gitlab or when access to the repository is revoked and we are no longer able to attempt the nightly build. ## monday, february 12th, 2024 it rained again in muscat today. as a result of the unstable weather, a national holiday was observed. there were very few of us in the ot hq today but this meant that vpn, network and other infra was working exceptionally well and i felt like it was an easy day to make rapid progress while network requests were resolving rapidly and my vpn connections terminated more infrequently than when the office is busy. - fixed the nightly build scheduler logic to trigger all builds that haven't yet run in the current build window - fixed the docker deployment to set the timezone for the scheduler workflow to Asia/Muscat - added the workflow component of app-build to the openshift deployment so that it runs continuously on ot infra - added caching of jenkins job and folder existence checks to reduce calls and load on the jenkins server. this fixed timeout issues i was seeing with workflow api calls. ## sunday, february 11th, 2024 - implemented configurable build windows for the build scheduler to ensure that nightly builds are triggered after the last commits of the day have been received and before the following days commits begin. the current build window configuration is after 21:00 and before 06:00. - met with shahabudin who gave me information that helped me to resolve the issues i had with connecting to svn and explained the nature of the older ant/java applications there and the challenges we will face in including these in ci/cd workflows. ## saturday, february 10th, 2024 - implemented database persistence of repository discoveries, commit volumes and last build time ## friday, february 9th, 2024 - implemented build capacity rules to ensure that only a configurable number of nightly builds are allowed to run concurrently. this eliminated issues with saturation of the jenkins server capacity to manage multiple simultaneous builds. ## thursday, february 8th, 2024 worked from the hotel as ot hq was observing the national holiday - implemented build queueing logic to schedule and trigger all nightly builds ## wednesday, february 7th, 2024 - implemented continuous automatic discovery of gitlab repositories, commit history and classification of those repositories (by default builder). ## tuesday, february 6th, 2024 - travelled to muscat. the trip was complicated and extended by a missed connection in doha, where i had to wait overnight for a later flight to mct. i reached the hotel just as the sun rose on wednesday morning.