Metrics - HackMD

# Metrics ## Common Ideas * scrape/fetch/extract relevant info and put somewere * needed for build time, 14 days retention * needed for deploy frequency (only last ~10 provided) * build dashboard from retrieved data * GitHub app can avoid polling * how/where to save those data * which data we want to save * Prometheus/Grafana -> NEXT * random resources * https://docs.github.com/en/rest/actions * https://github.com/amirha97/github-actions-stats * https://medium.com/@michmich112/get-detailed-analytics-and-statistics-from-your-github-actions-5ee43e056ff ## Deploy Frequency * helm -> script to get from helm itself :heavy_check_mark: * argo cd -> explore if available api to get info :heavy_check_mark: * don't care about direct kubectl apply * should migrate to helm or argo ### ArgoCD * provide some obscure API endpoints * `GET /api/v1/applications` * list of all projects (`$.items[*].metadata.name`) * history of all project revisions, last 10 (`$.items[*].status.history`) * only provides datetime, sequential id, and revision hash * `GET /api/v1/applications/agata/revisions/32b14e1803ab55d3e395bdfa3aff34f20451015b/metadata` * author, date, message: message can be used to get original repo commit ("`deploying agata from casavo/agata@8ca3545d8a1d27d03fd9dce419c9e0e3c62a9487`") ### Helm * use `helm` cli to get releases and release details * must store previus fetch (only latest 10) * must specify namespace to query (we use helm to install third party) ### Storing * good format for data -> JSON ```json [ { "serial": 10, "timestamp": "2021-12-13T09:21:28Z", "state": "deployed" }, { "serial": 11, "timestamp": "2021-12-16T11:25:45Z", "state": "deployed"} ] ``` * fetch and collect to S3 bucket :clock10: * import into snowflake :clock1030: ### Preview * https://github.com/casavo/metrics-collector-deploy ## CI Build Time * https://casavo.atlassian.net/browse/SELL-1098 ### Roadmap / what's next - Share withe the CoP data what we want to do to gather feedback @DenisBellotti - Explore the GH API for workflows, runs to understand better which kind of calls should we do to GitHub to be able to extract the data that is relevant to us (namely the build time data) @GioVisentini - Develop a "tool" / script to execute and produce a JSON export of the last N runs of the actions - the key is the name of the github repo - should this also perform the IMPORT / store somewhere? - should we use FiveTran to ease the importing from GitHub to Snowflake? - The tool should then be extended to be able to retrive ALL the Casavo github reports and iterate over them - Target for the storing: snowflake? - The result should be "containerizable" so that we can use it flexibly in our infra ### doubts * how do detect which workflows are the one the we want to monitor (e.g. only the "build+test" workflow) * should we just fetch ALL data as raw data and THEN make custom projections? Or shall we instead restrict the data fetched to only the data we are interested about? * how do we "bind" the github repos to its owning team? ### api to call: https://docs.github.com/en/rest/actions/workflow-runs#get-workflow-run-usage GET https://api.github.com/repos/OWNER/REPO/actions/runs/RUN_ID/timing -> maybe no since it works only for GitHub-hosted runner https://docs.github.com/en/rest/actions/workflow-runs#list-workflow-runs-for-a-repository GET https://api.github.com/repos/OWNER/REPO/actions/runs -> give the list of workflow -> there is only the starting time we can pass a created paramiter to have al all the runs in a time range GET /repos/{owner}/{repo}/actions/jobs/{job_id} give us all the steps with starting and compliting time https://docs.github.com/en/rest/actions/workflow-jobs#get-a-job-for-a-workflow-run what we have to do: - call https://api.github.com/repos/OWNER/REPO/actions/runs to get the list of workflow and extract the job id? for each workflow call GET /repos/{owner}/{repo}/actions/jobs/{job_id} to get the duration