owned this note
owned this note
Published
Linked with GitHub
# 2025-10-02 OSMF OPS meeting
02 October 2025, 19:00 London time, unless rescheduled
[Time in your timezone](https://www.timeanddate.com/worldclock/fixedtime.html?msg=OSM+Foundation+OPS+meeting+-++Thursday+02+October+2025&iso=20251002T19&p1=136&ah=1)
[Countdown](https://www.timeanddate.com/countdown/generic?p0=136&iso=&msg=OSM%20Foundation%20OPS%20meeting%20-%20%20Thursday%2020251002T19)
[Online calendar](https://framagenda.org/apps/calendar/p/fce4xrpFGx7fMxz8)
[Subscription to future events](https://framagenda.org/remote.php/dav/public-calendars/fce4xrpFGx7fMxz8?export)
Frequency of meetings: every two weeks, on Thursday at 19:00 London time, unless rescheduled.
[Video room](https://osmvideo.cloud68.co/user/dor-x99-y3m)
## Participants
* Grant
* Tom
Apologies: Paul.
-----
## New action items from this meeting
* **Grant to discuss with Paul Norman and flesh out his suggestion and determine the practicalities (e.g. key revocation).** [Topic: AWS CA cert]
* **Grant to follow up with Paul.** [Topic: Serving vector tile styles]
* **Tom to do a check on Saturday.** [Topic: Upgrade to Postgres 17]
* **Grant to go ahead with the purchase of the Gen10 (second-hand) server for Nominatim in the US.** [Topic: Gen10 Nominatim purchase (USA)]
* **Grant to upgrade the baseboard manager controller before the PG17 upgrade.** [Topic: Upgrade to Postgres 17]
-----
## Reportage
### snap-02 BIOS upgrade
[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Done: Grant to test the BIOS upgrade against snap-02. Whenever suitable. [Topic: OSM DB upgrade to Postgres 17]
Done. See https://github.com/openstreetmap/operations/issues/1289 Was relatively easy - the Bios upgrade took a bit longer than expected.
Grant changed two settings on Snap02, one of them for power management (CPU scaling). It will scale much higher when a lower number of cores are in use.
### South Africa potential hardware donation
2025-03-20 Grant to follow-up with the South African contact about the potential hardware donation from a mobile network. [Topic: New offers of Servers Australia and South Africa] #2025-09-18 parked
Grant recently emailed his contact in South Africa, who is checking whether we can get better hardware in a more modern data center.
-----
# Agenda
## AWS CA cert
We have a few machines that have backups synced across. Grant wants backups copied directly to S3, rather than sending them to the backup server that would then send them to S3. This way, we would remove one layer of potential failure (e.g. if the storage server is out of space). Want copies of backups to be independent.
This will increase the number of keys we have.
We have keys for:
* Planet publishing
* Planet dumping
* Backups
* Tilelog processing, run by Paul Norman
* Rails
It is recommended to give each service its own key.
### On AWS
You give each machine a role in AWS, which allows them to sudo into another role (e.g rails/images/backups/log processing). The roles have all the access credentials that they need. The machine's role can only switch to another role, for which they have gotten permission.
### On self-hosted systems
We can't do this method with self-hosted systems, but we could use Certificate Authority (CA) certificates: if the machine presents itself with a signed certificate, we could trust the CA.
Wondering whether we could leverage the private key to do a Certificate Signing Request (CSR) against an internal CA and produce a certificate. The secret key would then never leave the individual server. E.g. Chef would run and create the CSR, and the CSR would be copied to e.g. to the ACNE system.
A CSR is not needed in this case. The Chef keys could be completely ignored and built our own system. Would need a central system, and we create a self-signed certificate which would be our CA. Each machine would generate its own key pair and upload the public key to the server, which would sign it with the CA and send it back to the client.
If we want to simplify the process, we could extract the public key for each client and possibly sign those public keys using our own CA certificate, to create a certificate for that client. The Chef server has the public keys, which can be extracted with knife.
We would need to submit the root certificate to Amazon. Rails probably supports CA-based authentication to Amazon.
### On sudoing to another role vs current practice
Advantage of proposed method:
* When a server gets added to the pool, it automatically gets an access key to AWS.
** We would need to go to Terraform and tell AWS about what the server can do.
*** Create roles in Chef.
<u>Disadvantage of proposed method</u>
* Creation of extra steps when we add a new machine (run Chef, run OpenTofu).
<u>Suggestion</u>
* Taking the roles from Chef and injecting them in OpenTofu.
<u>Other points mentioned during discussion</u>
* Sudoing to another role gives credentials (such as an id and secret) which we could extract e.g. every 12 hours.
### On storing keys
* Chef.pem is the private key.
* The Fastly configuration that we have now has no keys, as they already had a single public identity with Amazon that their customers are working with. This case is a bit different from ours, as arbitrary customers use the single identity.
<u>Other points mentioned during discussion</u>
* Grant can now share the Fastly OpenTofu code, as it no longer contains any secrets.
### On private CA managers
* Easy-RSA https://github.com/OpenVPN/easy-rsa
** Not needed, as Chef has the necessary resources.
* Could use LetsEncrypt https://letsencrypt.org/
**Action item: Grant to discuss with Paul Norman bout the AWS CA cert proposed method and flesh out his suggestion and determine the practicalities (e.g. key revocation).**
-----
## AWS Setup automation
Paul is setting-up the log-processing stuff. He uses some AWS services (Athena and Glue) which we're moving to a new account. Paul would like to have automation on AWS, like we do with Fastly.
There are some bugs related to OpenTofu, where some resources are forgotten and recreated.
Grant tries to figure out granting the permissions.
-----
## Serving vector tile styles
Follow-up to conversation in 2025-08-07 meeting and making a call on [#1263](https://github.com/openstreetmap/operations/issues/1263).
**Action item: Grant to follow up with Paul.**
-----
### Gen10 Nominatim purchase (USA)
Grant to OSUOSL. OSUOSL contract.
**Action item: Grant to go ahead with the purchase of the Gen10 (second-hand) server for Nominatim in the US.**
-----
### AWS email
AWS are changing some S3 operations.
If they are things that fail to replicate, they would previously get deleted, if there was a delete policy set or life-cycle rule set. Now they won't delete them.
<u>Other points mentioned during discussion</u>
* Grant has a script which checks if things have missed replication, and replicates them if they haven't.
**Action item: Grant to do a commit to alert manager, so that we get notified if there are failures in replication.**
-----
## Upgrade to Postgres 17
Grant to update the baseboard manager beforehand, which he has already done for Snap-02. The CPU load will spike briefly.
**Action items**
* **Tom to do a check on Saturday.**
* **Grant Grant to upgrade the baseboard manager controller before the PG17 upgrade**
-----
## Open Ops Tickets
Review open, what needs policy and what needs someone to help with...
https://github.com/openstreetmap/operations/issues
https://github.com/orgs/openstreetmap/projects/1
https://github.com/orgs/openstreetmap/projects/1/views/2?filterQuery=-is%3Aclosed
## Action items
* ~~[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Done: Grant to test the BIOS upgrade against snap-02. Whenever suitable. [Topic: OSM DB upgrade to Postgres 17]~~
* ~~[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Done: Grant to send an announcement about the upgrade tomorrow. [Topic: OSM DB upgrade to Postgres 17]~~
* [2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Paul to look at potential issues related to the collation of indexes - Debian Postgres upgrade. [Topic: OSM DB upgrade to Postgres 17]
* Not public: [2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Grant to disable 2 old accounts, old access keys removed. [Topic: AWS Security]
* Not public: [2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Grant to produce list of keys for Ops to review and rotate. [Topic: AWS Security]
* ~~[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Done: Grant to get Gen10 quotes for Nominatim upgrade. [Topic: Hardware upgrade - Nominatim]~~
* ~~[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit) Done: Grant to ask Sarah if she will be happy with a Gen10. [Topic: Hardware upgrade - Nominatim]~~
* ~~[2025-08-07](https://hackmd.io/et_yASB0QsCp7VIQXA6e6w/edit) Done: Grant to 1) create AWS account + S3 buckets, 2) start from what we log for raster tiles, and 3) set the logging compression to zstd. [Topic: Vector Tile Logging]~~
* [2025-07-24](https://hackmd.io/LZwDRuy7QHeYWNvEQkeeYw/edit) Grant to set-up a test for OWG's review [Topic: Switching www.osm.org to Fastly frontend]
* [2025-07-24](https://hackmd.io/LZwDRuy7QHeYWNvEQkeeYw/edit) Grant to do the Mailman 2 to 3 conversion [Topic: Mailing lists] - https://github.com/openstreetmap/operations/issues/1264
* [2025-05-01](https://hackmd.io/XM71u-YHS6WYugFA02lotw?edit) Grant to see if other USA University offers are still available and what hardware would be required. [Topic: OSUOSL funding / issues] #2025-09-18 parked
* [2025-03-20](https://hackmd.io/G2QZRHpkRyOQG_DC-Je2Tw/edit) Grant to follow-up with the South African contact about the potential hardware donation from a mobile network. [Topic: New offers of Servers Australia and South Africa] #2025-09-18 parked
* [2025-03-20](https://hackmd.io/G2QZRHpkRyOQG_DC-Je2Tw/edit) Grant to run an SQL query to identify more email providers used by spammers. [Topic: Spam] #2025-05-01 Grant has created a small list now disposable email providers. #2025-09-18 parked
## OPS pads for 2025 meetings
[2025-01-09](https://hackmd.io/LQexyX9iSSu8GG6JsYKyxA/edit)
[2025-01-23](https://hackmd.io/sXaKftrrRNOQsOgNPF1grg/edit)
[2025-02-06](https://hackmd.io/mpF7kUW3SBaiFV64KIxEoQ/edit)
[2025-02-20](https://hackmd.io/1tldw0TbT8-fMcHrK3FZbQ/edit)
[2025-03-06](https://hackmd.io/YbsJCyKMRji8xuSPqp5l4Q/edit)
[2025-03-20](https://hackmd.io/G2QZRHpkRyOQG_DC-Je2Tw/edit)
[2025-04-03](https://hackmd.io/_puAtHLWTtC0QY85S3f5Lw/edit)
[2025-04-17](https://hackmd.io/WMJtb4XKRD6KwqiaJBleaw/edit)
[2025-05-01](https://hackmd.io/XM71u-YHS6WYugFA02lotw/edit)
[2025-05-15](https://hackmd.io/6Y1ERM5YTsK3W_npgEfCwA/edit)
[2025-05-29](https://hackmd.io/W-vfoOMXT4GJHUWCgSEalg/edit)
[2025-06-12](https://hackmd.io/auqIBdUBTmufr0J28ILUFA/edit)
[2025-06-26](https://hackmd.io/RuUmi--vQyC9r8Dw9SV7QQ/edit)
[2025-07-10](https://hackmd.io/FOnb4lVDTouSKquJ6N_Cug/edit)
[2025-07-24](https://hackmd.io/LZwDRuy7QHeYWNvEQkeeYw/edit)
[2025-08-07](https://hackmd.io/et_yASB0QsCp7VIQXA6e6w/edit)
[2025-08-21](https://hackmd.io/NU41pU-hRxmCg_38_4VECQ/edit)
[2025-09-04](https://hackmd.io/VBUSEJwzQPyInb7zXiCvYQ/edit)
[2025-09-18](https://hackmd.io/yDbLczVeSAWrLQbBFpxTzQ/edit)
[2025-10-02](https://hackmd.io/frQWtzX8SQCaa0hvP7tW6A/edit)
[2025-10-16](https://hackmd.io/Pv21I7zsRnuZM595BD2Mzg/edit)
[2025-10-30](https://hackmd.io/j2yqXiX4SReufR30qO8wMQ/edit)
[2025-11-13](https://hackmd.io/YByQ6Lj3SaWzSa-JQQcbDQ/edit)
[2025-11-27](https://hackmd.io/ux4TpF3_S6axy9x1PKwuew/edit)
[2025-12-11](https://hackmd.io/7WV8uEE0T9y7iOVU7-rmkw/edit)
[2025-12-25](https://hackmd.io/RkiREgvpQ-aJqPwiQ5gLLw/edit)