owned this note
owned this note
Published
Linked with GitHub
## Key Features and Deliverables:
### Software Development
The core of vector tile generation is the geometry manipulation and vector tile serialization. The former of these is particularly complex and error-prone. PostGIS' vector file functions are used for this, as they are the most reliable implementation of geometry to vector tile logic. They are also becoming the industry standard, with Martin, pg_tileserv, Apache Baremaps, and several stacks internal to companies.
To reuse existing work, osm2pgsql is used to load OSM data into a PostGIS backend. osm2pgsql can handle minutely updates and is currently deployed by the OSMF for the standard tile layer and Nominatim. In theory, Tilekiln would work with any PostGIS database, including those loaded with Imposm, or non-OSM data.
There are no existing components suitable for vector tile storage. Single file options like PMTiles and Mbtiles do not work with minutely updates, and options which store the tiles on disk or in an object store do not scale to planet-wide datasets. Instead, compressed vector tiles are stored in a separate PostgreSQL database. This is a simple component of Tilekiln, with an implementation of about 100 SLOC. The database structure is similar to a common mbtiles implementation, but PostgreSQL can handle the concurrent updates that Sqlite cannot.
PostgreSQL is chosen as storage because it has a low per-tile overhead of approximately 50 bytes, which is small compared to minimum effective file sizes. Additionally, the tile server already needs PostgreSQL, so reusing an existing component simplifies operations.
Tilekiln is currently able to
- Generate vector tiles on demand for development
- Pre-generate tiles in a single-threaded process
- Serve pre-generated tiles without rendering new ones
- Serve pre-generated tiles while rendering missing tiles
It needs to
- Pre-generate tiles in parallel
- Emit metrics for monitoring
When combined with software capable of keeping an OpenStreetMap PostGIS database, this means it is capable of keeping a map up to date with minutely updates.
### Vector Tile Schema and Style
There is no such thing as a vector tile schema suitable for all uses. What to include and how to include it in the vector tiles needs to be driven by the cartography desired. For this project, a new style and schema will be developed, with schema changes driven by style changes. This style is named Street Spirit, and it's aim is to be suitable
- for use as a locator map,
- to show off what can be done with OpenStreetMap data,
- to be up-to-date with the latest OpenStreetMap data, and
- for using to orient a viewer to a location they are at.
I considered the choice to go with a new style instead of basing the style off of OpenStreetMap Carto. OpenStreetMap Carto has been developing for over 15 years when the prior Mapnik XML style is included, and I do not believe its decisions are well-suited for a new style. The cartography it has developed is designed with the Mapnik rendering engine in mind, and to reuse its style would cause problems where the capabilities of Mapnik and MapLibre GL do not match up. Many techniques used with Mapnik are not possible with MapLibre GL, and it would not take advantage of the new capabilities of MapLibre GL.
### Infrastructure Development
The infrastruture requried to create and serve vector tiles and styles shares a lot in common with the existing Standard tile layer infrastructure. Instead of generating raster tiles and storing them, vector tiles will be generated and stored. The database and CDN infrastructure has the same requirements as OpenStreetMap Carto.
New tile rendering servers for the standard tile layer are the same as general-purpose servers, and the same type of server will work for a vector layer. Because vector tiles are more easily pre-generated than raster tiles, less capacity should be needed than the Standard tile layer. When in production, there should be two servers for redundancy purposes.
## Coexistance with raster tiles
At least initially the Standard tile layer will need to coexist with Street Spirit tiles. Any webpage users of the Standard tile layer will be able to switch to Street Spirit if they are using Leaflet or another library that supports MapLibre styles. Desktop and mobile apps may have more difficulty switching, but MapLibre is an option for all common mobile platforms.
It is impossible to predict how many users will be able to switch over, particularly since the OSMF has no way of contacting most of them. A vector to raster converter could be done as future work, or if there are no mapping-related uses for the standard layer, it could be shut down.
## Comparison to Existing Software and Stacks
There are other software stacks that can generate vector tiles. These can be broadly grouped into those reading OSM files directly and those which are not OSM specific and read from a database, relying on software like osm2pgsql to load the database.
### Planetiler
Planetiler can turn a planet file into vector tiles in under an hour on reasonable hardware. It has some flexibility, allowing the schema of the tiles to be altered by writing Java code. Although excellent for many uses, it is incompatible with minutely updates which are a requirement for any replacement to the Standard layer. Additionally, there are significant architectural reasons that make Planetiler poorly suited to minutely updates.
1. Planetiler's output is a 80GB Mbtiles or PMTiles file and individual tiles cannot be updated. This is incompatible with minutely updates.
The PMTiles file format is not designed for updating individual tiles within an archive, and a completely new file needs to be written for every update. With a 80GB PMTiles file, this is not practical.
Mbtiles on the surface appears updatable, but this cannot be done under real-world load. Normally when using Mbtiles, one process generates the Mbtiles file, and independent serving processes point at it to read from it. This would require updating the complete file, with the same problems as PMTiles.
In theory, because Mbtiles is sqlite, reads and writes of tiles could be done in parallel, but is not designed for heavy concurrent IO by independent processes. Previous experiments with Mapproxy and Mbtiles have shown sqlite fails to work reliably under this type of workload. Instead, the generation and serving of tiles would have to all go through one process.
2. Planetiler does not store enough state to use minutely updates. Using a minutely update requires, at a minimum, storing all OSM objects in the planet with a method like osm2pgsql's slim tables or imposm's cache. Without this information, it is not possible to construct geometries for changed objects. Previous efforts like the imposm2 to imposm3 rewrite, Overpass API, and osm2pgsql slim tables have established that doing this is a significant investment.
3. Planetiler's vector tile generation code is not shared by any other software. One of the most error-prone parts of vector tile generation is ensuring the output is always valid and as expected after geometry transformation to the vector tile grid. Work by Tegola, Tilemaker, t-rex, and a number of internal stacks written by large companies has established that this is difficult, prone to non-obvious errors, and results in duplicating work.
### Tilemaker
Tilemaker uses the same approach as Planetiler, except it is written in C++ and has a Lua API which allows the user to specify how to transform data without having to recompile the software. It takes about a day to generate tiles for the planet, and has the same architectural reasons as Planetiler for not being suited to being a replacement for the Standard layer.
### Martin and pg_tileserv
Martin and pg_tileserv both operate similarly, using a PostGIS database, and generating the vector tiles in PostGIS. Both of these generate the vector tiles in PostGIS itself, using the same functions as Tilekiln. This means that their output is well-tested, as PostGIS has excellent vector tile generation functions. They both suffer from the same drawbacks that prevent use as a replacement for the Standard layer.
1. There is no vector tile storage of generated tiles, and all tiles are generated on-demand. When combined with a content delivery network (CDN), this is only a minor issue for high zooms which are fast to generate, but it does not work for low and medium zooms.
A medium zoom tile could take 30 seconds or several minutes to generate, depending on data density and how much geometry simplification needs to be done. This means that the first user to request the tile will not get it before the connection times out. It is never acceptable for a user to wait minutes for a map to load. Various other factors with the CDN such as multiple POPs, cache eviction, and other factors will make this happen more often. Increasing the cache time will help with this problem, but not solve it, and any improvements come at the cost of making minutely updates not work.
2. Each layer must be specified as a PostGIS function returning a binary blob, and layers are combined together based on the URL. This is not a friendly format for style development, as it tends towards writing lengthy PostgreSQL queries with extensive duplication in them, and generally a poor experience when writing queries for a real-world complex basemap. These queries generally take the form of a large UNION ALL, where editing the schema at one zoom level requires edits at all zoom levels.
I considered a caching layer in front of Martin that solves the storage problem, but this would still leave the difficulties in writing complex PostgreSQL functions.
### Tegola with PostGIS generation
Tegola can use PostGIS to generate vector tiles, bringing the same advantage of robust vector tile generation Tilekiln, Martin, and pg_tileserv have. Additionally, it supports caching tiles. Unfortunately, it suffers from issues which prevent its usage with a worldwide basemap.
1. Tegola's token substitutions suffer similar problems to Martin and pg_tileserv functions, where queries tend to lengthy UNION ALL statements, and a number of bugs.
2. Tegola only allows tiles to be saved to files on disk, Redis, or Amazon S3. Files on disk and S3 do not scale to offer adequate cost and performance for a planet-sized tileset, and Redis does not offer persistent storage.
3. The tegola software project is no longer active.
### Apache Baremaps
Apache Baremaps uses PostGIS to generate vector tiles, bringing the same advantage of robust vector tile generation Tilekiln, Martin, and pg_tileserv have. Additionally, it can write tiles to disk.
1. Tiles as files on disk does not scale to offer adequate performance for a planet-sized tileset.
2. The JSON configuration for vector tiles does not offer advanced enough token substitution for complex basemaps, particularly if geometries need merging or simplification.
### Other options
t-rex, tegola with builtin vector tile building, and node-mapnik were considered but are primarily unmaintained, legacy software, and/or offer nothing that approaches above don't offer.
## Comparison to existing client-side styles
### [OSM OpenMapTiles](https://github.com/openmaptiles/openmaptiles/tree/master/style)
OSM OpenMapTiles is the official style of OpenMapTiles, and is based on a subset of OpenStreetMap Carto cartography. The generation of tiles is done with either a PostGIS ST_AsMVT generator, or a legacy node-mapnik toolchain that outputs vector tiles. By using ST_AsMVT, it uses the same MVT generation code as Tilekiln, Martin, pg_tileserv, Tegola, and Apache Baremaps. Neither toolchain is capable of minutely updates.
Because the style depends on what is included in OpenMapTiles, it is ill-suited for showing off what can be done with OpenStreetMap data and the range of features mappers demand from the default map on osm.org. This is an issue common to any general-purpose map style and schema, as the goals of a general-purpose style and schema do not align with what we want from a default layer. If we were to fork OpenMapTiles to add changes they are not interested in, we would lose any advantages of having a common schema.
Several potential users of OpenMapTiles have turned away from it because they request attribution for the schema itself, which is generally believed to have no legal basis. OpenMapTiles have [indicated they are changing the attribution requirements](https://github.com/openmaptiles/openmaptiles/issues/1416), but taken no action since an initial press release.
### OpenStreetMap Americana
OpenStreetMap Americana is not a stand-alone style, existing as [a webpage](https://zelonewolf.github.io/openstreetmap-americana), with some work underway to turn it into a stand-alone reusable style. It is not known if this is possible, given the complexities of road shields the style deals with. It currently uses OpenMapTiles vector tiles, generated by planetiler. It is likely that Americana will stop using OpenMapTiles as they have found themselves limited in what they can add to the style by what OpenMapTiles is willing to accept.
Besides technical issues, the style is focused on American cartography, a goal contradictory to a map with a worldwide audience.
### Other options
Shortbread was considered, but does not come with styles. Additionally, it is designed as a basic schema and not suited to complex basemaps.
## Expected Benefits:
* Enhanced map experience, especially on mobile devices, with smoother, more visually appealing maps.
* Faster style development for a variety of mapping needs.
* Minutely updates for real-time mapper feedback, enhancing the community engagement.
* Possible bandwidth savings when browsing areas at high zoom.
* Overcoming technical limitations of the current raster-based Standard Tile Layer.
Several benefits will be capable, but not part of the initial work, or are work that other people would do
* Easier customization of displayed data, such as international boundaries and map language.
* Introduction of clickable points of interest for a richer, more interactive map experience.
## Key Milestones:
1. Software design and development phase.
1. Vector tile schema and style development.
1. Beta testing and user feedback collection.
1. Parallelism implementation
1. Infrastructure setup and optimization.
1. Full-scale implementation and launch.
## Quality Assurance
Tilekiln has testing for most code that is automatically run as part of CI. This will be maintained for new code. There are not good standard unit testing techniques for cartography, but testing will be added to test the style is valid.
The style will be made available on a preview site to allow others to work on it.
## Project Timeline:
1. Software design and development phase: 2 weeks.
2. Vector tile schema and style development: 11 weeks
3. Beta testing and user feedback collection: 3 weeks
Most of the time during these three weeks will not be spent working on the project, but will be waiting for people to respond to the calls for comments.
4. Parallelism implementation: 2 weeks
5. Infrastructure setup and optimization: 4 weeks
This is based on a estimate of working 3-4 days per week on the project and the tasks described in the appendix, allowing for non-OSMF work and vacations. A total of 22 weeks is estimated, but an additional week of non-work should be added as I will be attending conferences not part of this proposal. It is impossible to predict where the conferences will take place in the schedule without knowing a start date.
If additional hours are required for work, I could likely decrease non-OSMF work done to spend more time on the project.
## Budget Estimate:
67.5 days of work corresponds to 507.5 hours estimated work at 7 hours/day. 7 hours is chosen as a normal work day includes administration and other work not billable to the project.
## Project scope
The scope of this project the software and style to generate a vector-tile client-side rendered map with the features listed in the expected benefits section. Work outlined in follow-up work is explicitly out of scope. The cartography will be suitable
- for use as a locator map,
- to show off what can be done with OpenStreetMap data,
- to be up-to-date with the latest OpenStreetMap data, and
- for using to orient a viewer to a location they are at.
Cartographic features included will be those currently in Street Spirit, plus those listed in the style development phase.
## Stakeholders
The stakeholders are the
- current users of the standard layer;
- OWG, who are responsible for the servers which Street Spirit will run on;
- sysadmins who are responsible for the day-to-day running of the servers; and
- style developers interested in participating.
## Roles and Responsibilities
### Current users of the standard layer
The current users of the standard layer will be kept in the loop via diary entries and other communications tasks, but they have no explicit tasks. Feedback from users will help improve the style, but there is no way to require them to provide feedback.
### OWG
The OWG are responsible for budgeting and supplying servers for the style, deciding on featured layers for the website, determining usage policies, and responding to requests in a timely manner.
Some work will be done by Paul with an OWG hat on like usage policies, while others will be done as part of normal work. Due to conflict of interest reasons, Paul will not be able to participate in some OWG matters.
### Sysadmins
The sysadmins are responsible for reviewing Chef pull requests in a timely manner, documenting how to use the existing Chef setup, and the ongoing day-to-day running of the service once it is set up.
### Style developers
Paul will be reaching out to other style developers to try to get interest in participating in a volunteer capacity. There is a reasonable chance of attracting developers if the style becomes a featured layer.
### OSMF
The OSMF will be responsible for assigning a contact person for regular meetings, as well as a backup if the primary contact is away.
## Risk Management
The parts of the project with the highest risk are
- the parallelization of tilekiln,
- performance bottlenecks,
- running into the limitations of MapLibre, and
- community acceptance.
To reduce the risks of the parallelization of tilekiln, an initial prototyping and research stage is tasked out.
To reduce the risk of unexpected performance problems, one of the early tasks is to conduct tests to see the compute resourses needed. If unfixable performance problems are found, it is possible to switch generation methods from pre-generation to a mix of pre-generation for low zooms and on-demand for high zooms.
There is little that can be done for the limitations of MapLibre. I am experienced in Mapbox GL and MapLibre style development and the limits of what is possible, and it is generally possible to work around its limits at the cost of performance and complexity, but in some cases desired cartography may not be possible.
To reduce the risk of the community not accepting the result of the project, even if it meets all of its goals, regular updates are provided in many tasks, along with a beta stage. Ultimate community acceptance is out of the scope of the project as I cannot control it.
## Change Management
Changes to scope will be documented by email with the OSMF contact person. Some scope changes are expected after collecting user feedback, in accordance with good design practices.
## Communication Plan
There will be a weekly meeting with a contact person decided by the OSMF, and they will be responsible for communications to other parties in the OSMF about the status of the project.
Communications with the community will be done by diary posts, Discourse, and mailing lists and is built in to the tasks. Communication frequency will be variable depending on the part of the project and how interesting that part is to the users, but will typically be semimonthly.
## Project Closure and long-term maintenance
After completion of the project both Tilekiln and Street Spirit will continue to exist as Open Source projects.
Tilekiln's complexity comes from PostGIS, which is externally maintained. Tilekiln itself is only 609 SLOC. While this will increase with the work to be done, it is not complex software. For compairson, mod_tile is about 9000 SLOC.
OpenStreetMap Carto has shown there are a limited number of cartographers volunteering their time on styles, but a project hosted on OSMF servers and publicly available can attract them. Street Spirit offers several advantages over OpenStreetMap Carto for attracting new developers
- no legacy code to work around;
- a newer rendering engine than Mapnik;
- easier reusability for other parties with the ability to replace layers with their own data; and
- much development can be done without any local data processing, using OSMF-served tiles.
## Possible follow-up work
### Documentation on extending the style for other purposes
Currently there are several projects that use OpenStreetMap Standard tiles and overlay additional OSM data on top. Two popular examples are OpenSeaMap and OpenRailwayMap. Work could be done to document how to replace one class of object in Street Spirit with detailed interest-specific data, avoiding the problems currently found when overlaying on top of a raster layer.
This would also cover documentation on how to replace OSM data with some other source, for example if you were legally required to present a specific set of boundaries without regards to what is in OSM.
### Localized labels
With client-side rendering, it is possible to included additional data and switch the displayed language of names. Once added, making use of the ability on osm.org would require additional front-end work.
Issues: [spirit/#10](https://github.com/pnorman/spirit/issues/10), [tilekiln/#3](https://github.com/pnorman/tilekiln/issues/3)
### Clickable POIs
Vector tiles can contain some information not used by the style, so it is possible to take actions when the map is clicked. A clickable POI feature would require product work to determine what capabilities are desired. It is not possible to make every feature shown clickable, nor can full OSM tags be included with each object. These would result in vector tiles that are too large to be usable.
### Tile downloads
Work could be done to prepare weekly dumps of all the tiles as PMTiles, allowing people to self-host the tiles and not be subject to OSMF usage policies.
### Raster tile generation
It is possible to generate raster tiles from vector tiles, and this would allow clients which cannot use vector tiles to use Street Spirit as a replacement to the Standard layer. This could include static image generation to replace the osm.org export functionality, which is very resource intensive when used.
## Appendix: Detailed tasking
### Software design and development phase
This milestone completes steps necessary for more efficient style development and collecting data that will be needed later on.
Total: 7.5 days
#### Test resource requirements
Set up a world-wide minutely updated server with current osm2pgsql, Tilekiln, and Street Spirit to establish
- number of dirty tiles per day
- time to generate a day's worth of dirty tiles with current Street Spirit.
The results will be published in a diary post.
Estimate: 2 days
#### Modernize + automate python packaging
Review current best practices for python packaging, implement them, and have them run automatically in CI
Issues: [tilekiln/#5](https://github.com/pnorman/tilekiln/issues/5), [tilekiln/#9](https://github.com/pnorman/tilekiln/issues/9)
Estimate: 1 day
#### Allow tilejson overrides
Allow overriding of tilejson tile URL, to allow serving behind a CDN
Issues: [tilekiln/#8](https://github.com/pnorman/tilekiln/issues/8)
Estimate: 0.5 day
#### Identify needed metrics
Identify needed metrics for monitoring tile generation and serving in production.
Estimate: 1 day
#### Gather tile storage size metrics
Create functionality in Tilekiln to get metrics on tile size in storage
Issues: [tilekiln/#7](https://github.com/pnorman/tilekiln/issues/7)
Estimate: 1 day
#### Publish tile storage metrics to prometheus
Establish the groundwork for publishing to prometheus, and publish the tile size storage metrics
Estimate: 2 days
#### Implement real-time serving metrics
Implement metrics from real-time serving
Estimate: 2 days
### Vector tile schema and style development
This section will complete the style work that is currently known. Throughout development the latest version of the style will be deployed on a preview site, and diary posts will be posted after significant user-facing changes to allow user to review and comment on them.
Many of the cartography tasks will involve research, looking at other maps for ideas.
Total: 37.5 days
#### Investigate sprite building
Investigate methods for building the spritesheet, including if charites can handle sprite building.
Estimate: 1 day
#### Implement sprite building
Using previous investigation, implement working spritesheet building.
Estimate: 1 day
#### Investigate font building
Research how to build the font stack for Street Spirit, and create follow-up issues to implement it
Issues: [spirit/#22](https://github.com/pnorman/spirit/issues/22)
Estimate: 1 day
#### Implement font stack
Switch from go-spatial provided fontstacks to one built specific for Street Spirit.
Estimate: 2 days
#### Implement transit lines
Implement rendering of transit lines
Issues: [spirit/#15](https://github.com/pnorman/spirit/issues/15)
Estimate: 2 days
#### Implement pedestrian and non-motorized road cartography
Implement cartography for pedestrian roads, as well as footpaths, bicycle paths, and similar non-motorized ways
Issues: [spirit/#13](https://github.com/pnorman/spirit/issues/13)
Estimate: 2 days
#### Implement tracks
Implement cartography for tracks that fits with other road and highway cartography
Estimate: 2 days
#### Implement road shields
Add shields for motorized roads, from relations and ref tags
Estimate: 5 days
#### Implement shop POIs without icons
Implement rendering of shop POIs as dots or an alternate means that does not have shop-specific icons. This will require consideration of jinja templates and how to best handle large lists of shops
Estimate: 2 days
#### Implement food POIs without icons
Remove current inconsistent icons from food POIs and render them the same.
Issues: [spirit/#18](https://github.com/pnorman/spirit/issues/18)
Estimate: 1 day
#### Identify missing high-zoom fills
Review high-zoom fills in OpenStreetMap Carto and identify any that are not in Street Spirit and should be added, and create issues for them.
Estimate: 1 day
#### Implement missing high-zoom fills
Implement missing fills previously identified
Estimate: 1 day
#### Consistent transit POI icons
Find or design a consistent set of transit POI icons
Issues: [spirit/#12](https://github.com/pnorman/spirit/issues/12)
Estimate: 3 days
#### Implement one-way road icons
Issues: [spirit/#17](https://github.com/pnorman/spirit/issues/17)
Estimate: 2 days
#### Extend roads to mid-zooms
Add mid-zoom roads to vector tiles and add styling
Estimate: 2 days
#### Extend roads to low zooms
Add low-zoom roads to vector tiles and add styling
Estimate: 1 day
#### Power lines
Add power lines and poles to tiles and style them
Issues: [spirit/#19](https://github.com/pnorman/spirit/issues/19)
Estimate: 1 day
#### Add individual trees
Add individual natural=tree to tiles and add rendering
Issues: [spirit/#20](https://github.com/pnorman/spirit/issues/20)
Estimate: 1 day
#### Add drinking water POIs
Add individual natural=tree to tiles and add rendering, including an icon
Issues: [spirit/#21](https://github.com/pnorman/spirit/issues/21)
Estimate: 1 day
#### Include osm2pgsql config with Street Spirit
Move away from OpenStreetMap Carto's flex PR and put the flex config in Street Spirit's repo
Estimate: 0.5 day
#### Add vegetation table with generalization
Add vegetation table to flex config, and set up generalized tables using osm2pgsql-gen
Estimate: 5 days
### Beta testing and user feedback collection
Total: 4 days
#### Publishing to community communications channels for feedback
Writing diary entries, Discourse posts, mailing list emails, and other communications to solicit user feedback, as well as responding to questions
Estimate: 1 day
#### Collecting user feedback
Collecting user feedback from beta testing, and turning feedback into issues.
Estimate: 1 day
#### Identifying scope changes
Working with the OSMF to identify any scope changes or new work from the beta test, as well as time estimates.
Estimate: 2 days
### Parallelism implementation
Total: 12 days
#### Identify parallelism requirements
Identify the requirements for parallelism, including
- size of tile queue
- independence of parallel work
- amount of parallelism required
Estimate: 1 day
#### Parallelism research and prototyping
Research ways to implement parallelism in python, given the requirements identified, and prototype to figure out what is best.
Estimate: 3 days
#### Implement parallel tile pre-generation
Implement pre-generating tiles for an area or a tile list in parallel
Estimate: 8 days
### Infrastructure setup and optimization.
Total: 8.5 days
#### Request resources from OWG
Request server resources from OWG.
Estimate: 0.5 day
#### Cookbook writing
Write service-specific role for vector tile serving. Completion of this task requires that ops action items on documenting chef testing have been completed, and sysadmins can review PRs in a timely manner.
Estimate: 5 days
#### Initial import and pre-generation
Import the planet to the OSMF server and generate tiles, and document procedures in runbook
Estimate: 1 day
#### Setup of Fastly
Notify Fastly, set up CDN and logging
Estimate: 1 day
#### Dashboard setup
Set-up backend dashboard in Prometheus
Estimate: 1 day
### Full-scale implementation and launch.
Total: 3 days
#### Write webpage showing maps
Write a demo webpage showing the maps on OSMF hardware, and deploy it.
Estimate: 2 days
#### Submit proposal for featured layer
Submit proposal to add Street Spirit as a featured layer on osm.org.
Estimate: 1 day