owned this note
owned this note
Published
Linked with GitHub
# Revising GAP infrastructure: website, downloads, package distribution, releases
[TOC]
In the following, various plans for overhauling our existing GAP infrastructure are described. The goal is to become less reliant on servers hosted in one single place (University of St Andrews right now), and the required admin work, and to instead be able to work "in the cloud" and with many people in different places being able to help with administration.
The proposals below are centered around using infrastructure provided by GitHub and other third party providers for us. But of course we don't want to tie ourselves interminably to GitHub & co. Therefore, in each case, it should be possible to very quickly switch to alternate solutions if it ever becomes necessary. This seems quite achieveable, though, and several concrete suggestions towards this end will be described. This concern should also be kept in mind when implementing these changes.
## Task 1: Switching website to Jekyll (DONE)
NOTE: Work on this has begun, see <https://github.com/gap-system/GapWWW/pull/121>; this also contains further notes on technical details.
This task proposes to migrate the [GAP website](https://www.gap-system.org/) (whose source code can be found at <https://github.com/gap-system/GapWWW/>) from our proprietary [Mixer](https://github.com/gap-system/Mixer) static website generation tool (see also [here](http://www.math.rwth-aachen.de/~Max.Neunhoeffer/Computer/Software/mixer.html)) to [Jekyll](https://jekyllrb.com). Benefits:
- many people are already familiar with Jekyll
- there is extensive documentation for it
- there is a vibrant community driving its development
- lots of themes, plugins, etc. are available
- it is super easy to host Jekyll on GitHub (though we may need to use something slightly more advanced in the end, see below)
There are of course many alternatives to Jekyll, see e.g. [here](https://www.staticgen.com) for many more. But Jekyll seems to be the most widely used; also, many of us are already familiar with it, and hosting it on GitHub is dead simple. A major advantage of certain alternatives such as [Hugo](https://gohugo.io) seems to be raw speed of the site generation, but I do not believe that this is an issue for us.
This task is purely about changing the way we generate the final static website from a set of source files. It does *not* involve updating the looks and layout of the site, its structure or content (even though all of those are also desirable, of course).
### How to migrate to Jekyll
At first, we will use Jekyll and Mixer *together*: First run Mixer, to generate
HTML files, then run Jekyll to copy those HTML files to its `_site/` dir.
Then, use the content of the `_site` directory as content of the website.
Then we add some `_includes` and `_layouts` files which reproduce the existing website "styles". With these available, we can start to gradually convert `.mixer` files to `.html` or `.md` files handled by Jekyll. Over time, we'll likely also add `_data` files. e.g. to handle the packages list and bibliography. Indeed, there are currently tools that produce `.mixer` files which should be changed to produce inputs for Jekyll instead.
Note that we can do all this even with the existing hosting solution!
Once the conversion has begun, everybody has a chance to contribute to it, even people who know little about GAP or web development -- we could even hire students to help with it.
### Decisions to be made and steps to be taken
- Decide: do we want to switch to Jekyll?
- Decide: do this in the `master` branch of the [GapWWW repository](https://github.com/gap-system/GapWWW)? Or move it to an experimental branch or repository?
- ...
## Task 2: Separating text from file download (DONE)
Right now, we have genuine web content and binary downloads next to each other, under the same domain. E.g. we have both of these:
* <https://www.gap-system.org/Releases/index.html>
* <https://www.gap-system.org/pub/gap/gap-4.10/tar.gz/gap-4.10.1.tar.gz>
But for most alternative hosting, it makes sense to separate these.
### Suggested steps:
1. Adjust all download links to go through a Jekyll template (or possibly multiple) which turn a filename into a download link.
2. Move files to separate subdomain, e.g. files.gap-system.org, ftp.gap-system.org, downloads.gap-system.org, ...
3. Adjust the download link templates to generate the new URLs.
4. While at it, the new download site might once again allow browsing through all files, like it was possible with the FTP solution.
Note: to avoid breaking existing links (also from outside sources!) we should consider providing [301 redirects][301] from the old to the new locations. This is unfortunately not currently possibly on GitHub! But it is available on Netlify and also our own servers, or with other solutions such as Amazon AWS S3 (more on this in Task 2).
However, future release tarballs might be hosted in a completely different way and location anyway (e.g. the GitHub release system, see Tasks 3 and 4).
So, what we also could do is this:
1. Provide the new file hosting, together with [301 redirects][301] in the old location; all still hosted in St Andrews
2. Search for all places that use the old file download URLs, e.g. `make bootstrap` targets in the GAP build system, and change them to use the new URLs
3. Disable the 301 redirects for a while to see whether we got all (and fix any broken links that turn up)
### Decisions to be made and steps to be taken
- Decide: do we want the 301 redirects?
- Add the subdomain for files, say `files.gap-system.org`
- Ensures downloads are available from the new subdomain
- Add the redirects (if so desired)
- Update all web pages, Wiki pages, GAP manuals etc. to use the file download
URLs (also the `PKG_BOOTSTRAP_URL` variable in GAP's `Makefile.rules`)
- ...
## Task 3: Hosting website in the cloud
The proposal is to change the hosting of the main GAP website from a server in St Andrews over to [GitHub Pages](https://pages.github.com), which we already use for many packages.
This relies on Task 1, but only in its simplest form; i.e., once the website has been minimally converted to work with Jekyll, we can proceed with this task.
This task is *not* concerned with the hosting of big binary files. These take much more storage and bandwidth than the text content of the website, and are covered in a separate task.
### Pros and cons
Main advantages of such a move:
- uptime should improve (St Andrews had several multi hour outages in 2018)
- we can forget about most maintenance needs, including security patches
- it is very easy to grant many people the right to deploy website updates; we can even automate it
Similar free hosting is available from e.g. [GitLab](https://gitlab.com) and [Netlify](https://www.netlify.com). Since the pages are ultimately static, any kind
of webserver can very easily and quickly be adapted for this.
However, the main advantage of GitHub pages is that people need to be familiar with that for submitting issues or working on GAP anyway, and thus also may already have an account there. They are hence also able to contribute to the [GapWWW repository](https://github.com/gap-system/GapWWW).
One concern with GitHub pages is that it does not support [301 redirects][301], which would be useful as described in Task 2. However, we can do without it; or we could use Netlify (which supports 301 redirects, but is less nice to admin for us than GitHub pages, at least at the free tier). Or if we are lucky, GitLab might soon add support for 301 redirects, see <https://gitlab.com/gitlab-org/gitlab-pages/issues/24>. We could also keep using hosting at a university, but that leaves concerns about uptime, and maintenance (who will be able to access the machines if it becomes necessary?). We could also look into paid hosting solutions, such as Amazon AWS S3.
Deployment on the other hand is not a problem at all: For GitHub pages it is of course automatic (unless we want to use Jekyll plugins); for all others, we can easily use Travis CI, CircleCI, Azure Pipelines, etc.
### Work plan
1. Decide in which repository we want the data for this to be hosted:
- `gh-pages` branch of <https://github.com/gap-system/gap/>
- fits with what GAP packages do; but perhaps it'd be better to retain a separate issue tracker for website issues?
- `master` branch of <https://github.com/gap-system/GapWWW/>
- least work, but may be difficult if we want to experiment with using GitHub Pages while at the same time keep the existing hosting solution working, until we are ready to switch (then again, this might work without a hitch, too, hard to tell in advance)
- `gh-pages` branch of <https://github.com/gap-system/GapWWW/>
- perhaps a good compromise?
2. Meta discussion: perhaps we should rename `GapWWW`, e.g. to one of these:
- `www`
- `web`
- `website`
3. Prepare for `www.gap-system.org` (or at first a new subdomain, like `gh-pages.gap-system.org` or `www.gap-system.org`) pointing at GitHub Pages via [these instructions](https://help.githutestb.com/en/articles/using-a-custom-domain-with-github-pages)
- this essentially amounts to adding a file `CNAME` with content `www.gap-system.org` to the repository
4. Test the result extensively, fix any broken links etc.
5. Change DNS setup for `gap-system.org` by adding a CNAME record pointing to GitHub
- Q: Who actually has access to the DNS config for our domain?
- A: Alex Konovalov knows whom to talk to
6. Profit!
#### Contingency plan if we need or want to move away from GitHub
Since this is just about static hosting, we can always go back to hosting at a university (St Andrews, Siegen, ...) or virtually any other kind of service offering.
### Decisions to be made and steps to be taken
- [ ] add a subdomain for experimenting with alternate hosting, say `test.gap-system.org`
TODO
## Task 4: Migration of file downloads to GitHub releases (DONE)
We could host all our big binary files in the GitHub release system. This way, it is easy for multiple people to maintain the data, and we also get to benefit from their CDN (content delivery network). As an example, I added a few files for the GAP 4.10.0 and GAP 4.10.1 releases, see here:
- <https://github.com/gap-system/gap/releases>
- <https://github.com/gap-system/gap/releases/tag/v4.10.0>
- <https://github.com/gap-system/gap/releases/tag/v4.10.1>
Note that this does not necessarily have to *replace* our regular file hosting; it could also supplement it. Even if we replace it, I recommend that we store backup copies of these files in independent location. E.g. I would setup a cron job to backup them at the Uni Siegen, too.
Some tasks:
1. create upload scripts which take a bunch of tarballs and upload them to the
GitHub file release system, to the appropriate tag
- TODO: details; mention that we can lift some code from ReleaseTools or many other existing tools
#### Contingency plan if we need or want to move away from GitHub
We simply go back to the current situation and host at a university.
## Task 5: Revising the package distribution
[See this HackMD document](https://hackmd.io/EFifigvAQ32fYXd6XgbnQg).
#### Contingency plan if we need or want to move away from GitHub
TODO
## Task 6: Overhaul the *design* of the GAP website
TODO: let's find 1-2 people who are good at webdesign
and let them work on the look of our website, and on a nice landing page.
Alternatively, we could look for some good templates and adapt them; or host a contest for "designing the best GAP website or so" among all our users, advertised on the website itself.
However, I'd recommend to first switch to Jekyll.
## Task 7: Overhaul the *content* of the GAP website
TODO: we need a better landing page with less distraction and more focus on what people need; also, a lot of things are needlessly difficult to find, scattered over multiple pages etc.; other pages have valuable content but it's difficult to parse out becaues there is too much fluff text; etc.
However, I'd recommend to first switch to Jekyll.
### idea: versioned manuals
Another idea: it would be nice if we archived the documentation
for all GAP releases, and made it available simultaneously, with the GAP version in the URL. So <https://www.gap-system.org/Manuals/doc/ref/chap0.html> might become <https://www.gap-system.org/Manuals/4.10.2/doc/ref/chap0.html>, and so on. We would also provide a `latest` variant, symlinked to whatever is the latest version. And we could even consider having a `dev` variant, which shows the manual as it is in `master` (could be updated daily by a Travis cron job, or even as part of the regular `master` Travis jobs).
Bonus points for writing some Javascript which injects into the generated manuals a navbar somewhere which allows switching between the available versions.
### idea: versioned packages
The same as for manuals could be done for packages, so that one could browse the (sets of) package versions released with specific older GAP versions
## Task 8: overhaul the GAP release process (DONE)
This is closely related to overhauling the website and package distribution.
Some random snippets and thoughts
- the current process is documented [in the release checklist](https://github.com/gap-system/gap-distribution/blob/master/DistributionUpdate/RELEASE_CHECKLIST.md)
- see also [the stable branch checklist](https://github.com/gap-system/gap-distribution/blob/master/DistributionUpdate/STABLE_BRANCH_CHECKLIST.md)
- reduce work required to create release notes
- just deliver them as Markdown or HTML, don't bother to create a GAPDoc manual with them (or if we do want to keep a GAPDoc manual, then make it as easy as possible to do so, ideally automated)
- automate more of it, by requiring people to ensure all PR titles can be used as release notes entry; and then write a script which pulls the merged PRs from GitHub and automatically generates (or updates) the release notes. This might include an intermediate machine readable format for the release notes, e.g. a JSON file which contains PR numbers, titles, etc.
- for the categories, make sure that github issue labels match with release notes categories, so that we don't have to categories things twice
- make sure Windows support in the distribution is fully automated
- in particular, creating the Windows NSIS installer should not requure human interaction
- revise how the stable branch is created, and how we inject versions into the release tarballs
- instead of what we do now (see XXX), I suggest we actually put the GAP version into the repository, namely into `configure.ac`
- then, all the files that currently contain strings (like `GAP.dev` or `of today`) which we replace when creating the release archives, should instead be changed to be updated by the GAP build system resp. to retrieve the data from the central location
- full release builds should be made frequently (either do it as part of the regular Travis CI tests; or via a nightly/weekly job); those builds should be identical to a regular release except for the version number; both to ensure the release process works, and also to ensure we can cut a release at any time.
- ultimately, I envision that the process of making a GAP release such as, say 4.12.0, should look like this:
```
# 1. Create source tarballs
cd gap.git
git checkout stable-4.12
vim configure.ac # change version from 4.12dev to 4.12.0; possibly insert a release date(?)
git commit -m "Version 4.12.0" configure.ac
git push
etc/release # run release script, similar to ReleaseTools; will also upload sources to GitHub, create tags
# 2. Creating Windows installer, updating Linux rsync distro, macOS bindist, etc.
# TODO: could be done semi- or fully-automatically by a GitHub action; see Julia package release process for inspiration
# ... or: require user to initiate process by calling ascript (actual work would either happen remotely on a server; or else, we'd require the release manager to have all relevant (cross-)compilation tools installed)
etc/update_bindist
# 3. Update website:
# TODO: could also be done automatically by Travis/GitHub Actions/Jenkins
# ... or: require user to initiate process:
etc/update_website
```
## Misc ideas, to be sorted
- it would be good to have previews for website updates; i.e., GapWWW PRs should result in a test website being uploaded somewhere which one can browse to check everything is fine (but beware of security implications; so e.g. must not use `www.gap-system.org` domain). Once we converted the website to Jekyll and moved downloads to a separate location, this should be not too hard
- it would be good to have previews of the GAP manual for GAP PRs; i.e., so that one can preview changes to the manual. Note that we build the manual as part of Travis CI test anyway, so this is mostly a matter of deploy the resulting HTML files "somewhere" (e.g. PR #12345 could have its HTML uploaded to a hypothetical `preview.gap-system.org/github/pulls/12345`. Bonus points for automatically adding a link to that location to the PR (this is something GitHub Actions ought to make easy). More bonus points for ensuring that once a PR is closed, the corresponding website preview is deleted (if the PR is later re-opened, the preview can be regenerated by rebuilding the PR)
-
--------
[301]: https://university.webflow.com/article/intro-to-301-redirects