---
tags: outreachy
---
# Outreachy Summer 2021: OCaml
*Contact: Anil Madhavapeddy <mailto:anil@recoil.org>. Contributions: Gabriel Scherer, Patrick Ferris, Gemma Gordon, Bella Leandersson*
Outreachy is a program to connect interns with open source mentors. Outreachy internship projects may include programming, user experience, documentation, illustration, graphical design, data science, project marketing, user advocacy, or community event planning. See https://www.outreachy.org for more details.
OCaml is planning to take part as a community of mentors in 2021, and we need to get our:
- Community application in by **March 1st 2021**. @avsm will act as community coordinator this year.
- Individual mentorship applications by **March 7th 2021**.
- see https://www.outreachy.org/communities/cfp/ for details.
# Mentors
In previous years of Outreachy, we've had a variable approach to which projects take part. Some years, it has been lead by MirageOS and in others (such as last year) from the core OCaml compiler (thanks to @gasche). There is now an [OCaml Outreachy Community](https://www.outreachy.org/may-2019-august-2019-outreachy-internships/communities/ocaml/) which we can use moving forward.
This year, given the pandemic and general time crunch resulting from it, we're going to have a go at assembling mentors in two phases:
- For the _immediate_ deadline (i.e. in March 2021 for starting internships in May 2021) we are going to get mentors for the OCaml.org website effort.
- For the _next_ Outreachy deadline (i.e. towards the end of 2021) we begin preparing our projects over the summer so that they have a culture of "first time issues" so that we can expand the program in upcoming years.
## What does mentorship involve?
Read the [Outreachy mentoring guide](https://www.outreachy.org/mentor/#mentor) -- you should already be contributing to some OCaml project in the community (see the list later in this doc). What you need to help with is:
- Outreachy ensures that each proposed project corresponds to a contribution to an established, open-community project. So when picking projects, 'tie them' to one of the headers in the list below (e.g. the OCaml compiler, or the opam package manager, or multicore OCaml) so that it is clear that it is more than a personal repository.
- The [Outreachy application process](https://www.outreachy.org/docs/applicant/#make-contributions) works by having potential candidates send a contribution (typically a Pull Request) to the project; only candidates that did a small contribution can be considered for the internship. For each potential project, one has to decide which project would receive the contribution, and plan for some work to provide an easy-issues list, and some time to review the contributions. Working through the potential contributions can take some time, so ideally our project should have some "first time issues" that will guide newcomers into finding areas where they can get a PR merged. Remember how you felt when your own first PR got merged into an open source project!
- The only sort of difficulty we had with Outreachy in the past is that, several times, we ended up with interns that were not qualified enough to complete the project as we had originally envisioned. We recommend aiming at very gentle projects (or being ready to not select any intern for it, by lack of qualified candidates; but sometimes we have good surprises and there definitely are a few already-skilled applicants), and being very careful at not over-estimating applicant skills during the application process. Bear in mind that everyone has to start somewhere, and people learn at different paces and are under varying circumstances. Our goal is not to get everyone to become regular contributors, but to give every applicant a chance to do so.
Don't feel pressured to sign up -- only do so if you have some spare cycles. But equally, don't feel that you are too inexperienced to do so. If you've been contacted about OCaml mentoring, we feel you are more than capable of helping someone more inexperienced!
# How you can help
The immediate priority is to identify mentors and projects for ocaml.org so we can submit a community application. The big advantage of getting Outreachy interns to work on the website infrastructure is that in the event that OCaml itself doesn't work out, they may also be able to focus on web technologies like CSS or HTML that may be useful to them in other areas.
This effort is split into two:
- **ocaml.org:** there is the current live website which is being deprecated, but is the place to contribute to right now.
- **next.ocaml.org:** in parallel, the core maintainers of the current website are rebuilding it using modern tooling and web technologies, which is due for going up on next.ocaml.org in around March or April 2021. This is where we would like to focus the efforts of interns, since there is a fairly long (but worthwhile) path to porting all the existing content to the new infrastructure.
## Initial Contribution Phase
The first thing to do is to create a bunch of issues on the existing website so that potential interns can pass the contribution requirement.
- [Small Improvements for Docs Page](https://github.com/ocaml/ocaml.org/issues/1236) -- adding `<iframe>` titles and fixing layout of buttons on mobile for the [documentation](https://ocaml.org/docs) page.
- [Footer width on mobile](https://github.com/ocaml/ocaml.org/issues/1237) -- the footer should stretch across the bottom, currently it has thin margins on either side on mobile devices.
These will be marked with a label `good-first-issue` and we will explicitly ask contributors to try to solve these.
We will also invite them to Matrix/IRC/Discord to ask questions about setting up the build of the website, which can be a little tricky for a newcomer, and to contribute documentation to improve that process too.
## Projects for Internships
The forthcoming next.ocaml.org site is written in Next.js and Rescript, and is vastly simpler to contribute to and extend than the current infrastructure. However, content needs to be ported over, some of which can happen mechanically. The following people are potential mentors in this space. Bear in mind that it's totaly fine to co-mentor, and we should probably do that to do a better job with fewer projects.
### Mentors
#### within next.ocaml.org
The next.ocaml.org team is comprised of the following people.
- Anil Madhavapeddy (ocaml.org core team)
- Ashish Agarwal (ocaml.org core team)
- Gemma Gordon (design / project oversight)
- Bella Leandersson (design / UX)
- Kanishka Azimi (development)
- Patrick Ferris (development)
#### from the OCaml community
There are also a group interested in general OCaml mentoring:
- Gabriel Radanne (@drup)
- Gargi Sharma (@gs0510)
- Jon Ludlam
- Sonja Heinze
- Craig Ferguson (@craigfe)
- *add your name here*
### Project List
These are the list of projects we can submit to Outreachy immediately, so that prospective interns can see what sort of things to apply for by Feb 22nd 2021 (which is the initial application deadline).
#### watch.ocaml.org population
OCaml has been around for a long time, and there are a number of media recordings available of talks about various aspects of the language. We would like to begin curating these and archiving them on self-hosted infrastructure on OCaml.org instead of relying on third-party hosting.
To this end, we have established watch.ocaml.org, which is an instance of the open source [peertube](https://joinpeertube.org) software. This can import videos to self-host them on OCaml.org infrastructure, and also serve them using p2p techniques to reduce the need for a big central streaming setup.
##### Project Milestones
1. Locate and import as many OCaml videos from YouTube/Vimeo/etc as can be found (2-4 weeks)
2. Metadata manipulation scripts to curate content (2-4 weeks)
3. Integration with Discourse commenting (4-8 weeks)
###### Importing OCaml videos
The current site has just the last year's OCaml Workshop videos uploaded. There are many, many more videos online on various other sites, which need to be imported into watch.ocaml.org from there.
Although the Peertube software can take care of the transcoding of the actual videos, there is some manual effort needed to ensure that the description, tags and other metadata are consistent throughout the site. This first milestone will see around 50-100 videos (or more!) imported with reasonable metadata.
###### Metadata manipulation to curate content
Once there are a number of videos in place, we need to write some scripts to help us manage that metadata and also link it to the next.ocaml.org site. This will involve using the [Peertube REST API](https://docs.joinpeertube.org/api-rest-reference.html) to write some OCaml code that will output the videos on watch.ocaml.org in Yaml format that can be interpreted by the OCaml website generator.
This will allow us to easily link the content in watch.ocaml.org as embeds directly within the OCaml website itself, in the 'talks' section.
###### Discourse integration
This is an advanced milestone that you may not fully complete, but is a good stretch goal in case you have time left. The discuss.ocaml.org site runs using the [Discourse](https://www.discourse.org) forum software. Ideally, comments on watch.ocaml.org should redirect the user to a thread on the discussion site.
This milestone involves writing a Peertube plugin that will replace the commenting area with a link to a Discourse forum thread. It can also create that forum thread using the Discourse API in case one doesn't already exist.
---
#### Opam Package Search & Grapqhl Endpoint
The package search was initially two separate projects: a new client and a new, GraphQL endpoint. It makes sense for these ideas to come under the same project and perhaps cater for the need of the applicants better that way.
The new web client for rendering output from the opam package database could use a [JSON endpoint](https://opam.ocaml.org/json/last10_updates.json) on opam.ocaml.org which provides information about packages (see [this commit](https://github.com/ocaml-opam/opam2web/commit/12b62b176aa2a8ee499f88376fe3d3e2542b35ec)) which would provide metadata about the packages. It could also use the new GraphQL endpoint as designed in this project. The two could be done by one, very competent and ambitious applicant.
Skills you will learn: Javascript, OCaml, GraphQL
Difficulty: entry level
Applicants: 1
##### Project Milestones
The opam package search can be split up into three phases:
1. Generation of the data (2-3 weeks)
2. Implementation of the GraphQL server (4-8 weeks)
3. A GraphQL Client app (4-8 weeks)
The weeks are rough estimates and if an intern only wants to work on say the GraphQL client app then they could do that, or they might want to have a go at the server too, or only the server.
###### Generating the Data
The most likely method for doing this is using the [opam2web](https://github.com/ocaml-opam/opam2web) tool. This tool generates the current opam.ocaml.org/packages site. We won't need the HTML output, but will most likely need the underlying data-structures it builds using a checkout of th opam-repository.
It could dump these as JSON and from that we could implement the GraphQL server. Or it could all be wrapped into the same project with the data just being in-memory.
The simplest idea for what to generate could just be a list of packages with information like `description`, `reverse-deps` etc.
###### GraphQL Server
Once the data-structures are finalised from *generating the data*, the next phase is to wrap this in a GraphQL server probably using [ocaml-graphql-server](https://github.com/andreas/ocaml-graphql-server). This involves generating a `Schema` that follows the types we defined in the data.
Another option could be to "Irmin-ize" the data and use Irmin's ability to generate GraphQL servers from a store. This still needs some [work though](https://github.com/mirage/irmin/pull/771) and I'm not sure how easy it is to extend the Schema.
###### A GraphQL Client
There are quite a few possibilites here, but it would be nice to have a decent prototype search client (eventually this could end up on OCaml.org) but I think there's enough here to keep it separate for the purposes of the project with the goal to upstream it in the future.
A cool project would be to write a *jsoo* client. I (@patricoferris) have some WIP bindings using `Brr` to [apollo-graphql-client](https://www.apollographql.com/docs/react/api/core/ApolloClient/) and we can integrate the `graphql_ppx` to do typed queries using a schema :cool:.
---
#### Improve OCaml.org
There are three main subprojects:
1. Accessibility
2. Translations
3. Design
Each take on a slightly different meaning depending if they are applied to ocaml.org or next.ocaml.org.
##### Accessibility testing for the ocaml.org website
The OCaml website is browsed by a variety of people with different accessibility needs. This project involves researching the various web accessibility standards, writing them up into a summary and checking our overall compliance. Once this research is complete, the remainder of the internship can be spent actually fixing some of the issues found.
Skills you will learn: HTML, CSS, OCaml
Difficulty: entry level
Applicants: 1+
###### Subproject Project Milestones
1. Research web accessibility (1-2 weeks)
2. Audit ocaml.org/next.ocaml.org (1 week)
3. Apply improvements from the audit (1-6 weeks)
4. Research common CI for enforcing accessibility standards (1 week)
5. Try deploying the research from (4) in ocaml.org/next.ocaml.org (1 week)
Step (3) is a little open-ended. A truly accessible site can include lots more than just good contrasting colours and `alt` attributes. Accessible forms, search bars, maps etc. are much more complicated (and probably interesting) and you have libraries like [reakit](https://reakit.io/).
##### Translations of the website
Currently next.ocaml.org exists primarily in french and english with some other translations such as japanese. Adding more translations would be great and in the new next.ocaml.org this should be much easier as there is a clear delineation between content (stored as yaml and markdown) and code (ReScript which consumes markdown and yaml).
By nature of what needs to be translated, interns will also pick up lots of OCaml knowledge.
Skills you will learn: Yaml, Markdown, OCaml
Difficulty: entry level
Applicants: 1+
###### Subproject Milestones
This one would likely have to be combined with others to fill out the internship. But it would be nice to have more translations into whatever language the applicant speaks.
They would also learn lots of OCaml too by converting the tutorials.
---
#### Add templating to odoc library output
The current version of odoc emits HTML that is quite specific, and not embeddable into other pages. There is no way to add headers, footers, or customize the output in any way. This project is to add a mechanism to support customisation of the output to enable these features.
Applicants should have contributed to the ocaml.org website project before applying (there aren't many starter issues in odoc to work on today).
Skills you will learn: OCaml, HTML
Difficulty: moderate
Applicants: 1
#### Markdown output for odoc
Odoc has recently gained generic support for producing output
in different format. Currently, the supported formats are
HTML, latex, and man-pages. This project is to add a new
text-based output format: markdown. The existing output generators will serve as templates for the new markdown output so
#### Help us to grow Outreachy
### Machine requirements
This is the information on the Outreachy "machine requirements" section of the project proposal:
> Need a Linux, FreeBSD or macOS host. Windows is possible to get working, but is more trouble than the other operating systems at present. While most machine specifications are acceptable, if your machine is too slow we can provide you with remote access to a fat virtual machine for the duration of your internship.
### How to contribute
> The best way to contribute to this project is by making an improvement to the ocaml.org website. We have categorised a number of "good first issues" on https://github.com/ocaml/ocaml.org/issues. You can select one that you would like to fix, and then build the website locally following the instructions in the README file at https://github.com/ocaml/ocaml.org.
> If you are having trouble following the instructions or find them incomplete, feel free to post on https://discuss.ocaml.org to request help, noting that you are hoping to join the Outreachy program. It is very important that you follow up to any replies that you get when you ask for help -- do not worry if you do not understand the first reply, but if you do not communicate it is difficult to help you further.
# Preparing for future rounds after 2021
We need to identify mentors and projects for these areas, in anticipation of getting new contributors in future Outreachy cycles (_not_ for the 2021 cycle). By preparing now, we will be ready then. Even if you have no time to mentor now, queuing up your ideas for the future will be extremely valuable.
These are the general areas to figure out:
- Core OCaml compiler
- OCaml Platform
- Ecosystem Libraries
# Discuss Post to open out to the Community
see [here](https://hackmd.io/uup7cHwxTxSd6Z3V2apAEA) (separated out for Susan or Christine to have a look at it)
## OCaml Compiler
### Main compiler
TODO
### Multicore OCaml
- coding a few data structures using parallel domains
- creating synthetic GC benchmarks for specific cases
- fuzz testing
### flambda2
TODO
## OCaml Platform
This covers all the main tooling for OCaml such as:
- opam
- dune
- odoc
- lsp-server
These can be difficult codebases to get into, so identifying `good-first-issue` topics is critical to a successful internship.
## Ecosystem Libraries
- Lwt
- MirageOS
- Tezos
- Core / Async
Each one needs a few low hanging fruit for interns to contribute to ahead of the
### MirageOS: Deploy a webserver
Build and deploy a [MirageOS unikernel](https://github.com/mirage/mirage) to host a static web page over HTTP. The unikernel could use [CoHTTP](https://github.com/mirage/ocaml-cohttp) to host the webpage and be deployed either as a process on a cloud-based server or as a standalone VM.
### MirageOS: Build DHCP server and gateway/firewall
Build a [MirageOS unikernel](https://github.com/mirage/mirage) VM that runs a DHCP server and can forward IP traffic. The [charrua](https://github.com/mirage/charrua) library can be used to provide a DHCP server. The [MirageOS firewall for Qubes](https://github.com/mirage/qubes-mirage-firewall) can be used to provide the firewall. Optional: include IPv6 support