<!-- more why less how -->
# Lessons from open source software development for GDSE
If you are interested in how to successfully do globally distributed software engineering (GDSE) then this blog post is for you! If you are unfamiliar with GDSE then we recommend you read [this Wikipedia article about GDSE](https://en.wikipedia.org/wiki/Distributed_development#Globally_Distributed_Software_Engineering) first.
##
GDSE has rapidly become more prevalent due to the COVID pandemic and will likely continue to grow in a post COVID world. Working in a globally distributed setting brings many opportunities and challenges. One of the biggest challenges is the reduced in person communication which is the most effective way of communicating.[[1]]
In this blog post we will take a look at best practices from open source projects, which typically have hundreds of developers from all over the world working on it, to deliver amazing software. What practices are used to make the software engineering process function so smoothly? And are these practices valuable for GDSE? To answer these questions we investigated three open source projects: Kubernetes, the Apache foundation, and Rust!

## Good documentation
When going to any large open source project a wide array of documentation can be found. Documentation about how to contribute, how to open issues, how to interact with the community. These documents act as the onboarding resources for new collaborators. Let's see to what extent each of the 3 open source projects handles documentation and how these techniques can improve GDSE practices.
### Kubernetes
The Kubernetes project on Github has a [separate repository](https://github.com/kubernetes/community/) for all documentation. The Kubernetes project is quite large and everything is rigorously documented. There is documentation for all technical aspects of the system as well as which committees oversee what parts of the Kubernetes system. The high quality of the documentation and easy access on Github reduces the bar for understanding and contributing to the Kubernetes project. In the image below we can see how the good documentation in action. New contributors are quickly guided to the information they need.

### Apache foundation
Each Apache project provides their own documentation, these documents are used to more easily onboard new collaborators. The documents contain information like; instructions on how to run the program, how to execute the tests, which rules to follow when adding new code, how to communicate with the corresponding team. The Apache projects also keep extensive archives, so it's possible for a new collaborator to read previous discussions.
### Rust
Rust has an extensive webpage that it uses to steer developers that want to contribute or use rust which is called [Rust Forge]. Here every little detail is explained, from the tools they use, to the different ways to install Rust. This webpage also links to the different teams that are working on this project, since Rust is made up out of different components.
### Takeaways
The documentation for these open source projects is much more extensive than typically found in industry. Likely, due to the fact that good documentation is necessary to make contributing easy for external contributors. Another benefit of good documentation, is that developers can easily find what they are looking for. What companies doing GDSE should takeaway from the good documentation of the open source projects is that having good documentation for projects will make it easier for new developers to be onboarded and reduce needless communication about where to find relevant information.
## Easy to find and use communication channels
Communication plays an important role in GDSE, so it naturally does as well for open source projects where everyone is distributed and working asynchronously by default. Open source projects have to ensure that communication channels are easy to use and accessible otherwise people will find it difficult to contribute to the project. Let's take a look at how communication is handled in open source projects.
### Kubernetes
Communication in the Kubernetes community happens through Slack/Google forums and on Github. Anyone can access these channels and see what others have written. On Slack there are over 420 channels each with a different topic of discussion. The comments on Github issues and pull requests are of course all related to code while the forums and Slack channels appear to be more focused on user questions about Kubernetes.
### Apache foundation
The Apache foundation has a different way of communicating, when compared to other open source projects. Project communication is handled via mailing lists[[5]]. Mailing list are easy to archive, but less accessible than instant messaging platforms like Slack and Discord. For questions about Apache projects contributors are recommended to ask them on Stack Overflow with the correct tag on their question. There is also some communication happening on pull requests on Github.
### Rust
Since some teams need more than just text channels, Rust uses two main tools, Discord and Zulip. Rust uses these for most of their informal communication, while using Github and e-mail for formal communication. In Discord and Zulip, the developers can ask for questions or assistance. The different text channels keep the server well structured. Rust uses Github mainly for version control and communication about version control specifically, while e-mail is used for private communication.
### Takeaways
What the three projects have in common is that all communication is archived and made publicly available. This ensures that no information is lost, and that decision-making steps can be easily traced back. Accessing the correct communication channel is also quite easy as this is specified in the documentation. This ensure that communication is routed to the correct people, reducing churn within the organization. Having easy to use and archived communication channels is especially useful in a globally distributed setting as people are working asynchronously and distributed which typically makes communication more difficult than in an office setting.
## Strong sense of community
In any organization creating a strong sense of community is beneficial for the end product.[[2]] A community can be defined as: a unified body of individuals with common interests. Open source projects are notorious for establishing and nurturing passionate communities, let's take a look at how this is done.

### Kubernetes
The Kubernetes community has a code of conduct and set of values that serve as a basis for the community.
The 5 core values of Kubernetes are [[3]]:
1. Distribution is better than centralization
2. Community over product or company
3. Automation over process
4. Inclusive is better than exclusive
5. Evolution is better than stagnation
There are also community meetings and office hour meetings conducted once a month. Keeping members of the community connected and facilitating knowledge transfer. In person meetups are also organized multiple times a year for anyone to attend.
#### Inclusion
The community value "Inclusive is better than exclusive" is especially helpful for community building. The values document puts it best: "Broadly successful and useful technologies require different perspectives and skill sets, which can only be heard in a welcoming and respectful environment."[[4]]
### Apache Foundation
Shane Curcuru, who serves as the Vice President of Brand Management for the ASF, published a [blog] where he explained the Apache way. The Apache way can be split up in 6 main concepts:
1. **Charity**, Apache’s mission is providing software for the public good.
2. **Community**, Many of us are more effective than all of us.
3. **Consensus**, Getting good enough consensus is often better than voting.
4. **Merit**, Those that have proven they can do, get to do more.
5. **Open**, Technical decisions are discussed, decided, and archived publicly.
6. **Pragmatic**, Apache projects use the broadly permissive Apache license
All Apache projects follow these main concepts even though all Apache projects have their own dedicated teams. These teams decide themselves how to nurture their community. The Apache projects all encourage participation, and it is from this that many projects organize meetups.
Beside these meetups there are more ways the community actively helps each other, they participate in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo.
#### Inclusion
The Apache Foundation also actively promotes and studies diversity and inclusion. They have an active project called "[Apache Diversity and Inclusion]", which is dedicated to understanding and promoting diversity and inclusion within Apache projects.
### Rust
Rust organized meet-ups in the past where teams could come together and meet each other in person. The Rust Community teams purpose is to organize events to help the different teams connect with each other. This team organizes "meetups, conferences, barcamps, but also online events like global hackfests". The in-person events really enhance social cohesion in the Rust community.
#### Inclusion
Workshops called [RustBridge] are organized to help others take first steps in learning and writing Rust. These workshops are made specifically for "underrepresented groups" to include them in the Rust community.

Rust also has inclusivity clearly specified in their code of conduct: "We are committed to providing a friendly, safe and welcoming environment for all, regardless of the level of experience, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, religion, nationality, or other similar characteristic." Making inclusivity a core value is likely one of the reasons the Rust community is so strong.
### Takeaways
The core takeaway from the open source projects is that the communities are super welcoming and foster an incredibly inclusive environment. The inclusive environment combined with mentorship programs, consensus based decision-making and in person meetups create a work environment where everyone is comfortable sharing their ideas and unique perspective, resulting in great software being created. For companies that want to implement GDSE ensuring that their developers feel included and part of a community just like in open source projects will likely lead to better results. This sense of community can be created by organizing events, mentorship programs and by ensuring that the company attempting GDSE prioritizes inclusivity.
## Standardized workflows
In industry settings GDSE is often accompanied by distributed Scrum, but in open source settings with thousands of developers other workflows are utilized. The workflows in open source projects make it easy to see things like; who is currently working on what, what is already done, do all tests still pass. Being aware of these things greatly reduces misunderstandings. Let's take a look at how each open source project manages its workflows.
### Kubernetes
The Kubernetes community performs triage on issues and pull requests. Important issues and pull requests are labelled according to their priority and to what part of the system they correspond. Kubernetes also has an entire subcommunity dedicated to the test and releasing infrastructure of Kubernetes. These structured workflows make the process of contributing and building the Kubernetes project as hassle-free as possible. In the image below you can see a snapshot of open pull requests of the Kubernetes Github. Note how well tagged all pull requests are.

### Apache foundation
As opposed to Kubernetes, the Apache foundation tracks issues on Jira and Bugzilla[[6]]. Communication between developers happens in the comment section of these issues, and in the comment section of the pull requests. Apache teams use static code analysis tools and linters to ensure a baseline quality of the codebase. The CI/CD tools of choice for most Apache projects is Github CI.
### Rust
Rust uses different CI/CD tools for different parts, but most of Rust uses [Rust CI], which is a CI made by the Rust community. Rust works with Github pull requests just like other projects combined with the CI/CD tooling. Therefore, the workflow for contributing to the codebase is quite standardized. One cool thing about how Rust handles issues is that they, when needed, require the person who makes an issue to explain the idea behind the issue in a meeting. This ensures that the issues are understood and aligned with the project. Since there are many issues to work on, Rust has a specialized team that categorizes the issues into different [priority levels]. The general procedure is shown below.

<!-- meeting is held so that questions can be asked about the issue and to determine its relevance and priority for the Rust project. -->
### Takeaways
All three open source projects have impressive and standardized workflows by utilizing Github and CI/CD tooling to the greatest extent possible. All the infrastructure code is automated and of high quality. A new contributor can easily see the status of things like; state of an issue, state of pull request, state of a discussion and more. What teams doing GDSE should takeaway from these open source projects is that standardizing workflows and accurately labeling pull requests helps to reduce misunderstanding leading to less rework. The workflows can only be standardized if CI/CD tooling is utilized which should be a given for any company serious about GDSE.
## Conclusion
In conclusion, these 3 open source communities function so well because of:
1. The high quality of the documentation, allowing people to quickly find information they require.
2. The easy to use and archived communication channels, allowing anyone to easily obtain information from previous questions/discussions.
3. The strong sense of community obtained by having inclusivity as one of the core values and by creating a safe space for everyone to share their ideas and contribute.
4. The standardized workflows, which make sure work is organized and picked up by the right people.
Companies doing GDSE would certainly benefit from implementing these best practices as they have been shown to result in high quality software being delivered in a globally distributed setting.
<!-- ## References -->
[Apache Diversity and Inclusion]: http://diversity.apache.org/
[8]: https://whimsy.apache.org/board/minutes/Diversity_and_Inclusion.html
[2]: https://lists.apache.org/thread.html/r0e076ffb20dbf80ee3e60c61f37cee69aa648c4860182ecfabb3abbf%40%3Cdiversity.apache.org%3E
[3]: https://github.com/apache/cloudstack/pull/4922
[4]: https://www.apache.org/foundation/policies/conduct.html
[5]: https://www.apache.org/foundation/how-it-works.html#communication
[6]: https://issues.apache.org/
[10]: https://github.com/kubernetes/test-infra/blob/master/prow/README.md
[3]: https://github.com/kubernetes/community/blob/master/values.md
[4]: https://github.com/kubernetes/community/blob/master/values.md
[blog]: http://theapacheway.com/
[1]: https://www.educba.com/different-methods-of-communication/
[2]: https://www.kenzie.academy/blog/why-community-matters-in-the-workplace/
[Rust Forge]: https://forge.rust-lang.org/
[Travis CI, Gitlab CI and builds.sr.ht]: https://doc.rust-lang.org/cargo/guide/continuous-integration.html
[Rust CI]: https://forge.rust-lang.org/infra/docs/rustc-ci.html
[RustBridge]: https://rustbridge.com/
[Rust Community]: https://docs.google.com/document/d/1jH2Cz493ILQ79mTR1O8Msgf4v7UmaYp5Mc0UjTNmQ68/edit#
[priority levels]: https://forge.rust-lang.org/compiler/prioritization/priority-levels.html