# R Core Initiatives Kick-off Meeting
bit.ly/mine-zoom
## Present
R Core: Martyn Plummer, Luke Tierney, Michael Lawrence, Robert Gentleman
R Foundation: Gabriela de Queiroz, Jenny Bryan, Henrik Bengtsson
Forwards: Heather Turner, Ritwik Mitra, Emily Dodwell, Anna Vasylytsya, Emma Rand, Mine Çetinkaya-Rundel, Noa Tamir
R-Ladies: Laura Acion, Yanina Bellini Saibene, (& Noa Tamir), Saranjeet Kaur Bhogal
MiR:
General R community: Amelia McNamara, Kara Woo, Maya Gans, Jyoti Bhogal
Python Core: Carol Willing, Joannah Nanjekye
RUG Buenos Aires: German Rosati, Elio Campitelli
RConsortium Diversity & Inclusion Group: Samantha Toet, gwynn sturdevant, Hadley Wickham
## Agenda
1. [5 mins] Who's here?
2. [5 mins] What is the objective of this meeting? Ground rules.
3. [20 mins] What are the barriers to new people contributing to R, especially those from under-represented groups?
- What does contributing to R mean? Core R code base and documentation (not e.g. CRAN packages, bioconductor)
- (YBS) As underrepresented groups we need to be supported and safe. We want to contribute, but we need to know how.
- (MP) Being a core member is a binary thing; can we consider a different arrangement with wider group of people (and wider distinction between R Core and not-R core?)
- (JB) Hard to know how to make contribution to R (i.e. to know you haven't broken things -- are there tests and where?)
- It is fairly onerous to install R from source and check it
- Doesn't feel like it's written down anywhere or implemented the same way a lot of R packages are (i.e. Github pull or merge request); with the latter, things are automated such that you make a change and a build happens to decrease burden on everyone. Ultimately the difference between changes made automatically vs. artisanally.
- Broaden participation: identify small changes vs. big, move towards automation and standardization (that is, revisit tooling that R is using in development process)
- (HT) Themes: tooling, transparency of process
- (EC) Right now most of the community development is done using github and roxygen2. Is hard to move from creating packages using those tools to having to learn to send patches.
- (CW, from Python core and steering coucil for jupyter) I was invited to Python core language summit, which was previously only open to core developers (little diversity of race and gender) and unable to know supporters/blockers of efforts to increase diversity. So not welcoming to new developers (or process to become one), and Python was losing out on talent.
- People have limited time but want to contribute in meaningful way
- Over 4 years of slow changes, code base was moved to GitHub 1.5 years later which helped with tooling, processes, and putting in place automated tools (kept code quality high but made it easier to contribute)
- Less focus on "there need to be women on core," but rather benefits for Python of having diverse group of skill sets and people in core
- Risk of giving people privileges to push to core is mitigated by code review process and tooling
- (JN) Barrier to entry for Python is learning curve, so mentorship is important
- Recently more people have been encouraged to contribute, and more responsibility (i.e. triaging bugs, not necessarily core code permissions) gives people sense of purpose and ownership
- Frequent contributions with smaller tasks
- +1 from LA, GQ, YB all agree mentorship is key
- (HT) staged approach and pathway to contributions
- (WC) JN is one of our key devs for all GC (garbage collection). We are very grateful to have her skills and contribution.
- We created a triage team. This has been a good on-ramp for all core developers
- (MCR) For people from underrepresented groups, difficult not seeing people who look like themselves (role models)
- Processes take time, while there are people in the R community who are qualified and could start process tomorrow
- Two pronged approach is necessary to help inclusivity: Defining process to join R Core, and encouraging people from underrepresented groups
- (EC) Are there now more varied roles within R-Core that don't involve the tall order of having access to the SVN repository?
- (RG) Even within R Core, people would find it helpful to contribute and check -- transparency may be easy project with substantial benefit for getting group of people involved (and simpler)
- (ML) Is there documented process in Python community for joining core?
- (CW) Yes, the Python core development guide (https://devguide.python.org/, https://www.python.org/dev/core-mentorship/) was the first project moved onto github
- Spent a lot of time documenting the process for contributing: breaking down with core team to ask how are we promoting core developers (otherwise ad hoc random thing)
- Structure: mentorship, dev guides
- Brett and I focused on how to make the processes and workflows better for everyone.
- (EC) We need a "R Packages"-like book but for R itself
- (GR) I like that idea
- (KW) Other barriers that exist: because tooling and systems are not the ones some are used to, there is high barrier of fear to using those that exist. I once tried to submit a patch, and spent hours reading rules to do the right way -- checking and re-checking the wording of my report before working up the nerve to hit send.
- Mailing list has scary reputation, but otherwise need bugzilla account
- Suspect that many people have been turned away by fear of doing something the wrong way
- Need for transparency for process so they can check their own work (for fear of being shot down once brought public)
- (AM) Increase in diversity in R community is based on intentional mentorship, e.g. tidyverse dev days after rstudio::conf
- Low hanging fruit issues in tidyverse packages have been identified, so participants worked on improving documentation with group of mentors
- Walked through whole process, which is a good example of on-ramp support
- (LT) We hoped to do 1/2 day learning at useR, which hopefully we will be able to think about how to do again next year. There is now a dedicated email to request account.
- (CW) We did similar sprints at PyCon and Europython each year.
- (LA) Intention is key. Diversity won't happen if the door is not open from the R Core team and a safe space is provided
- (LT) have considered migration to GitHub/Lab. One drawback of GitHub is some wouldn't feel comfortable with product owned by Microsoft based on past experiences with them.
- GitHub is high profile operation, so it becomes a target (a lot of data is put there), e.g. previously Turkey considered blocking access because of data there, and it's similarily not hard to imagine China blocking access.
- gitlab: maybe not use gitlab.com
- Potential for similar but different thing? i.e. if people are used to GH and we use GL
- (CW) I believe that you can self-host GitLab
- (GS) Sounds like we someone should explore the options: GitHub, GitLab, etc…
- (JB) Re: studying GitHub vs GitLab, for example, I’d love to hear from the Python folks re: PEPs
- (GS) For anyone else that wonders what a PEP is: https://www.python.org/dev/peps/
- (NT) Q for Python: given how high the bar is right now, unless one is in academia and there are ways to support this work by yourself, is there funding in the community to support those who are ramping up to development team? Otherwise, contributions are limited to academics only or those who could support themselves
- (CW) All Python devs are volunteers; there are a small number who get some funding from companies and/or get release time to contribute
- Can do intros and sprints at conferences for onboarding to core language, which also helps others in the community who support wanting to bring in people's different skill sets
- Goes underlooked how much time (as a woman who is highly technical) you are asked to speak and mentor. This represents a time burden (even with access and ability to contribute), but also how do you chop 24 hours of day to handle different responsibilities/contributions?
- By making processes more straightforward (had same convos re GitHub/GitLab/other git repo), it helps bring in younger generations by giving tooling they are already comfortable with.
- Longterm goal should be strong contributions, and how we get there is up to us.
- (NT) I also organised a scikit-learn sprint for women+ and it was so much work. I can imagine those that get involved will have a lot of added work on top of ramping up.
- (CW) Thank you, NT. scikit-learn sprints are great.
- (NT) Yes. We had a few women become regular contributors now. It was a very positive experience for me to see for myself these changes can happen.
- (GB) One thing that is good to know/informative fact: There are companies that pay for people to contribute and become core developers. They do this as full-time work. It is part of the company goal. I’m only aware of big companies that do this.
- (CW) One idea that we had but didn't have funding for was to train a cohort
- (LA) Currently, how does the team renew? Or how does someone get onboarded to the R Core team? Does the team have any type of protocol for renewal?
- (ST) Idea of mentorship is fantastic, but given time commitments, how feasible is that and what does it look like?
- Similar to tidyverse dev day is issue tagging. These are specific areas where people could contribute/commit (experience required noted), which may help with onboarding in incremental tiers
- Adding the ability to fail is very helpful (i.e. sandbox where you break things and it's totally ok because it's a dedicated learning environment)
- (ML) Who can we talk to around tidydeveloper day or is there documentation on how run?
- (JB) Repo is run out of https://github.com/tidyverse/tidy-dev-day
(and I'm happy to be a resource)
- Issues document when something didn't work well
- (NT) Ran one for scikitlearn
- (HW) In general this has been helpful to reveal weaknesses/issues you've never thought about as you lead people through
- (JB) Highly motivating to get issues fixed quickly and thoroughly, and to be in the room with people working through
- (HT) From forwards POV, if we're going to recommend mentoring scheme or dev day for people from underrepresented groups, we need to ensure safe space with Code of Conduct
- Ideally CoC around entire R project (and policy for implementing the CoC)
- How to give credit to people? One disadvantage of current process is that if someone makes a suggestion and receives the ok of a core member who then implements that change, it is difficult to get credit for that free work
- (LA) And it makes it hard to have visibility into who's involved already, and therefore discoverability of mentors
- (HT) Language barrier is a big issue (also accessibility in general)
4. [20 mins] What are our goals?
- (HT) Not necessarily to train up specific individuals, but set a path with collaboration with members of R Core (and things only they know)
- (LA) To R Core: great community at hand, and there are a lot of people willing to help here with diversity efforts and to take things off your plate
- Need/request for transparency and credit (ideally money - just a little can do a lot); but if there is no money, at least credit
- Concerned if there aren't plans for renewal and future
- (JN) Also mentorship in Python is most times goal oriented. Core developers identify motivated individuals and guide them further to give them extra responsibility to make non trivial and more frequent contributions.
- +1 to recognizing contributions from people from YB, JN, CW. (JN) This is what makes people hang around -- credit for their contribution.
- +1 to the community from YB, GR, NT
- +1 that we have the knowledge from YB
- +1 that we want to help from YB, GR
- (LT) There is the issue of contribution credit: we've finally gotten to point where writing a package is credit, but even if you've worked on a big project on internal parts of R, how to capture?
- Help wanted in bugzilla, and trying to tag some of bugs as things that would be good to work on (and give people place to look for)
- (LA) This is why mentoring is so important
- (CW) Core developers have limited time and while skill sets might be very technical, don't necessarily want those people organizing event (there is a different set of skills for logistics)
- (LA) +1 you need a lot of different skills. organizing a dev day takes a ton of work from different skill sets, not only programming.
- (YB) +1 to more skills because R is more than a language
- (LA) But all types of skill must be weighted and credited the same
- (GR) +1 Even the "non-so.technical community" could help a lot in problems of usability, workflows, user-experience, etc...
- (EC) R-Core has been selected on the basis on knowing R, C (and SVN), not for mentoring, community creation.
- (CW) It helped with Python to bring in other people and therefore skill sets beyond just committing code
- When Gita retired, there was new governance where problem solving emphasized people and how to move forward as a group.
- Credit is important and not solved in Python
- Easier processes and workflows -- more benefit
- Easier configuration so it doesn't take someone 1.5 days to compile code
- (NT) Even in the python world credit is an issue. Especially for those that are contributing more organisationally rather than in code. We need to think outside of the box about recognition of time, efforts, dedication.
- (HT) Learn from RLadies about expanding and growing:
- LatAm community, learn something new then teach it
- Be realistic about what people can contribute, scales
- It is everyone's responsibility to join in mentoring and training
- Documentation: lots of things to document (ala python dev guide), process of being developer, and process of R core developer
- (CW) Quality processes result in a quality language. +1 from YB, SKB
- Automated tooling helps check people's pull requests and make it easier for people to contribute safely. The tooling provides tips on linting, style, and testing.
- Documentation comes first; mentoring succeeds through documentation.
- +1 Documentation from SKB, YB
- Work days
- Issue tagging: people who are interested don't know what need to do. Bugzilla is only a small portion of what's going on.
- (CW) R Ladies is our role model in the Python world
5. [20 mins] What ideas do we have to meet our goals?
- (MG) As per the documentation, super eager to learn, doesn't know much, would be cool to include solving a simple bug
- Mentorship: asynchronous mentorship, those more familar check the documentation making sure mental models are correct
- (LT) What is difficult about setting up R?
- (CW) I was building on a Mac which added complexity. Linux is definitely the easiest to build on. The process isn't hard but the process is not in one place. It's spread over the multiple places.
- (EC) Wouln't documentation of the current worklow grow stale quickly if the plan is to switch to git-something?
- (CW) Expert hand holding would be helpful.
- (JN) Doing it with several people would be good
- (MCR) Document process of R Core
- Documentation is easy to start with
- No diversity divide between R Core and those who documentate
- Other people who could start with R Core, I could not, so documentation
- Not conflate technical skill building with diversity
- (JB) So much this! Documentation is a gateway, yes, but I would hate to be creating two groups of people.
- (CW) I was able to show how much easier it was to contribute to Jupyter and numpy than core python. Even though the technical work was comparable.
- (GQ) Come up with documents about what we have to present to R Core
- Here is how we will execute with your help
- (HT) Should be thinking about this next
- Good to work together on GitHub/GitLab. Good to have experts and software engineers to see if it's useful (also others included)
- Share conversations with Python community
- (HB) Wish list from R Core - what they hope to see out of this project
- +1 to wish list from LA, YB, SKB, GR
- Expose their thoughts and what they think
- Show low hanging fruits and things we can start with
- Loosely in R Core heads, have they discussed this?
- (LA) Maybe a wish list from the individuals in R Core that would like to see change happen? Thank you to the individuals in R Core that are here today and for all the key work you do
- +1 thanks to R Core folks for participation today from AM
- (RG) R Core is not an organized structure - IMTJ side of things
- Assumes that e-mail communication is not necessary
- Very conflicting views of the world
- Bulk of R Core
- Thanks for all of the contributors
- Will put this ideas into action to move things along
- (MT) Process R core maintains base code, also shepard the vast code of CRAN packages
- Change R core: what did you break, and now there is a message that tells you what you broke
- Increase participation: need tools in place to allow people to see what broke
- Is it package builder or R Core?
- Cloud based solutions are an option, but not in place
- Must be able to break things without worrying
- (CW) We have a bunch of buildbots.
- (CW) Excellent points, Martyn. One page cheatsheet on how to build R on Linux, Mac and Windows.
- (NT) That goes to Jenny’s point from the beginning about tooling. It’s definitely at the heart of this issue.
- (JB) And how do you play with building R and patching R without breaking your “main” R installation?
6. [10 mins] What are our priorities?
- (HT) What angle do we want to attack this from, what approach should we take, coordinate and strategize
- Summarize all of this and report to R Core so can have internal discussions
- Reverse process of R Core's wish list
- Can't set out all priorities now - not commited for the next year, but a start
- (EC) What about adding a person to the R Core team whose role is specifically to think and do stuff around this issue? It would organise the disparate views among the rest of the members, and also officially recognise the importance of that work.
- (CW) Python Devguide - 1 page how do you build python linux, mac, and window
- Showed where the failures in our tooling are
- How can we help people bootstrap their development to help them
- Little things have a big impact
- Gives people ability to code not just documentation
- (HT) Simple guide to help people get started
- People who recently joined Forwards and are interested in documentation
- People that can help explain what the problems are, that R Core can't see
- (JB) Evaluating GitLab/GitHub - PEPs
- Have community-wide discussion of the big decision?
- Some members of R Core make development blog post
- Moving R to SVN to a modern git host is a big decision, community has a lot of expertise
- Should probably be analyzed
- Opportunity to create report about this
- Fundation for diversity initiatives
- (HT) Software engineering experts could weigh in on this
- (Mine) Perhaps this sort of a move can reveal holes in the system
- Muscle memory things could be rethought
- (CW) People can be subversion natives
- Git and subversion side-by-side
- Needs to be a win-win, if not will not be successful
- Unequal buy in is okay
- 4 years to become a developer
- Knew she could help b/c she had worked on Jupiter and there were things that were not being done
- Focus on win/win and more likely to be successful
- We spent a lot of time training core devs who were comfortable with mercurial to learn git.
- (LT) subversion and git are the same age
- One is not better than other
- Once thought of source force
- (EC) The best one is the one people use. Better != technically better.
- (LA) Because you need more people participating in the process.
- +1 from LA, GR, YB, NT
- (NT) The fact that essentially they aren’t different is all the more reason to move to an open one that is more inclusive
- (CW) Must look at what people are learning in school now
- Tooling for automated testing,... are supported by git, subversion is an afterthought
- (RG) How do you get to a win/win but by doing so we enable a bigger group of people to contribute?
- Everyone in R Core is super busy and changing something that you've done for 20 years is hard
- I have been doing it for 20 years and it is automatic still even though not contributing for awhile.
- Not as good a mentor with a new system b/c have to learn a new process
- (CW) Excellent points, Robert. +1 from YB and NT.
- (NT) Co-learning with peers can be better than learning from mentors. +1 from AM
- (AM) I think it’s incorrect to say that someone who is not an expert is a worse mentor. Research shows that people closest to the novice state are often the best teachers.
- (GM) Sorry - I did not mean that one can only learn from an expert
- (EC) There are lots of people who know git. Maybe they are the ones that become mentors to the r-core members during the transition. It will be painful, but the community is there and we can help (well, I don't, because I don't know so much about git). +1 from CW
- (ST) Happy to help with git wizardry :)
7. [10 mins] Where do we go from here?
- (HT) Meet in Sept
- August is an off month in Europe
- R Core discusses what has happened today
- R Core work on wish list
- Setup a repo on Forwards that HT will invite us to
- Create an issue for each idea and other ideas to discuss them further
- Groups formed
- September to assess which issues have volunteers and projects
- Feel free to get in touch with comments later
- Many thanks to HT for organizing from ST, AM, JB, EC, AV, RB, SKB, MCR, NT, GR, GQ, YB, CW, MG, ER
- (NT) one comment regarding forming groups "naturally" around topics. I think we need to be strategic about how we affect change. So some ownership around long term goals, and strategy around this should be specified by R Core / Forwards / or maybe any of the groups that see themselves as long term stakeholders?!
- (HT, post-meeting) Good point, I thought we might use Martha's Rules (https://third-bit.com/2019/06/13/marthas-rules.html) to discuss proposals and decide what to take forward. But that may not help with prioritising. We can also discuss meta issues like this on the GitHub once I've got it set up.
- For people that want to join link below, not necessary to join if don't want.
- (ML) - Thanks, thanks to RG and LT
People that want to join Forwards:
https://github.com/forwards/knowledgebase