HackMD
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    ## Google's Gender Gaps - Worse Than Advertised *Where we can measure it, female participation in Google's engineering is much lower than expected.* Google reports that 21.4% of their “tech” employees are now female, up from 20.2% last year[^1]. You would be mistaken though, if you assumed that this was equivalent to the female representation in their software engineering roles, which a lawsuit revealed to be significantly lower at 16% in 2017[^2]. Are there other reasons to be skeptical? Well, glancing over at Microsoft's diversity report, we notice that they shifted their data collection date from September to June this year[^3]. Allegedly the shift was to align with their financial calendar, but it is interesting that this date coincides with the rapidly expanding "Explore Microsoft" internship program, which has a [markedly different]( https://twitter.com/donasarkar/status/999809666170290177?s=20) demographic makeup compared to the rest of the company. Facebook also, is reporting 2018 data from June[^4], shifted from May[^5] in past years. Google has a diverse summer internship program as well, and 2018 saw *"49% of Google’s global interns identifying as Black, Latinx, and/or women"[^1]*. Google doesn't provide a reporting date with their data, but I wouldn't bet against it being in June. This possibly explains some of the discrepancy between these companies' PR announcements and their Equal Employment Opportunities filings, which show much lower numbers[^6][^7][^8]. Facebook warns us that: > ...due to the way the U.S. government tracks EEO-1 data, the numbers reflected in the below form are representative of a point in time in December 2017, and not our current 2018 data. The EEO-1 data also reflects job groupings and categories that do not align with the way Facebook groups our roles and employees internally. we believe that the information present on this website is a far more accurate reflection of the progress we've made and the work that remains to be done.[^4] If this seems difficult to disentangle, that's probably intentional. We may want to understand where representation is at its lowest, but tech companies are trying to make themselves look as good as they can. They have to compromise between appearing to be honest and forthcoming with their metrics and making sure those metrics stack up against the competition. Ideally we would measure employee demographics and work patterns directly, without having them passed through the PR department. An enterprising Google employee could do this quite easily - they could compile user activity from their engineering systems, tag users with identity information from the internal mailing lists, and run the numbers[^10]. We don't have access to this data, but we can find a smaller approximation of it. The past few years have seen the rapid growth of [Google’s GitHub Organization](https://github.com/google). Here thousands of verified Googlers do their work in public on open source projects. Almost all their accounts can be linked to their real identities, making relatively accurate determination of their gender possible, albeit time consuming, and GitHub provides an API for linking them to [contributions](https://blog.github.com/2013-01-07-introducing-contributions/). As of December 2017, a scrape of the Google GitHub organization yielded: - 1493 members - 1048 repositories - 60,969 contributions[^12] Aggregated by user and supplemented with gender and the user's personal GitHub stats[^11], you can find the dataset [here](https://pastebin.com/raw/gjCbeNbi). Since this scrape, the member list has grown to 2100 with significant turnover, so there will be a lot of value in updating it with new data. But for now, lets see what we have. First up, member statistics by gender: | | Count | Avg #repos | Avg #followers | Avg #gists | | :------------ | --------: | ---------: | -------------: | ---------: | | Male | 1378 | 7.26 | 147.9 | 8.18 | | Female | 69 | 5.29 | 84.7 | 2.64 | | Unknown | 46 | 3.29 | 20.8 | 0.76 | The 46 users marked as “unknown” are comprised mostly of pseudonymous usernames that could not be linked to any social media presence. These accounts tended to have less activity, causing the low numbers for this group. A significant number of unknowns also came from real names that did not have a strong gender association and could not be reliably linked to a social media presence. Looking at the male and female groups two things show up as interesting: 1. Only 4.8% of the gendered members are female 2. Females have significantly less activity than males, with: - 73% as many repositories - 57% as many followers - 32% as many gists Now lets examine contributions to repositories owned by the Google organization: | | #contribs > 0 | Avg #contribs | Avg excluding 0 | % of all | | :------------ | ------------: | ------------: | --------------: | -------: | | Male | 34% | 44 | 127 | 98.8% | | Female | 25% | 8 | 32 | 0.9% | | Unknown | 34% | 5 | 15 | 0.3% | As shown in the first column, the majority of Googlers who join the GitHub org, don't make any contributions (or only contributed to capsicum-linux[^12]). What would cause someone to join the org but not make any contributions is a potentially important unknown, so we calculate the average number of contributions both with and without these members. Points of interest: 1. Females were 74% as likely to have any contributions as males 2. Females with contributions had 25% as many contributions as males 3. 98.8% of all contributions were from males The contributions disparity between males and females was much larger than expected[^13]. What with only 1.2% of the original member list being contributing females, we're down to just 17 members for this category but we can still get a better idea of how the groups compare with a density plot: ![](https://i.imgur.com/8i7euDS.png) Note that the contributions are shown here on a log2 scale, so members on the right edge of this graph are contributing 1000x as much as members on the left. What we observe is that a minority of the male contributors are contributing far more than most other members, and thus also driving the disparity in average contributions between male and female contributors. It's not so much the case that female members have unusually low contributions; rather its that all the members with unusually high contributions are male. Given the surprising nature of these findings, we should revisit the assumption under which this data was collected. We started out aiming to understand the work patterns and demographics of software engineering at Google, but not being able to access this data directly, we settled for what was hosted on the Google GitHub organization. This excluded all closed source projects, and also excluded open source projects hosted elsewhere (such as Android and Chromium). This approach includes selection effects that could have skewed our sample, such as: - Open source vs non-open source - Small projects vs large projects - Small teams vs large teams Disparities don't come from nowhere, and if our data is significantly skewed from Google's internal workflows then that raises interesting questions in itself. For example, are small teams worse at accommodating women? Is Google's open source culture particularly male-dominated? It should also be noted that the disparities we are looking at are not the same thing as employee performance. In our data "contribution" is a term from GitHub's API, and may be a poor proxy for an employee's actual contribution to the company (although again, a gender skew between GitHub contributions and overall performance would be interesting in itself). Overall, three important conclusions appear to be warranted by this investigation. First, that female participation in Google's GitHub org is, for whatever reason, much lower than expected. Secondly, that for understanding gender disparities at Google, the diversity reports and press materials fall somewhere between "insufficient" and "misleading". While they probably don't give false information, anybody who takes them at face value will be very surprised by findings such as the gender disparity in their GitHub users. Finally, this investigation demonstrates that interesting gender disparities can be found by extracting data from engineering systems, and that open source provides an opportunity for interested third parties to conduct research. [^1]: [Google Diversity Annual Report 2018](https://static.googleusercontent.com/media/diversity.google/en//static/pdf/Google_Diversity_annual_report_2018.pdf), page 17 [^2]: [JAMES DAMORE vs. GOOGLE, LLC Case #18CV321529](https://www.dhillonlaw.com/wp-content/uploads/2018/04/20180418-Damore-et-al.-v.-Google-FAC_Endorsed.pdf), page 62 [^3]: [Microsoft Workforce Demographic Report 2018](https://www.microsoft.com/en-us/diversity/inside-microsoft/default.aspx) [^4]:[Facebook Diversity Update 2018](https://www.facebook.com/careers/diversity-report) [^5]: [Driving Diversity at Facebook](https://newsroom.fb.com/news/2015/06/driving-diversity-at-facebook/) [^6]: [Facebook's EEO report](https://www.facebook.com/careers/pdf/diversity-report) [^7]: [Google's EEO report](https://diversity.google/static/pdf/Alphabet-Consolidated-EEO-1-Report-2017.pdf) [^8]: [Microsoft's EEO report](https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE2Hh94) [^9]: Note: it’s important to do (2) before (3) and (4), to minimize the possibility of researcher bias when determining gender. [^10]: It would also be fascinating to compare other groups, e.g. do furries create more bugs than basketball players? [^11]: Method for collecting the data: 1. Scrape the members list from Google’s GitHub org 2. For each member account, attempt to determine gender[^9]: - Try determination from account name and photo - Try determination from contact details, personal website, email address, username reuse - Attempt to match all possibly ambiguous names to LinkedIn and social media accounts - For data deficient names, record “Unknown” - Go with stated gender if it exists, if the member identifies as a gender other than male or female record “Unknown” 3. Scrape GitHub’s contribution statistics for every repository owned by the organization 4. For each member, additionally scrape their personal account statistics [^12]: Excluding contributions for the repository [capsicum-linux](https://github.com/google/capsicum-linux), as these were too large for the API [^13]: Predictions recorded before running this experiment: - Only 8% of contributors will be female - Females will on average have 75% as many contributions as males

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully