owned this note
owned this note
Published
Linked with GitHub
# Fairlearn Community Call Meeting Notes (2020 through July 2021)
## 07/08/21
- Discuss upcoming SciPy tutorial logistics for 7/13
- 2 speakers (Miro and Manojit), and others monitoring the chat to answer questions and identify questions for speaker from AirMeet chat
- Breakout rooms - AirMeet has "tables" which can be reserved but people have to leave the session (and speakers can't leave).
- Can post a link to a Google Meet for the breakout rooms
- How to handle technical questions? If they need video support etc. There's a general Slack, and we should be present in their Slack in our tutorial channel, but it may be better to respond in the chat in case others have similar questions or could offer support themselves
- At the end of the tutorial, we can direct them to our Discord for future conversations
- Word cloud as icebreaker in the beginning (and again at the end)
- TODO:
- Dry run of Google Meet breakout room
- Asking for host permission for Miro
- Asking organizers to send out email to attendees about having a Google account for Meet and CoLab
- Set up word cloud
- Sprints
- We have a new sprints channel on Discord!
- Roman curated about 25 issues (see sprints channel on Discord)
- [SciPy](https://www.scipy2021.scipy.org/sprints) coming up on July 17 & 18 with mentored sprints on the 17th and general sprints on 18th
- [sprint instructions](https://github.com/romanlutz/fairlearn-scipy-sprint/blob/main/README.md)
- [EuroPython](https://ep2021.europython.eu/events/sprints/) happening on July 31 and August 1 - anyone willing to help?
- Contributor guide is lacking specific enough information about setting up environment (e.g., with versions, etc) --> Roman to add (a non-sprint-specific version of the) sprint instructions to the contributor guide
- Data Umbrella hosts sprin
- Announcements/updates
- Educational materials group meeting next Wednesday at 11am ET
- Release is out, but not yet updated on the website, pending a Readme PR
- People interested in plotting for MetricFrame should comment on Alex's updated PR
## 07/01/21
- Discuss issue [#756](https://github.com/fairlearn/fairlearn/issues/756)
- Alex updated [error bar PR](https://github.com/fairlearn/fairlearn/pull/857) with API suggestions - looking for comments and reviews
- Long discussion on issue 756 that seems to have ground to a halt - how to organize it to help people contribute?
- Can we inject support for error bars into metric frame?
- Better way to have this discussion to move it forward? - the thread is getting too long
- Do we want a structure between metric frame and a dataframe?
- Storing errors may be a separate conversation
- May need to create a separate issue for supporting error bars for derived qualities
- See [Seaborn](https://seaborn.pydata.org/) for exemplars here
- Action items:
- Richard will create a higher-level summary and open a new issue
- Need for review of Alex's PR
- (and don't forget Kas's PR)
- For future: PR authors may want to summarize on the calls
- Release (almost ready)!
- Will need to update tutorial notebook after it's out
SciPy [sprints](https://www.scipy2021.scipy.org/sprints) - on 7/17 and 7/18 (before the main conference on 7/13)[](h[](https:/
- Feel free to join! (starting at 5am CT?)
- Issue curation beforehand - Roman will take the lead, but open for input from others (or just open issues yourselves and tag with labels)
- For new contributors:
- Before opening a PR, start a conversation on the issue to check if anyone else is working, or questions about scope or need
## 06/24/21
- Update on next steps for educational resources
- See notes [here](https://hackmd.io/@STU6DFvcRo6VVk1dPGbTMQ/B1i-SMzhd)
- If you want to get involved in developing the "syllabus" for educational resources, please message Michael M and/or Hilde on Discord
- Action items:
- Hilde and Michael M will lead a separate working group with interested contributors in:
- Defining learning goals, their sequencing, and ways to integrate them into existing content (e.g., steps 1-3 in "what's next" in the [HackPad]((https://hackmd.io/@STU6DFvcRo6VVk1dPGbTMQ/B1i-SMzhd))).
- Then opening issues to address those goals, in ways that help contributors understand how they fit in with the desired structure
- SciPy discussion of breakout groups during tutorial
- Discussion of Discord platform
- "bias towards action" rather than just discussion
## 06/17/21
- Update on abstraction traps PR
- Working through examples to use (e.g., for solutionist trap)
- Can add some examples if possible, but can also leave others for future PRs
- Discussion with Ayodele on educational goals and material for Fairlearn
- What might the community need for better tutorials?
- Currently have more of binary classification but may want more complex examples
- Possibly more explanations of what fairness metrics mean and when to use them
- From Kas: not just teaching fairness separately from ML, but integrating the two
- Domain-specific fairness guidance
- How might users identify the groups to consider?
- How are people accessing the website / getting started with Fairlearn?
- Starting with quickstart, notebooks, API pages; copy-paste from simple examples
- How to integrate the social and technical in the content and not keep them siloed?
- How to add friction or pause points in the quickstart for users to consider sociotechnical factors or context?
- Other forms of content:
- Short videos
- Slideshows
1. - Educational materi
- Can create a channel in Discord for these conversations
- May have at least 1 meeting discussion on strategy? Potential for ongoing conversation in separate meetings with interested folks or dedicated community call meetings
- Considering the website design and user flow in addition to creating content
- **Action item:** Hilde and Michael (and others, if interested) will discuss concrete agenda for next week's meeting on Discord
## 06/03/21
- Website update - have a front-end engineer to help with the website, but how do we best use their time, given sphinx
- Create new sphinx theme?
- or modify an existing one (in sphinx pydata, which barely works on mobile)
- May be a lot of work to maintain our own theme
- Define the right scope, but not create maintenance debt
- May want to talk to (someone who developed pydata - Uris?)
- Miro will follow up with maintainers, contractor, designer; and then go to Uris
- New website issues (#833-845)
- Goals: transparency of the process
- Vanessa may flag certain ones for contributors to comment on or take some action on
- New intern (Alex) working on plotting error bars
- SciPy
- Should fix the issue which makes us need to create a custom estimator in the tutorial
- Can pass the name of the function internally
- **Action item here?**
- Manojit giving tutorial to DataKind next week, will get some feedback then
- Feedback from new members: good to have contact with maintainers, liked opportunities for conceptual contributions
- Take a pass on issues to see which are assigned to people
- Make clear that even if someone else is working on it (e.g., assigned, with PRs, etc), it might be stale, so people should check before starting
- But this expectation should be made clear to new contributors
- **Add this to the How to Contribute Readme**
- **For "good first issue", add a line** asking contributors to comment on the issue before starting to work on it (link to contributor guide or [label explanations](https://github.com/fairlearn/fairlearn/labels))
- Discuss documentation re: abstraction traps, and including examples, code etc?
- Add examples for abstraction traps of what these look like in contexts
- Start with concrete examples first, then can consider code snippets, simulations later, maybe in separate issues or PRs
- **Laura will share an update in 2 weeks**
## 05/27/21
- Collaboration platforms
- Discord:
- Easier to hang out than Gitter (e.g., can see if people are online and willing to jump into a call)
- Would it work for large group video calls (like the weekly meetings)?
- Possibly?
- But may not have ability to schedule meetings
- Concern about replacing Gitter due to ease of access
- Has quite a few options for moderation:
- requiring accepting code of conduct before posting
- easy to remove people if they're violating CoC
- Has GitHub integration, to reference PRs like in Gitter
- Only allows up to 8 people on video in the free version - but this seems to have been raised to 25 recently?
- Suggestion to use Slack
- For threaded discussions, GitHub integration
- But not as transparent as Gitter (i.e., you can view without having an account)
- Concern about limits on the free version
- Recommendations
- Use Discord for asynchronous chats, sprints, ad-hoc meetings (i.e., replacement for Gitter)
- Continue to have majority of discussion to be on issues
- Continue to use Teams for weekly calls (for now)
- Highlighting contributions on Twitter (#[824](https://github.com/fairlearn/fairlearn/discussions/824))
- Giving credits for contributions in other areas via AllContributions bot
- Currently manually creating Authors.md file, but this has issues with upkeep
- TBD on next steps
- Website update
- Working with illustrators to get inclusive illustrations
- Looking at getting a web developer contractor
- But this shouldn't be a blocker on improving the website design
- e.g., moving sections around, restructuring, but not for example, re-doing sphinx
- Would be good to have a clean-up before the SciPy tutorial in the middle of July
- Release of 0.7
- Should think about creating curated issues for SciPy
- Hilde will start with some website issues
- Chest x-ray use case update
- Goal - demonstrate fairness issues in the medical domain
- Audience - data/applied scientists in the medical domain (and possibly others, but would need some transfer)
- Challenge of scope - you could write a book about x-rays, so we don't want to cover all of this, but we do want to engage with the complexity of the context and t ask
## 05/20/21
- [PyTorch implementation of Fairlearn](https://github.com/wbawakate/fairtorch) - but Fairlearn already supports PyTorch functionality, so may be possible to sync rather than duplicate efforts
- We can be clearer about support for PyTorch, TensorFlow, etc
- Discussing contributing documentation - breaking into smaller segments, using concrete examples if possible
- Specifically for issue [782](https://github.com/fairlearn/fairlearn/issues/782) and PR [809](https://github.com/fairlearn/fairlearn/pull/809), will break into separate PRs for each abstraction trap, add examples
- For new contributors, see contributor guide [here](https://fairlearn.org/main/contributor_guide/index.html)
- Discussion of other options for synchronous working
- Possibly short office hours to help unblock contributors?
- Question around split functions (e.g., separate utils, visualization options, etc)
- Support for matrices and arrays
- May work with Roman to develop unit tests
- Issues on the radar
- About section - Miro made tweaks to reflect new governance
- May have broken the website?
- For SciPy, Miro will open a PR for cost-sensitive classification gradient loss to post-processing
- How do we handle PRs where the contributor hasn't updated in a while (how long is a while: a month? more?)?
- Email contributors reminding them that we want to support them in contributing, but we also want to resolve the PR (while still acknowledging their contributions)
- For the [website link PR](https://github.com/fairlearn/fairlearn/pull/744), we'll just merge and fix
- The [other PR](https://github.com/fairlearn/fairlearn/pull/732) may not have a fork anymore
## 05/13/21
- Discussion of how to incorporate error bars and confidence intervals in MetricFrame
- Discussion of counterfactual analyses ([#772](https://github.com/fairlearn/fairlearn/issues/772))
## 05/06/21
- Different versions of documentation (for different Fairlearn versions) may have dependency issues
- You can add dependencies, but can't get rid of them (with sphinx multiversion)
- Alternatives - sk-learn, but is non-trivial to set it up
- CircleCI has a 10-minute cap for the free tier (we're reaching this already with at least one build)
- This is being used for the website, which includes notebooks and datasets
- **Action item**
- Open an issue, and Adrin can follow up to help on the sk-learn side
- PyCon presentation is next Friday (5/14, at 3pm ET) and a later workshop with DataKind to use Fairlearn in project scoping phase
- How should we navigate the relationship between Microsoft and Fairlearn (now that it's open governance) at PyCon?
- Make it clear that the project is open governance
- PyCon has two types of sprints: "[development sprints](https://us.pycon.org/2021/events/development-sprints/)" (5/17-18) and "[mentored sprints](https://us.pycon.org/2021/summits/mentored-sprints/)" (5/15-16)
- If you're interested in helping with the mentored sprints, let Roman know
- **Action items:** mentors for sprints should label the issues with "sprint spotlight" they would be interested in mentoring on
- Model cards - Adrin has talked to lawyers about license issues, and moving the code to the Fairlearn org
- Roman/Miro/Rishabh found a better way to get model comparison plots ([issue #666](https://github.com/fairlearn/fairlearn/issues/666))
- **Question**: How do people feel about forcing keywords more often (e.g., more like pandas than scikit-learn)? (some conflation between keyword arguments and "default argument" here)
- Fairlearn dashboard functionality moved to [Responsible AI widgets](https://github.com/microsoft/responsible-ai-widgets)
- New release coming out today
- Issue opened for a glossary of construct validity terms ([issue #769](https://github.com/fairlearn/fairlearn/issues/769))
- Incorporate this into a new [user guide](https://fairlearn.org/v0.6.1/user_guide/index.html) section 1.4 (Measurement modeling)
- **Idea:** Use weekly meeting(s) as one or a series of hackathons to implement/tackle some of the open issues
- Question: should each contributor of an implementation (e.g., of counterfactual fairness) write documentation on how to use it (probably yes) *and* for what it might look like (or how) to use it in a grounded context
- **Action item:** Open a small issue to start the conversation
- Suggested: using [xdoctest](https://pypi.org/project/xdoctest/)
- Platform for weekly meetings:
- Suggestion to use Discord (may be able to support up to 50 people, with LaTeX functionality, syntax highlighting, etc)
- **Action item:** Roman will summarize different platforms in a Discussion post
## 04/29/21
- Updates:
- PyCon coming up: two types of sprints: "[development sprints](https://us.pycon.org/2021/events/development-sprints/)" (5/17-18) and "[mentored sprints](https://us.pycon.org/2021/summits/mentored-sprints/)" (5/15-16)
- We may want to use a new label for issues called "sprint spotlight" to focus attention on certain issues that might be good for sprints, for both PyCon and then SciPy later
- May want to refactor metric frame to enable new uses (e.g., dataset metrics)
- Include it in user guide, possibly FAQ,
- Presentation from Wesley Deng on research with practitioners using Fairlearn
## 04/22/21
- Synced on process for getting the use cases to GitHub, for the predictive policing / construct validity piece specifically
## 04/15/21
- Governance update is done
- Possibility for Eindhoven interns to work on Fairlearn (possibly mitigation techniques)
- Deadline for SciPy tutorial - June 18th
- May aim to present a practice talk at a Fairlearn community call
- Discussing how to teach construct validity (from Adrin's scenario)
- May want to go through the process described below (e.g., break it down into separate issues for each sub-type, with definitions, examples, etc)
- Getting on the same page re: "tags/labels vs. projects vs. milestones"
- Labels for issues: e.g., "good first issue", "easy" (which may be interpreted differently w.r.t sequence)
- Projects may be ad hoc, personally created (e.g., "Roman's pet project"), and may be stale
- May be used as a tool to manage workflow, but not to discover what to work on
- Often new contributors work on latest issues
- Concern about large number of tracking methods - issues, PRs, projects, discussion posts, milestones (mostly just used for releases for now, but may change as the velocity increases)
- Next time there's a release, get Richard to write down the steps
- Get access to the Fairlearn Azure release pipeline for non-MSFT employees (Adrin and Hilde)
- May need a separate org
- Label taxonomy (in order):
- Good first issue
- Discussion of whether this is good for newcomers to OSS or to Fairlearn
- Easy
- Intermediate
- Moderate
- Is it apparent that the community call is meant for any contributors to join?
- Question of discoverability for newcomers - are newcomers who want to join able to?
- There is an open [issue](https://github.com/fairlearn/fairlearn/issues/716) to update the info about the community call
- Current approach to the meeting link: reach out on Gitter or other forums, but _not_ posting an open link to the Teams meeting
- Possibly use an open platform where more than one person can approve new members
- Possibly share link on mailing list
- Could post the link (on ??) and open a lobby
- Meeting discussion topics
- Twitter account can tweet a link to the discussion topic a day in advance
- Add backlog of reading list to discuss if no other topic
- Or just open discussion of recent issues or a working meeting
## 04/01/21 and 04/08/21
- Updates, blockers, etc?
- Code of conduct
- Should we point to the Python CoC?
- Or, fork that and update that for us
- SciPy proposal accepted!
- Can use a tutorial for DataKind (mid-May) as a practice run for that
- Issue inventory (Roman, Hilde): with PyCon and SciPy sprints coming up we should compile a list of good first issues and other issues for which we want help from contributors. The list below also includes some existing issues whenever we encountered areas that might need some extra information/description or clarification. A lot of this is highly related to roadmap discussions that have happened over the last several months.
- general pieces of advice:
- issue description need to be clear to new people, ideally with a proposed solution so that people can get started
- the smaller the better (large changes tend to get stuck), when possible break up into "finding sources/relevant literature" and "distilling information into user guide"
- Miro: every issue should have a maintainer who commits to working with contributors (review)
- option: "low priority" label
- respond saying that there's currently no time for that
- assignee is person working on PR (rather than maintainer)
- items can be tagged for the sprints
- **website**: Does it make sense to compare the status quo with the mocks to find content gaps and create issues for each gap?
- Miro: let's hold off on this since we have a front-end engineer joining soon
- Content changes could already be captured in issues (but well-defined and not open-ended)
- ~~About~~ PR #500
- **user guide**: shall we create issues for the following?
- ~~Abstraction traps~~ captured in #778-782
- ~~identify relevant literature~~ (Selbst et al., 2019)
- break down into different issues for different abstraction traps
- small issues with different output forms (e.g., define concepts, give example(s) of concept in a particular context, write code demonstrating concept, etc)
- Construct (and other types of) validity - #769 created for construct validity, #707 captures the larger discussion with other types
- ~~identify relevant literature~~ (Jacobs and Wallach, 2021)
- break down into different issues for different types of construct validity
- small issues with different output forms (e.g., define concepts, give example(s) of concept in a particular context, write code demonstrating concept, etc)
- How do I decide which fairness criterion to use? [A very common question for which our user guides don’t really offer any help] - update: Hilde created https://github.com/fairlearn/fairlearn/issues/721
- identify relevant literature
- concrete small steps rather than open-ended
- ~~Talking points on how to raise fairness issues / discuss fairness~~ #783
- identify relevant literature
- concrete small steps rather than open-ended
- ~~User guides for new mitigation techniques (see the corresponding list)~~ - should be included in PR that adds the new mitigation technique
- Datasets
- create user guide entry for fetching datasets
- create documentation for an existing dataset (e.g., datasheet?)
- identify sources for information about the datasets (i.e., find links to sources which can then be added to API reference)
- illustrate known fairness issues with existing datasets based on the sources found (and/or point to or summarize blog posts or papers about that dataset)
- ~~propose new datasets?~~
- Example: https://github.com/fairlearn/fairlearn/issues/507 (Roman to make clear exactly what's expected there to avoid open-ended scope)
- get rid of census dataset usage in our examples
- Other types of fairness (as mentioned in https://github.com/fairlearn/fairlearn/issues/244)
- Extending user guides based on questions:
- https://github.com/fairlearn/fairlearn/issues/611
- https://github.com/fairlearn/fairlearn/issues/486
- https://github.com/fairlearn/fairlearn/issues/460
- https://github.com/fairlearn/fairlearn/issues/444
- https://github.com/fairlearn/fairlearn/issues/439
- https://github.com/fairlearn/fairlearn/issues/243
- **API reference & code documentation**:
- Hilde already created two items for API reference: (break down into very small chunks)
- ~~https://github.com/fairlearn/fairlearn/issues/719 reference consistency~~
- https://github.com/fairlearn/fairlearn/issues/720 structure & overview tables - needs more discussion/investigation
- new issue: Create guideline for what "consistency" means
- Two existing items for code documentation (although we could certainly create more):
- https://github.com/fairlearn/fairlearn/issues/232 internals of reductions
- https://github.com/fairlearn/fairlearn/issues/96 constants in reductions
- **visualizations**: Rishabh is taking over the current PR to provide very basic visualizations directly through `MetricFrame`. Beyond that we already have items for
- [Model comparison](https://github.com/fairlearn/fairlearn/issues/666)
- ~~[Support for multiple metrics](https://github.com/fairlearn/fairlearn/issues/667)~~
- [Support for control features](https://github.com/fairlearn/fairlearn/issues/668)
- [AUC by group](https://github.com/fairlearn/fairlearn/issues/758)
- precision/recall by group
- **mitigation techniques**: requested by Hilde (to be added below) and Julia Stoyanovich (NYU, h/t Lucas!) to allow them to use only Fairlearn in their Responsible Data Science classes. Can we create issues for these?
- options: create wrapper (soft dependency, definitely take a look at sklearn submodule in AIF360 first) OR reimplement from scratch OR copy-paste (from AIF360 OR original repos)
- Adrin: prefer PyTorch over Tensorflow
- when creating issues let's make it clear that implementation option (see above) should be decided on first; also add motivation for adding techniques
- ~~F. Kamiran and T. Calders, “Data Preprocessing Techniques for Classification without Discrimination,” Knowledge and Information Systems, 2012. https://aif360.readthedocs.io/en/latest/modules/generated/aif360.algorithms.preprocessing.Reweighing.html~~ #784
- ~~B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating Unwanted Biases with Adversarial Learning,” AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society, 2018.~~ #785
https://aif360.readthedocs.io/en/v0.2.3/modules/inprocessing.html
- G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger, “On Fairness and Calibration,” Conference on Neural Information Processing Systems, 2017.
https://aif360.readthedocs.io/en/latest/modules/generated/aif360.sklearn.postprocessing.CalibratedEqualizedOdds.html#aif360.sklearn.postprocessing.CalibratedEqualizedOdds
- techniques mentioned in "Practical Fairness" (Aileen Nielsen, O'Reilly book)
- **clustering**: Lucas/Roman to follow up, for more info see [this discussion](https://github.com/fairlearn/fairlearn/discussions/710)
- Adrin: is there a baseline to compare with?
- **metrics**: only [1 item](https://github.com/fairlearn/fairlearn/issues/676) at the moment (by Manojit), but others might be interesting (see below). Should we create issues for any of these?
- Calibration – mentioned a lot in fairness literature, but we don’t evaluate it anywhere. Obviously, this doesn’t quite fit into the metric frame contract using `y_true` and `y_pred`
- Positive/negative predictive value – occurred to me based on reading some of the ProPublica critique by NorthPointe.
- Multi-class classification – this was asked [on StackOverflow](https://stackoverflow.com/questions/66574745/fairness-metrics-for-multi-class-classification) recently
- conditional demographic parity (Wachter et al. paper)
- A meta point: we don’t show anywhere how large the buckets actually are. If we have fairness metrics that are based on a difference between two groups there’s an implicit assumption that these groups are sensible. Nowhere would we currently make the user aware that one group has 100k members and the other one is actually just a single person. Groups are only ignored when there are no members AFAIK. This is super important information and should probably be highlighted (programmatically).
- Adrin: add "count" to list of metrics (which would just automatically add this)
- **Sklearn compatibility**: we’ve wanted this for a long time. Adrin is clearly the expert on this. Can we compile a full list of incompatibilities and discuss how these should be addressed so that we can create issues accordingly?
- A related issue: https://github.com/fairlearn/fairlearn/issues/665
- Another one: https://github.com/fairlearn/fairlearn/issues/222
- with May sklearn release this will get easier
- **contrib guide**: Is anyone interested in creating little videos to explain how to get started contributing?
- Adrin: contrib guide is not necessarily the same as first time contributor instructions (How to create a PR? How do I find an issue that I can work on? etc.)
- **Other existing items that need follow-up actions**:
- ~~https://github.com/fairlearn/fairlearn/issues/703 OpenML dataset failure - can we close this?~~
- https://github.com/fairlearn/fairlearn/issues/698 face benchmarking request - the consensus seemed to be that this has nothing to do with Fairlearn and should live somewhere in the MSFT org instead. Can we close this?
- ~~https://github.com/fairlearn/fairlearn/issues/697 Richard fixed this. Closed.~~
- https://github.com/fairlearn/fairlearn/issues/683 The corresponding PR by Richard was closed. It seems like there’s a plan to move forward. Can we put a description of that into the issue? It seems related to https://github.com/fairlearn/fairlearn/issues/645
- https://github.com/fairlearn/fairlearn/issues/543 Should we revive this? (Miro)
- https://github.com/fairlearn/fairlearn/issues/498 Kevin’s original website issue – I feel like we can close this by referring to the discussion(s) that Vanessa opened. Thoughts?
- https://github.com/fairlearn/fairlearn/issues/473 We seem to have dropped the ball here. Miro do you want to follow up? Perhaps deltasun would like to take a stab at this?
- ~~https://github.com/fairlearn/fairlearn/issues/425 closed since it’s a dashboard feature request~~
- https://github.com/fairlearn/fairlearn/issues/316 testing - should we have more items for testing? I personally doubt that anyone will come in and want to write tests for existing functionality.
- https://github.com/fairlearn/fairlearn/issues/269 Richard since the dashboard lives in raiwidgets now, will the group metric set also move elsewhere or what’s the plan?
- https://github.com/fairlearn/fairlearn/issues/235 Can we add some extra info on this? (Miro)
- https://github.com/fairlearn/fairlearn/issues/65 Miro this is waiting on a response from you
- Whenever we finish the **governance** PR let’s also finish up with
- DCO https://github.com/fairlearn/fairlearn/issues/650
- Copyright https://github.com/fairlearn/fairlearn/issues/433 - this mentions SECURITY.md as well. Adrin’s suggestion was to replace it with a simple “please report security issues to <some email address>”, but we don’t have that email address yet. I could ask python.org whether they’re willing to give us another address in addition to fairlearn-internal and fairlearn-announce, or we could allow non-members to send email to fairlearn-internal. BTW these were meant to be used only once the governance is non-MSFT, otherwise they’re fully functional (except that nobody knows about them at the moment).
- SECURITY.md https://github.com/fairlearn/fairlearn/issues/290 - same as above
- When we **remove the widget module** in April let's also take care of
- documentation that refers to it
- this issue https://github.com/fairlearn/fairlearn/issues/608
- note: Roman has a PR practically ready, except that it doesn't have replacement visualizations yet (since those are in the corresponding PR)
- related topic: tags vs. projects vs. milestones
## 03/25/21
- Updates, blockers, or questions from community members
- Manojit presented on fairness harms to DataKind, and is scheduling a workshop on Fairlearn (similar to the talk to PyCon)
- Topics:
- Update on predictive policing case study (Michael A and Bruke)
- Opportunities to use this to teach construct validity
- Possibly using code snippets and/or Fairlearn elements to demonstrate concepts or the limitations of fairness metrics in cases of construct validity issues
- May split into two pieces - one on task definition and the other on construct validity
- Discuss issues
- Volunteers?
- Logistics?
## 03/18/21
- Intros, if needed
- Updates, blockers, or questions from community members
- NA
- Topics:
- Wesley and Manish presenting their pilot study on using fairness toolkits
- Vanessa presenting and getting feedback on current website designs
## 03/11/21
- Intros, if needed
- Updates, blockers, or questions from community members
- Multiple calendar invites - Roman will resolve this
- Abby Jacobs presented the Measurement and Fairness paper at FAccT - videos may be available to non-attendees
- Hanna and Miro presented an MSR [webinar](https://note.microsoft.com/MSR-Webinar-FairLearn-Registration-Live.html) on Fairlearn
- New website is under development, people should give feedback on new [website draft](https://github.com/fairlearn/fairlearn/pull/709#issuecomment-796819274) and [illustrations](https://github.com/fairlearn/fairlearn/discussions/690)
- Meeting topic from GitHub discussion [post](https://github.com/fairlearn/fairlearn/discussions/653#discussioncomment-465224)
- Miro presents his [paper](http://proceedings.mlr.press/v80/agarwal18a/agarwal18a.pdf) on reductions approach to classification
-
## 03/04/21
- Intros, if needed
- Updates, blockers, or questions from community members
- Fairlearn landing page has a pull request open - please contribute to the wording of content, give feedback, etc
- DataKind's volunteer scoping squad may decide to use Fairlearn
- Discuss implications of Measurement and Fairness paper for Fairlearn
- Given that fairness is contested, showcasing different ways of fairness being operationalized
- Low-hanging fruit: link the tutorial and paper on resource page
- Incorporate language and terminology into Fairlearn documentation etc
- Prompt people to think about these concepts, without jargon
- In documentation / user guide?
- Pop-up questions at different stages in process
- Could include as part of model cards
- Embed materials into end-to-end Fairlearn experience
- Developing materials to help people dive deeper if need be
- Example use case where users may not think there are issues, to demonstrate concepts
- R packages
- psych
- lavaan
- DeclareDesign for CSS research
- Wary of overloading data scientists with these concepts
- May be helpful just to ask them to ask "are we measuring what we think we're measuring?"
- Possible (negative) examples:
- facial recognition predicts political affiliation
- Zillow "walkability score"
- Next steps for construct validity and measurement:
- Open issue for educational material around measurement and construct validity
- Create PR for adding link to tutorial and paper to resources
- Add topics about construct validity and measurement to the roadmap and/or mind-map of educational topics
- Discuss fair clustering [proposal](https://github.com/fairlearn/fairlearn/discussions/710)
- The "what" is more clear here (e.g., based on peer-reviewed papers), but the "why" is less clear
- Would want a 1-paragraph scenario motivating how this might be used in a real context - articulating which construct of fairness this might operationalize, and what that looks like in context
- Discuss model cards
- Dependences may be TensorFlow heavy, which we may want to avoid
- Now on PyPI [here](https://pypi.org/project/model-card/)
- Would be good to have as a separate library under Fairlearn org
- Need to be careful about optics - we should reach out to Google's Model Card team
## 02/25/21
- Intros, if needed
- Updates, blockers, or questions from community members
- Weekly topic (based on GitHub Discussion board posts)
- Discuss Measurement and Fairness [paper](https://arxiv.org/abs/1912.05511) (Jacobs and Wallach, 2019)
- For next week, discuss implications for Fairlearn capabilities, resources, etc
## 02/18/21
- Note on meeting structure
- Intros, if needed
- Updates, blockers, or questions from community members
- Weekly content (based on GitHub Discussion board posts)
- Updates:
- SciPy and PyCon proposals
- Submitted, will hear back in April
- For SciPy tutorial:
- Should start working on developing those materials in the meantime
- Discuss whether there are gaps in Fairlearn tooling needed for the tutorial (or for later)
- Discuss whether notebooks for the tutorial can/should be used for FL later
- Create GitHub project for the tutorial? Or a repo?
- What should format of the tutorial be?
- CoLab? Work collaboratively on it, add comments, then the finished version, attendees can clone the repo
- Would we want slide materials as well?
- You can use presentation-mode so it looks more like a slide ([details here](https://medium.com/@mjspeck/presenting-code-using-jupyter-notebook-slides-a8a3c3b59d67))
- Move the current structure of proposal into CoLab (Triveni has a draft)
- Need to decide on dataset early on - needs to be publicly accessible
- Currently using NY SPARCS [dataset](https://health.data.ny.gov/dataset/Hospital-Inpatient-Discharges-SPARCS-De-Identified/22g3-z7e7)
- Possibly use the ML Failures [dataset](https://colab.research.google.com/drive/16qOk72rNADLgSQ9UMqVUORuG4qrnfK1W?usp=sharing)
- Adrin working on standardizing names of fairness metrics for his work - has that effort begun for Fairlearn?
- Miro working on something similar for metric API over the next few weeks
- How do we balance the complexity of fairness with giving data scientists the tools they need to get started? (e.g., balancing accessibility of developers and risks of "fairwashing")
- Differences between antidiscrimination risk mitigation and unfairness mitigation
- Worked on brainstorming topics to cover in educational materials:
- [Mural link here](https://app.mural.co/t/fate3199/m/fate3199/1613663752472/37ea06b86781eff694c734a0195c742627cdcc2f)
- Next steps:
- Continue this process of generating learning objectives and synthesizing them into topical clusters
- Discuss potential sequencing, as needed
- Convert each topic cluster into an issue or discussion thread
- Create tasks for developing materials for each
- If folks have good links to other educational resources from similar projects (not even necessarily fair-ML), please share links!
- ...
- ...
## 02/11/21
- Proposals:
- SciPy proposal: moving forward; email Triveni if you want to give feedback
- PyCon proposal:
- discussion of including companies, tools, resources around Responsible AI
- Shared updated proposal summary and outline
- For datasheets, choose some subset of questions to focus on, that they can discuss in depth - make an explicit point about the importance of deliberate reflection and discussion for these questions
## 02/04/21
- Roadmap discussion (PR [here](https://github.com/fairlearn/fairlearn/pull/500))
- In Scikit-learn:
- Roadmap typically updated annually
- Roadmap included in documentation
- Priorities generally based on maintainer interest
- Sponsorship from organizations - with interest in particular features; people paid by e.g., a French consortium work on certain topics
- For us:
- Interest in translating docs on fairness metrics, for including in model cards
- Sponsorship not just monetarily
- Suggested topics for roadmap:
- Use cases and/or example notebooks that demonstrate and/or teach fairness concepts in sociotechnical contexts (Michael M, Michael A, Richard, Hilde)
- Including testimonials and/or scenarios of groups who have already used Fairlearn in their work - more detailed than a 1-paragraph summary, more accessible (e.g., notebook) than a white paper (Richard/Miro)
- Although 1-paragraph testimonials are still good, early on, for funding (Adrin)
- Hilde: more interested in educational materials, including:
- abstraction traps
- construct validity
- simulations of long-term fairness
- sub-group discovery
- Need to create a list of the topics we would want to teach in educational materials (Michael A)
- Need to create a scaffold or organizational structure, taxonomy etc, of topics, possibly sequenced curriculum or textbook (e.g., [fairmlbook](https://fairmlbook.org/)) (Hilde)
- API docs: e.g., what do the metrics mean, what are other names for it (Adrin, Hanna, Richard)
- Documentation is up-to-date, explains things well, connects to external resources that give context (Hanna)
- Features that are already in use in other fairness toolkits (e.g., IBM AI Fairness 360, scikit-fairness), e.g., re-labeling, re-weighting (Adrin, Hilde, Hanna)
- Miro: may need to first identify what the features *are* that other toolkits are using (e.g., as in the table on the Lee and Singh). (Please add some suggestions to a Discussion post!)
- Approaches to explore construct validity and reliability; see Hanna and Abby Jacob's paper [here](https://arxiv.org/abs/1912.05511) (Hanna)
- Toolkits to create datasheets, as in Google's model cards toolkit (Manojit)
- Strategy or path for creating learning objectives and learning materials for Fairlearn (Michael A)
- Adrin: Shouldn't need to limit people from creating content or tutorials for Fairlearn, even if it doesn't end up in the library
- Miro: May want some curation process for resources on the website, and governance may include guidance on how to talk about Fairlearn
- Add error bars to metric frame (Miro)
- Add counterfactual evaluations; e.g., with censored data in, e.g., recidivism, admissions, [child welfare screening](https://arxiv.org/pdf/1909.00066.pdf) (Miro)
- Including metrics for counterfactual fairness, potentially including causal modeling (Miro, Hilde)
- Website revision (Miro, Michael M)
- Next steps:
- Add this as a PR
- Priorities may not matter too much, since people may work on things of their own interest anyway
- metric_frame API
- Certain metrics don't take y_true and y_predicted
- Should all of our arguments be keyword only?
## 01/28/21
- Next release
- Richard will start a GitHub discussion thread to discuss these:
- 0.6 or 0.5.1
- Is changes.md complete?
- Viola's resampling change may not be
- Streaming metrics may not be
- Governance proposal should go in - Miro is working on this, but may or may not be ready for this release
- Switching to main branch (issue [#477](https://github.com/fairlearn/fairlearn/issues/477))?
- Roman and Richard will coordinate on this
- Each PR should have a maintainer responsible for following up, clarifying issues, making sure people who open PRs don't feel like they've fallen through the cracks
- Roman will assign these
- Discuss SciPy and PyCon proposals, and identify who wants to be involved
- Discussion [here](https://github.com/fairlearn/fairlearn/discussions/674)
- Goal is to have the website updated before the conference, so the audience can figure out where/how to contribute
- PyCon
- Discuss 3 types of harms
- Discuss Responsible AI pipeline, and how fairness assessment and mitigation fits in
- Discuss how other RAI tools (datasheets, fairness checklists (e.g., [Deon](https://deon.drivendata.org/), Microsoft's [fairness checklist](https://www.microsoft.com/en-us/ai/responsible-ai-resources?activetab=pivot1:primaryr4), etc) fit into the pipeline
- Use case for assessment and mitigation for quality of service
- Have we already run mitigation approaches for word error rate?
- Mention WER as a concrete example, but may need to change to a different example later
- Making sure to ground quality of service harms in particular social context and make clear how WER disparities may have real consequences for real people (e.g., accessibility, courtroom transcription), before widening the lens to the end
- Questions:
- Discuss Fairlearn's future?
- Invitation to contribute? (either as part of PyCon, or mention modes of contribution)
- May flag gaps and priority areas for functionality - but not do this in a way just tied to Fairlearn, but for other fairness toolkits too
- To what extent do we mention affiliation with Microsoft?
- Talk contributors:
- Not everyone is on the call, so if people are interested, talk to Manojit and Miro on Gitter and GitHub discussion forum
- SciPy [tutorial proposal](https://github.com/fairlearn/fairlearn/discussions/674#discussioncomment-306760)
- Classification or regression? (e.g., binning values into categories)
- Making sure that the use case is credible, and walks through each stage of the pipeline
- Triveni will add a more comprehensive description of the use case at the top of the proposal
- Manojit will discuss with Triveni about fleshing out some elements (e.g., datasheets)
- Triveni will create/share Google Doc link
- Question:
- What are the learning objectives, and is it worth making those explicit in the talk?
- Understanding how to ground concepts in use cases, and how to do that practically
- Triveni will write these out in the proposal
- Contributors:
- Triveni, Miro, Manojit, Michael, Lisa, possibly Roman?
## 01/21/21
- Manojit's PyCon talk proposal (talk length: 30 minutes):
- half the talk on sociotechnical aspects of fairness
- half the talk for a case study
- goal is to raise awareness of Fairlearn and tackling fairness responsibly, because there's a separate sprint at PyCon which is for (potential) contributors
- Triveni's SciPy tutorial proposal (Feb 9 deadline)
- options:
- [diabetes patients data](https://www.hindawi.com/journals/bmri/2014/781670/) -> does blood test data correlate with whether someone was readmitted within 30 days? The particular blood test measure varies between gender and ethnicities which needs to be accounted for.
- something related to medical risk score lab (based on Obermeyer et al. paper)
- [NY SPARCS dataset](https://health.data.ny.gov/dataset/Hospital-Inpatient-Discharges-SPARCS-De-Identified/22g3-z7e7) on hospital admissions from 2017, demographic information seems to be limited; predict whether someone needs care in the future and potentially provide
- do we have a domain expert to validate our assumptions?
- [Colab](https://colab.research.google.com/drive/16qOk72rNADLgSQ9UMqVUORuG4qrnfK1W)
- Manojit: creating a datasheet for the datasets could be a great first step as well
- reminder for Roman to explore non-Teams meeting tech options
- Google Meet
## 01/14/21
- Next steps for conference:
- Spec out outline for talk and tutorial
- Practice talk and tutorial
- Discussion of governance model [PR](https://github.com/fairlearn/governance/pull/1)
- Desire to simplify lawyer jargon, but this may take more time (e.g., scheduling a consult from a lawyer) (Adrin)
- Another issue was that the information I think which is necessary, which would clarify the workflow of contributions, is missing from the governance a little bit (Adrin)
- Website update
- Maintainers + Manojit will meet with website designers
- Potential conferences to present at:
- PyCon US
- Data scientists, front-end, DevOps
- Potential agenda:
- Half: sociotechnical aspects of fairness (e.g., types of harms; abstraction traps, etc)
- Half: presenting example use case and/or notebooks
- SciPy
- Format: talk (25ish minutes) vs. tutorial (3 hours)
- Unclear whether we have sufficient content for a 3 hour tutorial
- Use cases should be 1) credible and 2) supported by Fairlearn
- Audiences: Data scientists and ML
- More focused on how to use Fairlearn (30/60 vs 50/50) than for PyCon
- Need concrete examples for a tutorial (maybe 2-3 case studies)
- Triveni has [tutorials](https://blog.dataiku.com/introducing-the-responsible-ai-in-practice-series-and-healthcare-use-case), but with a different tool, so would need to be adapted for Fairlearn
- Could be a good candidate to include on the website, even if it doesn't fit
- Deadlines:
- February 9th?
- Potential decision:
- Talk at PyCon, given the shorter timeline (May conference) and more general audience (DevOps, front-end)
- Tutorial at SciPy, given the longer timeline (July conference) and more data science-focused audience
- We have a current 90-minute tutorial
- Lending scenario, speech-to-text example
- But purely conceptual, no content
## 01/7/21
- Current website structure:
- User Guide
- API Docs
- Contribute
- Currently proposed website structure (in wireframes):
- Fairlearn home page
- Use cases
- Get Started
- Documentation
- Quickstart
- User Guides
- API References
- Example Notebooks
- Contributor Guide
- Community
- Current content formats:
- Example notebooks:
- Smaller code snippets
- This should likely be in the user guide
- Use cases / scenarios:
- To teach or demonstrate higher-level concepts or ideas
- Challenges:
- Unclear structure for example notebooks, despite contributor guidelines
- Discoverability
- Multiple file formats:
- using small code snippets in the middle of larger text (e.g., .rst files)
- larger code examples (.py or .ipynb files) that does not go into docstrings
- Proposed format distinctions:
- User guide as a tutorial or course in *how* to use the library including fairness concepts included in the package
- Possibly starred sections to flag specific items
- API docs may need to be richer, but should not be tutorials
- Should have a clear definition of concepts, with links to user guide for more detailed explanations, and example notebooks to demonstrate
- Exemplars:
- [Spacy.io](https://spacy.io/)
- Look into the spaCy "universe" for other resources developed for or with spaCy
- [Pandas](https://pandas.pydata.org/)
- AI Fairness 360
- scikit-learn
- Aequitas Toolkit
## 12/17/20
- Feel free to suggest new discussion topics for future meetings on the Fairlearn GitHub Discussion board [here](https://github.com/fairlearn/fairlearn/discussions/653)
- DataKind is interested in potentially using Fairlearn as part of their [Center of Excellence's](https://www.datakind.org/blog/introducing-datakinds-center-of-excellence) strategy for evaluating projects for bias.
- TODO: follow-up conversation with DataKind
- Researchers at Columbia University's [SAFELab](https://safelab.socialwork.columbia.edu/) want to know if we can replicate the methodology and results of [Detecting and Reducing Bias in a High-Stakes Domain](https://www.aclweb.org/anthology/D19-1483/) using Fairlearn.
- classify tweets into aggression, grief, or "other"
- Potential need for:
- probability divergence metrics (e.g., KL Divergence)
- counterfactual analyses, to ground in existing use cases (e.g., Chouldechova)
- Potential risk of using complex ML methods for a task that might not need such an approach
- If people are interested in collaborating, open an issue on GitHub
- Potentially around the metrics needed, or a use case that demonstrates how to use that metric
- [ML Failures Labs](https://daylight.berkeley.edu/mlfailures/)
- We discussed:
- value of articulating learning goals and outcomes (though this was for students, not practitioners)
- importance of contextualizing the dataset (which some of the lab notebooks do better than others)
- the difference in the [loan approval lab](https://colab.research.google.com/drive/1J9YZHEqRWNmNje9q8xmPxUWns15O86A9#scrollTo=gnb6oWTGebot) between the rich historical context about loans and housing in Berkeley, and the Kaggle dataset where the data is not necessarily from that context
- Think about how to (subtly) flag decision points or directions for further, deeper reading and thinking throughout the notebooks and scenarios, without overwhelming practitioners
- Think about some use cases where measurement is the focus, and mitigation may not be part of it
## 12/10/20
- invites - please delete the old invite if it's still on your calendar
- governance: need more clarity on development workflow/operations (who merges, etc.), who new contributors can reach out to; is it okay to continue someone's PR?
- MD: we have some of this information at various places, we need to make it more easily accessible
- DCO: developer certificate of origin has been a huge pain point for every contributor. With the changing governance we have the option of changing/removing this requirement.
- https://github.com/fairlearn/fairlearn/issues/650
- There is a new feature called "GitHub Discussions". This could be useful for anything that's not really an "issue" per se, and allows us to convert existing issues to discussions. https://docs.github.com/en/free-pro-team@latest/discussions
- Miro: what content should be on website vs. repo?
- can we transfer questions into an FAQ on the website?
- Adding streaming metrics: @Frederic raised this feature request. In Deep Learning predictions are batched, so metrics are updated with each batch. With Fairlearn that's not currently possible. Could we extend metric functionality to this scenario?
- Richard: can we follow an established pattern like we did for scikit-learn metrics?
- an `update` function could handle that
- Depending on the size of `y_true`, `y_pred` this may still work, but if those vectors are too big we need something more
- Frederic to open issue
- upcoming release of Fairlearn: Let’s get on the same page w.r.t. what’s missing and find a target date
- mitigations for control features
- complete plotting PR(s), then do a release -> widget removal
- Adrin: wait a few months to fully remove
- Roman: correct deprecation warning to say "going awaya after next release"
- AIF360 recently started wrapping our mitigation techniques. We can discuss if there’s anything similar (or very different) we’d like to do.
- Miro: comparison between toolkits could be useful on website including metrics, mitigations (there are some differences)
- Miro: contributions should be motivated by a use case (regardless), if one comes up where a wrapper makes sense we can consider that
- Adrin: you need to create a dataset for that, overall wrapping process is not simple, they don't use scikit-learn API
- Hilde: implementation from scratch not too hard
- website: we'll get a proposal on what the website might look like in the future soon
- Bruke: Can we improve (or even just measure) usability of the tool & the documentation? --> continue discussion next time
- pypi downloads
- #stars
- interactions on GitHub, Gitter
- ...
## 12/3/20
- Discussion of predictive policing and criminal justice scenarios (Michael Amoako, Bruke Kifle, Hilde Weerts)
- Goal of the predictive policing case study was originally to highlight risks of using data with historical biases to make decisions
- Goals of COMPAS tutorial:
- problem of fairness through unawareness;
- issues of construct validity;
- risks of abstraction traps, feedback loops;
- problems of making decisions about individuals based on group statistics
- biases introduced when humans make decisions on the basis of the model's output
- Potential revision to combine both policing and COMPAS case studies to a larger discussion of risks in criminal justice
- Concerns about:
- generalizing the learning goals too much by combining them
- the risk of an abstraction trap, or losing the specificity of the sociotechnical context by combining these two
- Potential approach:
- Choose a scenario, then identify possible harms that arise through
- OR, choose a set of potential focus areas or learning goals, then choose a scenario to communicate those
- Comments/questions:
- May need a section in the tutorials to be explicit about the mapping between the two scenarios, and being explicit about the underlying ideology for someone looking for a scenario about that ideology
- What is the intention behind these scenarios?
- One idea: a data scientist working in this domain might learn something about it
- Alternatively, teaching specific concepts in the fairness space, using this context as an example
- And the need for either of those to be sure to ground the write-up in what is known in the literature for that domain and the specific sociotechnical context
- The value of open-source is that it allows for corrections and revisions to the scenarios from experts
- Need to be explicit about who the audience is (not just data scientists, generally)
- Helpful to be explicit about the intended audience and intended purpose of the scenario is (e.g., marketing, educational), if educational, be explicit about the prioritized learning objectives at the beginning of the Pull Request
- What is the purpose of this call?
- Possible: a synchronous space for members of the community to come together to discuss matters of shared concern, in addition to the asynchronous participation on GitHub and Gitter
- Could be a space for users to come ask questions on how to use it?
- Process for suggesting agenda topics for the future?
- Post on a GitHub wiki?
- Use a single GitHub issue for meeting agenda topics?
- Gitter? (but may get lost in the discussions)
- Types of meetings
- Discuss open GitHub issues
- Feedback sessions on work people have done
- High-level discussions about project (e.g., governance, roadmap, etc.)
- Hands-on activities/workshops
- Brainstorming framing traps in potential domains
- Opportunities for research presentation and feedback
- Open Q&A with users
## 11/18/20
- Discussion of governance structure, led by Miro Dudik
- Notes on the governance structure will be in a pull request
- Discussion of a tutorial on fairness in ML, led by Hilde JP Weerts
- Motivated by a need to provide training on ML fairness approaches to practitioners
- Choice of COMPAS problem framing - ubiquitous dataset and problem in fair ML
- Structure:
- Context for pre-trial risk assessment
- Initial pass on models and metrics
- Removing sensitive features
**- Should include why this is problematic**
- Elements to include
- Should highlight the real harms to real people
- Should emphasize the experience of people who go through the system
- Problematizing construct validity of data (e.g., arrests != crimes; how race is measured and constructed)
- Predictions of the model are not equivalent to the output of the system
- Asking whether the problem is appropriate for machine learning (may be critical to ask at the beginning, but it may not be best for data scientists to read that first)
- Question:
- How do we give practitioners these tools and insights beyond just "think harder"?
- How do we prevent practitioners from feeling helpless when they encounter these issues?
- Next steps:
- To create a set of separate tutorials, including modeling COMPAS scores
- Comments:
- Applicability for policing scenario [pull request](https://github.com/fairlearn/fairlearn/pull/634)
- [Critique](https://www.uscourts.gov/federal-probation-journal/2016/09/false-positives-false-negatives-and-false-analyses-rejoinder) of ProPublica task may be relevant
- What are the educational goals here?
- Introduction to fairness metrics
- Show issues arising when trying to mitigate
- Show how the focus should expand beyond the predictive model alone
- Teaching by analogy - demonstrate this process in one domain, while highlighting how practitioners might able to apply it in their own domain
- Are the chosen features valid, are they arbitrary, etc
- Highlight generalizable elements:
- How features are derived, etc
- Raising the need for new features:
- Error bars
## 11/12/2020
Developer call
Scribbler: `Richard Edgar <https://github.com/riedgar-ms>`_
Attendees:
- `Michael Amoako <https://github.com/michaelamoako>`_
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Michael Madaio <https://github.com/mmadaio>`_
- `Diondra Peck <https://github.com/diondrapeck>`_
- Alex Quach
- `Hilde Weerts <https://github.com/hildeweerts>`_
- `Viola Zhong <https://github.com/violazhong>`_
Agenda
- v0.5.0 Release
The v0.5.0 release was finally sent to PyPI earlier this week.
This has the new metrics API, various other enhancements and deprecates
the widget. Our documentation should now match the version on PyPI
as well.
- Governance
Miro summarised a governance proposal based on those for InterpretML.
Basically, there would be a steering committee, and maintainers for
the various projects which make up Fairlearn.
This lead to a lively discussion, with particular concerns being raised
by Adrin and Hilde.
Hilde was concerned the the proposal still looked very Microsoft
dominated, and that there would be little change in how things worked.
Would the community at least be asked to nominate people to the
steering committee?
Adrin brought perspectives from SciKit-Learn. Some key points were:
- The technical committee for SciKit-Learn is a subset of the
maintainers. Adrin feels that this helps build community
- Although the technical committee formally owns the trademarks etc.
it is subordinate to the maintainers
Adrin also raised the point that the process for changing the governance
documents needs to be clear - although it may mechanically be a
pull request, the bar is going to be higher.
It was also unclear how the concepts discussed would map onto the
facilities provided by GitHub.
Miro is going to work on formal governance documents, and put out a PR.
Feedback is *really* encouraged for this - whether as PR comments, or
emailing Miro directly.
## 11/05/2020
- Discussion of paper on classifier of chest X-rays as a possible case study
- Use of insurance type as a proxy for SES - validity of this?
- No intersectional analyses
- TPR disparity
- How to raise or address larger sociotechnical issues?
- From a recent [blog post](https://lukeoakdenrayner.wordpress.com/2019/02/25/half-a-million-x-rays-first-impressions-of-the-stanford-and-mit-chest-x-ray-datasets/)
## 10/29/20
- Presentation of predictive policing scenario from Michael A and Bruke
- Delivery of scenario:
- For predictive policing - recent criminal justice reform hackathon, or folks from the MSFT criminal justice reform initiative (e.g., Merisa)
- Getting input from domain experts prior to publication (seems to be preferred), or with revisions via Github issues post publication
- Process proposal:
- PR to publish, with comments and issues in discussion there
- white paper going through review process, then PR for publishing on the website
- Separate meetings with small groups to work on scenarios (e.g., Michael and Bruke), using weekly meetings to:
- Get input on scenario details
- Solicit input about which SMEs to contact and get connections
- Delivery format:
- notebook or blog post?
- Both? e.g., markdown with code (e.g., Python blocks)
- Synthesizing code and text - need to think about goals of data scientists coming to Fairlearn
- Do we need to highlight Fairlearn in each use case? (possibly not, if it's not relevant)
<!-- - Update on contribution process for content (e.g., use cases, scenarios) -->
## 10/22/2020
- Identify lessons learned from the scenarios for candidate screening and predictive policing, and what can we take away from those to inform future content or the contributor guidelines?
- Construct validity
- Importance of choice of dataset
- Datasets available in the Fairlearn package may not be appropriate for every use case
- How do we get users to think about whether their dataset is appropriate for their context?
- This issue has been raised before [here](https://github.com/fairlearn/fairlearn/issues/583)
- Talking points about each case study: see Kevin's discussion of this [here](https://hackmd.io/nDiDafJ6TMKi2cYDHnujtA#A-first-draft-format)
- Can't possibly enumerate all possible contexts in scenarios, so each scenario might exemplify certain broader issues
- Challenge of communicating about issues
- Both among our groups, as well as communicating with practitioners
- Need for a shared language about best practices
- Put the 2 scenarios that we have developed on the website, with language and caveats for concerns
- Concern about validity of entire dataset vs. validity of certain labels
- Goal of the case study:
- Highlight that the problem formulation is problematic irrespective of the dataset
- Importance of considering the dataset
- Who are we making these scenarios for, and why, and what impact might they have on the world?
- Making resources for the users may be very different than making resources for the Fairlearn contributor community to understand the problem space of fairness as sociotechnical
- Critical to recognize that, by analogy, data scientists in other domains may encounter a scenario in another domain and learn something about *sociotechnical fairness* and how they might identify those issues and address them in their own domain
- We may need to be explicit that this translation is the goal
- And create methods for evaluating the educational outcomes, and change on users' knowledge and behavior
- Discuss a process for contributing new content (e.g., scenarios like those, but also example notebooks, etc)
- Note: may want to put content on the website with caveats or stubs (in the Wikipedia sense) for a larger discussion of the complexity of the issue
- May want to start with the goal or point we're trying to make, and develop scenarios for those, vs. choosing a context or use case
- Who should be empowered and enabled to contribute content/resources (scenarios, use cases, example notebooks)?
- One approach: seed with some set of example scenarios or use cases from the community that model what "we" (who?) want to see from those, then allow anyone who encounters Fairlearn to contribute
- Concrete proposal (from Miro): bring back [the example notebooks](https://github.com/fairlearn/fairlearn/pull/513) that were generated
- Who should contribute to them?
- "We as a community should feel ownership" and create pull requests to modify existing use cases
- Who would review or vet the proposed use cases, and how?
- Given the complexity of the domains, the sociotechnical context, and the nuances of AI ethics
- One proposal: the community (of users, contributors, others, etc?) provides feedback to each other, help each other learn and grow
### Notes
- Potential for Fairlearn to scaffold thinking about operationalization, construct validity, reliability
- Need to make sure that contributors feel like they're using their time productively
- Creating a shared language, understanding
- How might we concretize these learnings and shared language and understanding for when new members join
## 10/15/2020
- Presentation from Michael and Bruke about a predictive policing scenario
## 10/08/2020
Developer Call
Scribbler: `Richard Edgar <https://github.com/riedgar-ms>`_
Attendees:
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Diondra Peck <https://github.com/diondrapeck>`_
- Alex Quach
- `Kevin Robinson <https://github.com/kevinrobinson>`_
- `Mehrnoosh Sameki <https://github.com/mesameki>`_
- `Hanna Wallach <https://www.microsoft.com/en-us/research/people/wallach/>`_
- `Hilde Weerts <https://github.com/hildeweerts>`_
Agenda:
- Requirements for next release
- Still need new metrics API
- Need to announce deprecation of ``FairlearnDashboard``
- Need to make sure Quickstart will actually work
(short term fix: put note about running `pip install` against GitHub at the top)
- Roman is writing up a userguide for ``ThresholdOptimizer``
To clear up confusion over the ``FairlearnDashboard``, it has been updated with a
``flask`` backend but we have rolled back the major UI change which had been put
in (after the ``v0.4.6`` release). We plan on deprecating the ``FairlearnDashboard``
within Fairlearn, since its code has been moved to an explicitly Microsoft-owned
repository (which will be open sourced Real Soon Now), since there are
Azure-imposed requirements on that code. The Fairlearn community will then be
free to add any desired visualisations within Fairlearn in the future.
- Governance
An update from Miro. Probably won't be ready for the October NumFocus application
deadline, but Microsoft will hopefully separate Fairlearn as a neutral organisation
during October. This will remove the need for things like `Security.md` which are not
particularly open source friendly.
- Naming the metrics API
With the basic API for the new style metrics agreed, we need to decide on names for
everything. This is currently
`an open issue <https://github.com/fairlearn/fairlearn-proposals/issues/17>`_
and *really* needs to be brought to resolution.
We had a lengthy discussion. There was a feeling that `MetricFrame` would be better than
`GroupedMetric` for the whole class, but we did not agree on alternatives for
`conditional_features=`. At Kevin's suggestion to focus on how users will first encounter
the code, Richard will rewrite the example code
introducing the new metric API to be focussed on the sociotechnical side, rather than
just an API demonstration. We will also reach out to Solon Barocas.
## 10/01/2020
### Next Thursday
- Think about how we reach and get input from users on their needs (both for Fairlearn as well as the website)
- Decide on an initial place and process on the website for proposing new content and creating pull requests to edit content
- Identify takeaways from the [candidate screening scenario](https://hackmd.io/GMli82s7SxORABkabCgw8Q#3-Sociotechnical-context)
- Discuss scenario authorship logistics
- ...anything else?
### Notes
- Decision: Prioritizing and focusing on data scientists first
- Process:
- Getting data scientists to:
- provide input on what they need
- give input on prototypes
- Types of content:
- User guide and example notebooks
- How to use current tools
- Domain-specific scenarios
- Ways to think about fairness beyond tools
- Real world examples or case studies of:
- people building systems and how they mitigate risks
- organizational constraints they might or have run into
- Structure of website:
- Organized around data science process
- Given in bite-sized chunks
- Goals of content:
- Teach them:
- how to use Fairlearn 0.4.6
- Communicate:
- what value they might get from Fairlearn
- Shape:
- their current systems and design processes
- Framing of how we communicate fairness has implicit assumptions around what our goals are, for which audiences, etc [-Kevin]
- Intended audiences / personas
- Potential *consumers/users* of Fairlearn
- Data scientists and ML developers/engineers
- Data journalists, citizen auditors, regulators
- Customers of Azure ML
- **Comments:**
- These might be more appropriate from the AzureML website? [-Miro]
- Users may be coming from different platforms (e.g., GCP) [-Mehrnoosh]
- Business leaders
- **Comment:** Are these folks actually likely to use it? [-Miro]
- Researchers
- Cloud solution architects, consultants
- Potential *contributors* to Fairlearn as an OS project
- Potential needs of personas
- Data scientists creating reports for bosses
- Auditors, regulatory bodies, or data journalists creating reports or audits
- **Comments:** May be a distinction between model builders (e.g., data scientists), model "breakers" (e.g., auditors), and model consumers
- Documenting and storing models
#### Comments:
[-Richard E]: This from Hilde should be a big concern: "ah, that's just another single algorithm implemented solely for a particular paper, let's find a package that's more substantial"
- In other contexts, I've found myself looking at a piece of software and thinking "is this something substantial and supported, or is this just enough to let someone write up their thesis?"
- I suppose we need to consider that the experience is on github.com/fairlearn/fairlearn, as well as fairlearn.github.io. People might find the former first, and we need to make sure that the project page communicates enough to avoid the reaction Hilde describes. Same for the PyPI page (which is basically the same as the github.com/fairlearn/fairlearn page)
[-Roman L]: A place I’d really like to get to is that everyone knows where to put content that they want to add. One way to get there is to add a new section where each new case study or blog can go (sort of a TAble of contents) and then people can add PRs to suggest added content. That’s definitely not ideal yet but it’s a state from where we can iterate and people can actually share ideas and comments as opposed to wondering what it may look like eventually.
- FWIW I’m not a huge fan of the landing page but that doesn’t need to hold us back. Let’s define an initial place to add content to get people contributing. In parallel let’s rethink how the landing page can point folks to the right content and iterate.
[-Bruke K]: Are there currently any open communication channels with users (slack, community calls, etc.)? (gitter.im/fairlearn/community)
[-Vanessa M]: In working with DS on other projects, bite-size visuals for before/after in the given blog/article scenario really helps. Pending the article type - case study vs. use case because I think the content structure is very different, that interactive modules in context is also helpful
[-Lisa I]: Think about what the initial engagement is into the website and then how the value and engagement can change over time.
[-Hilde W]: The primary audience should be data scientists and I think creating useful content for just that audience is already quite a big task :) And in order to create such content we need more expertise than data science expertise alone. I would love to have on the website is examples on how to use the tools/blog posts in which we dive deeper into sociotehcnical issues/blog posts that cover systems people have actually build and how they mitigated risks in practice. Additionally i think it would be cool to structure this around the data science development process and make sure we add tags for specific domains (both technical domains and business domains)
- [-Miro D]: yes! i'm hoping that notebooks & worked-out scenarios (blog posts?) will be helpful in that regard, but there might be other avenues? (though this is an open question)
- Random other note: I do think the technical tools are important to 'lure' in data
- Right now i think data scientists might skip fairlearn compared to other libraries such as aif360 because we have only a few mitigation techniques.
- something like that might be addressable in a FAQ? The basic issue is tha AIF360 is even more technical and I have found it extremely hard to justify many of those algorithmic approaches from the socio-technical perspective. (-Miro)
- Honestly I doubt people will read the FAQ. E.g. when I first looked at the Fairlearn project I was like "ah, that's just another single algorithm implented solely for a particular paper, let's find a package that's more substantial" (-Hilde)
- Also, atm our own examples/user guide do not really tell anything about in which scenario a particular technique would mitigate harms, do they? i'm definitely not in favor of just adding random algorithms, but I do think it's important just to lure people in honestly xD. otherwise they won't hang around for our awesome blog posts
[-Kevin R]: Enormous list of stakeholders that surpasses the capacity to stakeholders to reach. May be a distinction between people who add value (e.g., contributors) and those who find value in it (e.g., "users").
- being a github repo is a really unique strength vs. how other groups approach this
[-Abbey L]: How about customers, partners as intended audiences? What about our CSAs [cloud solution architects], they might not be data scientists, but would work on customers on projects that could benefit from fairlearn.
- Can we bring in a few data scientists next week and have them give their input on their needs?
[-Michael A]: How big is the audience of those who actually want to contribute to Fairlearn (since it's open source) and who does that tend to be? And how much of an emphasis should be placed on that stakeholder group?
[-Merhnoosh S]: Paper I was referring to is: Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs - https://arxiv.org/pdf/2004.11440.pdf, Again, heavier on the interpretability side. But touches on fairness issues as well.
### Agenda
- Introductions
- Discuss how communicating fairness as sociotechnical might fit into the [website](https://fairlearn.github.io/contents.html) structure (e.g., scenarios)
- Identify next steps for the [candidate screening scenario](https://hackmd.io/GMli82s7SxORABkabCgw8Q#3-Sociotechnical-context), to turn that into something publishable on the website
- Discuss authorship logistics for scenarios
- ...anything else?
- Next steps
## 09/24/20 Meeting Notes
### Next steps
- For next week
- Identify next steps for the candidate screening scenario, to turn that into something publishable on the website
- Come with ideas on how sociotechnical scenarios might fit into the website structure
- Discuss authorship logistics for scenarios
- ...anything else?
- Directions to move forward on:
- Case studies / example notebooks / scenarios
- To what extent do we want notebooks to use the Fairlearn codebase? (possibly, but not exclusively?)
- Do we want notebooks or blog posts? (either/both?)
- Contributor guide (e.g., specifics on what we mean by treating fairness as sociotechnical; criteria for a good notebook)
- Documentation / user guides
- Website design / structure
### What does it mean to treat fairness as a sociotechnical challenge?
- Consider real harms affecting real people
- Lot to be learned from existing case studies (e.g., COMPAS, predictive policing) – “education is the best form of empowerment”
- Acknowledge examples where tech has “gone left”
- Models are a simplification of reality
- Can we get back to where the data comes from in the world? Talking to impacted people?
- Be aware of – and avoids - abstraction traps
- Be aware of context (collecting data…. All the way to deployment context)
- Ensuring there is a variety of diverse humans involved in the process
- Understand the specific actors and institutions of the social context, as well as the political, legal, and historical context
- Understand the power dynamics between different actors in that social context
- Thinking about your model as embedded in dynamic systems (why certain features are there and why they aren’t)
- this is the “pre” part, and post is how it might get used]
- What is the impact of false positives or false negatives? What is the “journey” of those?
- What is the impact on society – what are the implications for people they’re affecting?
- Redefining performance metrics – not just speed/accuracy, but expanded to include fairness (and privacy, equality, etc)
- Writing and talking about fairness is key – translational work to connect to people outside their role
### How might the Fairlearn project “model the value that fairness is sociotechnical”?
- Improve use cases
- Perception that existing example notebooks didn’t communicate the stated values of the project
- Could be shipped Python code, educational notebooks/materials, documentation
- Include contextualized data
- Include cases where you might not want to use Fairlearn, and how you might want to talk about fairness
- What might be missing within an application of Fairlearn? And what might they do instead?
- Notebooks that prompt action-oriented reflection on social context and power relations
- In ways that are actionable for data scientists with limited bandwidth
- That model how to explore that social context and document sociotechnical aspects of the problem space (including cautions, limitations, etc)
- Improve structure of the website
- Sociotechnical talking points
- To give data scientists the language (and tools?) to talk about how the social context shapes their model, data, definition of fairness, etc
- Or guidance to talk about what is out of scope for their role, but may be in scope of someone else (e.g., PM)
- Think about potential differences between:
- Communication / rhetorical materials to communicate value of FL
- Educational materials to teach concepts or features
- Improve toolkit / documentation / user guide
- To “tie in sociotechnical issues better”
- Consider who our users or audience are:
- Early adopters “ready to drink from the firehose”
- Broader mainstream
- **Comment**: Is this discussion only for example notebooks that communicate the value prop for Fairlearn 0.4.6?
- Response: No, we should consider proposing new features as well
- **Question**: What are the criteria for growing Fairlearn?
- Either for something new in the library or in the documentation, what should be the criteria for inclusion?
- Documentation may be educational material, resources, links to other media
### What have we learned from the deep dive into the candidate screening scenario that could inform sociotechnical approaches to Fairlearn?
- Categories and heuristics
- Skeleton and structure
- Concern of incomplete information – demonstrating the wider lens of the problem on filtering and screening candidates
- **Question**: Is this a concern for data scientists? How might Fairlearn provide value with use cases given incomplete information?
- Other example scenarios might not have issues with incompleteness (e.g., credit scoring?)
- How do we present nuances of this larger social context (e.g., avoiding the “framing trap”) without intimidating them
- Flagging concerns about construct validity
- The lessons from this scenario could be useful for framing case studies in general.
- First start with context for the situation (which we have gone over). Then dive into the following parts:
- A. The overall goals (costs, profit, etc..)
- B. Info on the underlying model in the situation
- C. What Fairlearn in this case might show if we apply it to the underlying model? How does this tie to A?
- D. Given the full socio-technical context, what might be missing with this application of Fairlearn?
- E. Given D, as Kevin (Guest) said, what should practitioners do or think about instead of or in addition to C?
- Could the issue be within A. as Bruke alluded to?
## 09/17/2020
Developer Call
These are collaborative notes.
Attendees:
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Roman Lutz <https://github.com/romanlutz>`_
- `Andreas Mueller <https://github.com/amueller>`_
- `Kevin Robinson <https://github.com/kevinrobinson>`_
- `Mehrnoosh Sameki <https://github.com/mesameki>`_
- `Hilde Weerts <https://github.com/hildeweerts>`_
Agenda:
- Datasets module: keep or remove?
- not part of a release yet (so removing would be easy)
- reasons for including are technical:
- show how to use a function (create "minimal examples")
- benchmark algorithms, e.g., how well they optimize
fairness/performance trade-off under specific quantitative
definitions of fairness and performance
- but including datasets means we should address sociotechnical aspects
- we don't add real functionality, just use functionality from
`sklearn.datasets`, namely `fetch_openml`
- potential value prop: we can return sensitive features in addition
to X, y
- e.g., https://github.com/fairlearn/fairlearn/pull/519
- other options (other than keep/remove):
- don't spend more time on it
- remove module, but move content into examples
- Roman to summarize this info in an issue and get Vincent's feedback
- Should we restructure our weekly sessions?
- current situation:
- developer call once a month (2nd Thursday of each month, today being
an exception)
- sociotechnical deep dive on all other Thursdays
- changes to this schedule are possible
- Feedback: get agenda out at least a couple of days before the call!
- Metrics API progress
- Proposal: https://github.com/fairlearn/fairlearn-proposals/pull/15
--> Richard started
`implementation <https://github.com/riedgar-ms/fairlearn/blob/riedgar-ms/return-of-the-GMR/test/unit/metrics/experimental/sample_code.py>`_
- https://hackmd.io/zMSG-_oWSxOs-AVbLwNywQ
- UI recap
- Proposal for existing widget:
https://github.com/fairlearn/fairlearn-proposals/pull/14
- PR to add matplotlib plots to Fairlearn:
https://github.com/fairlearn/fairlearn/pull/561
- if there's a need for other kinds of visualizations beyond matplotlib
we can add other repos within the Fairlearn org
## 08/13/2020
Developer call
Notes taken by Roman Lutz.
Attendees:
- `Michael Amoako <https://github.com/michaelamoako>`_
- `Andrew Anderson <https://www.linkedin.com/in/andrewanderson05/>`_
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Roman Lutz <https://github.com/romanlutz>`_
- `Kevin Robinson <https://github.com/kevinrobinson>`_
- `Hanna Wallach <https://www.microsoft.com/en-us/research/people/wallach/>`_
- `Ke Xu <https://github.com/KeXu444>`_
Agenda:
- UX refactoring
- `Proposal <https://github.com/fairlearn/fairlearn-proposals/pull/14>`_
published earlier this week.
- Feedback:
- user guides need to reflect the separation
- all steps required need to be outlined
- make it clear and easy how to contribute to UI
- communication around UI repo should also be open
- proposal needs to address community-driven UI development, not just
the future of the existing UI under a Microsoft repo.
- datasets module - skipped this since Hilde / Vincent should be involved
- webpage redesign:
- Vanessa has some capacity
- effort starting soon --> anyone interested please reach out
## 07/09/2020
Developer call
These are collaborative notes.
Attendees:
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Roman Lutz <https://github.com/romanlutz>`_
- `Vanessa Milan <https://www.microsoft.com/en-us/research/people/vmilan/>`_
- `Andreas Mueller <https://github.com/amueller>`_
- `Mehrnoosh Sameki <https://github.com/mesameki>`_
- `Mark Soper <https://github.com/marksoper>`_
- `Hanna Wallach <https://www.microsoft.com/en-us/research/people/wallach/>`_
- `Hilde Weerts <https://github.com/hildeweerts>`_
Agenda:
- We now have project boards that show what tasks are currently in progress.
This should indicate to contributors whether a task can be picked up by
them. We're looking for feedback on whether this is clear or not, and will
adjust the process accordingly.
`[GitHub Projects] <https://github.com/fairlearn/fairlearn/projects>`_
Additionally the GitHub Issues marked with `help-wanted` have been cleaned
up to make sure they're up to date.
- We're still doing the weekly deep dives every Thursday (other than 2nd every
month) to capture the sociotechnical context of applications and hopefully
produce one or more appropriate example notebooks. If you want to join
just send a message to Roman on
`Gitter <https://gitter.im/fairlearn/community>`_
- Metrics proposal - Miro to follow up based on last week's conversation
with Hilde, Adrin, Roman, Richard
- UX: goal to enable ecosystem of visualizations specific for application
context, currently not as well documented as the rest of the code
- Governance:
- short-term: MSFT supported project
- long-term: hand-off to neutral entity (including trademark)
- waiting might discourage contributions, so we'll aim at getting this
done sooner rather than later. Microsoft folks will follow up with
legal.
- Outreach: So far the philosophy was to wait for project to stablize before
major outreach efforts. Most of the participants felt like it's time to do
more on this (while ensuring we
`speak about fairness in the right way <https://fairlearn.github.io/contributor_guide/how_to_talk_about_fairness.html>`_).
- It's important to be clear on project values before outreach. For that
reason we'll double down on efforts for the mission/roadmap PR.
How we speak about fairness matters. What are cases where this is the
appropriate tool, what are cases where it's not? We need to capture that.
- Roman signed up for scipy sprints this weekend. It's unclear whether we're
too late.
## 06/11/2020
Developer call
Notes were taken by various attendees.
Attendees:
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Parul Gupta <https://github.com/parul100495>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Roman Lutz <https://github.com/romanlutz>`_
- `Vanessa Milan <https://www.microsoft.com/en-us/research/people/vmilan/>`_
- `Kevin Robinson <https://github.com/kevinrobinson>`_
- `Mehrnoosh Sameki <https://github.com/mesameki>`_
- `Mark Soper <https://github.com/marksoper>`_
- `Hanna Wallach <https://www.microsoft.com/en-us/research/people/wallach/>`_
- `Hilde Weerts <https://github.com/hildeweerts>`_
- `Vincent Xu <https://github.com/vingu>`_
Agenda:
- new website at https://fairlearn.github.io/
- please provide feedback and feel free to submit PRs to update/edit/add
content or styling
- content in the process of getting created, please help if you're
interested! This includes
- user guides
- roadmap, vision/mission, FAQ
- contributor guide
- acceptance criteria for contributions
- Governance
- copyright updated to "Copyright Microsoft Corporation and contributors"
- short-term:
- define how to become core dev
- define requirements for contributions to be accepted
- long-term:
- Would it help for Fairlearn to become a foundation / charity?
At which point? This will likely be something to keep revisiting.
- NumFocus may perhaps be an option?
- definitely need to transition to a different model if there are
outside institutional contributors (but possibly earlier, e.g., if we
reach a larger scale).
- Kevin in Gitter:
- *Who are the current users of Fairlearn, what do they love about it?*
- how do we find out?
- *What's the roadmap for the project in the next 3-6mos?*
- TODO: start a PR around a short-term roadmap
- *I'd also be excited to discover if there are any co-conspirators
interested in working towards addressing practioner needs
(e.g., Holstein et al. 2019) or supporting practioners facing the wider
scope of sociotechnical fairness work (eg, Madaio et al. 2020;
Selbst et al. 2020).*
- TODO: meet and discuss (already planned!)
- Metrics: Simplicity/usability and more complex metrics group objects with
richer functionality don't need to be mutually exclusive. We will iterate on
the current state.
- Notebooks
- development process needs to be established
- porting notebooks from `notebooks` over to `examples` as a good first
issue
- To work out proper examples with a focus on sociotechnical framing we'll
set up a weekly hour where everyone who wants to discuss and contribute
can do so. It'll be at the same time as the developer call
(Thursday 11am New York time). Reach out to Roman via Gitter to participate. All are
welcome!
- UX
- There's already a separate call scheduled (originally meant to be on
6/12, but had to be rescheduled to next week) to discuss various kinds
of potential UX research directions.
- Beyond that there's an immediate need to remove obstacles to contribute
to the UX. Clarifying the roadmap as mentioned above is one step in this
journey. Currently the contributor guide is focused entirely on Python
code, with basically no explanation on how to work with the typescript
based dashboard.
## 05/07/2020
Developer call
Scribe: `Roman Lutz <https://github.com/romanlutz>`_
Attendees:
- `Andrew Anderson <https://www.linkedin.com/in/andrewanderson05/>`_
- `Matthijs Brouns <https://github.com/mbrouns>`_
- `Miro Dudik <https://github.com/MiroDudik>`_
- `Richard Edgar <https://github.com/riedgar-ms>`_
- `Parul Gupta <https://github.com/parul100495>`_
- `Ken Holstein <https://kenholstein.myportfolio.com/>`_
- `Adrin Jalali <https://github.com/adrinjalali>`_
- `Abdul Hannan Kanji <https://github.com/hannanabdul55>`_
- `Roman Lutz <https://github.com/romanlutz>`_
- `Vanessa Milan <https://www.microsoft.com/en-us/research/people/vmilan/>`_
- `Kevin Robinson <https://github.com/kevinrobinson>`_
- `Mehrnoosh Sameki <https://github.com/mesameki>`_
- `Hanna Wallach <https://www.microsoft.com/en-us/research/people/wallach/>`_
Organizational updates
^^^^^^^^^^^^^^^^^^^^^^
#. Microsoft & Fairlearn:
- goal is to be an open source, community driven project
- sociotechnical framing is central to the project, careful and precise
wording matter
- currently many core members are from Microsoft, but the project is an
independent open source project
- more focus on governance when the project matures
- The Azure ML team will try to make Fairlearn easily usable in Azure,
but Fairlearn is a standalone and independent project. Any
functionality that can exist in the Fairlearn repository should live
there. The Azure ML goals do not affect Fairlearn project decisions,
just like with any other open source projects.
#. regular occurrence of developer call (2nd Thursday each month)
#. new newsletter through python.org, used to keep community in the loop about major news/updates
#. university collaborations with UMass Amherst and Stanford are currently ongoing
#. documentation/website updates: overall proposal currently in
`PR <https://github.com/fairlearn/fairlearn-proposals/pull/8/files>`_,
landing page designed, to be added mid-May
Topics
^^^^^^
The rest of the call was meant to be about the UX roadmap and potential
collaboration with scikit-fairness. We knew ahead of time that this was
ambitious, especially with lots of new people on the call. A lot of good
points about the general direction of the project were made, which I'll try to
summarize below:
So far Fairlearn has focused on group fairness as opposed to individual or
counterfactual fairness. The covered tasks are classification & regression
only. Other settings are currently missing, e.g., partial observability
(example: we don't have data on those that didn't get an opportunity such as
getting the chance to pay back a loan).
Guiding question: **how to cover the sociotechnical frame properly?**
[summary of points made by various participants]
The sociotechnical nature of fairness needs to be at the center of the
project. User experience starts with broadly accessible educational material
to encourage people to think bigger, not necessarily visualizations (people
first, not technology first). Fairlearn and similar toolkits will not 'solve'
fairness because fairness is fundamentally sociotechnical. However, Fairlearn
may have a role to play in a slice of the things that need to be done. This
will require certain guardrails and lots of education. The project itself will
need to go beyond just the code through community, outreach, and education.
Accessibility serves as a source of inspiration after having faced some of the
same obstacles including changing the way people talk about the topic.
UX-specific points made throughout the conversation:
* A lot of things depend on the specific context. To create domain-specific UX
flows we'll need to work with partners in those domains.
* For data scientists the job doesn't end with metrics. They need to interpret
disparities, so there's potential for interactions with model
interpretability to "debug" issues (e.g. InterpretML)
* We need to provide a process for how to iterate on new UX flows as well as
domain or case studies. To understand whether specific parts of the UX are
useful to data scientists we can add them and see how people use them, e.g.
through user studies. To some extent we've followed a similar approach with
our example notebooks. They are already ahead of the dashboard in terms of
context-specific visualizations (done with matplotlib). Those that resonate
with users could be added to the dashboard.
For both major topics (UX, scikit-fairness) we'll have to start follow-up
discussions.
For reference:
* UX roadmap - discussion related to this `proposal <https://github.com/fairlearn/fairlearn-proposals/issues/2>`_
* scikit-fairness - discussion related to this `issue <https://github.com/fairlearn/fairlearn/issues/406>`_