# Scikit-learn bi-weekly progress status (even weeks) **Goal**: internal communication on recent work and short term planning **Who**: people working the maintenance of scikit-learn and related, in particular at probabl and Inria and maybe others **Frequency**: every other Monday at 15:00 CET/CEST, unless it happens on the same day as the Monthly meeting. **Where**: https://meet.google.com/xdm-ozyn-pgj **Meeting notes**: to be archived on the [scikit-learn org repo](https://github.com/scikit-learn/administrative/tree/master/biweekly_meetings) ## Next meeting templates **Rules of the game:** - No question during the progress reports - Add those questions in the discussion reports ``` ## 20XX-YY-ZZ ### Progress reports - [name=Someone] - item - item ### Discussion points - ... ``` ## 2026-02-02 ### Progress reports (notes summarizing a few items that stand out) - Callbacks are moving forward. Olivier wonder's if he has not seen a bug on the progress bar in interactive use in notebook mode of vscode - Array API support for logistic regression LBFGS is moving forward - Funding for NASA roses is validated for this year, but next year we cannot rely on it - Connection to skore hub: simplifying the connection ### Discussion points - ## 2026-01-19 ### Progress reports - [name=Olivier] - Reviewing tree criterion/splitter refactoring to fix a bug with the Poisson loss and add missing value support for MAE: https://github.com/scikit-learn/scikit-learn/pull/32119 - Reviewing the class-based design of array API support for loss functions used for LogisticRegression: https://github.com/scikit-learn/scikit-learn/pull/32644 - Reviewed `DecisionBoundaryDisplay` multiclass colors fix: https://github.com/scikit-learn/scikit-learn/pull/33015 (second review in progress) - Final review and merge for sample_weight fix for RFs: https://github.com/scikit-learn/scikit-learn/pull/31529 - JAX support review iteration: https://github.com/scikit-learn/scikit-learn/pull/29647 (most code changes implemented, still need to investigate numerical discrepancies) - Iterated on GP gradient reproducer as a test: https://github.com/scikit-learn/scikit-learn/pull/31543 - Quick iteration on the review on the draft PR to fix the numerical problems of RidgeCV: https://github.com/scikit-learn/scikit-learn/pull/33020 - Some reviews to fix CI failures on scipy-dev and free-threading. - Triaging this week. - [name=Stefanie] - PR [FEA Add array API support for average_precision_score](https://github.com/scikit-learn/scikit-learn/pull/32909) - PR[TargetEncoder into a metadata router and route groups to cv object](https://github.com/scikit-learn/scikit-learn/pull/33089) - only routes groups via params - PR[ENH Add zero division handling to cohen_kappa_score](https://github.com/scikit-learn/scikit-learn/pull/31172) - PR [DOC/MNT Little clean up around the splitting docs and error message](https://github.com/scikit-learn/scikit-learn/pull/33091) (merged) - reviewed - [Update AGENTS.md on selecting issues](https://github.com/scikit-learn/scikit-learn/pull/33066) - [FIX: Resolve precompute in enet_path when check_input is False](https://github.com/scikit-learn/scikit-learn/pull/33014) - [name=Antoine] - continue [FIX instability of RidgeGCV ](https://github.com/scikit-learn/scikit-learn/pull/33020) - continue [Fix sample weight handling in SAG(A)](https://github.com/scikit-learn/scikit-learn/pull/31675) - review [Added sample weight handling to BinMapper under HGBT](https://github.com/scikit-learn/scikit-learn/pull/29641) - [name=François P] - Iterating on the [doc example for callback support in 3rd party estimators](https://github.com/jeremiedbb/scikit-learn/pull/27). - [Investiagating a weird bug of the progressbar](https://github.com/jeremiedbb/scikit-learn/pull/26) only present in Linux `pymin_conda_forge_openblas_min_dependencies` for just one combination of params. Found the change that causes the bug but not why it happens. - Made a [quick fix](https://github.com/scikit-learn/scikit-learn/pull/33081) on unit-test GHA for windows builds. - Waiting for a review on a [PR to update the random_state doc](https://github.com/scikit-learn/scikit-learn/pull/32844) to match the actual behaviour regarding clones. - [name=Loïc] - scikit-learn - move towards GHA underway, merge main if you see a required check in pending state for a long time, see [Discord message](https://discord.com/channels/731163543038197871/1046822941586898974/1460970279475089521) - free-threaded: a bit of work for running free-threaded with default joblib backend as threading, also pytest-run-parallel. Did not find obvious issues so far. - fix scipy-dev build (reported a Cython regression and work-around for numpy 2.5.dev DeprecationWarning in joblib) - fix segmentation fault and failure in free-threaded marking the relevant tests as thread-unsafe https://github.com/scikit-learn/scikit-learn/pull/33070 https://github.com/scikit-learn/scikit-learn/pull/33080 - attempt to try to compile with address sanitizers, slow progress. - joblib - reviewed a few PRs by NanoRed4498 (Yoann): https://github.com/joblib/joblib/pulls?q=sort%3Aupdated-desc+is%3Apr+is%3Aopen+author%3ANanored4498 - misc - attended free-threaded monthly documentation meeting https://py-free-threading.github.io/docs-meeting/ - merged: cython fix issue with spaces in depfile paths (interaction with meson) https://github.com/cython/cython/pull/7423 - pre-commit installed with free-threaded on conda-forge is broken. Bit of rabbit hole, we will see what happens: https://github.com/conda-forge/identify-feedstock/pull/172 https://github.com/asottile/ukkonen/pull/132 https://github.com/conda-forge/ukkonen-feedstock/pull/12 - [name=Anne] - [class vs. instance error message PR](https://github.com/scikit-learn/scikit-learn/pull/32565) still waiting for review - got feedback from Olivier and Lucy on [DecisionBoundaryDisplay bugfix](https://github.com/scikit-learn/scikit-learn/pull/33015), which is currently blocking some other issues and also opened up follow-up issues - reviewed [issue](https://github.com/scikit-learn/scikit-learn/issues/33093) that has come up in Display context as well and gave feedback on possible approach --> waiting for "NeedsTriage" to be removed - [name=Jérémie] - Reviewed https://github.com/scikit-learn/scikit-learn/pull/32732 (adds decision threshold curve) from Lucy - Approved https://github.com/scikit-learn/scikit-learn/pull/31172. Needs second approval - Callbacks - https://github.com/jeremiedbb/scikit-learn/pull/26 - https://github.com/jeremiedbb/scikit-learn/pull/27 - Updated SLEP https://github.com/scikit-learn/enhancement_proposals/pull/90. Got feedback from Thomas - Need reviewer on https://github.com/scikit-learn/scikit-learn/pull/28760 - - [name=Shruti] - Reviewed PRs: - https://github.com/scikit-learn/scikit-learn/pull/32956 - https://github.com/scikit-learn/scikit-learn/pull/32964 - https://github.com/scikit-learn/scikit-learn/pull/33014 - Trying to finalise gradient test for GPR/GPC PR (derivative provides strict assessment that is difficult to cheat however approx_fprime seems easier to implement) - Opened a PR on adding sample weights to QuantileTransformer, https://github.com/scikit-learn/scikit-learn/issues/30707 - Working on API testing for jax.numpy PR #29647 - Looking into calibration curve feature for skore - [name=Guillaume] - Review PRs from Gaetan et Auguste - Fix and extend permutation importance for the different Skore reports - Work on the onboarding on Skore Hub - [name=Gaétan] - `ConfusionMatrixDisplay` is where we wanted it to be: supports decision threholds, supports all types of reports, has `subplot_by` - wrapping up refactors using seaborn: `PrecisionRecallCurveDisplay` is done, `PredictionErrorDisplay` is in review, `RocCurveDisplay` is next. - Once done, add the option to display train and test together in displays for `ComparisonReports` - general maintenance and reviews - [name=Gael] - Rebooting the comms team: writing the way we work to onboard people - Guidelines for blog (should it be a SLEP?) https://github.com/scikit-learn/enhancement_proposals/pull/92 - Brand guidelines https://github.com/scikit-learn/communication/pull/40 ### Discussion points - [name=Loïc] do we need a 1.8.1 release soonish or do we wait a bit more? [Milestone 1.8.1](https://github.com/scikit-learn/scikit-learn/issues?q=sort%3Aupdated-desc%20milestone%3A1.8.1) ## 2026-01-05 ### Progress reports - [name=Olivier] - off last week (and won't attend today's meeting) - some things I did previously: - did a pass of reviews on issues/PRs in the array API project board - iterating on the review of the deprecation of the Friedman MSE criterion as a first step towards merging GradientBoosting and HistGradientBoosting classes - https://github.com/scikit-learn/scikit-learn/pull/32708 - opened a draft PR with a heuristic for the number of OpenMP to avoid slowdowns in HGB on small to medium datasets: - https://github.com/scikit-learn/scikit-learn/pull/32955 - still WIP - first review of output features tables in estimator diagrams https://github.com/scikit-learn/scikit-learn/pull/31937 (still need to get back to it) - [name=Guillaume] - off the three last weeks - going to release skrub 0.7.1 - Documentation CI is failing because we use `fetch_california_housing` - [name=Shruti] - Been focussing on fixing the floating-point errors in SAG(A), https://github.com/scikit-learn/scikit-learn/pull/31675 - Made a board on the issues and PRs surrounding Gaussian Processes, https://github.com/orgs/scikit-learn/projects/29 - Opened a PR on adding sample weights to QuantileTransformer, https://github.com/scikit-learn/scikit-learn/issues/30707 - [name=Anne] - reviewed [pre-commit fixing PR](https://github.com/scikit-learn/scikit-learn/pull/32813), which looks ready to be merged - [class vs. instance error message PR](https://github.com/scikit-learn/scikit-learn/pull/32565) waiting for approval - spent a little more time with [pyrefly](https://github.com/scikit-learn/scikit-learn/pull/32737) and might look into some of the flagged errors at some point - will open PR for fixing [DecisionBoundaryDisplay color inconsistency with predict](https://github.com/scikit-learn/scikit-learn/issues/32872) this week (need to spend some more time on tests) - [name=Antoine] - [FIX instability of gcv_mode="svd" in RidgeCV](https://github.com/scikit-learn/scikit-learn/pull/32506) - a big refactoring is needed to work for all `gcv_mode` and sparse/dense X - new draft PR coming soon - [Fix sample weight handling in SAG(A)](https://github.com/scikit-learn/scikit-learn/pull/31675) - need to review solver in the sparse case (math and code) - [name=Dea] - In progress 1. [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937) **Status:** Waiting for feedback. Added total num of output features block, modified some tests. 2. [ENH Display fitted attributes in HTML representation](https://github.com/scikit-learn/scikit-learn/pull/31442) **Status**: Waiting for feedback. [See example](https://output.circle-artifacts.com/output/job/33884ebc-7491-4c63-9cc1-bfe861f3f138/artifacts/0/doc/auto_examples/compose/plot_column_transformer_mixed_types.html). Question about memory footprint [comment](https://github.com/scikit-learn/scikit-learn/pull/31442#issuecomment-3213782243) 3. [Remove CSS template substitution in estimators' HTML Display](https://github.com/scikit-learn/scikit-learn/pull/32839) - **Status**: Needs feedback. Thomas and Mathiew OK'ed it. Towards [RFC: Potential improvement of HTML Display's css logic](https://github.com/scikit-learn/scikit-learn/issues/32834) - TODO - [ENH Display Methods in HTML representation](https://github.com/scikit-learn/scikit-learn/pull/31698) **Status**: Need to perform a survey in order to go ahead with this PR - or not. - Grant blog post - [name=François P] - off the two last weeks - getting back to callbacks, working on a doc example showing how to implement an estimator that supports callbacks - fixed my progressbar not showing in notebook issue through a fresh re-install of jupyter ¯\\\_(ツ)_/¯ - [name=Stefanie] - on triage last week - working on WIP [PR FEA Add array API support for average_precision_score](https://github.com/scikit-learn/scikit-learn/pull/32909); yet to do: add handling for mixed inputs - [PR MNT/DOC Autoclose schedule doesn't run on forks and improved structuring](https://github.com/scikit-learn/scikit-learn/pull/32889#) (merged) - [PR MNT cleanup old numpy workaround in metrics functions](https://github.com/scikit-learn/scikit-learn/pull/32917) (merged) - reviewed - [TST Add common test for mixed array API inputs for metrics](https://github.com/scikit-learn/scikit-learn/pull/32755) - [DOC Add link to glossary "label indicator matrix" for classification metric docstrings](https://github.com/scikit-learn/scikit-learn/pull/32893) - [MNT Use consistent ruff version in pre-commit and linting](https://github.com/scikit-learn/scikit-learn/pull/32849) - [DOC: clarify verbose behavior in GridSearchCV and RandomizedSearchCV](https://github.com/scikit-learn/scikit-learn/pull/32968) - [FIX array API support when pos_label=None for brier score metrics](https://github.com/scikit-learn/scikit-learn/pull/32923) - waiting for reviews on - PR [ENH Add grouped splitters to be used in TargetEncoder cross fitting](https://github.com/scikit-learn/scikit-learn/pull/32843) - fixed test failure after merging main into PR [ENH Add zero division handling to cohen_kappa_score](https://github.com/scikit-learn/scikit-learn/pull/31172/) - PR [FIX Error handling in ranking metrics supporting multiclass: average_precision_score, roc_auc_score and top_k_accuracy_score](https://github.com/scikit-learn/scikit-learn/pull/32912) for raising more comprehensible error in ranking metrics if `y_score` has only shape `(n_classes, )` and `y_true` is multiclass - [name=Jérémie] - off ~ half of last 2 weeks - included François' PRs in the callback PR. - public decorator to setup and tear down the callbacks - more tests - some maintenance - [name=Gael] - Started rebooting the comms team: - discussions with Reshama, contacted Lauren - https://github.com/scikit-learn/communication/pull/40 - [name=Emily] - (finally) finished the [Nystroem Array API PR](https://github.com/scikit-learn/scikit-learn/pull/29661) (thank you Olivier) - Looking for new tasks to work on - [name=Gael] some array API stuff? - [name=Adrin] - skops CI issues (no release seems to be needed) - [name=Gaétan] - Waiting for feedback on [PR curve refactor](https://github.com/probabl-ai/skore/pull/2193) and extentions of `ConfusionMatrixDisplay` to [ComparisonReport](https://github.com/probabl-ai/skore/pull/2236) and [CrossValidationReport](https://github.com/probabl-ai/skore/pull/2221) - In the mean time I have time I need to allocate ### Discussion points - [name=Olivier] (not attending the meeting) We need to find a workaround for the figshare.com hosted datasets. A few possibilities discussed in: https://github.com/scikit-learn/scikit-learn/issues/32961 - [name=Adrin] Need a better way to handle such contributions: https://github.com/scikit-learn/scikit-learn/pull/32368 which improve latency in prediction time. - [name=Adrin] Callbacks priority - having "it" for next release - resources? - [name=Adrin] didn't Francois G have a slep or a PR explaining the guidelines? - I (Guillaume) drafted something and wanted that Francois G. take ownership: https://github.com/scikit-learn/enhancement_proposals/pull/92 - Gael's to take over - [name=Adrin] pinging Charlie for Dea's PRs ## 2025-12-22 ### Progress reports - [name=François P.] I'll be off the next two weeks for the end of the year celebrations. Happy hollydays and see you on January 5th :) - The solution chosen at the callback meeting is implemented, still working with Jérémie to finalize it. - Also working on a doc example to show how to make a custom estimator that supports callbacks. I'm making an example in `scikit-learn/miscellaneous`, mimicking `plot_metadata_routing.py`. - The progressbar callback has display issues in notebooks because it is done in a seperate thread. I'm trying to fix it but for now the only way I found is to have the progress bar in the main thread. - [name=Olivier] - Was off last week and broke my phone so I was truely offline for once. Catching up with issue tracker and notifications. - Investigated joblib pickling issue reported to cloudpickle with CPython 3.14 but the problem seems to be reproducible with CPython only: - https://github.com/python/cpython/issues/143067 - Now investigating a crash observed on the doc CI of the lock file update: - https://github.com/scikit-learn/scikit-learn/pull/32902 - Planned: review PRs and discussions related to gradient boosting fixes/improvement/refactoring next. - [name=Dea] - In progress - Needs feedback [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937) **Status:** Implemented Guillaume's latest feedback - [ENH Display fitted attributes in HTML representation](https://github.com/scikit-learn/scikit-learn/pull/31442) **Status**: Question about memory footprint [comment here](https://github.com/scikit-learn/scikit-learn/pull/31442#issuecomment-3213782243) - Needs feedback [FIX remainder parameter for column transformer visual block](https://github.com/scikit-learn/scikit-learn/pull/32713) **Status:** It has 1 approval - Towards [Unexpected behavior of the HTML repr of meta-estimators](https://github.com/scikit-learn/scikit-learn/issues/32146) - Needs feedback [Remove CSS template substitution in estimators' HTML Display](https://github.com/scikit-learn/scikit-learn/pull/32839) - **Status**: Thomas and Mathiew OK'ed it. Towards [RFC: Potential improvement of HTML Display's css logic](https://github.com/scikit-learn/scikit-learn/issues/32834) - [ENH Display Methods in HTML representation](https://github.com/scikit-learn/scikit-learn/pull/31698) **Status**: Need to perform a survey in order to go ahead with this PR - or not. - Got comment from Kaggle [PR unpinned scikit-learn version](https://github.com/Kaggle/docker-python/pull/1516) **Status**: [Waiting for the release date](https://github.com/Kaggle/docker-python/pull/1513#issuecomment-3643978036). - Done - [FIX: Tooltip position using CSS anchor positioning](https://github.com/scikit-learn/scikit-learn/pull/32887) - [ENH: Order user-set parameters before default parameters on HTML Display](https://github.com/scikit-learn/scikit-learn/pull/32802) - TODO - Grant blog post - Investigate shadow DOM ### Discussion points - ... ## 2025-12-08 ### Progress reports - [name=Dea] - Ready for feedback [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937) - Ready for feedback. Split from PR above. This PR also corrects ColumnTransformer dotted-line. [FIX remainder parameter for column transformer visual block](https://github.com/scikit-learn/scikit-learn/pull/32713) - Ready for feedback. [ENH: Order user-set parameters before default parameters on HTML Display](https://github.com/scikit-learn/scikit-learn/pull/32802) - Ready for feedback. [Remove CSS template substitution in estimators' HTML Display](https://github.com/scikit-learn/scikit-learn/pull/32839) - [name=François P.] - The callback meeting happened last Friday. It was decided that the callback context management (intialization and clean-up) will be achieved in the `_fit_context` decorator, using a seperate context manager. A public decorator doing just the callback context management without parameter validation will be exposed for 3rd party devs who want to use callbacks. - Made a doc PR to update the doc on the RandomState description, to make it match what's actually happening during cloning. [DOC: Update the description of RandomState usage to match actual behaviour](https://github.com/scikit-learn/scikit-learn/pull/32844) - Still working on some CI tweaking : [CI: Make one tracking isssue per build for the unit-tests](https://github.com/scikit-learn/scikit-learn/pull/32832) - [name=Antoine] - [Add new default max_samples=None in Bagging estimators](https://github.com/scikit-learn/scikit-learn/pull/32825) merged - review [Fix sample weight handling in SAG(A)](https://github.com/scikit-learn/scikit-learn/pull/31675) - [name=Gael] - Mostly been traveling (EurIPS) and busy with non opensource things - skrub: nearing release, integration of optuna - [name=Adrin] - Spent time in Paris mostly in meetings - [name=Stefanie] - mainly worked on PR [ENH Add grouped splitters to be used in TargetEncoder cross fitting](https://github.com/scikit-learn/scikit-learn/pull/32843) - adds `groups` param to fit_transform, which adds information to internally chose suitable splitter for cross fitting - adds more options to pass `cv` objects as init params - adds metadata routing (TargetEncoder becomes a router) to internal cross validation - [name=Guillaume] - Participate in the callback API call - Submitted the Wellcome Trust annual report - Follow-up to come to request extension - Helping Riccardo to release `skrub` - Dedicate times to review Dea's PRs - [name=Loïc] - in Paris until Thursday - scikit-learn 1.8 release - wip convergence on release highlights PR: https://github.com/scikit-learn/scikit-learn/pull/32809 - still hope to release today. If not tomorrow. - put more thought on blog post on free-threaded from scikit-learn perspective. The more people I tell about it, the more likely it is I will manage to actually write it :sweat_smile: - bot lint comment tweak by François P. (on lint success: no or remove comment) was merged (nobody complained so it's probably OK) - variation on loky issue with multiprocessing change in Python 3.13.10 and 3.14.1 https://github.com/joblib/loky/issues/475 - get back to loky and joblib to do a release in both - - [name=Anne] - [PR](https://github.com/scikit-learn/scikit-learn/pull/32566) on Automated contributions policy in docs and PR template got merged (and is being used) - [Quickfix for CI failure due to Pandas4Warning](https://github.com/scikit-learn/scikit-learn/pull/32865) - [name=Olivier] - Some bugfix reviews for the 1.8 (`max_samples` for `Bagging`...) - [Prepared an example notebook](https://colab.research.google.com/drive/1ztH8gUPv31hSjEeR_8pw20qShTwViGRx?usp=sharing) for the array API section in the release highlights and found bugs along the way: - https://github.com/scikit-learn/scikit-learn/issues/32836 (fixed by Tim and merged) - https://github.com/scikit-learn/scikit-learn/pull/32840 (merged) - https://github.com/scikit-learn/scikit-learn/pull/32846 - I have a few more problems to report. - Participated in the callbacks API review meeting. - Some triaging - Plans to take next week off (but still plan to attend the montly meeting). - [name=Arturo] - Not much open source, don't mind me ### Discussion points - [name=Adrin] Need somebody to implement GH pages move to cloudflare - GHA to deploy cloudflare: https://developers.cloudflare.com/workers/ci-cd/external-cicd/github-actions/ - [name=Loïc] budget for scikit-learn MOOC JupyterHub? - switch to jupyterlite sooner than anticipated? - alternative source of funding to investigate at Inria to buy us one more year of jupyterhub. - [name=Guillaume] get a call with Antoine and/or Olivier regarding cross-validation analysis - today? tomorrow. - [name=Guillaume] got asked regarding the expected time for `joblib` release - [name=Loïc] by whom? - [name=Guillaume] By Thomas because we worked at some point on `skore` compatibility with 3.14 but your comment earlier answer my question ;) - [name=Gaël] keeping track of our priorities? - Olivier: started to use https://github.com/orgs/scikit-learn/projects/24/views/1 - [name=Adrin] removing / adding comments is probably going to break the bot on long hisotry PRs - [name=Loïc] crazy suggestion: write release highlights as we go? array API case when we discover problems kind of last minute. Also useful for marketing maybe: here are the cool stuff to look forward to? - Olivier: +1 Guillaume: +1 Jérémie: +1 (for ongoing RH) - [name=Arturo] reuse MOOC videos for skolar? Loïc thought that it was non-commercial licence so that seemed a bit risky? ### Discussion points ## 2025-11-24 ### Progress reports - [name=Stefanie] (on train with bad WiFi, as usual on trains) - mostly preparing talk ["OSS Community Building"](https://github.com/StefanieSenger/Talks/tree/main/2025_Building_an_OSS_Community) for Probability 1.0 - finalised PR [CI Add "autoclose" workflow by label setting](https://github.com/scikit-learn/scikit-learn/pull/32504) which now can be used by setting the "autoclose" label - finalised PR [FIX classification metrics raise on empty input](https://github.com/scikit-learn/scikit-learn/pull/32549) - fixed merge conflicts in PR [ENH Add zero division handling to cohen_kappa_score](https://github.com/scikit-learn/scikit-learn/pull/31172) since `cohen_kappa_score` now has array api support - waiting for reviews - reviewed [DOC add paragraph on "AI usage disclosure" to Automated Contributions Policy and PR Template](https://github.com/scikit-learn/scikit-learn/pull/32566) - [name=Olivier Grisel] - Reviewing deprecation related PRs for 1.8 - logistic regression fitted attributes - logistic regression `penalty` - [`SVC` with `probability=True`](https://github.com/scikit-learn/scikit-learn/pull/32050) - Array API: - review and benchmarking for [LogisticRegression with LBFGS](https://github.com/scikit-learn/scikit-learn/pull/32644) - GPU speed-up can be significant on large enough problems with `n_samples >> n_features`. - But requires code duplication (Cython vs array API in the loss module) - Tree-based models: - started discussion on binning/histograms for RFs: https://github.com/scikit-learn/scikit-learn/issues/32704 - Antoine confirmed that this can significantly improve the fit time/accuracy tradeoff on real customer data. - Back to investigate the threading overhead of HGBT on small data: - https://github.com/scikit-learn/scikit-learn/issues/14306 - confirmed that xgboost has similar performance problems with openmp overhead on small dataset - working on a heuristic that seems to fix the problem on my Apple M4 CPU - TODO: - check if heuristic also work well on 32 core x86_64 CPU - profile why the scalability w.r.t. number of threads is not better with large number of cores - BLAS packaging: - compared AMD AOCL-BLAS from AMD pip packages with BLIS from conda-forge - reported discrepancy with default number of threads in BLIS https://github.com/conda-forge/blis-feedstock/issues/45 - TODO: craft a minimal reproducer with OpenBLAS / OpenMP overhead. - [name=Anne] - DOC discussions - [Lucy's PR](https://github.com/scikit-learn/scikit-learn/pull/32715) on issues for new contributors - [adding AI usage disclusure to guidelines and template](https://github.com/scikit-learn/scikit-learn/pull/32566) - [Draft PR](https://github.com/scikit-learn/scikit-learn/pull/32737) for testing pyrefly (to potentially replace mypy) - [Issue on passing class vs. instance](https://github.com/scikit-learn/scikit-learn/issues/32719) (occured in Pipeline, currently not caught in all meta-estimators), related to [my other PR](https://github.com/scikit-learn/scikit-learn/pull/32565) (where a better error will be raised when calling `get_tags()`) - [name=François] - Callbacks : organised a meeting next week to discuss alternatives for the implementation - RandomState : looking at the previous issues / SLEPs / discussions regarding the use of RandomState objects in estimators with the idea of restarting the discussion around supporting the Generators and either fixing the behaviour of cloned estimators or the doc, as they are in contradiction right now. - CI : working on some migration from azure to GHA issues with Loïc: - [adding doc tests to GHA](https://github.com/scikit-learn/scikit-learn/pull/32730) - [adding pytest soft dependecy test to GHA](https://github.com/scikit-learn/scikit-learn/pull/32738) - [automate the labelling of PRs with failed linting](https://github.com/scikit-learn/scikit-learn/pull/32751) - [name=Arturo] - Took over [Refactor make_classification](https://github.com/scikit-learn/scikit-learn/pull/32476) - [name=Loïc] - scikit-learn - scikit-learn 1.8 release has never been closer :wink: - release candidate soonish: today or tomorrow? draft rc PR https://github.com/scikit-learn/scikit-learn/pull/32766. Interested to have a call with Jérémie to see what is missing. - one PR to decide on `LogisticRegression` + `LogisticRegressionCV` `penalty` deprecation https://github.com/scikit-learn/scikit-learn/pull/32659. Actually 2 Olivier added one more :sweat_smile:. - planning for 1.8 release beginning December so we have a bit of time to adjust before Christmas break (just in case we need a quick bug-fix release) - Stefanie's PR: autoclose label is now posting a comment (that works) and closing after two weeks (tested in a fork), feed-back in https://github.com/scikit-learn/scikit-learn/pull/32743 - Windows arm64 segmentation fault has disappeared from the CI but can somehow still reproduced https://github.com/scikit-learn/scikit-learn/pull/32754. TBC. - conda-lock install hang, unable to reproduce in https://github.com/scikit-learn/scikit-learn/pull/32643 so giving up for now. - probably other a few not so important things - [name=Dea] - [Fixing issue with Column Transformer dotted lines](https://github.com/scikit-learn/scikit-learn/pull/32713) I needed to split [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937#issuecomment-3497690242). Targeting the 1.8.1 release probably. - [name=Guillaume] - [`skrub` presentation](https://docs.google.com/presentation/d/1ejclBruWTbSQfKk37DBGi_SlgQ1vc6wVWBrAZGP5HHQ/edit?usp=sharing) for AdoptAI - Feedbacks on some `skore` PRs related to display ### Discussion points - [name=Stefanie] Can we ask people from the product team to review Dea's CSS PRs? - [name=François] Looking for people's opinion on the RandomState situation. Should we just update the doc ? Change the behaviour ? Or take an opportunity to move to Generators and fix the behaviour while doing that ? - [name=Guillaume] did we look at `ty` from Astral: https://blog.edward-li.com/tech/comparing-pyrefly-vs-ty/ - [name=Loïc] what are we trying to fix? The time taken by mypy in pre-commit? Seems like a distraction to me, especially because we are not using typing much and we don't really plan to in the foreseeable future ... - [name=Olivier] conda-lock freeze happened 3 times to me this morning. - [name=Loïc] weird ... feel free to take a look at https://github.com/scikit-learn/scikit-learn/pull/32643 and make suggestions about what I am missing ... maybe enabling verbose makes the bug disappear (only half-joking :wink:) - [name=Dea] - Just FYI, this bug has been there for years (previous to the first HTML Display PR)[Fixing issue with Column Transformer dotted lines](https://github.com/scikit-learn/scikit-learn/pull/32713) ## 2025-11-10 ### Progress reports - [name=Dea] - Worked on PR - ready for feedback [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937#issuecomment-3497690242) - Opened PR on Kaggle docker-python [Bump scikit-learn version from 1.2.2 to 1.7.2](https://github.com/Kaggle/docker-python/pull/1513) - [MAINT Cleaning up old scipy version mentions and code](https://github.com/scikit-learn/scikit-learn/pull/32685) - [CI Update ubuntu_atlas lock file](https://github.com/scikit-learn/scikit-learn/pull/32653), - [MAINT Clean up after Python 3.11 bump](https://github.com/scikit-learn/scikit-learn/pull/32646) - [MAINT Clean up after Scipy min version to 1.10](https://github.com/scikit-learn/scikit-learn/pull/32615) - Closed PR - couldn't remove old scipy version code [MAINT Clean up old scipy version code](https://github.com/scikit-learn/scikit-learn/pull/32654) - [name=Loïc] - scientific Python developer summit - discussions on various topics - better handling with the repo activity https://github.com/scientific-python/summit-2025-nov/issues/6 - [google/triage-party](https://github.com/google/triage-party#triage-party-in-production) temporary home, may be hosted somewhere by scientific-python: https://sklearn.triage-party.mriduls.com/s/home - example usage by kubernetes/minikube: https://teaparty-tts3vkcpgq-uc.a.run.app/s/daily - admin - the scikit-learn thanks.dev money (~240$) managed to find its way to our OpenCollective account after a bit more than a week - NASA ROSES invoice for October - scikit-learn - triage last week - took part in recovering the accidentally deleted scikit-learn.github.io repo :sweat_smile: - reviewed+merged sponsors page reorg by François G: https://github.com/scikit-learn/scikit-learn/pull/32642 - merged final Python 3.10 -> 3.11 by Dea: https://github.com/scikit-learn/scikit-learn/issues/32650 - merged dependabot PR: https://github.com/scikit-learn/scikit-learn/pull/32629 - merged my own PR with one approval to add example dependencies to our bumping script: https://github.com/scikit-learn/scikit-learn/pull/32557 - macOS-15-intel Azure brownout https://github.com/scikit-learn/scikit-learn/pull/32649 - meson - shorter path for Cython generated files to avoid Windows MAX_PATH limitation: https://github.com/mesonbuild/meson/pull/15219 - seems like Cython has limitations with spaces in paths for dependency files https://github.com/mesonbuild/meson/issues/15227 - threadpoolctl - merged Pyodide fix with deprecated method in Pyodide 0.29 https://github.com/joblib/threadpoolctl/pull/201 - investigating main CI failures https://github.com/joblib/threadpoolctl/pull/203 - [name=Gael] - Gave a talk on skrub at dotAI, the gist being: we need better composable, reusable primitives in data-science - I did some AI-assisted live coding in a 20mn-long talk, in from of a mixed audience of 600 people. Was scary/fun - Merged some PRs: - An example of defining selectors for columns with outliers https://skrub-data.org/dev/modules/multi_column_operations/advanced_selectors.html#custom-criteria-in-filter-example-of-selecting-columns-with-outliers - Running the TableReport on polars when pyarrow is not install - Did a PR on selector docs: outline, formulation and see-alsos - https://github.com/skrub-data/skrub/pull/1745 (if people want to review :) ) - Will need to do a big-picture presentation on what is going on in scikit-learn - Happy to take input on what should be in - [name=Adrin] - Doc repo drama continues... - Talk @Open Science days @Max Planck - scikit-learn triage this week - [name=Arturo] - Scientific Python developer summit : [CI tool to test jupyterlite](https://github.com/scientific-python/summit-2025-nov/issues/13) - [name=Stefanie] - further work on Autoclose bot (PR [CI Add "autoclose" workflow by label setting](https://github.com/scikit-learn/scikit-learn/pull/32504)) - [DOC add information on 'needs triage' label in contribution guide](https://github.com/scikit-learn/scikit-learn/pull/32574) merged - scientific python summit - learning - followed [python packaging tutorial](https://packaging.python.org/en/latest/tutorials/packaging-projects/) - back on Linear Algebra (geometric interpretation of dot product, cosine similarity, projections - [name=Guillaume] - Scientific Python Dev Summit - Thoughts around tutorials - TODO: - Write annual report for the Wellcome Trust grant - Follow-up Scientific Python admin - Review Dea and Anne PRs - [name=Anne] - Started on Displays by taking over PR on [setting custom `xlim`/`ylim` in DecisionBoundaryDisplay](https://github.com/scikit-learn/scikit-learn/pull/31693) - Follow-up PR from updating contributing documentation on [pre-commit instructions](https://github.com/scikit-learn/scikit-learn/pull/32664) ### Discussions - [name=Guillaume] Is it normal that I get notified that I got fund by thanks.dev? ## 2025-10-27 ### Progress reports - [name=Stefanie] (on vacation) - finished work on PR [ENH Add zero division handling to cohen_kappa_score](https://github.com/scikit-learn/scikit-learn/pull/31172); happy to get this into 1.8 as planned (reviewed by Adrin, Jérémie and Virgil; needs final review) - PR [FIX classification metrics raise on empty input](https://github.com/scikit-learn/scikit-learn/pull/32549) (superseeds [#31187](https://github.com/scikit-learn/scikit-learn/pull/31187)) (needs reviews) - PR [MNT bump to Python 3.11 for pymin_conda_forge_openblas_min_dependencies](https://github.com/scikit-learn/scikit-learn/pull/32530) (merged with Loic) - took over PR [DOC: Clarify recommended usage of fit_transform() vs fit().transform() in TargetEncoder](https://github.com/scikit-learn/scikit-learn/pull/32347) (merged with Arturo) - from past weeks: - PR [DOC Simplify metadata routing example and add short example to docstrings](https://github.com/scikit-learn/scikit-learn/pull/32191) (reviewed by Antoine, awaiting second reviewer) - PR [CI Add "autoclose" workflow by label setting](https://github.com/scikit-learn/scikit-learn/pull/32504) ready for more reviews - reviews and other things: - reviewed [DOC Clearer linear "get your development environment" setup documentation](https://github.com/scikit-learn/scikit-learn/pull/32509) (looks nice, close to be merged) - reviewed [FIX Infer pos_label in Display method from_cv_results](https://github.com/scikit-learn/scikit-learn/pull/32372) to unplug blockage in Displays - unblocking reviews needed for some of Lucies PR, before Anne can start, especially [ENH add from_cv_results in PrecisionRecallDisplay (single Display)](https://github.com/scikit-learn/scikit-learn/pull/30508) - reviewed [MNT Add example dependencies to version bumping script](https://github.com/scikit-learn/scikit-learn/pull/32557) - reviewed [Add specific error message when users pass estimator class instead of instance to is_regressor() and co.](https://github.com/scikit-learn/scikit-learn/pull/32565) - found a few tasks for Anne - issue/discussion on bumping scipy-version for array api( [Array API test failure for probabilistic metrics with scipy==1.15.0](https://github.com/scikit-learn/scikit-learn/issues/32552)) - Birdaro training sessions on preparing collaborative work (no directly applicable learnings, but exchange) - [name=Antoine] - continued [FIX instability of gcv_mode="svd" in RidgeCV](https://github.com/scikit-learn/scikit-learn/pull/32506) - reviews - [FIX: Reduce bias of covariance.MinCovDet with consistency correction](https://github.com/scikit-learn/scikit-learn/pull/32117) - [FEA Add support for arbitrary metrics and informative initialization to MDS](https://github.com/scikit-learn/scikit-learn/pull/32229) - [name=Olivier] - many meetings at probabl..., including open source team priority and organization - array API reviews and merges! - followed-up on triaged PR from previous weeks (QDA, ...) - review of upstream fixes for CPython 3.14 free-threading in python-lz4 and impact on joblib and scikit-learn - lock file fixes and polars regression min reproducer - started to draft a skrub text embedding + PyTorch ridge classification on google colab: started with polars code: - [WIP] https://colab.research.google.com/drive/1S03Ry3726urs9I46iS4NowcVcD9V-3Oh?usp=sharing - WIP reviewing the MAE criterion PR: https://github.com/scikit-learn/scikit-learn/pull/32100 - [name=Anne] - refined [DOC Clearer linear "get your development environment" setup documentation](https://github.com/scikit-learn/scikit-learn/pull/32509) - joined conversation on [AI tools like Copilot Coding Agent don't know about / don't respect our Automated Contributions Policy](https://github.com/scikit-learn/scikit-learn/issues/31679#issuecomment-3450994191) and opened [DOC add paragraph on "AI usage disclosure" to Automated Contributions Policy and PR Template](https://github.com/scikit-learn/scikit-learn/pull/32566) - took over (probably AI generated) PR on [Add specific error message when users pass estimator class instead of instance to is_regressor() and co.](https://github.com/scikit-learn/scikit-learn/pull/32565) - [name=Guillaume] - Review PR from Dea regarding showing the number of feature in the HTML representation - Reported a bug when `remainder="passthrough"` when displaying the HTML representation of a `ColumnTransformer` in a `Pipeline`: [issue](https://github.com/scikit-learn/scikit-learn/issues/32146#issuecomment-3450807154) - Review PR from Lucy regarding overwriting kwargs in `Display`: [PR](https://github.com/scikit-learn/scikit-learn/pull/32313) - Had a look at the list of PRs to take over regarding the displays - Many meetings as well - [name=Dea] - Worked on PR [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937) - Spent some time with [Dark mode on documentation may not work as intended](https://github.com/scikit-learn/scikit-learn/issues/32354) - PR was merged [CI Remove python 3.10 wheels](https://github.com/scikit-learn/scikit-learn/pull/32522) - PR was merged [MNT Bump to Python 3.11 for remaining pymin CI builds](https://github.com/scikit-learn/scikit-learn/pull/32555) - [name=Arturo] - DOC Clarify decision trees complexity [#32583](https://github.com/scikit-learn/scikit-learn/pull/32583) ### Discussion points - [name=Olivier] start with focus task force this week or discuss it first at the monthly meeting? - First focus group (Antoine + Olivier): - tree-based model comparative investigation with xgboost/lightgbm/catboost on many class and thread-based parallelism problems. - [name=Arturo] Make make_classification draw from the same distribution regardless of n_samples [#32405](https://github.com/scikit-learn/scikit-learn/issues/32405)? - [name=Guillaume] Issue related to the HTML diagram regarding the feature count. - ... ## 2025-10-13 ### Progress reports - [name=Olivier] - Addressed some pending reviews on `sample_weight` doc: - https://github.com/scikit-learn/scikit-learn/pull/30564 - Follow-up on `d2_log_loss_score` / `d2_brier_score` merges by registering named scorers and common tests + performance fix - https://github.com/scikit-learn/scikit-learn/pull/32356 - array API - reviews - explored testing with testing against dpnp https://github.com/scikit-learn/scikit-learn/pull/32460 - mostly works on CPUs - probably need non-default driver to get Intel GPU working - Investigating build slow-down on the weekly lockfile update - Triaging + attending a 2-day Inria event this week - TODO: bluesky thread on estimating prediction intervals - [name=Arturo] - [DOC Clarify splitter criteria in Random Forest and Decision Tree #32416](https://github.com/scikit-learn/scikit-learn/pull/32416) - [Make make_classification draw from the same distribution regardless of n_samples #32405](https://github.com/scikit-learn/scikit-learn/issues/32405) - [DOC Expand description of random_state in make_classification #32406](https://github.com/scikit-learn/scikit-learn/pull/32406) - [name=Antoine] - reproducer [BUG RidgeCV with gcv_mode="svd" is unstable](https://github.com/scikit-learn/scikit-learn/issues/32459), originally observed in [FEAT Polynomial Chaos Expansions](https://github.com/scikit-learn/scikit-learn/pull/27842) - explore sample weight in SGD - [name=François] - explored 3 variations to manage the callback context : [private fit function](https://github.com/jeremiedbb/scikit-learn/pull/18), [decorator around fit](https://github.com/jeremiedbb/scikit-learn/pull/19), and [dynamically wrap the fit through a mixin](https://github.com/jeremiedbb/scikit-learn/pull/20) A meeting will decide between these 3 strategies. - working on tests to handle an estimator that does not support callback as a child of a meta-estimator that does and vice versa - Various PRs for deprecation clean-ups for 1.8: - [Deprecation of response_method=None in make_scorer](https://github.com/scikit-learn/scikit-learn/pull/32457) - [Deprecation of copy_X in TheilSenRegressor](https://github.com/scikit-learn/scikit-learn/pull/32456) - [Remove the positional arg deprecation warning for groups param in RFE](https://github.com/scikit-learn/scikit-learn/pull/32454) - [Rename force_all_finite to ensure_all_finite](https://github.com/scikit-learn/scikit-learn/pull/32452) - [name=Loïc] - scikit-learn - contributing doc restructuring - easiest thing seems to be a linear "get your dev environment" setup (I find [matplotlib's one](https://matplotlib.org/devdocs/devel/development_setup.html) has the right amount of details) https://github.com/scikit-learn/scikit-learn/issues/32475 - for more unstructured WIP thoughts look at https://hackmd.io/X8T3WmGBRj6g4DTUxhEaoA - fix weird build issues (modifying a .pxd would not rebuild associated files with rebuild from scratch work-around) https://github.com/scikit-learn/scikit-learn/pull/32420. Planning to open a meson issue longer-term because we ended up there because of our work-around for too long generated paths on Windows. - opened a tracking issue (with raw brain dump on some aspects) for Azure -> GHA migration. Thomas S may be motivated to help. - check changelog has link to changelog instructions in the build log: https://github.com/scikit-learn/scikit-learn/pull/32464 - use absolute imports in Cython code meta-issue done: https://github.com/scikit-learn/scikit-learn/issues/32315. - cython-lint used for linting (PR by MarcoGorelli, tweaks and mistakes by me) - macOS arm64 CI on GHA (array API with PyTorch mps backend): https://github.com/scikit-learn/scikit-learn/pull/32349 - I commented on the ppc64le (IBM-specific) wheels and Adrin closed it. Main reason: numpy is not doing it so we won't do it either. - scipy - 1.16.2 hang on macOS Intel debugged further until an OpenBLAS C reproducer: https://github.com/scipy/scipy/issues/23686#issuecomment-3381958611 - [name=Jérémie] - reviewed PRs in preparation for 1.8 - finalized https://github.com/scikit-learn/scikit-learn/pull/30134 to make public a function to compute the confusion matrix terms at different thresholds. - Made comments in Lucy's PR which does the same for any metric to have common code. - testing different alternatives with François with the callbacks to not depend on public vs private fit in sklearn. - - [name=Stefanie] - involved in working on displaying link to changelog instructions where contributors can find it with Emily and Loic - [CI Add link to changelog instructions when check-changelog fails](https://github.com/scikit-learn/scikit-learn/pull/32464) merged, alternative PR [CI Add link to changelog instructions](https://github.com/scikit-learn/scikit-learn/pull/31954) closed - working on CI action on setting autoclose label in [test repo](https://github.com/StefanieSenger/GitHub-Actions-test-repo/blob/main/.github/workflows/autoclose.yml) and discussed with Loic - issue[RFC: Proposal for autoclose option for non-compliant PRs](https://github.com/scikit-learn/scikit-learn/issues/32207) - please get involved with your suggestions - looking through [Displays and Visualisation project board](https://github.com/orgs/scikit-learn/projects/10/views/2) for getting an overview - reviewed [DOC: Clarify recommended usage of fit_transform() vs fit().transform() in TargetEncode](https://github.com/scikit-learn/scikit-learn/pull/32347) and needs second reviewer - [name=Anne] - preparing a first draft for restructuring the contributing docs, starting with [linear description for setting up development environment](https://github.com/scikit-learn/scikit-learn/issues/32475) - [name=Gael] - skrub: adding an optional connection to optuna for hyper-parameter search on the DataOps - this week: P16 days: meeting of broader funding of open source in French academia. Projects present: MAPIE, AEON, tslearn, skrub.... - [name=Dea] - Worked on PR [ENH: Display the number of output features](https://github.com/scikit-learn/scikit-learn/pull/31937) - Opened issue about [Dark mode on documentation may not work as intended](https://github.com/scikit-learn/scikit-learn/issues/32354) - Tried to debug part of previous issue [https://github.com/scikit-learn/scikit-learn/pull/32458](https://github.com/scikit-learn/scikit-learn/pull/32458) - Commented on [MAINT add jupyter extension and pre-commit in devcontainer](https://github.com/scikit-learn/scikit-learn/pull/32342) - Commented on [FIX Guess theme based on estimator parent node color](https://github.com/scikit-learn/scikit-learn/pull/32477) - Commented on [DOC Clearer linear "get your development environment" setup documentation](https://github.com/scikit-learn/scikit-learn/issues/32475) ### Discussion points - Loïc switching between sparse array and sparse matrix with scikit-learn config. What's the deprecating strategy on our side? https://github.com/scikit-learn/scikit-learn/pull/31177 - Olivier: `RidgeCV` bug: any potentially fixable root cause? ## 2025-09-15 ### Progress reports - [name=Olivier] - New feature idea: frequency encoding option for `OrdinalEncoder`: https://github.com/scikit-learn/scikit-learn/issues/32161 (please express opinion) - Got several unrelated feedback from users wanting to use ML surrogate models to approximate/accelerate and explain slow numerical simulation results. So I decided to get a bit more familiar with the sensitivity analysis literature and related open issues or PRs: - Explored the use of Sobol indices as an alternative to permutation importance (or SAGE): - https://github.com/scikit-learn/scikit-learn/issues/22453#issuecomment-3284608178 - PR on polynomial chaos expansion with analytical value of Sobol indices for that model - https://github.com/scikit-learn/scikit-learn/pull/27842 - Thinking about how to document the use of feature importance and the need for unbiased feature importance in RFs and co: - https://github.com/scikit-learn/scikit-learn/pull/31279 - good way to leverage RFECV (compared to MDI which is likely to fail pruning noisy high cardinality features) - more efficient than using Permutation Importance - [name=Gael] (not here, updating about skrub) - Reporting on skrub: we're mostly working on improving the documentation and having custom error messages that help users figure out what's wrong - [name=Arturo] - [DOC Rework Decision boundary of semi-supervised methods #32024](https://github.com/scikit-learn/scikit-learn/pull/32024) - [DOC Rework StackingRegressor example and add SuperLearner #32163](https://github.com/scikit-learn/scikit-learn/pull/32163) - Minor reviews - [name=Stefanie] - worked on PR [MNT Refactor and deprecate get_metadata_routing method in _MetadataRequester](https://github.com/scikit-learn/scikit-learn/pull/31695) after Adrins and Antoine's reviews - reviewed - [FIX (SLEP6) descriptor shouldn't override method](https://github.com/scikit-learn/scikit-learn/pull/32111) - [DOC Rework Decision boundary of semi-supervised methods example](https://github.com/scikit-learn/scikit-learn/pull/32024) - [name=Jérémie du Boisberranger] - Finalized release 1.7.2 - freezes on conda-forge feedstock for windows build - need to lower the CI timeout from 6h to maybe 2h - Working on Callbacks with François https://github.com/jeremiedbb/scikit-learn/pull/11 - Reviewed François PRs to clean up deprecations for 1.8 - Reviewed Christian PR to clean LR deprecation for 1.8 https://github.com/scikit-learn/scikit-learn/pull/32073#pullrequestreview-3211646078 - needs a second pair of eyes - Reviewed Guillaume's PR to improve HTML repr https://github.com/scikit-learn/scikit-learn/pull/31969 - need someone with better css knowledge to review if possible - reverted deprecation of public murmurhash3 - - - [name=Emily] - [Nystroem Array API compatibility PR](https://github.com/scikit-learn/scikit-learn/pull/29661) is ready for review. Certain utility functions are repeated from another function and I wonder if we can add them into the array API internal util file - [D2 Brier score User Guide doc rendering](https://github.com/scikit-learn/scikit-learn/issues/32174)... how to reproduce? (@Stefanie) - [name=Guillaume] - Mainly some HTML related PRs. - [name=Adrin] - mlflow <-> skops - some metadata routing work - blog post on AI contributions - couple of reviews - [name=Shruti] - Working on PR of [Deprecate use of probability=True in SVC and NuSVC](https://github.com/scikit-learn/scikit-learn/pull/32050) lots of tests using CustomSVC to adapt - Working on [SAG tests](https://github.com/scikit-learn/scikit-learn/pull/31675) PR, weighted regression based convergence is not passing still but weighted classifier tests are working - Finalising stricter gradient checks for Gaussian Processes [PR](https://github.com/scikit-learn/scikit-learn/pull/31543) - Opened [PR](https://github.com/scikit-learn/scikit-learn/pull/31888) raising ValueError for logistic regression with high values and liblinear - Review of LabelPropagation [PR](https://github.com/scikit-learn/scikit-learn/pull/31924) by dschult ### Discussion points - (Stefanie) What do you read for tech information? - (Stefanie) Re-open [Add links to examples from the docstrings and user guide](https://github.com/scikit-learn/scikit-learn/issues/30621) for sprint at PyData Paris? - (Adrin) Categorical kmeans-like clustering: https://github.com/scikit-learn/scikit-learn/issues/32115 - (Guillaume) Sprint Developer Summit Python in Copenhagen (Scientific Python) - (Adrin) Olivier's feature request on categorical encoding - (Olivier) conda-forge freeze: at build time or at test time? Would pytest's `faulthandler` help?  - https://github.com/pytest-dev/pytest/pull/13679 (still not merged but 1 green review) - (Olivier) ping Charlie or Thomas for CSS reviews? - (Stefanie) reproduce Brier score section rendering issue - (Jérémie) to Olivier: François not in both invites ?