Jannis Leidel
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.

      Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

      Explore these features while you wait
      Complete general settings
      Bookmark and like published notes
      Write a few more notes
      Complete general settings
      Write a few more notes
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note No publishing access yet

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.

    Your account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Your team account was recently created. Publishing will be available soon, allowing you to share notes on your public page and in search results.

    Explore these features while you wait
    Complete general settings
    Bookmark and like published notes
    Write a few more notes
    Complete general settings
    Write a few more notes
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Packaging Summit @ PyCon US 2026 **Date:** Friday, May 15, 2026, 1:45 PM - 5:45 PM **Location:** Room 201A, Long Beach Convention and Entertainment Center **Organizers:** Pradyun Gedam, C.A.M. Gerlach, Jannis Leidel :::info **How to use this document:** Everyone is welcome to take and add notes collaboratively. Please keep notes concise and attribute comments where possible. ::: Speakers: Please add your slides to the talks below ## Schedule **Also on:** https://us.pycon.org/2026/events/packaging-summit/ (might be outdated in case of ad-hoc changes) | Time | Session | |------|---------| | 1:45 PM | Welcome, introductions | | 1:55 PM | Revisiting Wheel 2.0 and Better Compression (Emma Smith) | | 2:25 PM | Limiting vectors and incentives for abuse (Mike Fiedler) | | 2:55 PM | Encouraging the community to view Packaging in terms of ecosystems, not tools (Mahe Iram Khan) | | 3:25 PM | Break | | 3:40 PM | Lightning talks | | 4:10 PM | Roundtable discussions | | 5:35 PM | Wrap-up | | 5:45 PM | Summit ends | --- ## Revisiting Wheel 2.0 and Better Compression *Emma Smith* Slides: https://docs.google.com/presentation/d/1zh-3FkCg2cSMp3QD5oFji5sebeJlu3Hg_d-maYe7Pno/edit?usp=sharing **Summary:** **Notes:** - Motivation - Want to make a lot of changes to wheel like better compression, better metadata format, etc. - However, making changes to wheel is hard - Wheel 2.0 is how we want to do that - Emma wrote [PEP 777](https://hackmd.io/k0C-RAIVRnu-YXoDFgABvA?edit) - Motivation: If new wheel release comes out, don't want to break eveyrone's CI - Solution: Add new metadata entry for wheel version and also change wheel filename so existing installers don't install it - Issues - Silently stops updates on old installers - People felt existing wheel schema was sufficient - Cannot handle local installation of wheels (pip install ./wheel_file.whl) - New plan - Talking with Donald, we can't migrate the whole ecosystem coherantly, too fragmented - Any approach to do so has a lot of downsides - Solution - Revision to PEP 777 to make more minimal changes rather than changing the wheel as much as possible - Wheels stay zip files, use existing metadata format - Only hard requirement with the existing schema is dist-info needs to contain wheel version and be compressed in the wheel - Left up to indicies to decide when to switch - All future changes will be proposed individuall as sub PEPs. Each sub-PEP is individually motivated. - Action plan: PEP 777 and later sub-PEPs will be provsionally accepted, then the whole thing Accepted or Rejected - Zstandard compression - Analyzed top 1000 projects on PyPI - zstandard gets 25% smaller wheels - Saves 100 PB bandwidth for just top 1000 PyPI projects - and 36 years decompression time - Would be nice! - Comment: (do zstd for only tensorflow and you'd get a lot :-) - Can't just happen immediately tho - Big issue is tools (like pip) don't like taking C dependencies - Solved this by [adding](https://read.gov/aesop/003.html) Zstandard compression to Python stdlib - Inner archive tar.zst stored in ZIP but not compressed - Wheel metadata file contains a new key specifying the format - Could introduce xz, etc, tho zstd seems good enough for now - Can be proposed in other PEPs - Open Questions - Wheel 2.0 - Should the PEP give guidance on adoption of Wheel 2.0? - How long should sub-PEPs be introduced/discussed? - Can take a year or more - Zstd - How to handle data.tar.zst with RECORD? - Should we allow other compression formats? - Emma leans no - Can/Should we use Zstandard dictionaries for better compression of Python files? - You train it on existing data that you know (e.g. Python) and would increase compression for small wheels **Discussion:** - So ZIP can immediately support zstd, why not jsut allow zips with other compression methods - Emma: Thought about it, a few issues with this. Per file compression worse than compressing all data files togher - Emma: If we allow per-file zstd compression in the ZIP, issues with Python compatibility e.g. using older interpreter to install into env for new interpreter - Emma: for simplicity's sake, and compatibility, best to do archive in an archive - Potential savings (zstd) is amaing, you alluded to convesration topic on how long to discuss, but even after provisional approval can take up to a decade to be implemented. What timeline do you feel is appropriate? - Emma: We defintely will want a decision on taht going into things, somewhere within a year or two from inital PEP writing to provisional apporval - Compression and side compression: Have had a slew of compressional differential attacks that we've had to mitigate, what are the security implications of putting a TAR inside a ZIP? - Emma: Havne't given it a ton of tought, think we could analyze the files in a particular way and write the expected extracted size in the zstandard header and compare it with the actual size - Currently some installers like uv have an optimization where the ywill store zip entries as stored rather than deflate so they can be read directly. how do you expect that to work with this new proposed format? - Emma: Metadata stays exactly as it as today, additional files requires extracting from a TAR archive - What do you think about inventing a new archive format - Emma: Brought up during original Wheel 2.0 discussion, without strong motivation can't change the format and also to be compatible with existing installers can't change the format, could modify header but ZIP readers can be finicky - Thought about restricting how much of ZIP/zstd/TAR standard tools can use? Specify more tightly how the ZIP file, etc. looks? As part of introducing Wheel 2.0 to be more conservative in what you produce, use small subset of ZIP format? - Emma: yes, is fine to be stronger in requirements, just need to be compat with existing installers? - Also on the subject of sub-PEPs, is there a minimum delay of introducing something before it can be shipped? - Emmma: It's a maximum window - Other compression formats: Other ways of compressing PAX/TAR or something isntead of TAR? - Emma: Somewthing instead of TAR, key should be generic so if we want to change from TAR in the future we can do that - In the example I gave I said zstd but should be zstd TAR - On the topic of RECORD, would that be required to be in the ZIP, the TAR or some concrete split? - Emma: Question was whether you include TAR-Zstd in RECORD or not? TAR-Zstd could change depending on the metadata set in the TAR itself - Stefano Rivera: We spoke about reasons that TAR inside ZIP is good, but what about the downsides? We cannot import wheels directly any more (without extracting anything with zipimport). - Emma: Zipimport alreayd doesn't work with things like C libraries and such so would still work with pure Python but its fine to generate without - Like meatadata at the root level cause there are a lot of stuff that likes to read that out of the ZIP. What's your plan for parallel adoption of publishing both new and old wheels? - Emma: Original PEP, the idea was to have 1.0 and 2.0 side by side, but we defer that for now in this PEP and will be considered in the PEP that decides whether to change the name and to what - Pradyun: When repackign wheels was mentioned everyone with PyPI nodded no, security concerns - Daniel Holth: Metadata is not compressed right now, including all filenames inside the ZIP (this is how ZIP works) - for conda packages we wrote a very efficient way to stream the contents out without a temporary file. Streaming is a reason to prefer data.tar.zst over data.zip.zstd (you would have to decompress an entire data.zip.zstd before you could do anything with it, but the .tar would stream out in the same way as its zstd container) - conda-package-streaming "[write tar directly into zip](https://github.com/conda/conda-package-streaming/blob/main/conda_package_streaming/create.py#L155)" code is clever. [Getting them out again](https://github.com/conda/conda-package-streaming/blob/main/conda_package_streaming/package_streaming.py#L147-L149) works by getting a file-like object from ZipFile without using a temporary file for the inner archive. - Pradyun: Could use the existing data directory name (to allow two wheels to be unpacked in the same directory with unlikely overlap, a current property of the format) - Won't talk about symlinks but right now when pip unpacks ZIP file we only listen to X bit and thow out every other permission, something to think about if converting between the two formats (without losing information - ensure both formats can represent exactly the same information) - Emma: Absolutely, also wrote a PEP about symlink support but new version of that will be focused more in Linux - TARs have some support for symlinks, do we want to think about adding support and how to we ensure files don't escape env - Emma: installers already have to think about escaping env, safest thing is to disallow symlinks and don't want to open can of worms but can consider - Not familiar with zstd metadata but in the interest of reproducibility, the same zstd verison and compression level should give you same bit for bit result, is that stored in the mdatadata or in the file which is hard to do since you've already compressed it. - Emma: There is a hash at the end of the format, but you can disable it and maybe you could include it in the RECORD instead - @dholth thinks reproducible builds are out-of-band for producing bit-identical wheels on two machines - Do you envision this being a 5-year transintoin period where you can mandate it after 5 or 6 years for PyPI? - Emma: Great thing about this version of the PEP is that I don't have to make that call but I do think 5 years is reasonable, think there will be some early adopaters and some folks will wait a long way but that can be a decision made by indices - Comment: Our five year mission, to boldy compress packages where no one has compressed them before - Think we really shouldn't bother with dictionaries, not that much benefit - Emma: yeah, I can see that - @dholth notes the top 10? very few biggest wheels esp tensorflow produce a huge portion of pypi download bandwidth --- ## Limiting vectors and incentives for abuse *Mike Fiedler* Slides: [Link](https://www.dropbox.com/scl/fi/kg4fr4gc1t9c27sbbqvig/PyCon-US-2026-Packaging-Summit-Mike-Fiedler.pdf?rlkey=4u3v4n201ja82ugkdydf0jy6f&dl=0) **Summary:** Problem: PyPI is targeted for a variety of abuse patterns today. There's many things we could do, let's talk about some of the abuses and how we might work on prevention, and what makes sense to most folks in the room Decisions to make: Should we limit deletions, allow deletions, enforce deletions? Should we "seal" a release after some time? Should we reject compiled code with no source distribution? Probably more ideas here. **Notes:** - Work as a PyPI safety security engineer, spend my day sorting though a lot of mess - Here to start the conversation around some ideas to turn into implementations enventually - PyPI hockey-stick scaling dramatically way faster than Mike's bandwidth - Something happened where people use the most popular language in the world to generate code in the most popular language in the world that then publish packages in the most popular language in the world which drives peole to attack the most popular language in the world - It's hard to react to the volume at the speed attacks and malicious packages are coming in - Used to get 8000 new packages a month, which was managable with lots of security experts monitoring the new packages feed - Volume of 24,000 new pacakges a month are hammering us all - New uploader accounts are very cheap, and once you have one you can exploit it - Seeing more automation, more scripting, everyone wants more APIs which accelerates publishing vulnerable packages faster than we can deal with them - Trusted Publishing is great and has helped somewhat but has not kept pace with the number of increased package uploads - Keep plugging it! - More rate limiting, users used to be able to publish 40 new projects an hour and dump 10 Gigs, now 4 new projects a day - New lifecycle status to mark projects as archived, doesn't do much yet but serves as a signal - On admin side, prohibited names list keeps growing, right now no visibility into that, need to balance transparency with security - Pending publishers: Can't just site on names for more than 30 days, use it or lose it - Want to talk about 3 things: - Deletion of persistant state - Open ended releases - Who's using it, why are you using it - Open PR that could address this - Binary storage abuse - PyPI is an open CDN for the world now - Other one I don't want to talk about: namespaces, could talk about for hours - State problem - PyPI a warehouse, people shove stuff in the warehouse and it stays there forever, e.g. people pushing their homework - Deletions aren't really deletions, they live in cold storage can be accessed by URL and live forever - PyPI currently 36 TB of live storage and about the same amount unreferenced (cold storage), - People upload their MP3 collections to PYPI (yes they do) - Someone wrote a PyPI project to use it as dropbox, basically - We delete it but the deletions are still there - Are we okay with deleting things forever? - I am - But should we? - If you've abused the system, should we actually delete it? - Should we have a middle ground where it is not user accesable, but is a secret side room that will slowely grow over time? - Some people accidently uplaod properitary code or secrets, and complain - Even if PyPI doesn't have it, mirrors still will have it - Shouldn't upload it in the first palce - Other thing: Tokens. People provision API tokens and never use it - Saw some hands up, thumbs up - I think we should delete it after notifying the user - Maybe als oprompt users to use trusted publishers, and prompt those that do to delete it - Question about that: Should we start to attack that problem by giving people an really easy button to delete their API tokens at the same time as setting up trusted publishing - Mike: We defintely good, I'm not a UX designer but could be a bit icky asking users to delete somethinig - Do have a backgrounjd task today that reminds users to delete API tockens if they have trusted publishing - Have over 1 million user accounts on PyPI, part of them is spawned by people who only can do 4 uploads so they create 20 accounts and dump 20 GIGS of files since there is no limit on that - Pinning: If you bin by hash, it will always get the same, but just pinning by version someone can upload a malicious file to the same version with a post release, etc. - Maybe we should lock releases after e.g. 7 days and suggest users release post releases instead - Henry: One issue with post relases though is if you do == and put that release it will not get the post release, had to delete non-post release wheels due to PyPI limits and it broke a lot of people - Mike: Ouch, didn't know about that one. this is an installer semantic, do we force instalelrs to behave better or leave this vector open - I don't mind uploading things after the fact if there is a delay of a day between when the release is availbile to the publish versus to malware scanners - Mike: We don't have that right now, there is a PEP out there that will make that possible and like that idea - "I could also pay you a dollar" - (Widespread laughter and some approval) - I will take your moeny, but not saying that is how the FOSS community should operate - Just want to clarify that this is a version spec semantic, not an installer semantic so would have to change that specification - Wrt to install time, could pin to upload time which as long as the index is relaible cannot be spoofed - Mike: Didn't even know that existed, new installer feature - Another thing, Markupsafe built a wheel for new version but does change behavior for users - Emma: I've also run into post release issue, is the same issue if we allow == to pull post releases because you're still pulling a different file, so if we do decide not to allow new uploads must be the same thing that was uploaded in the original release, PEP 694 staged uploads would help a lot with this - Question: Do want to chime in on the subject that losing the ability to publish e.g. 3.15 wheels to existing releases, the ecosystem loses something significant if you can't backfill version compat - Mike: Can I offer you a middle ground? Right now uploading out of band works by default with your token, would be great if it could be opt-in temporarily with additional security checks - Hopping back on ==1.0.0, does technically allow local versions.... - Everyone: Get out - Emma: If you want something new, then requiring users to update for that is reasonable, so if people want to upload a post1 supporting 3.15 I think that's an acceptable cost - Think we need to revisit the assumptions that releases are open ended, think it is acceptable to require users to bump to the next version - Should PyPI reject unisntallable projects whose deps don't exist on PyPI? Why should PyPI host things that other users cannot install - What does it mean that other people can't install it? - Mike: If my package foo depends on package bar which does not exist on PyPI, how can anyone else intstall package foo? - Malware authors are pre-staging packages to handle their future attacks - Emma: You may upload foo before bar and if that fails that would be unfortunate, if someone does this we should notify them at least because this could prevent dependency confusion attacks - There is an idea of a coordinate suite of releases that we could potentially tie into 694, like a staging but for multiple packages, to address that need - Are circular dependencies a thing? That might disable some funcationality that works because eventually everything becomes consistant? Also could produce a large amount of support requests when corperate indicies are mixed with PyPI - Are deletions okay - Can you wait a few days and then delete? Eventually delete seems a reasonable contract, just not immediately - Bernat: Would be good to have a way to mark things as to be deleted and then 1 to 6 months later be actually deleted? - Mike: We are bound by law to not host certian things, e.g. people have uplaoded terabytes of copyrighted E books and can't evaluate every one of those cases manually - Need to decide how we delete without breaking the world and also doesn't intorduce sueciryt vulns like downloading bad files without checking if they actually exist - Do we do it with a mark and sweep? Thta's great but doesn't work for copyright and other illegal material. - Do we allow users to delete? How should this intersect with the quota? - We've seen attackers upload and delete immediately and then use the trusted domain to get people to install it anyway - Question: What about setting tokens for default expiry? - For new API surfaces they will have default expiry. For existing tokens, can add default expiry but for folks who have never used it, seems like we can just delete them immediately - But if we expire existing tokens, will break existing use cases seemingly at random - PyPI is turtles all the way down, we have a Rust binary that installs some Go code inside a Python package, people use PyPI to distribute Zig, is that PyPI's intended use? I don't think so - Have piles of Java jars - Have node modules directories - This is the reason we have multiple terabytes, is this a good thing? Its not helping the Python ecosystem, its slowely harming it - Lots of questoins but no time now, can address them later during roundtable **Discussion:** - --- ## Encouraging the community to view Packaging in terms of ecosystems, not tools *Mahe Iram Khan* Slides: [TODO] **Summary:** **Notes:** - SWE based in Berlin, been maintainer of Conda for almost 4 years now - A few questions I get asked a lot - What is the difference between Conda and Pip? - When should I use one or the other? - Can I mix them? - Should I use either of them? - To answer these questions properly and with confidence, have to learn about these projects myself - Read lots of blogs, watched confrence talks, talked to old people - With all that research, led to a talk I gave last month at PyCon Germany called: how to mix conda and pip without causing environmental damage - While working on that talk, main idea that stood out was Conda and Pip are two different eocysstems rather than jstu two tools - To understand this, have to look at the history of Python packaging - Originally: distutils, bare bones and tighly coupled with CPython release cycle as part of Python stdlib - This resulted in it not evolving at the pace of Python packaging - This resulted in Setuptools with included easy_install, the firs treal Python package manager - Some dep management, incomplete - Could not uninstall stuff - 2008, pip and virtualenv came out - Could uninstall stuff - Better error messages, better UX - Solved most problems at that time - But one community who's needs were not met by that community., scitntific Python community - Pip could not handle binary deps, had to handle them thereself - Conda born in 2012 as langauge agnostic package manager - Was both package and env manager in one - Conda emerged form the needs of the scientific community - Not trying to be a better pip, solved an entirely differnet problem - Therefore, helpful to think as parallel ecosystems eovlved under different fconstraints rather than competing tools - As ecocsystems develop, deveop their own conventions, infra, community - Tools like uv and poetry fall into pip ecosystem - Tools like mamba and pixi fall into conda ecosystem - This creates confusion when trying to compare them - Most experts here already understand this, but users tend to experiance packaging at the level of tools and don't understand - Not a perfect model or a concrete theory, lines are blurry but just my personal attempt at sense making - Going to have a packaging council soon - Need to think about what steps we need to take to help packaging make sense to users - Lots of confusion and frustration at user level and doesn't have to be like that - **Discussion:** - Bernat: Wasn't totally clear to me what you're proposing here, last year we had someone propoising a dedicated website to solve this problem, what are you prosing that's different? Dont' want this to be an [xkcd 927](https://xkcd.com/927/) situation - Mahe: The fact that people still don't - Perhaps thinking of https://pypackaging-native.github.io/ - Or also [Scientific Python packaging guide](https://learn.scientific-python.org/development/tutorials/packaging/) - Or [PyOpenSci packaging guide](https://www.pyopensci.org/python-package-guide/index.html) - Deb: Where do we as a community send newcomers to a vendor neutral and official feeling place? Would love to hire people to work on that. - Scientific community member who felt like a 2nd class citizen at the language level, we've made al ot of progress with the new packaging council but there are multiple packaging ecosystem, pip and conda but also Linux, robotics (pixi), etc. there are reasons there are multiple ecosystems. Official message on Python website pushes to toward pip, good for many things but not eveyrthing and everything else comes across as not official due to history, etc., so as we said making all the ecosystems ahve a place in official python land is a key thing going fowrad - I work on a project called Ecosystems, form a technical point of view treat them as separate eocysstems, got Debian and all otehr distros, got Nix, Homebrew, many palces Pyhon packages get repalced, each will have different semantics and defintitions of the metadata, don't jsut call through to builder but throw it awawy a lot of the time or like Nix wrap it up and script them all out across package managers. All open source so can't stop people from repackaging things but can try to make things to work better, different versions of metadata formats confuse people, different versions of packages between them - The other thing that reason shows up is package managers like Spak, that have the compiliers as part of the dependency tree. When there is a security advisory published on the PyPI package that doesn't propogate to these other eocsystems so folks maintaing them don't always detect them. Being able to translate and have mpappings between ecosystems is very valuable to be able to shared communication with what's happening with those projects, security advisories, etc. - Don't have any particular solutions, I am a consultant and availible suggestion some (laughter) but think there are things the Python community can do but also blaze a trail ahead to create standars for other eocystems to follow as well, with the new packaging council and the documentation around pip and PyPI as people look at those, the PEPs have been very influential across ecosystems so we should do more of that - Want to point out about ecosystems one interesting thing about COnda and PyPI, that are the only two ecosystems in all of software that support relocatable binaries that can run anywhere on the file system that you install it, some ecosystems support relocatable source, some of the pressure that goes on PyPI is that it supports that unlike most other ecosystems. Conda may not exist now if PyPI supported back then what it did now. One possible differentiator is that Conda intentionally supports other languages whereas PyPI isn't so interested in being the R package manager - Comment: This is a compelte misunderstanding of the scientific community, scientific community has to understand and use the other languages as Python is just hte glue over Fortran and C++ - Response: Think there are a whole bunch of scientific users who rely on packages that do at some point use other languages but don't write those languages themselves - Response: Work at at a company who deployes numpy pytorch and tensorflow at big companies and every other month with pip something breaks, whereas Conda is designed for that - See blog post by Ralf Gommers about native binaries and packaging issues, intersection of source code, community and technical issues that combine in pip, PyPI and wheels, get told over and over again that wheels are the answer, could do all this in wheels but would be a lot to get this work - Response: I think the wheel ecosystem is open to solving thse problems, I don't think they are solved now - Response: your first statement was that you don't want to support multiple languages, which rules that out - Fair point - Deb: Think he was speaking for the PSF, not everyone i nteh romm - Pretty sure most peole are awaree of wheel next, Ralf's website pypackaging native is the bible of how to get binary packages to work in wheels - One issue with wheels is if you ship compiled code, so many pakcages are compiled for Armv7 which is RPi [Only rPI], with ARM7.1 with vectorized atomics you get a 50x improvement on vector instructions, very critical to use wheel varients to handle things like this - We as a community need to find a way to make this happen - I think there's almost 2 orthagonal issues I'm hearing discussed - 1 is support for different non-Python languages, with all the qreuiqrements and problems - 2 is multiple sidfferent ecosystems, e.g. if I have a pure Python package and if it gets repackaged into an AUR by sometone else, that is a pure ecosystem problem - Security element as well, if I fixed it needs to be republished by someone else in order to be effective there - What is the story for me as publisher, if I'm creating a pure Python CLI tool what is my experiance as an author and publisher? - Bundle it up best I can, ship it to PyPI as an sdist and wheel and I'm done - Then it shows up on Homebrew 3 months later and I'm like oh that's interesting - What is your on ramp as a dev and maintaer to getting involved in the various places your work is published - Some if specifized, I don't know much about Deiban packaging for e.g. but I can learn - Pradyun: These are two different ecosystem, but the user views themself as being in both - Hearing ther eis a push and pull about whether we make this clear enough to users - Tooling ecosystems, distribution ecosystems, user and community ecosystems. Feel the work being done on the wheel ecosystem is important, think the conda and pixi communicaty can also contribute some of the learnings to this, few people in the room who are pretty familiar with that. Conversion a good thing but can't convert everything - Wheel varient a good way to bridge the pip world and the conda world, think it will make difference here - Jannis: Very exicted to build bridges between ecoystems, wheel next proposal showed that this was possible to build a better world for users. Recent split of PEPs related to the varient builds was a big improvement. Will share this with Jonathan to try to make this happen - Call to action: Help try to solve this together --- ## Lightning talks ### Quick update on PEP 772, Packaging Council - Barry Warsaw - PEP 772 packaging council PEP has been fully approved (cheers!) - Wnat to thank everyone last year who contributed the detals, deb and Pradyun have been fantastic - Want to go over the next steps over the weeks nad months - Want to align council election with PSF board election in the fall - Deb: Want to mirror both the voter verification process as well as the elections - Election the part you see on your public site but we do a lot of stuff behorehand to notify voters - Packaging council part for this a bit extra fuss ysince we haven't done that before - Plan is to work with Jacob our infra lead to help with that and Barry and Pradyun as community liason - TL;DR 3 phases in the process, each one 2 weeks long - PSF members (if you are not, you should be, if contributing member don't have to pay), except for basic members - Nominations: Have to be PSF member to nomindate the be nonminated, some carveouts of who can't run (SC, PSF staff) - Election process - 2 co-horts, A and B up for election on an alternating 2 year cycle - Deb: Help spread the word! Not everyone interested is in the room today so please forward links and emails to everyone who might be intererested - Barry: Hope everyone in this room will consider running as we all bring something unique to the table ### Mobile packaging update - Malcolm Smith Slide link: [TODO] - In last year we've got all the major pieces of the ecosystem in place - Building packages we have android and iOS support in cibuildwheel - Hosting we've enabled support on PyPI with new wheel tags - Installing we now have official support - Usage side - Seem an uptake in development tools - This time last year was just BeeWare and few others - Now support in Kivy and others - Current situation on PyPI, still pretty low but slowly but steadily starting to support due to user requests - Beweare team will be around Monday and Tuesday for sprints, Open Space on Saturday, anyone with a package you want to put on mobile support or contribute support for another package we can help you - Wnat to get to a point where Python on mobile is just as easy as any other platform ### Compile-to-Flit solution to bootstrapping problem (@jaraco) Slide link: [TODO] - Jason Coombs, been in ecosystem a long time, Setuptools maintainer and many other things - Issue: Build backends can't have dependencies as they can't bootstrap themselves - In a pure source environment can't build build backend, e.g. Setuptools, without dependencies - Setuptools works around it by vendoring - Flit works around by not having dependencies - Hatch does backend path minip - Have a new strategy: Compile to Flit ([example](https://github.com/coherent-oss/encutils)) - For example Coherant build system has 15 dependencies, would be a nightmare to vendor - When you do compile to Flit, doesn't use the original source but it builds for sdist, generates pyproject.toml that uses flit_core and concretizes metadata that wasn't in the original package - Because Flit has no deps, doesn't have the bootstrapping problem - This makes the sdists tiny - This does mean you ahve to build from sdist, not streight from GitHub - Planning on making all Setuptools deps use this technique so SAetuptools have deps again ### More Variants, More Diversity for AI Accelerators (joongi; @achimnol) Slide link: [[SpeakerDeck](https://speakerdeck.com/achimnol/pycon-us-2026-packaging-summit-lt-more-variant-more-diversity-for-ai-accelerators)] - Extending the use case of the ideas of Python packaging - PEP 817 wheel varients by wheel next working group - Want to make pip install auto detect a feature and e.g. GPU hardware and map that to specific binary builds - Same analogy applies to docker pull - If you run docker pull on armrr64 machine, automatically chooses what docker image to pull but also support for additional hardware feature - Container specification borrows support for archtecture but not hardware features - Recently kubernetes introduced a new feature for dynamic resource allocation which has expressive power to determine what vendor specific properties they want to use - But config headache for users as they need to do this all themselves - Wnat to simplify the user experiance for users - Design sketch: - Individual compute node reports its properties to the scheduler - which matches that with the conatiner image labels to check varient compat - Need to solve for scaling - Many questions like this, for example how to make Docker container registry support this kind of concept - For example runs a pip install command and tracks wheel varient properties and can convert this automatically to container labels - Think the current specification is complete to what I look like here ### [how we used the deconstructed pip stack](https://dholth.github.io/presentation-conda-pypi-internals/#/) (unearth, pypa/installer, pypa/build in [conda-pypi](https://github.com/conda/conda-pypi)) - Daniel Holth, I think most of you know my work - Been writing something called conda-pypi that takes many of the ideas we've working on over time - Want to control all dependencies from one place when we have a mix of pypi and conda - Back in the day had setup\.py, then pyproject.toml., then Flit, then anything building a wheel, and finally now we can reprisent editable installs in wheels - With Conda-PyPI we can build a wheel and then immediately turn it into a Conda package and install it, allowing us now to integrate editable installs and wheels into Conda environments - (Shows demo) - Note: Unlike pip, conda runs in a different Python environment than the target environment used to build, install ### Sharing malware scanning results of PyPI from multiple providers (joongi; @achimnol) Slide link: [[SpeakerDeck](https://speakerdeck.com/achimnol/pycon-us-2026-packaging-summit-lt-sharing-malware-scanning-results-of-pypi-from-multiple-providers)] - One of the pip maintainers - Pip broke my package install back in the day - uv shows that pip can be much faster in package resolution - Full package dependency resolver library in Python - It has a resolver library that is based on pubgrub but fixes some bugs that uv has as a result of the pubgrub-rs dependency - Performance is comparable to uv when caches aren't involved - End goals is to get this in pip and have other tools to use it ### (sign up on-site) Solving all remaining packaging problems in 5 minutes or less - Sharing malware scanning results of PyPI from multiple providers (joongi; @achimnol) - We are mirroring all of PyPI to scan it (and running out of disk) - Crazy increasing Malware on PyPI - There was a survey on PyPI experiance - Need to balance between open source freedom and validation of trusted packages - Come to the sprint if you want to help out with this - jaraco: [Coherent System Triumphs and Challenges](https://docs.google.com/document/d/1yfSqbCH2VHVy5iViBATzpwkze-tJQr3ONvMM89BiN3o/edit?usp=sharing) - https://github.com/coherent-oss/system - Goal is to make project maint sustainable, maintain over 100 packages - Automoatica dep infrence, inferred from imports in your project, can be overriden but mostly works by default - Automatic author and version infrence etc. to fill in dependency metadata metadata, so only need a lightweight pyproject.toml - Build system uses Flit to build itself so very lightweight - Dep infrence uses MongoDB database from PyPI that downloads everything from PyPI and figures out what imports the expose - notatallshaw: Nab - a new pure Python dependency resolver [nab](https://github.com/notatallshaw/nab) - Announcing the new project - Goal is to be about as good as uv, in terms of speed with uv, as a resolver library - dharhas: nebi - solving environment management for teams and collaborators [nebi](https://github.com/nebari-dev/nebi) - Work at OpenTeams, when not complaining about wheels do some useful things - Wnat to talk about software environemtn management - Released a new tool called Nebi, nebi.nebari.dev - Does some things you want to do in orgs, where your environemnts might not be stored with your code - Lock files have all concrete dependencies, Spec files have only abstract - We version them together so you can roll back and forward with them together - Also has role based access control as to who can edit them and who can't - Build something called conda-store previously, Nebi is a reimagining of these areas on a modern packaging ecosystem - Can do things like publish lock files to OCI registiries, which is what organizations use not just publishing to Git which is locked down - Long term goal is reproducible environments for science --- ## Roundtable discussions ### Topic: yoda conditions in PEP 508 markers (does it need a PEP?) - Blocking marker tree public API in packaging - We only see that the grammar is underspecified, with no examples in the dependency specifiers spec or PEP 508 - Decided that we should open an editorial PR for the spec - Grammar: marker_expr = env_var wsp* marker_op wsp* python_str - Add failing example with value on LHS, could be in tests - Need to mention that this is how packaging always works - Provide the use case of wanting the public method - Mention the complexities of the yoda expression - uv doesn't have '3.13.*' == python_full_version implemented, ~= is also confusing - Forces always comparing an env_var with a value, right now you could compare two env_vars ### Topic: Being significantly less conservative in Python ### Topic: PEP 803 -- `abi3t`: Stable ABI for free-threaded Python (Petr) - Q. What will the wheel filename be? - A. `mypackage-0.1-cp315-abi3.abi3t-manylinux1_x86_64.manylinux_2_5_x86_64.whl` - Q. Filename tags? - Linux / Mac: - `.abi3t.so` - If you want 2 separate exts, `.abi3t.so` & `.abi3.so` - For 3.13-3.14, `.so` is a workaround - Windows: all `.pyd` - `pip` now supports installing `abi3t` wheels - Installers need to accept `abi3t` tags for FT CPython where they accept `abi3` for GIL now - Build tools should implement support whenever they want to give this feature to their users ### Topic: Wheel 2.0 and Zstandard compression - geofft: I think you should be much more restrictive in what you allow as a valid ZIP or tar file. Some of the parser differentials have come from GNU tar and POSIX tar being independent specs and it not being obvious what happens if you use both extensions at the same time. In practice unpackers support both but no standard says what to do and also no standard bans it. We should pick one and ban the other and specify exactly which extensions we support. Ideally, specify the exact binary format of the outer ZIP container since we know it contains exactly two regular files with specific compression styles. Don't let people get creative about using features of the two formats - We will need ZIP64 for large wheels. Can you insist on ZIP64 for all wheels, or is that disallowed if the files are too small? - Emma: You are the third person to bring this up - People need to be able to easily generate wheels / I don't want people to be unable to generate a wheel because some popular language does not let them set the flag - Any requirements that we make needs to be compatible with extracting things in current zip/tar utilities - This should be a sub-PEP of 777 defining what the stronger requirements are. As part of that PEP, the authors need to go through all the tools and confirm that you can generate and extract. - But it is an excellent idea. - BTW - no hard links, no device files, no resource forks/NTFS streams, etc., no xattrs, no restricted filenames like CON - Petr: Don't restrict restricted filenames because there is no list and it depends on the Windows version - Files that differ only by case? (Canonicalization form?) - Should we require filenames are valid UTF-8? Bidi marks etc.? - Again, what do other tools do? - jjhelmus: Right now we have to validate a ZIP parser and a tar parser. Can you just compress a ZIP and put it inside a ZIP? (I assume meaning you store things in the inner ZIP uncompressed, and then .zst the whole thing) - There are also things you don't want in tar, e.g., the ability for a file later to replace the earlier one - No new formats, unless 7 programming languages implement it - I (Emma) am still on the fence about inner ZIP and inner tar - One useful thing would be to go through unwanted tar behaviors vs. ZIP behaviors - Does cpio support multiple— no cpio! - Nix archive (NAR) might be an option: [spec](https://nix.dev/manual/nix/2.22/protocols/nix-archive) and [unofficial spec](https://gist.github.com/jbeda/5c79d2b1434f0018d693) - More inclined to adopt something in the Python stdlib and otherwise widely adopted (= tar or zip) - Emma: Do you think we should specify the container of the inner file in the outer metadata (i.e., "tar-zstd" instead of "zstd")? - Yes, inexpensive and self-describing - If we don't we define "zstd" as meaning .tar.zst and can come up with "zip-zstd" or whatever later - Emma: zstd supports multiple streams and we would probably not want to allow that - there are differences in how people deal with multiple streams. - We could do things like require that the zstd data has a hash at the end - jjhelmus: Does zstd have pre- and post-filters? No. (This is why XZ/LZMA are the best for executable because there is a thing to rewrite JMPs so they better compress) - The seekable extension to zstd has been at 0.9 for the past few years, I (Emma) have considered this but don't think it's a good idea. It's also not implemented in the stdlib - jjhelmus: Is there a concern that getting a list of all the files involves either trusting RECORD (which you don't) or decompressing the inner file? - Yes, no good options.... - How would PyPI scan the files in the wheel? We just assume that decompressing into memory is fast enough and easy enough - Discussion around compressing metadata. Emma ran tests and found only smaller differences between zstd and DEFLATE - How many existing wheels use DEFLATE? Are there any that do not? - The stdlib used to default to a lower compression level - We should just get build backends to use better options, which is a solvable problem - geofft: What are the error messages when you see a 2.0 wheel? - Emma: I believe pip says "filename is not a valid wheel" or something, which is reasonable - Slightly older pip: `ERROR: attrs has an invalid wheel, attrs's Wheel-Version (2.0) is not compatible with this version of pip` - Current pip: `ERROR: attrs's Wheel-Version (2.0) is not compatible with this version of pip ` - changed as a side effect of https://github.com/pypa/pip/pull/12579 - open issue but not addressed: https://github.com/pypa/pip/issues/12723 - Current uv: ``` error: Failed to install: attrs-26.1.0-py3-none-any.whl (attrs==26.1.0 (from file:///tmp/attrs-26.1.0-py3-none-any.whl)) Caused by: The wheel is invalid: Unsupported wheel major version (expected 1, got 2) ``` - What's on the table for Wheel 2.0? - Right now just zstd and your proposal for hardening the format - Symlinks? Would like to, don't have bandwidth - We would need to validate... there is a parser in the stdlib, which is the right place forit but it keeps getting CVEs - We could error out if the stdlib is to old, but there is hesitation about coupling packaging features to Python versions (or just active Python versions, for EOL versions maybe less of a problem - What if we restrict symlinks to just the current directory - PEP 491? - @dholth notes afterwards - I looked at the threads from 2019, 2020 and [wheel-greater-compression prototype](https://github.com/dholth/wgc). We had a data.zip-within-wheel idea that is almost identical to this one and I'm happy that the topic is being pushed forward. At the time I proposed putting all files in `RECORD` even the filenames that are inside the `.data.zip`. Today I would use a data.tar or `wheel-filename.data.tar.zstd` because we stream the `zstd` decompression and `tar` unpacking together, instead of having to unpack an entire inner zip before we can extract its members. I would include the single `data.tar.zstd` and not its members in RECORD so that it follows the exsting rules for wheel. If the inner archive is STORED (zip method meaning 'not compressed') inside the zip then we can read it out efficiently wihout extracting the inner archive to a temporary file; see conda-package-streaming techniques linked earlier in this document. Although an inner ZIP would allow us to preserve bitwise metadata from what would have been in the outer ZIP, we don't have to worry about it since wheel installers now honor the "execute" bit from ZIP and not much else. - As for later tar-format filenames ovewriting earlier ones, this is also possible in ZIP, maybe not writable with python zipfile; the installer easily deals with it by keeping a set() of seen files and erroring on a repeat. - I also observed in 2020 that all of this seemingly extra work of building an inner archive and zstd-compressing it, takes less time than DEFLATE (standard zip compression) compressing all individual files in a regular wheel. ### Topic: Cross-platform environments and wheel building * Android, iOS and Emscripten have all worked out solutions for building wheels on one platform which will be used on another. * These solutions have many things in common, so there is some duplication between them. * They involve using a .pth file which monkey patches sysconfig and other stdlib modules to simulate running on the target platform, working around the fact that most Python build systems have no concept of cross compilation. * Even on more well-established platforms, cross compilation may be useful for less common architectures which have limited availability of CI systems. e.g. this was the case for several years during the introduction of Apple Silicon, and is still the case to some extent for Windows on ARM. * Cmake has good cross compilation story; need exists to drive that compilation from Python environment. cibuildwheel can do this for platforms it recognizes * "Nasty hack version" is to build the binary with C++ tooling, then inject the .SO into a wheel * Better option is to get build backends (setuptools, meson, hatch etc), to recognize target platforms * xgboost difficult platforms are Apple Silicon (before CI systems provided machines), Linux ARM * Android wheel for xgboost is in the review queue ### Topic: Improving security metadata across ecosystems - communicating compatibiltiy of CVEs to all repackaged python projects - pypi, conda, distros, homebrew, nix etc - malware metadata - CVEs (uncovered vulnerability) are different from malwares. - There was [PEP 804](https://peps.python.org/pep-0804/) where a package can declare it's dependencies on system dependencies - in conda allow a recipe to declare the upstream packageurl - CVEs often don't have a mapping to any packages especially in nvd, OSV and GHSA - inspecting the output of the build system - https://advisories.ecosyste.ms/ecosystems/conda/aiohttp - connecting the dots via source repository url - when a conda package is patched - change in build recipe (not source codes, mostly) - version numbers when patched build changed but keep the same version number - pypi packages can have build numbers incremented, most tools ignore them - immutable versions -> source remains same - reusing versions with build numbers incremented -> only build recipe changes - cooldown periods - problem: what if everybody uses it? - the delay for discovering security issues may be prolonged - https://slsa.dev/ - carrying over provenance in the dependency chain ### Topic: Dependency resolution (and Nab) --- ## Action items * [what] * [who] * Interested in delving deep into packaging compiled and multi-language projects, or helping others learn how? Sign up for [SIMPLE-Py](https://scikit-build.org/events/simple-py/)! Colocated at SciPy during the tutorials; can fund travel and hotel stay during this portion of the confrence * Henry Schreiner, Scikit-Build

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password
    or
    Sign in via Google Sign in via Facebook Sign in via X(Twitter) Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    By signing in, you agree to our terms of service.

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully