owned this note
owned this note
Published
Linked with GitHub
# Python Packages by Domain Stack
https://github.com/Quansight/scipy-2022-swag
### Astronomy
Scientific Python libraries for analyzing celestial bodies.
- Astropy: Common core package for Astronomy in Python.
GitHub: https://github.com/astropy/astropy
Website: https://www.astropy.org
- Poliastro: Interactive Astrodynamics and Orbital Mechanics, with a focus on ease of use, speed, and quick visualization.
GitHub: https://github.com/poliastro/poliastro
Website: https://docs.poliastro.space/en/stable/
- SunPy: A community-developed, free and open-source solar data analysis environment.
GitHub: https://github.com/sunpy/sunpy
Website: https://sunpy.org
- SkyPy: Package for simulating the astrophysical sky.
GitHub: https://github.com/skypyproject/skypy
Website: https://skypy.readthedocs.io/en/stable/
### Packaging
Scientific Python tools for packaging.
- conda-forge: A community-led collection of recipes, build infrastructure and distributions for the conda package manager.
GitHub: https://github.com/conda-forge
Website: https://conda-forge.org
- conda: Package, dependency and environment management for any language.
GitHub: https://github.com/conda/conda
Website: https://docs.conda.io/en/latest/
- scikit-build: The official adaptor for CMake in Python packaging to build binary extensions, along with Python packages for CMake, Ninja, and other tooling.
GitHub: https://github.com/scikit-build
Website: https://scikit-build.org
- cibuildwheel: The PyPA tool for building binary wheels for all supported platforms.
GitHub: https://github.com/pypa/cibuildwheel
Website: https://cibuildwheel.readthedocs.io
### Performance
Scientific Python packages for improving performance when coding in Python
- Dask: A library that allows parallelism for analytics and performance at scale.
GitHub: https://github.com/dask/dask
Website: https://www.dask.org
- Cython: An optimizing static compiler. It makes writing C extensions as easy as Python itself.
GitHub: https://github.com/cython/cython
Website: https://cython.org
- Numba: JIT compiler that translates a subset of Python and NumPy code into fast
machine code.
GitHub: https://github.com/numba/numba
Website: https://numba.pydata.org
- CuPy: Array Library compatible with NumPy and SciPy for GPU-accelerated Computing with Python
GitHub: https://github.com/cupy/cupy
Website: https://cupy.dev
- pybind11: A C++ API for CPython/PyPy.
GitHub: https://github.com/pybind/pybind11
Website: https://pybind11.readthedocs.io
### Visualization
Scientific Python libraries for visualization
- matplotlib: Library for creating static, animated, and interactive visualizations.
GitHub: https://github.com/matplotlib/matplotlib
Website: https://matplotlib.org
- bokeh: A library for creating interactive visualizations for modern web browsers.
GitHub: https://github.com/bokeh/bokeh
Website: https://bokeh.org
- scikit-image: A collection of algorithms for image processing.
GitHub: https://github.com/scikit-image/scikit-image
Website: https://scikit-image.org
- napari: A multi-dimensional image viewer for Python.
GitHub: https://github.com/napari/napari
Website: https://napari.org/stable/
- holoviz: A set of high-level tools to simplify visualization in Python.
GitHub: https://github.com/holoviz/holoviz
Website: https://holoviz.org
### Support
Scientific Python organizations that provide support to open source Python projects
- NumFocus: Non-profit organization that promotes open practices in research, data, and scientific computing.
GitHub: https://github.com/numfocus/numfocus
Website: https://numfocus.orgs
- SciPy: Annual Scientific Computing with Python conference.
GitHub: https://github.com/scipy-conference
Website: https://www.scipy2022.scipy.org
- OpenTeams: A Network of Enterprise Solution Architects using open Source talent to build and maintain your softwaresolutions.
GitHub:
Website: https://www.openteams.coms
- Scientific Python: A community effort to better coordinate the Scientific Python ecosystem and grow the community.
GitHub: https://github.com/scientific-python/scientific-python.org
Website: https://scientific-python.org
### IDE
Scientific Python packages that contain developer tools.
- IPython: A command-line interface to Python, for interactive computing with an interactive shell and a kernel for Jupyter.
GitHub: https://github.com/ipython/ipython
Website: https://ipython.orgj
- Jupyter: A set of web services for interactive computing across all programming languages.
GitHub: https://github.com/jupyter/jupyter
Website: https://jupyter.org
- Spyder: A Scientific Python Developement Environment.
GitHub: https://github.com/spyder-ide/spyder
Website: https://www.spyder-ide.org
### Mathematics
Scientific Python libraries with math tools.
- sympy: Library for symbolic mathematics.
GitHub: https://github.com/sympy/sympy
Website: https://www.sympy.org/en/index.html
- scipy: Collection of fundamental algorithms for scientific computing in Python
GitHub: https://github.com/scipy/scipy
Website: https://scipy.org
- mathjax: A JavaScript display engine for mathematics that works in all browsers.
GitHub: https://github.com/mathjax/MathJax
Website: https://www.mathjax.org
### Building blocks
Scientific Python packages that make it easier to build and work in blocks.
- numpy: Fundamental package for numerical computation. It defines the n-dimensional array data structure, the most common way of exchanging data within packages in the ecosystem.
GitHub: https://github.com/numpy/numpy
Website: https://numpy.orgz
- zarr: A library and a format for chunked, compressed and N-dimensional arrays storage.
GitHub: https://github.com/zarr-developers/zarr-python
Website: https://zarr.readthedocs.io/en/stable/
- xarray: A package that makes working with labeled multi-dimensional arrays simple, efficient, and fun.
GitHub: https://github.com/pydata/xarray
Website: https://docs.xarray.dev/en/stable/n
- NetworkX: A package for the creation, manipulation, and study of complex networks.
GitHub: https://github.com/networkx/networkx
Website: https://networkx.org
- Awkward Array: Manipulate JSON-like data with NumPy-like idioms.
GitHub: https://github.com/scikit-hep/awkward
Website: https://awkward-array.org
### Machine Learning
Scientific Python packages for machine learning.
- scikit-learn: A set of simple and efficient tools for predictive data analysis.
GitHub: https://github.com/scikit-learn/scikit-learn
Website: https://scikit-learn.org/stable/p
- PyTorch Ignite: High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
GitHub: https://github.com/pytorch/ignite
Website: https://pytorch.org/ignite/index.html
### Dataframes
Scientific Python libraries for working with DataFrames.
- Pandas: A package that provides fast, powerful, flexible, and easy to use data analysis and manipulation.
GitHub: https://github.com/pandas-dev/pandas
Website: https://pandas.pydata.org
- GeoPandas: An open source project to make working with geospatial data in Python easier.
GitHub: https://github.com/geopandas/geopandas
Website: https://geopandas.org/en/stable/p
- Pandera: A Statistical Data Testing Toolkit that provides a flexible API for performing data validation on dataframes.
GitHub: https://github.com/pandera-dev/pytest-pandera
Website: https://pandera.readthedocs.io/en/stable/
### Testing
Scientific Python tools for writing tests.
- Pytest: A mature full-featured Python testing tool that helps you write better programs
GitHub: https://github.com/pytest-dev/pytest
Website: https://docs.pytest.org
- Hypothesis: A library for creating unit tests which are simpler to write and more powerful
GitHub: https://github.com/HypothesisWorks/hypothesis
Website: https://hypothesis.readthedocs.io/en/latest/
- Ghostwriter: A command-line interface which can write property-based tests for you.
GitHub: https://github.com/HypothesisWorks/hypothesis/tree/master/hypothesis-python/tests/ghostwriter
Website: https://hypothesis.readthedocs.io/en/latest/ghostwriter.html
### Communities
Communities around specific disiplines.
- scikit-hep: A community project from High Energy Physics building tools useful in HEP and beyond.
GitHub: https://github.com/scikit-hep
Website: https://scikit-hep.org
# MEETING 1
## Participants
Stéfan van der Walt (Scientific Python)
Juanita Gomez (Scientific Python)
Pamphile Roy (Scientific Python)
Paige Martin (Pangeo)
Ryan Abernathey (Pangeo)
Jim Pivarski (Scikit-HEP)
Henry Schreiner (Scikit-HEP)
Levi Wolf (Geography)
Martin Fleischmann (Geography)
Pey Lian Lim (Astropy)
Isaac Virshup (scverse)
## Meeting notes
- Initial idea: Place for new users of the Scientific Python ecosystem where they find information about tools that they can use for a specific domain.
Challenge: Linear structure is hard since there are packages that are used for a lot of domains.
Possible solution: Using tags
- Using Stacks
- Two different ideas: The landing page plus infrastructure (cookie-cuttter)
- Assigning responsabilities in order to maintain this. People from each of the stacks. -> How do we decide on this people? Voting?
Is this made for new commers or for people already involved in the ecosystem?
- It is useful to have information about who is maintaining which package to know who to reach out depending on the needs.
- It is useful for new comers to get to know packages from the different domains.
For example Astropy has https://www.astropy.org/affiliated/
It makes sense to have a person in charge of keeping the information updated but also allowing people to collaborate and having a review process in order to add new packages.
Counterpoint: The ecosystem is amazing because of decentralization, a lot of people creating.
There are a lot of resources that have the information about Python packages but we have the support of people from the community maintaining packages so we should have a high-bar standard for the packages that we include. -> On the topic of reviewing, it is essential to coordinate with PyOpenSci. https://www.pyopensci.org
Two possible concerns:
- People don't really look up for packages in lists, they usually learn from them in tutorials.
- There is a security concern when including all the packages
We can make a distinction between "core packages" but having a list can be useful to teach people what is used by most of the people.
There is not a lot of documentation on how to use several packages together. -> This is the idea of the Pangeo inicitative: https://projectpythia.org
- Where should this information leave? -> Probably each domain should host information on their own domain but we could share tooling within the community in order to make it easier. Example: https://discourse.pangeo.io/t/statement-of-need-integrating-jupyterbook-and-jupyterhubs-via-ci/2705
List of tools developed and used by several packages that we can help coordinate:
- Cookie-cutter
- Memory flags, DevOps https://github.com/astropy/astropy-project/issues/118 (Astropy)
- [Scikit-HEP cookie](https://github.com/scikit-hep/cookie), [developer pages](https://scikit-hep.org/developer), and [repo review](https://scikit-hep.org/developer/reporeview)
A need from the Scientific Python community: Link to legacy code-> So it would be useful to have a centralize place to point to this information.
Next steps?
- Organize a sprint to build the pages (we can work on some infrastructure before)
## Chat transcript
**08:06:34** From Henry Schreiner : Just figured out that I had no sound, sorry! Fixed.
**08:07:49** From pllim : Pey Lian
**08:10:08** From Juanita Gomez : I’ll be taking notes here under "Meeting 1”: https://hackmd.io/1wioifmCTY2UVmGT-jcZsg. Feel free to add anything there!
**08:10:32** From Isaac Virshup : Do we need to come back to jim?
**08:10:48** From Pamphile Roy : https://hackmd.io/1wioifmCTY2UVmGT-jcZsg
**08:10:55** From Pamphile Roy : Remove the dot :)
**08:11:06** From Juanita Gomez : oops
**08:12:54** From Henry Schreiner : It worked for me with the dot too, it was ignored when I clicked the link ¯\_(ツ)_/¯
**08:15:21** From Levi John Wolf (he/him) : my apologies for arriving in late! Levi Wolf w/ PySAL & Geopandas.
**08:16:52** From Jim Pivarski : "Stack" vs "framework"? (There's the VanderPlas layers diagram for illustrating stacks.)
**08:17:01** From pllim : https://www.astropy.org/affiliated/
**08:19:14** From Ryan’s iPhone : On the topic of reviewing, it is essential to coordinate with PyOpenSci.
**08:19:29** From Stefan van der Walt : Definitely; we will be talking to Leah.
**08:19:34** From Ryan’s iPhone : https://twitter.com/leahawasser/status/1565339596132560896?s=21&t=Ayow8xbhW4W-hvp2UB8-5Q
**08:29:55** From Pamphile Roy : Sorry I will have to jump out, something came out now. I might be able to jump back again, but in any case I will watch the recording. Thanks all, great to see everyone 😃
**08:36:29** From Ryan’s iPhone : Agree security is important. But also probably beyond the scope of what we can accomplish here. Companies like anaconda have big businesses about providing security guarantees for open source package. Hard to match that from a volunteer effort.
**08:39:58** From Ryan’s iPhone : https://projectpythia.org/
**08:43:15** From Ryan’s iPhone : On the topic of notebook build processes https://discourse.pangeo.io/t/statement-of-need-integrating-jupyterbook-and-jupyterhubs-via-ci/2705
**08:45:44** From pllim : How widely can this meeting's recording and notes be shared? Am I allowed to share this with Astropy Coordination Committee members (5 people)?
**08:46:45** From Stefan van der Walt : Sure, we'll post on YouTube so others can watch and comment.
**08:48:36** From Jim Pivarski : And much of that is broader than our HEP domain.
**08:51:18** From pllim : The diagram here gives you an idea on the Astropy side though it is a little outdated by now (e.g., we no longer use Travis CI) -- https://github.com/astropy/astropy-project/issues/118
**08:52:00** From Ryan’s iPhone : 👍 that cookie cutter sounds super interesting.
**08:52:06** From Isaac Virshup : Minor tangent: does anyone have a process for what happens when you update your cookie-cutter? E.g. how do users update to the newer standards?
**08:53:19** From Henry Schreiner : https://scikit-hep.org/developer & https://github.com/scikit-hep/cookie