CFEP Proposals

Here is a list of CFEP Proposals that I wanted to note down. I am happy about any kind of suggestion maybe some of these things aren't CFEP worthy?

Formalize conda-forge build scripts

The conda-build mechanism with build.sh and bld.bat is obviously very powerful, and flexible. But with flexibility comes also a price it's harder to be correct and follow good practices.

This CFEP suggests to implement and maintain a collection of build scripts for common build tools. For example for CMake. Instead of creating or copy pasting a CMake build.sh file, it would be possible to depend on the cf-buildscripts package in the build environment and run

script: cf-cmake -DMY_VAR=ON

This would allow us to set common configuration settings from the cf-buildscripts package.

On top of this, this could be a first step to generalize more complicated things such as distributing a split package where one contains an optimized, small build and the other the debug symbols, or even where there is a split between the header files and library files. For this case we would need to implement additional transforms on the recipe itself though.

Bot-correctify and lint version constraints

Even though there has been a warning in conda-build for some time now, we still find version constraints such as

numpy 1.14

in metadata on conda-forge. This version constraint is ambiguous because it's unclear if the intent is to express 1.14.0 or 1.14.* as a version constraint.

This CFEP proposes to lint the version constraints of new PRs and make sure that from now on the version constraint is properly specified. It would be nice to also formally back-fix the implicit convention (to use 1.14.*) but I am not sure a hot-fix on the repodata would be possible.

Implement naming schemes and namespaces

conda-forge is growing quite rapdidly and one of the hardest things in the life of a developer is naming things! That's why we end up with packages from different languages claiming the same name, for example: tabulate exists for Python and for C++.

Sometimes there are also libraries that ship an optional Python binding (such as openimageio).

It is difficult to correctly name these packages:

Should the package get a suffix or a prefix (e.g. cpp-tabulate, py-tabulate)? Which package should get a suffix or prefix (e.g. py-openimageio & openimageio or openimageio and openimageio-lib?).

Another issue is that growing channel sizes make conda slower and produce more traffic which could be avoided for example I might not be interested at all in R packages, but still I receive metadata for all R packages in conda-forge.

Therefore I am proposing to come up

  1. with a consistent naming scheme that will guide packagers in choosing names
  2. promote the implementation of real namespaces based upon our consistent naming schema

I am arguing that the implementation of namespaces is relatively straight-forward as sub-channels of conda-forge. The default selected namespace (e.g. python would then get the highest priority in a --strict-priority fashion and the conda-forge:python:tabulate package would override those available in conda-forge:cpp:tabulate.

As a first step, mandatory (linted) additional metadata should be added to the recipes, indicating wether the output belongs to one of several official namespaces. Some namespace names are already implemented (but unused) in the conda source code: https://github.com/conda

Select a repo