owned this note
owned this note
Published
Linked with GitHub
---
name: DIALS core meeting 2020-10-15
tags: core meeting
---
# DIALS core meeting 2020-10-15
[![hackmd-github-sync-badge](https://hackmd.io/W7hAAI5uQBax0ggYrVmgwQ/badge)](https://hackmd.io/W7hAAI5uQBax0ggYrVmgwQ)
## Previous Actions
* [ ] MG+ND: write a proposal for ['overall architecture discussion'](https://dials.github.io/kb/core/20200903) and ['new installer'](https://dials.github.io/kb/core/20200903)
* [ ] MG+ND: [Implement](https://dials.github.io/kb/core/20201001) [DC1 -- per-module `setup.py`](https://hackmd.io/@dials/B11SXgTND) in a pull request
* `setup.py` requires sources to be in a subdirectory, so to do this we need to change the source layout at the same time. We will therefore test this with xia2.
* [ ] MG+ND: Come up with a proposal to [move away from all code being in header files](https://dials.github.io/kb/core/20201001) and consolidate into a single library
* [ ] ND: update `CONTRIBUTING` in a (separate) pull request to include a soft-imposition of [stable master](https://dials.github.io/kb/core/20200903), [dials#1353](https://github.com/dials/dials/issues/1353).
* [ ] MG: organise a [typing intro lecture](https://dials.github.io/kb/core/20200917).
* [ ] GW: [move `cbflib_adaptbx`'s `compress` and `uncompress` functions](https://dials.github.io/kb/core/20201001) into dxtbx in a PR.
* broke building on Mac completely, and looked like it broke building on Linux but may have been a coincidence. ND looking at it.
* ASB to ask BKP whether fundamentally `scons` could run `cmake` under the hood for a module
## Agenda
### New folks in the room
Randy, Andrey, (maybe) Peter coming along? Intros; discussion of how things work at the moment; possibly schedule specific meetings for longer form stuff. Questions of degrees-of-freedom for individual reflections & lattice translocation raised recently.
Randy Read: translocation vs NCS, interested in running CryoEM-2D-classification algorithms on found spots to distinguish between fuzzy and non-fuzzy spots.
Peter Zwart: Interested in better likelihood targets, would like to know the DoF for each reflection. (As weighted sum of observations it is not N(obs)-1)
**AP:** JBE to set up meeting with Andrey/Randy/Peter with DIALS-E folks at a suitable date & time.
Peter Zwart: Have a look at [xicam](https://github.com/Xi-CAM/Xi-cam) regarding the image. https://xi-cam.readthedocs.io/en/latest/
**AP:** ASB to organise a meeting on the image viewer.
### cctbx conda-forge package
We may want to evaluate moving the dxtbx/DIALS Azure builds to use the [now available cctbx conda-forge package](https://github.com/conda-forge/cctbx-base-feedstock), once [issue #4](https://github.com/conda-forge/cctbx-base-feedstock/issues/4) is resolved, which prevents the package being used as a build platform.
* ABS: That would be rad.
* We haven't appreciated the fundamental perspective shift yet though.
* ASB: Will bring this up the next CCI-all meeting.
* GW: We should bring our heads round the idea that all different components become different conda packages.
#### Benefits
* Faster builds
* Making use of clearer API boundaries and versioning
#### Drawbacks
* API boundaries and versioning
* In other words: We would start treating cctbx as a software package. This restricts dxtbx/DIALS code to only use functionality that is part of the relevant cctbx release. From time to time, when cctbx releases new versions, we would update our dependencies, which would then allow the use of new cctbx features.
#### Outcome
* ASB: Was discussed at the CCI-all meeting, and the outcome was: this is fine. I need to be able to continue using cctbx as a development installation. Apart from that it is fine to treat cctbx as a versioned software package, ie. run Azure against the cctbx-base conda-forge package.
* The drawbacks noted above will thus apply.
* **AP:** MG: Make this happen
* ASB: Please don't wait for cbflib for a dxtbx conda-forge package.
### `dxtbx` → `xfel` dependencies
At some point we made a somewhat conscious decision that `dxtbx` should not depend on `dials`. While setting up some Azure testing of dxtbx ([#226](https://github.com/cctbx/dxtbx/pull/226)) without `dials`/`xia2` and their dependencies present some test failures were flagged because [`dxtbx` has dependencies into `xfel`](https://github.com/cctbx/dxtbx/commit/d25c9304601aea6725fa073855ff3c553840d184).
How should we deal with this? A few options:
* Disable the code so it stays in dxtbx but is not tested/missing dependencies do not trigger errors?
* Cleanest option would be to move xfel-dependent format classes into the xfel package
* Eliminate the dependencies
Discussion:
* GW: I have the opinion that we shouldn't have circular dependencies. Ever.
* DGW: has electron-diffraction format classes outside of dxtbx. Would love to put SMV classes into a separate box as they are too promiscuous.
* ASB: having format classes stashed together is an easy mental model
**Previous outcomes:**
* ASB needs to consider the future of the xfel module
* In principle the format classes could go into xfel
* The commit linked above has gone in as part of [dxtbx#226](https://github.com/cctbx/dxtbx/pull/226)
* Will discuss next time
There are a couple of places there dxtbx → xfel
* `format/FormatXTC.py`
* `format/FormatPYmultitile.py`
* `format/cbf_writer.py`
* `format/FormatXTCJungfrau.py`
* `format/FormatXTCRayonix.py`
* `format/FormatXTCCspad.py`
* `command_line/radial_average.py`
* `command_line/detector_superpose.py`
**AP**: ASB to create a ticket in dxtbx to discuss this so that we can get everyone's opinion.
### NIAC Nexus
Nexus as working format for DIALS?
DIALS has an existing data model (past: pickle, now: msgpack). Internal names in that model correspond to the names used inside DIALS for development, eg. `xyzobs.px.value` has a clear definition.
* internal names can be kept using soft/hardlinks
No objections for a `dials.export $file.refl format=nexus`, but do not use nexus as a format to interchange data within DIALS programs.
What is the underlying benefit of using Nexus?
* main benefit is putative interoperability
**AP**: ASB to create a transformation tool DIALS → Nexus → DIALS to demonstrate the feasibility of this approach, so that we have a concrete proposal to discuss.
**AP**: MG to put this into a ticket "*Demonstrate round trip of DIALS data to Nexus data with no loss of signal*" instead of meeting minutes
### Pull request *When file indices are provided,...*
*not discussed in this meeting*
ASB: This is [dxtbx#210](https://github.com/cctbx/dxtbx/pull/210). The FormatSaclaHDF5 class isn't lazy, which led to that problem. Will fix this in a separate PR, but the PR above is still a good PR independently of that.
In discussion we identified a new potential issue. We need to ensure during deserialisation this loop isn't hit for classes that don't have scans and goniometers.
### Definition of `trusted_range`
*not discussed in this meeting*
Discuss [Definition of `trusted_range` (dxtbx#182)](https://github.com/cctbx/dxtbx/issues/182)
### `image_range` vs. `array_range` baked in off-by-one errors
*not discussed in this meeting*
Discuss [`image_range` vs. `array_range` baked in off-by-one errors (dxtbx#186)](https://github.com/cctbx/dxtbx/issues/186)
* DW made a proposal and started work on that, but it proved messy
* There is a new, simpler, proposal but this version kills off the `image_range` concept rather than `array_range`
* Discuss, again, what we want here and whether the new proposal is useful
### DIALS documentation build
*not discussed in this meeting*
Should the DIALS documentation build outside a cctbx environment?
* Currently you can only build the DIALS documentation inside a cctbx environment. This means a remote documentation build needs to build cctbx first, so we can't easily run the documentation build on every commit. When things break they are difficult to fix. We can't use readthedocs.
* MG: I believe the main obstacles are three cctbx Sphinx plugins that may need to be extracted or removed from DIALS documentation, and some phil parsing logic.
* MG: the cctbx conda-forge package may be useful here
### dxtbx json/msgpack performance
*not discussed in this meeting*
ASB reports json being problematic when importing e.g. 50,000 experiments, as everything is stored in a single file, so everything needs to be read at once.
* GW: cf. [Load before heat death (dxtbx#118)](https://github.com/cctbx/dxtbx/pull/118). At the core of it are bad design choices in dxtbx.
* Future discussion topics:
1. treatment of large serial datasets
2. using an HDF5 backend for reflection files
3. [dials#1407](https://github.com/dials/dials/issues/1407) specific proposal
* ASB: I want `check_format` to go away.
Previous discussion outcomes:
* ASB:
* General consensus that this is needed.
* Make models and store them
* Load them in parallel with minimal memory requirements
* Version all models and rewind them
* This needs a lot of thought and at this time we don't have agreement here.
* Not need to copy data like shoeboxes when they aren't being changed, operate by reference
### Remove `tntbx` dependency
*not discussed in this meeting*
Appears not to be used anywhere outside of `cctbx_project/xfel`, but `xfel` does not declared is as a dependency. `xfel` only uses the [`svd()`](https://github.com/dials/tntbx/blob/master/tntbx/__init__.py#L7) function. Maybe change xfel to use [`numpy.linalg.svd`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html) instead and drop `tntbx` entirely?
Peter Zwart: My favorite SVD tools lives in scipy: scipy.sparse.linalg.svds (https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.svds.html#scipy.sparse.linalg.svds). This routine allows calculation of the top K vectors in an SVD, it is a massive time saver if you know the rank of the matrix. The code works for dense as well as sparse matrices.
### renaming `master` branch → `main`
*not discussed in this meeting*
* Nothing much will happen on this any time soon - either way we'll [wait for GitHub support on this one](https://github.com/github/renaming)
* General assent that this is useful
* from 1st of October new GitHub repositories will have `main` as the default
* **proposed outcome**: Agree to migrate to `main`. Turn this into dxtbx/dials/xia2 issues and move once GitHub provides a migration path. Use xia2 as a test-bed and move dials/dxtbx two weeks later.
### Next meeting
November 5th, 4pm UK time, 8am PDT.