--- title: dask-unyt arrays tags: yt, dask author: Chris Havlin --- [**return to main post**](https://hackmd.io/@chavlin/r1E_o6lAv#experiments-in-daskifying-yt) ### 3. dask-unyt arrays *yt* uses `unyt` to track and convert units so if we are using `dask` for IO and want to return delayed arrays, we need some level of `dask`-`unyt` support. In the notebook, [working with unyt and dask](https://github.com/data-exp-lab/yt-dask-experiments/blob/master/unyt_dask/unyt_from_dask.ipynb), I demonstrate an initial prototype of a `dask`-`unyt` array. In this notebook, I create a custom dask collection by sublcassing the primary `dask.array` class and adding some `unyt` functionality in hidden sidecar attributes. This custom class is handled automatically by the `dask` scheduler, so that if we have a large dask array with a dask client running and we create our new `dask`-`unyt` array, e.g.: ```python import dask.array as da from dask.distributed import Client client = Client(threads_per_worker=2, n_workers=2) x = da.random.random((10000, 10000), chunks=(1000, 1000)) x_da = unyt_from_dask(x, unyt.m) ``` then when we do operations like finding the minimum value across all the chunks: ``` x_da.min().compute() ``` we are returned a standard `unyt_array` ``` unyt_array(3.03822589e-09, 'm') ``` that was calculated by processing each chunk of the array separately. This implementation handles `unyt` functionality as hidden sidecar attributes. These attributes track changes to units separately from the Dask graph and apply final conversion factors only after calls to `.compute()`. This notebook demonstrates a general and fairly straightforward way to build in Dask support to `unyt` which can be used in conjuction with, for example, the prototype dask-enabled particle reader to return arrays with both dask and unyt functionality preserved. [**return to main post**](https://hackmd.io/@chavlin/r1E_o6lAv#experiments-in-daskifying-yt)