title: NumPy Dtype Requirements and Approaches
author: Sebastian Berg
tags: NumPy, Dtypes
*Note that I continued part of this in new documents which are not yet online, but it should be relevant since because it lists many things (including some smaller side notes)*
**Prepared Document for Meeting:** https://hackmd.io/B5TPPP-8QiKOFODPqmXtfA
## TODO Document restructuring/new document
I will probably try to split the document up into:
1. Requirements (possibly split into different sections)
2. Decisions that we probably need to make
3. Either in 2. or seperately, some suggestions for it.
It may be interesting to have some python dummy implementation...
# NumPy Dtype Requirements and Approaches
This is a work in progress to try to structure my thoughts.
It may be a bit random and some of these are probably premature.
**Please feel free to change/edit this!**
Depending on how types/instances are created, things can be confusing, so I would suggest to call:
* **dtype**: A class (type or not)
* **descriptor** (or **a dtype**): object tagged onto the array (an instance of dtype)
* **scalar type**: A type which, when instanciated, provides the scalar instance
* a **scalar**: instance of scalar type.
Just to note, the names below are just working names, and they would all need an `__array_` or `__numpy_` prefix of course.
Please try to list all related documents here (in case I missed some):
* Matti's NEP, discusses the technical side of subclassing more from the side of `ArrFunctions`
* https://hackmd.io/ok21UoAQQmOtSVk6keaJhw and https://hackmd.io/s/ryTFaOPHE
* Thoughts by Matti and Eric about dtype subclassing
* (2019-04-30) Erics is probably the furthest along for subclassing
* Discussion about the calling convention of ufuncs, but also
includes teardown/setup needs.
* https://github.com/BIDS-numpy/docs/blob/master/meetings/2018-11-30-dev-meeting.md and [NEP: high level data types and universal functions](https://hackmd.io/6YmDt_PgSVORRNRxHyPaNQ)
* BIDS Meeting on November 30, 2018 and document by Stephan Hoyer about what numpy should provide and thoughts of how to get there. Meeting with Eric Wieser, Matti Pincus, Charles Harris, and Travis Oliphant.
* Important summaries of use cases.
* [SciPy 2018 brainstorming session](https://github.com/numpy/numpy/wiki/Dtype-Brainstorming)
* Good list of user stories/use cases.
* Lists some requirements and some ideas on implementations
* [BIDS talk by Nathaniel](https://www.youtube.com/watch?v=fowHwlpGb34) from Oct. 2017. (Mostly historically, not much information on this topic)
* [xnd-project](https://github.com/xnd-project) with ndtypes and gumath
* Does not implement promotion rules, instead loops are registered for all variations.
* Does it even allows (easy) definition of new dtypes?
* [`__array_ufunc__` NEP](https://www.numpy.org/neps/nep-0013-ufunc-overrides.html):
* May be interesting to keep APIs similar.
* [similar document to this](https://hackmd.io/ThZt5S7iSXWfcPd_l3E-nw), I started
* was a try/start to make things a bit structured based on
what needs to happen when e.g. ufunc calls occur.
Descriptor (dtype instance) Requirements
* Provides an immutable instance acting as descriptor tagged to the array
* Maybe for some dtypes with their own storage management,
this would not be true (but maybe it would be true, since
if the storage changes it is probably some kind of shared
storage of the dtype type and not the descriptor)
* Dtype comparison and hash
* Allowing to override:
* slots (ArrFuncs) from python and C.
* Inner loop implementations
* Probably at instanciation time, so that a python
callback can be set in the "slot" level (I guess this is
what python does as well).
* Casting and promotion rules
* Decision needed: How do subclasses inherit these?
* Should a new type be allowed to make an impossible loop
possible easily? (thoughts below depend on this!)
* Say a unit could say: `timedelta * timedelta -> unit[timedelta**2]`
although datetime does not know this may be plausible.
* Thinking more about this: I think probably not immediately, but,
if we allow registering arbitrary loops, it is (depends on
who we ask).
* Is it OK, if import order/loop adding order can affect results?
* It would be nice to stay close to `__array_ufunc__`
when it comes to implementation.
* Assuming we go that way, we probably need caching?
* If we have a `unit[m]` is the casting handled on the
* Currently I do not think promotion rules really exist
aside from "`result_type`" (and the less its ideal scalar
* Promotion rules may be ufunc dependend, but `np.result_type`
does try to do handle them generically.
* While value based promotion sucks, it would be good to be
able to do it?! (also affects possible caching)
* Promotion could be handled by the ufunc loop:
* Input dtypes may not realize a loop exists?
* Could ask `np.result_type` *common type* (promotion) on failure
(this is what Julia does I think)
* `np.result_type` like *common type* operation is also needed
* i.e. for `concatenate`
* Casting tables could solve this (save-casting graph of depth 1?)
* Ask all involved dtypes (but what if there is a more awesome one
which knows how to cast them all, say a unit which understands
ints and datetime)
* Ufunc hook:
* Dtypes should be able to reuse existing inner-loops.
* Do we need to expose them somehow (including to python
or is it enough to have a way to do this for subclassing).
* But need to be able to run setup and teardown code?
* Reason: Units need to check and find the input
and find descriptor. However, after they did this, they
can simply use the normal python inner-loops.
* Currently, we use the the type resolver ufunc
functions for this. *It needs to be possible to
inject logic here for custom dtypes!*
* Dtype hooks run after `__array_ufunc__`
(just noting seems logical and pretty obvious).
* Storage of metadata (e.g. similar to units)
* Storage of additional data ragged arrays/variable length strings.
* "Reference counting"
* Some DTypes can require reference counting like hooks
on `PyArray_XDECREF`, etc.
([issue requesting such a feature](https://github.com/numpy/numpy/issues/10721))
* Associated with a scalar type, this is currently only done through `dtype.type`
(so already exists).
* Extensible `ArrFunctions`:
* Example: New sorting implementations
([current timsort hack](https://github.com/numpy/numpy/pull/12945)).
* We must be free to add new slots/methods to dtypes
(although sometimes we should likely move them to functions)
##### Other Possible Requirements:
* `dtype` parsing capability, for `dtype="MyType[unit]"`.
(Frankly, I am *not* convinced of this at all, but it can wait
in any case. Mentioned in https://github.com/numpy/numpy/wiki/Dtype-Brainstorming)
* "fused types" on the python level as mentioned by Matti and
* `np.array(b, dtype=np.blasable)` for `(s, d, c, z)`
* `np.array(b, dtype=np.floating)`
Are these like ABCs that types register to? Is this similar to
a flexible type?
They could also be used for casting...
* `np.load` and `np.save` would require pickling for user dtypes,
that seems very annoying.
<details> <summary> <b> List of (current) ArrFunction slots/attributes </b> </summary>
* get- and setitem (Python/PyObject coercion casting)
* copyswap and copyswapn (copying dtype)
* compare (comparison function, like Python `__cmp__` slots)
* dot product
* Scanfunc (parsing ascii file)
* fromstr (parse a single string)
* arange (FillFunc) implementation
* fill with scalar (fillwithscalar)
* Sorting and Argsorting implementations (fixed size)
* "Fast" functions:
* clip, putmask, take
###### Other slots:
* casting dictionary (castdict)
* Cast between user types
* Casting rules:
* ScalarKindFunc (same kind casting)
* can cast scalar kind to
* Item Refcount
* Has Object.
* Convert to list for pickling (TODO: What is this?)
* Is pointer: Item is a pointer (For extension types?)
* Needs Init (e.g. objects need initialization, no "empty")
* Needs PyAPI
* Use Getitem/Setitem for extraction 0-D array from scalar
* Rational/Usertypes in general need to define this I think?
* Object arrays use this
* (Aligned Struct)
Possibly nice to haves?
* `isinstance(scalar, descriptor)` could be nice, but that
does not necessarily mean that `descriptor is scalar_type`?
* We already have `(dtype|descriptor).type` which may be
easier to think of in any case.
* I am also wondering if we can make such things as a
`decimal_dtype`, in a sense specialize python types
(which mostly would mean asserting the output type and
possibly allowing to expose some of its methods)
* A place to put dtype specific ufuncs?
* (Seb.) A convenient/standard where to put something like methods?
MyDate.normalize(arr) # Use dtype namespace?
# Could expose as a method-like (Thought for later)
* Should also work for Operators/dunder methods.
*The specific propals are outdated and need some thoughts.*
1. Lets not worry about making descriptor a type (and thus the same as scalars). This *could* probably be added later, but I/we are not convinced it is a good idea.
2. The exact steps still need to be decided, but:
* We should aim to keep close to other APIs (`__array_ufunc__`)
Proposed API for subclassing numpy arrays. These are really brain storming right now:
def __promote__(self, other):
"""Used by np.promote_types, identical to result_type
but without scalar logic.
if isinstance(other, self):
# make sure to give priority to subclasses:
self, other = other, self
raise TypeError("cannot promote to common type.")
itemsize = 8
type = pyunit
# what is specfically needed here?
def __item_unpack__(self, val):
"""How to do this as low level loops?"""
def __item_pack__(self, obj):
if not isinstance(obj, self.type):
@classmethod # not sure?
def __promote__(cls, dt1, dt2):
"""By default, checks can_cast(self, other)"""
return np.result_type(self, other)
def __can_cast__(cls, dt1, dt2, casting="safe"):
return True or False # or unit["m"], loop?
def __get_loop__(self, ufunc, in_types, out_types):
Alternatively, register loops and:
* check exact loops → promote → check loops again
* (loop could still refuse)
# probably largely C-slots, but could fill from python
setup_loop = None # i.e. FloatClearErr
teardown_loop = None # i.e. FloatCheckErr
needs_api # Flag (or allow setup to set/override?)
identity = NotImplemented
# more flags to add
# Other or even extendable things?:
specialized_inner_loops # contiguous (copy code), AVX?
Casting inner loops, basically seem like ``"Type1->Type2"`` ufuncs,
so whatever API we end up using, unary ufunc calls and the final
casting call should likely look identical. In fact, they could
probably be identical.
#### Details about promotion
Promotion is necessary if there is no ufunc loop for the specific types.
Pushing it to the objects, may also allow to hack value based promotion
that we currently are stuck with.
We do have some "special" promotion rules currently hacked in, first
thing that comes to mind is integer addition using at least `long` precision.
*Should check current `TypeResolution` functions for other special cases.*
The question is how to split it up. We currently have `np.result_type` and
it would be nice if that can just call the `__promote__` logic.
*This is also used elsewhere, i.e. in concatenate*
If the promotion gets additional information, it could handle most
most of the things done in setup (see below), making the default setup
basically a no-op.
##### Other things to keep in mind:
* Flexible dtypes may be an issue. If promotion does not know about
the ufunc, it may have to return a generic string (or something like a
generic string user dtype).
* One possible thought: Have `dtype=IntWithUnit` but the unit still
is flexible? Note that this is likely so corner case,
I am not sure we have to worry about it.
* Flexible types are an option to handle promotion though:
* `unit * unit -> unit`, and the ufunc setup time check
decides which unit.
* Special rule for addition reduction (sum)! This is hardcoded "promotion"
loop selection logic, which is hard to represent (unless you pass `method="reduce"`
to some dtype/loop related setup).
#### Existing Implemenations:
<details><summary> Julia and XND </summary>
* Generally use "exact" signature
* promotion rules can be registered `type1, type2 → typeX` with `promote`
* `promote` can promote any number of arguments of course
* Super type (`Number`) implementation of functions (math operators):
* only used if no more specific implemtation found
* calls `promote(*args)` and tries again
* Does not know about promotion as such.
* gumath seems to simply stuff promotion into the ufunc
loops, i.e. you register all loops that you may want to use.
(I do not think there is casting involved at all in function calls?
For xnd, the interesting part is resolving the shape part of the
#### Details about casting
Note that unsafe casting. We currently have:
* unsafe casting
* save casting
* same kind casting
* equivalent (byte order changes)
* No casting
And python coercion (can be seen as unsafe casting to/from PyObject):
* *to PyObject* (a special type of casting used by `item`)
* *from PyObject* (maybe these are identical to unsafe casting)
I think it may be possible to get around `same_kind` casting, but
maybe there is also not much reason for it.
There is also the concept of *construction* which we may also need for
`np.array(..., dtype=dtype)` logic. [Julias Reasoning](https://docs.julialang.org/en/v1/manual/conversion-and-promotion/#Conversion-vs.-Construction-1) although possibly it can be seen as unsafe casting?
###### Other Notes:
* Should the inner loop dispatcher be allowed to override casting?
This may be useful to e.g. allow unsafe casting when the existing loop
you want to reuse has a for example a more precise output dtype.
* Inheritence/chained casting:
* `MyInt(int)` could inherit casting from `int`, but often that does
probably not make sense. However, unsafe casting may make sense.
* Type1 → Type2 → Type3 may be defined, but not Type1 → Type2
(Probably we should just not – accidentally – allow this?)
* Dtype discovery for `np.array` (unsupported for user types?)
#### Details about setup/teardown
At some point, we have to discover which inner loop to use and give
the ufunc a chance to set other information:
* Discover the correct output dtype (to get there, promotion of the input
dtypes may be necessary!).
* Since some loops may be mixed and others are not (timedelta * float
is OK, but timedelta * timedelta is not), it seems like pushing
this into promotion may be simpler (promotion would know the ufunc!?)
* Return the inner loop function (in some form or another)
* It could be plausible to just return inner loop types, such as
`"f,f->f"` which together with the ufunc name defines the inner loop
* More likely: expose `np.add.loop["f,f->f"]` using some PyCapsule
style wrapper object.
* Set whether the inner loops requires the Py-API (maybe tagged onto the
inner loop object itself, e.g. python implemented ones always need anyway,
ours will never need).
* Run additional setup code:
* Setup working memory
* Clear error flags
* Setup a teardown function to:
* Free working memory
* Check error flags and give warnings, or raise errors
* Kwargs to ufuncs, foward to inner-loops?
* paramter type arguments (e.g. precision argument)
* Resolved during ufunc setup?
* Broadcastable arguments seem difficult/out of scope
(Example: `np.clip(arr, minval=None, maxval=None)`)
* `arr.view(new_dtype)` is buggy if types use their own storage area:
* Could have a flag/reuse HASREF (probably have to?)
* (Other things to keep in mind?)
* Depending on type may or may not make sense?
(object → object is OK, but for a type with metadata and refs
it can probably go both ways)
### What needs to happen if we call a ufunc:
1. Ask dtypes if they implement a ufunc loop that should be used
* TypeResolution step returns the loop which should be used
and thus output dtype. Although that could still be cast in principle.
* *If none found*: Additional promotion step here (like Julia)?
2. Decide if casting input to the output can be handled.
3. Run the inner-loop:
1. Allow for inner-loop specific setup:
* i.e. FPU error clearing
3. Run inner loop until finished or stop flag given
4. Allow for inner-loop specfic teardown:
* i.e. FPU error checking
→ What type of access should we give these step? E.g. access values?
1. TypeResolver of ufunc gets run (currently do not really know about user types)
* This typically calls `ResultType` to find the output type
* (often) does a linear search on the existing loops
(unless specific TypeResolver for the function)
2. Loop selector is run, this is set for the ufunc object:
* Finds the actual loop
* Can force `needs_api` (otherwise the iterator will decide)
4. Ufunc machinery decides on what casting is necessary
5. Runs the loop:
* Run the loop until it finishes (except breaking on PyErrors)
* Check for floating point errors (when done)
*Main issue:* Everything is tagged onto the ufunc object, so ufunc object would have to ask the dtypes specifically.
### C-API vs. Python API
* Make wrapping elementwise functions into inner-loops easy
* Probably similar to `np.from_pyfunc` except it would return some
* Type annotations could be nice to use here as well.
* Provide a way to register/give existing inner-loops (or say cython
defined inner loops easily).
* It should be OK to have a python object that implements a few
fast loops in Cython and tags them on from python during instanciation
* Something like PyCapsule, or NpyInnerLoopCapsule?:
* Requires API flag?
* Leave room for other flags? (e.g. optimization hints
or even alignment requirements)
From the python side, it may be an alternative to simply view the array
and then call `np.add` again, OTOH that may force some consistency
checking/wrapping to make sure the python side cannot just return a wrong
dtype or shape.
* Inner-loop registration/capsuling should have a similarity.
* Need to make existing inner-loops "capsules" available probably?
Arguments for and against making descriptors types
* Feels logical for simple scalars and `float32_arr.dtype(3)`
and `isinstance(float32_arr, float32_arr.dtype)` is nice.
* It is probably possible to change later without a large compatibility
* Implementation needs Metaclasses, which are somewhat harder to reason about
(Although it is also not very hard probably).
* Say I want to create a dtype for `decimal.Decimal`. It is not
possible to change `Decimal` to add numpy specific information, so
off-loading it into a dtype/descriptor with `npdecimal.type is Decimal`
* (There is at least some discussion, whether scalars are even a good idea
#### Ufunc properties
* Ufuncs have an `identity` should this move to the loop implementation
(or be overridable)? The loop implementation knows the correct output type.
(It could live on the dtype, but that seems unnecessary/strange?)
#### Ufunc signatures
The ufunc inner loops are pretty limited right now. I am (personally) not
in favor of bloating the API too much, but we may want to add some things
to it while we are at it:
1. Possibly a return value to signal:
2. Better payload/metadata values to use for custom dtypes, these
may already fit into the current pointer we have. But are definitely
3. Possibly more fields related/in gufuncs?
Plausibly, the old ufuncs could recieve a very lightweight wrapper