NumPy Dtype Requirements and Approaches

--- title: NumPy Dtype Requirements and Approaches author: Sebastian Berg tags: NumPy, Dtypes --- *Note that I continued part of this in new documents which are not yet online, but it should be relevant since because it lists many things (including some smaller side notes)* **Prepared Document for Meeting:** https://hackmd.io/B5TPPP-8QiKOFODPqmXtfA ## TODO Document restructuring/new document I will probably try to split the document up into: 1. Requirements (possibly split into different sections) 2. Decisions that we probably need to make 3. Either in 2. or seperately, some suggestions for it. It may be interesting to have some python dummy implementation... # NumPy Dtype Requirements and Approaches This is a work in progress to try to structure my thoughts. It may be a bit random and some of these are probably premature. **Please feel free to change/edit this!** Nomenclature ------------ Depending on how types/instances are created, things can be confusing, so I would suggest to call: * **dtype**: A class (type or not) * **descriptor** (or **a dtype**): object tagged onto the array (an instance of dtype) * **scalar type**: A type which, when instanciated, provides the scalar instance * a **scalar**: instance of scalar type. Just to note, the names below are just working names, and they would all need an `__array_` or `__numpy_` prefix of course. Related Documents ----------------- Please try to list all related documents here (in case I missed some): * https://github.com/numpy/numpy/pull/12630 * Matti's NEP, discusses the technical side of subclassing more from the side of `ArrFunctions` * https://hackmd.io/ok21UoAQQmOtSVk6keaJhw and https://hackmd.io/s/ryTFaOPHE * Thoughts by Matti and Eric about dtype subclassing * (2019-04-30) Erics is probably the furthest along for subclassing implementation approach. * https://github.com/numpy/numpy/issues/12518 * Discussion about the calling convention of ufuncs, but also includes teardown/setup needs. * https://github.com/BIDS-numpy/docs/blob/master/meetings/2018-11-30-dev-meeting.md and [NEP: high level data types and universal functions](https://hackmd.io/6YmDt_PgSVORRNRxHyPaNQ) * BIDS Meeting on November 30, 2018 and document by Stephan Hoyer about what numpy should provide and thoughts of how to get there. Meeting with Eric Wieser, Matti Pincus, Charles Harris, and Travis Oliphant. * Important summaries of use cases. * [SciPy 2018 brainstorming session](https://github.com/numpy/numpy/wiki/Dtype-Brainstorming) * Good list of user stories/use cases. * Lists some requirements and some ideas on implementations * [BIDS talk by Nathaniel](https://www.youtube.com/watch?v=fowHwlpGb34) from Oct. 2017. (Mostly historically, not much information on this topic) * [xnd-project](https://github.com/xnd-project) with ndtypes and gumath * Does not implement promotion rules, instead loops are registered for all variations. * Does it even allows (easy) definition of new dtypes? * [`__array_ufunc__` NEP](https://www.numpy.org/neps/nep-0013-ufunc-overrides.html): * May be interesting to keep APIs similar. * [similar document to this](https://hackmd.io/ThZt5S7iSXWfcPd_l3E-nw), I started * was a try/start to make things a bit structured based on what needs to happen when e.g. ufunc calls occur. Descriptor (dtype instance) Requirements ---------------------------------------- * Provides an immutable instance acting as descriptor tagged to the array * Maybe for some dtypes with their own storage management, this would not be true (but maybe it would be true, since if the storage changes it is probably some kind of shared storage of the dtype type and not the descriptor) * Dtype comparison and hash * Allowing to override: * Representation * slots (ArrFuncs) from python and C. * Inner loop implementations * Probably at instanciation time, so that a python callback can be set in the "slot" level (I guess this is what python does as well). * Casting and promotion rules * Decision needed: How do subclasses inherit these? * Should a new type be allowed to make an impossible loop possible easily? (thoughts below depend on this!) * Say a unit could say: `timedelta * timedelta -> unit[timedelta**2]` although datetime does not know this may be plausible. * Thinking more about this: I think probably not immediately, but, if we allow registering arbitrary loops, it is (depends on who we ask). * Is it OK, if import order/loop adding order can affect results? * It would be nice to stay close to `__array_ufunc__` when it comes to implementation. * Assuming we go that way, we probably need caching? * If we have a `unit[m]` is the casting handled on the unit level? * Currently I do not think promotion rules really exist aside from "`result_type`" (and the less its ideal scalar rules). * Promotion rules may be ufunc dependend, but `np.result_type` does try to do handle them generically. * While value based promotion sucks, it would be good to be able to do it?! (also affects possible caching) * Promotion could be handled by the ufunc loop: * Input dtypes may not realize a loop exists? * Could ask `np.result_type` *common type* (promotion) on failure (this is what Julia does I think) * `np.result_type` like *common type* operation is also needed * i.e. for `concatenate` * Casting tables could solve this (save-casting graph of depth 1?) * Ask all involved dtypes (but what if there is a more awesome one which knows how to cast them all, say a unit which understands ints and datetime) * Ufunc hook: * Dtypes should be able to reuse existing inner-loops. * Do we need to expose them somehow (including to python or is it enough to have a way to do this for subclassing). * But need to be able to run setup and teardown code? * Reason: Units need to check and find the input and find descriptor. However, after they did this, they can simply use the normal python inner-loops. * Currently, we use the the type resolver ufunc functions for this. *It needs to be possible to inject logic here for custom dtypes!* * Dtype hooks run after `__array_ufunc__` (just noting seems logical and pretty obvious). * Storage of metadata (e.g. similar to units) * Storage of additional data ragged arrays/variable length strings. * "Reference counting" * Some DTypes can require reference counting like hooks on `PyArray_XDECREF`, etc. ([issue requesting such a feature](https://github.com/numpy/numpy/issues/10721)) * Associated with a scalar type, this is currently only done through `dtype.type` (so already exists). * Extensible `ArrFunctions`: * Example: New sorting implementations ([current timsort hack](https://github.com/numpy/numpy/pull/12945)). * We must be free to add new slots/methods to dtypes (although sometimes we should likely move them to functions) ##### Other Possible Requirements: * `dtype` parsing capability, for `dtype="MyType[unit]"`. (Frankly, I am *not* convinced of this at all, but it can wait in any case. Mentioned in https://github.com/numpy/numpy/wiki/Dtype-Brainstorming) * "fused types" on the python level as mentioned by Matti and [this PR](https://github.com/numpy/numpy/pull/5634): * `np.array(b, dtype=np.blasable)` for `(s, d, c, z)` * `np.array(b, dtype=np.floating)` Are these like ABCs that types register to? Is this similar to a flexible type? They could also be used for casting... * `np.load` and `np.save` would require pickling for user dtypes, that seems very annoying. <details> <summary> <b> List of (current) ArrFunction slots/attributes </b> </summary> * get- and setitem (Python/PyObject coercion casting) * copyswap and copyswapn (copying dtype) * compare (comparison function, like Python `__cmp__` slots) * argmax/argmin * dot product * Scanfunc (parsing ascii file) * fromstr (parse a single string) * nonzero * arange (FillFunc) implementation * fill with scalar (fillwithscalar) * Sorting and Argsorting implementations (fixed size) * "Fast" functions: * clip, putmask, take ###### Other slots: * casting dictionary (castdict) * Cast between user types * Casting rules: * ScalarKindFunc (same kind casting) * can cast scalar kind to ###### Flags * Item Refcount * Has Object. * Convert to list for pickling (TODO: What is this?) * Is pointer: Item is a pointer (For extension types?) * Needs Init (e.g. objects need initialization, no "empty") * Needs PyAPI * Use Getitem/Setitem for extraction 0-D array from scalar * Rational/Usertypes in general need to define this I think? * Object arrays use this * (Aligned Struct) </details> Possibly nice to haves? ----------------------- * `isinstance(scalar, descriptor)` could be nice, but that does not necessarily mean that `descriptor is scalar_type`? * We already have `(dtype|descriptor).type` which may be easier to think of in any case. * I am also wondering if we can make such things as a `decimal_dtype`, in a sense specialize python types (which mostly would mean asserting the output type and possibly allowing to expose some of its methods) * A place to put dtype specific ufuncs? * (Seb.) A convenient/standard where to put something like methods? ```python MyDate.normalize(arr) # Use dtype namespace? # Could expose as a method-like (Thought for later) arr.for_each_element.normalize() ``` * Should also work for Operators/dunder methods. Proposals --------- *The specific propals are outdated and need some thoughts.* 1. Lets not worry about making descriptor a type (and thus the same as scalars). This *could* probably be added later, but I/we are not convinced it is a good idea. 2. The exact steps still need to be decided, but: * We should aim to keep close to other APIs (`__array_ufunc__`) 3. (More?) Proposed API for subclassing numpy arrays. These are really brain storming right now: ```python class dtype(object): def __promote__(self, other): """Used by np.promote_types, identical to result_type but without scalar logic. """ if isinstance(other, self): # make sure to give priority to subclasses: self, other = other, self if self.__can_cast__(other): return other if other.__can_cast__(self): return self raise TypeError("cannot promote to common type.") class unit(dtype): itemsize = 8 type = pyunit def __new__(self): # what is specfically needed here? pass def __item_unpack__(self, val): """How to do this as low level loops?""" return self.type.from_bytes(val) def __item_pack__(self, obj): if not isinstance(obj, self.type): raise ValueError return obj.to_bytes() @classmethod # not sure? def __promote__(cls, dt1, dt2): """By default, checks can_cast(self, other)""" return np.result_type(self, other) @classmethod def __can_cast__(cls, dt1, dt2, casting="safe"): return True or False # or unit["m"], loop? @classmethod def __get_loop__(self, ufunc, in_types, out_types): """ Alternatively, register loops and: * check exact loops → promote → check loops again * (loop could still refuse) """ if not_possible: return NotImplemented return UfuncLoop class UfuncLoop: # probably largely C-slots, but could fill from python inner_loop setup_loop = None # i.e. FloatClearErr teardown_loop = None # i.e. FloatCheckErr needs_api # Flag (or allow setup to set/override?) identity = NotImplemented # more flags to add # Other or even extendable things?: specialized_inner_loops # contiguous (copy code), AVX? ``` Casting inner loops, basically seem like ``"Type1->Type2"`` ufuncs, so whatever API we end up using, unary ufunc calls and the final casting call should likely look identical. In fact, they could probably be identical. #### Details about promotion Promotion is necessary if there is no ufunc loop for the specific types. Pushing it to the objects, may also allow to hack value based promotion that we currently are stuck with. We do have some "special" promotion rules currently hacked in, first thing that comes to mind is integer addition using at least `long` precision. *Should check current `TypeResolution` functions for other special cases.* The question is how to split it up. We currently have `np.result_type` and it would be nice if that can just call the `__promote__` logic. *This is also used elsewhere, i.e. in concatenate* If the promotion gets additional information, it could handle most most of the things done in setup (see below), making the default setup basically a no-op. ##### Other things to keep in mind: * Flexible dtypes may be an issue. If promotion does not know about the ufunc, it may have to return a generic string (or something like a generic string user dtype). * One possible thought: Have `dtype=IntWithUnit` but the unit still is flexible? Note that this is likely so corner case, I am not sure we have to worry about it. * Flexible types are an option to handle promotion though: * `unit * unit -> unit`, and the ufunc setup time check decides which unit. * Special rule for addition reduction (sum)! This is hardcoded "promotion" loop selection logic, which is hard to represent (unless you pass `method="reduce"` to some dtype/loop related setup). #### Existing Implemenations: <details><summary> Julia and XND </summary> ###### Julia: * Generally use "exact" signature * promotion rules can be registered `type1, type2 → typeX` with `promote` * `promote` can promote any number of arguments of course * Super type (`Number`) implementation of functions (math operators): * only used if no more specific implemtation found * calls `promote(*args)` and tries again ###### xnd-project: * Does not know about promotion as such. * gumath seems to simply stuff promotion into the ufunc loops, i.e. you register all loops that you may want to use. (I do not think there is casting involved at all in function calls? For xnd, the interesting part is resolving the shape part of the datashape probably.) </details> #### Details about casting Note that unsafe casting. We currently have: * unsafe casting * save casting * same kind casting * equivalent (byte order changes) * No casting And python coercion (can be seen as unsafe casting to/from PyObject): * *to PyObject* (a special type of casting used by `item`) * *from PyObject* (maybe these are identical to unsafe casting) I think it may be possible to get around `same_kind` casting, but maybe there is also not much reason for it. There is also the concept of *construction* which we may also need for `np.array(..., dtype=dtype)` logic. [Julias Reasoning](https://docs.julialang.org/en/v1/manual/conversion-and-promotion/#Conversion-vs.-Construction-1) although possibly it can be seen as unsafe casting? ###### Other Notes: * Should the inner loop dispatcher be allowed to override casting? This may be useful to e.g. allow unsafe casting when the existing loop you want to reuse has a for example a more precise output dtype. ##### Issues/Discussion?: * Inheritence/chained casting: * `MyInt(int)` could inherit casting from `int`, but often that does probably not make sense. However, unsafe casting may make sense. * Type1 → Type2 → Type3 may be defined, but not Type1 → Type2 (Probably we should just not – accidentally – allow this?) * Dtype discovery for `np.array` (unsupported for user types?) #### Details about setup/teardown At some point, we have to discover which inner loop to use and give the ufunc a chance to set other information: * Discover the correct output dtype (to get there, promotion of the input dtypes may be necessary!). * Since some loops may be mixed and others are not (timedelta * float is OK, but timedelta * timedelta is not), it seems like pushing this into promotion may be simpler (promotion would know the ufunc!?) * Return the inner loop function (in some form or another) * It could be plausible to just return inner loop types, such as `"f,f->f"` which together with the ufunc name defines the inner loop * More likely: expose `np.add.loop["f,f->f"]` using some PyCapsule style wrapper object. * Set whether the inner loops requires the Py-API (maybe tagged onto the inner loop object itself, e.g. python implemented ones always need anyway, ours will never need). * Run additional setup code: * Setup working memory * Clear error flags * Setup a teardown function to: * Free working memory * Check error flags and give warnings, or raise errors * Kwargs to ufuncs, foward to inner-loops? * paramter type arguments (e.g. precision argument) * Resolved during ufunc setup? * Broadcastable arguments seem difficult/out of scope (Example: `np.clip(arr, minval=None, maxval=None)`) * `arr.view(new_dtype)` is buggy if types use their own storage area: * Could have a flag/reuse HASREF (probably have to?) * (Other things to keep in mind?) * Depending on type may or may not make sense? (object → object is OK, but for a type with metadata and refs it can probably go both ways) API --- ### What needs to happen if we call a ufunc: 1. Ask dtypes if they implement a ufunc loop that should be used * TypeResolution step returns the loop which should be used and thus output dtype. Although that could still be cast in principle. * *If none found*: Additional promotion step here (like Julia)? 2. Decide if casting input to the output can be handled. 3. Run the inner-loop: 1. Allow for inner-loop specific setup: * i.e. FPU error clearing 3. Run inner loop until finished or stop flag given 4. Allow for inner-loop specfic teardown: * i.e. FPU error checking → What type of access should we give these step? E.g. access values? #### Currently: 1. TypeResolver of ufunc gets run (currently do not really know about user types) * This typically calls `ResultType` to find the output type * (often) does a linear search on the existing loops (unless specific TypeResolver for the function) 2. Loop selector is run, this is set for the ufunc object: * Finds the actual loop * Can force `needs_api` (otherwise the iterator will decide) 4. Ufunc machinery decides on what casting is necessary 5. Runs the loop: * Run the loop until it finishes (except breaking on PyErrors) * Check for floating point errors (when done) *Main issue:* Everything is tagged onto the ufunc object, so ufunc object would have to ask the dtypes specifically. ### C-API vs. Python API Some thoughts: #### Py-API: * Make wrapping elementwise functions into inner-loops easy * Probably similar to `np.from_pyfunc` except it would return some capsule. * Type annotations could be nice to use here as well. * Provide a way to register/give existing inner-loops (or say cython defined inner loops easily). * It should be OK to have a python object that implements a few fast loops in Cython and tags them on from python during instanciation time? * Something like PyCapsule, or NpyInnerLoopCapsule?: * Requires API flag? * Leave room for other flags? (e.g. optimization hints or even alignment requirements) From the python side, it may be an alternative to simply view the array and then call `np.add` again, OTOH that may force some consistency checking/wrapping to make sure the python side cannot just return a wrong dtype or shape. #### C-API: * Inner-loop registration/capsuling should have a similarity. * Need to make existing inner-loops "capsules" available probably? * … Arguments for and against making descriptors types ------------------------------------------------- Pro: * Feels logical for simple scalars and `float32_arr.dtype(3)` and `isinstance(float32_arr[0], float32_arr.dtype)` is nice. Against: * It is probably possible to change later without a large compatibility break. * Implementation needs Metaclasses, which are somewhat harder to reason about (Although it is also not very hard probably). * Say I want to create a dtype for `decimal.Decimal`. It is not possible to change `Decimal` to add numpy specific information, so off-loading it into a dtype/descriptor with `npdecimal.type is Decimal` seems easier? * (There is at least some discussion, whether scalars are even a good idea within numpy.) Related Changes --------------- #### Ufunc properties * Ufuncs have an `identity` should this move to the loop implementation (or be overridable)? The loop implementation knows the correct output type. (It could live on the dtype, but that seems unnecessary/strange?) #### Ufunc signatures See: https://github.com/numpy/numpy/issues/12518 The ufunc inner loops are pretty limited right now. I am (personally) not in favor of bloating the API too much, but we may want to add some things to it while we are at it: 1. Possibly a return value to signal: * StopIteration * Error 2. Better payload/metadata values to use for custom dtypes, these may already fit into the current pointer we have. But are definitely necessary. 3. Possibly more fields related/in gufuncs? Plausibly, the old ufuncs could recieve a very lightweight wrapper

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.