or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
Weekly Xarray Flexible Indexes Refactor Meeting Notes
Jan 11 – Stephan: do we have a meeting today?
Deepak: I guess not?
Stephan: Yes, I guess not. I emailed Benoit.
December 7, 2021
Refactor
concat
(_dataset_concat
internals):_calc_concat_dim_coord
PandasIndex
instance too._calc_concat_over
concat_over
, append to the latter all coordinate names related to the index (even coordinates that won't be concatenated). There's no need to merge coordinate variables that could possibly be created from the concatenated meta-index (i.e., returned byIndex.concat()
).collect_variables_and_indexes
andmerged_collected
directly for convenience (not sure about the impact on performance…)collect_variables_and_indexes
so that we can skipconcat_over
Index.concat()
is implemented) and create new coordinate variables from the resulting index (ifIndex.create_variables()
is implemented)concat_vars()
November 23, 2021
Useful identity:
ds == concat([ds.isel(x=slice(5)), ds.isel(x=slice(5, None))], dim='x')
Do we need to refactor
concat
now? (IMO: yes)ds.indexes
.Dataset.__init__()
concat
works with one variable at a time)Refactor
concat
:Index.concat
abstract class method, which basically does the same internally thanIndexVariable.concat
. The latter won't be needed anymore.Index.concat
is not implemented, index is dropped in the resultingDataset
orDataArray
and fallback to coordinate variable(s)concat
.merge
implementation, still loop trough variables and cacheconcat
of multi-coordinate indexesconcat
, so extracting the indexes to concat together should be relatively straightforwardNovember 09, 2021
Status: https://github.com/pydata/xarray/pull/5692
TODO:
concat
(+ dependents likegroupby
(combine) andto_stacked_array
)October 26, 2021
merge_core
: check for potential index conflicts whenprioritized is not None
?create_index
option for.stack()
October 19, 2021
https://github.com/pydata/xarray/issues/5647
October 12, 2021
https://github.com/pydata/xarray/issues/5647#issuecomment-937865982
October 5, 2021
September 28, 2021
Status: https://github.com/pydata/xarray/pull/5692
TODO:
Remaining issues (70+ failing tests):
swap_dims
,update
, etc.)swap_dims
,to_stacked_array
,polyfill
,isel
, etc.)Implicit Creation of coordinates from multi-index:
Should Xarray indexes provide optional implementation of
isel
?Index.query
toIndex.sel
? And addIndex.isel
?Assign / update coordinates: maybe drop (multi-coordinate) indexes
Allow non-1D variables with a name that matches one of their dimensions.
Remove
.level_coords
September 14, 2021
Status: https://github.com/pydata/xarray/pull/5692
rename_*
set_index
,reset_index
,reorder_levels
(notebook showcase)
Small changes in behavior (no API change), mostly to get rid of "hacky" workarounds due to the limitations of the "index/dimension" coordinates concept (we don't need it anymore with the new data model). Is it ok to introduce those changes now? Or should we go through some smooth transition?
See https://github.com/pydata/xarray/issues/4825#issuecomment-916974087
September 7, 2021
Index.query API
https://github.com/pydata/xarray/pull/5692#issuecomment-914207743
August 31, 2021
Explicit indexes
https://github.com/pydata/xarray/pull/5692
Normalize label-based indexers before passing them to
Index.query
?https://github.com/pydata/xarray/issues/5697
(Benoît): probably better to let indexes take care of normalization.
Variable
orDataArray
)A lot of flexibility but also a lot of responsibility for xarray
Index
If repetitive patterns emerge in custom indexes implementations, we could eventually implement some convenient layer via
Query
andQueryResult
classes used as interface toIndex.query
.August 24, 2021
Explicit indexes, the big PR: https://github.com/pydata/xarray/pull/5692
PandasIndex
andPandasMultiIndex
classesThis refactoring has lots of impacts, fix the broken tests will probably require update and/or refactor things like:
rename
vars / dimsset_index
/reset_index
stack
/unstack
sel
At this point, I'm wondering whether we should depreciate a couple of things now or later:
Currently, level coordinates are implicitly created from the multi-index. Suggested behavior: treat the index as an array-like (single coordinate) by default and encourage passing it more explicitly, e.g., something like
Maybe OK?
Existing code (not ideal):
IndexVariable.level_names
andIndexVariable.get_level_variable
Maybe #5732 too?
August 10, 2021
https://discourse.pangeo.io/t/handling-slicing-with-circular-longitude-coordinates-in-xarray/1608
Possible "explosion" of custom indexes? Very complex "meta-indexes"? Index overlapping goals & features. How best to avoid the development of a messy ecosystem built on top of xarray.Index?
An example of geo-aware indexes:
Alternatives? Some sort of protocol? Data model allowing coordinates with multiple indexes?
August 3, 2021
Xarray index vs. variables API
Option A
Every operation that returns a new
xarray.Index
(e.g., set a new index from variables, selection, join) may (or should?) also return newIndexVariable
object(s). If no variable is returned, the input variables or the original index variables will be reused (maybe converted intoIndexVariable
objects).copy
and__getitem__
? Seems not conventional to return anything other than a newIndex
object. But if we copy indexes and variables independently, we could unsync them, i.e.,rename
? If we don't rename the underlying index in-place, e.g.,pd.Index.rename(inplace=False), we may need to return new variables as well.
Option B
Alternatively, use an
xarray.Index.coords
property? But API more confusing, separation of concerns is less clear, how to check that new variables are returned, etc.July 27, 2021
https://github.com/pydata/xarray/pull/5636
July 20, 2021
https://github.com/pydata/xarray/issues/5553
June 29, 2021
xarray.Index
<-> a set of coordinatesxarray.Index
instance be shared among different xarrayDataArray
/Dataset
objects?xarray.Index
internally needs information like the coordinate name(s), dimension(s), shape(s), etc. (e.g.,xoak
use case: back and forth transformation of n-d coordinate arrays from/to a 2-d array of shape(npoints, ncoordinates)
)xarray.Index
should be tightly coupled to aDataArray
/Dataset
?xarray.Index
anymore into anxarray.IndexVariable
xarray.Index
's.from_variables
and.to_variables
API?set_index(..., append=True)
)? Using a.from_variables
class method may not be the best option… Stick with__init__
?Option A:
from_variables
/to_variables
APIOption B: constructor signature +
coords
attributeJune 22, 2021
set_index
API? See notesassign_indexes
that looks likeassign_coords
(accepts a mapping)set_xindex
? (like.indexes
vs.xindexes
properties).xindexes[("x", "y")] = CRSIndex(...)
Dataset
/DataArray
constructors:coords
?June 8, 2021
Current status: all index-based operations go through
xarray.Index
pandas.Index
->xarray.PandasIndex
andpandas.MultiIndex
->xarray.PandasMultiIndex
xarray.Index
'sequal
,union
andintersection
methods for alignmentxarray.Index.query
method for label-based selectionxarray.Index.to_pandas_index()
for all other operations in Xarray that still rely exclusively onpandas.Index
Feedback on https://github.com/pydata/xarray/pull/5322 ?
Next step: update the data model (index <1:many> coordinates and/or dimensions)
PandasIndex
andPandasMultiIndex
for nowIndexVariable
for nowpandas.Index
)set_index
?equal
,union
,intersection
,query
?Dataset.xindexes
(orDataset.indexes
) to return one index per dimension