or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing
xxxxxxxxxx
Zarr-Python Developer Meeting Notes
formerly Zarr-Python Refactor Meeting Notes
April 4, 2025
Agenda
Minutes
March 21, 2025
Minutes
Core topic for the day: https://github.com/zarr-developers/zarr-python/pull/2874
March 7, 2025
Minutes
Davis' Dtypes (https://github.com/zarr-developers/zarr-python/pull/2874)
Versioning policy (https://zarr.readthedocs.io/en/latest/developers/contributing.html#compatibility-and-versioning-policies)
SciPy abstracts that went in
February 28, 2025
Minutes
February 21, 2025
Minutes
February 14, 2025
Agenda
February 7, 2025
Agenda
codec/numcodecs issue: https://github.com/zarr-developers/zarr-python/issues/2800
Need reviews on:
store hypothesis tests (Max ? for Deepak)
next steps for ObjectStore (https://github.com/zarr-developers/zarr-python/pull/1661)
January 31, 2025
Agenda
January 24, 2025
Agenda
January 10, 2025
Agenda
January 3, 2025
Agenda
3.0 release schedule update (Joe)
a.
3.0.0-rc.1
went out yesterdayb. we will publish and socialize the migration guide today
c. we will make the full 3.0.0 release on Thursday Jan 9 at 10a ET
-> during this time, we will focus on documenation and bug fixes (no major feture additions)
release announcement
a. Joe has written a blog post. The full zarr-dev team has comment access here: https://www.notion.so/earthmover/Zarr-Python-3-Release-Blogpost-14b492ee309f80d28af3ebfdeedf96f7
b. sanket will prepare a social media thread
shape of array after the addition of filters/compressors to top-level api
Norman will write a docs section on sharding
create_array
API design notesWe are struggling to find a user-facing API for creating new arrays.
We have decided to create a new top-level API function (
create_array
)to handle this but questions remain about how to provide a simple / intuitive
API that covers both v2 and v3 arrays, and sharded/non-sharded arrays in one API.
This short design note lays out the goals and options we are considering.
goals
non goals
current proposal
This function signature includes parameters that fall into the follwing categories:
Store parameters
store
storage_options
Runtime parameters
order
overwrite
config
data
V3-only parameters
dimension_names
shard_shape
chunk_key_encoding
Generic parameters
name
shape
dtype
chunk_shape
filters
compression
fill_value
attributes
**Note 1: the focus of this document is on the parameters that control how the core array metadata is configured.
**-
shape
dtype
chunk_shape
shard_shape
compression
filters
Note 2: it may be worth grouping the parameters in
create_array
using asimilar framework to above. This will help users navigate this fairly large
parameter space.
Usage examples
minimal example w/o sharding:
this creates an array using default / inferred parameters for zarr_format, chunk_shape, etc., etc.
create sharded array
_this creates a sharded array where chunks are compressed with Zstd
questions
what is the value/justification for providing arguments for
filters
andcompressors
instead of a singlecodecs
parameter? Will we enforce that allfilters
arearray->array
codecs and allcompressors
arebytes->bytes
?@d-v-b –> this seems like the question we need to answer first. How will we des
(d-v-b):
filters
andcompressors
map on to the two types of variadic codecs allowed in the v3codecs
attribute. This makes those parameters simple to parse –filters
must resolve totuple[ArrayArrayCodec, ...]
, andcompressors
must resolve totuple[BytesBytesCodec, ...]
. I think we could have just 1codecs
parameter, but it would need to take a form that allowed separably specifying theArrayArray
andBytesBytes
codecs. Something like this:any missing keys would resolve to the defaults set in the config.
But if
codecs
wastuple[Codec, ...]
then users would be confused, and parsing it would be a headache.(NR): I like
filters
andcompressors
because imo they better convey what the codecs are used for instead of "array->array" or "bytes->bytes" codecs. We should enforce that only the right type is used for both kwargs.what is correct type for the
filters
/compressors
argument? Options include:a. list of strings, e.g.
['gzip']
b. list of dicts, e.g.
[{"name": 'gzip', "configuration": {"level": 4}]
c. list of objects, e.g.
[GZipCodec(level=4)]
(b) and (c) seem like a reasonable choice.
(d-v-b) IMO the only option here is something that unambiguously represents a codec instance, which rules out
a
. If we can make constructing the dict representation of the codecs ergonomic (i.e., autocomplete), then I thinkb
is a pretty nice option, because users don't need to import a bunch of classes to use the create function. but we should also accept the complete codec class instances as well, soc
.(NR): I like
c
best, but also fine withb
. Agree thata
is too ambiguous. I also cleaned that up for the default codecs https://github.com/zarr-developers/zarr-python/commit/5cb6dd8f62ad6ed5391a08535dc05ef9ac50bbadHow do we want to parametrize the partitioning of the array? Right now the PR in question takes two parameters,
chunk_shape: tuple[int, ...] | Literal["auto"]
andshard_shape: tuple[int, ...] | Literal["auto"] | None
. In the interest of backwards compatibility and brevity I would support the nameschunks
andshards
. An alternative API would be to have a single parameter, e.g.,chunking
, that takes:tuple[int, ...]
, (no sharding, regular chunking),{"chunks": tuple[int, ...] | Literal["auto"], "shards": Tuple[int, ...] | Literal["auto"]}
(NR): I'd prefer
chunks
andshards
.chunking: tuple[int, ...] | {"chunks": tuple[int, ...] | Literal["auto"], "shards": tuple[int, ...] | None | Literal["auto"]}
would also be fine. Not really a fan ofauto
, though.(JH): would it help reduce scope to remove auto chunking / chunk/shared alignment from this first version?
(DVB): I don't think the auto chunking / sharding adds a lot of complexity here, and I think it's a big win for usability to have some defaults that "just work" (whether the defaults in my pr actually "just work" is another question). As for
auto
, we need some way of expressing "pick chunks / shards automatically". Often we useNone
to mean "default", but if we are usingshards=None
to denote "no sharding",None
can't mean "default" anymore, and we need to pick another value. I thinkauto
is short and literate but I'd be up for alternatives.December 20, 2024
Release topics:
December 13, 2024
Notes:
Array.__iter__
is slower compared to v2 because v2 loaded the entire array in memory upfront. not a release blockerDecember 6, 2024
Notes:
Discussion points for today:
November 29, 2024
Notes:
members
Array
andGroup
classNovember 22, 2024
Notes:
obstore
-based Store be in zarr-python or its own package? https://github.com/zarr-developers/zarr-python/pull/1661obstore
an optional depzarr.open
etc.shards
kwargs +10November 15, 2024
Notes:
November 8, 2024
Notes:
Discussion points
November 1, 2024
October 25, 2024
Notes:
- Updates from Tom — working on
info
,size
andtree
properties- Joe - hopefully wrapping store mode refactor up today
October 18, 2024
Notes:
Topics:
mode
was added to the storeclear()
is happening on reopenzarr.open
, e.g.zarr.open(LocalStore("...", mode="a") / "testdata.zarr")
October 11, 2024
Notes:
np<2
had no notion of varlen strnp>2
(breaking from zarr v2); decoding to object ifnp<2
October 4, 2024
Notes:
September 20, 2024
Notes:
chunk_shape
kwargSeptember 13, 2024
Attendees
Notes
Updates
on people's minds:
August September 6, 2024
Attendees
Notes
https://github.com/orgs/zarr-developers/projects/5/views/2 big lifts?
_set_item_nosync
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →time-permitting (Josh)
August 30, 2024
Attendees
Agenda
discussion about cache consistency and invalidation
August 23, 2024
Attendees
Agenda
https://github.com/zarr-developers/zarr-python/pull/2102
v2
module asapalpha release frequency
consolidated metadata
RemoteStore
open_array(s3://...)
sync
in user codestore = await MyStore.open('s3://foo')
store = sync(MyStore.open('s3://foo'))
store = MyStore.open_sync('s3://foo'), loop=...)
sync_store = SyncWrapper(MyStore, 's3://foo')
sync_store.set(filename, bytes)
GPU array progress
August 9, 2024
Attendees
Agenda
https://github.com/zarr-developers/zarr-python/issues/2008
July 26, 2024
Attendees
Agenda
Notes:
.chunks
attribute on ArraysDeprecate in 2.18.3
TODOs from this meeting
July 12, 2024
Attendees
Agenda
June 6, 2024
Attendees
May 30, 2024
Attendees
Agenda
Quick topics:
out
kwargNotes
May 23, 2024
Attendees
Agenda
May 17, 2024
Attendees
Agenda
Notes
May 8, 2024
Attendees
Notes:
HybridCodecPipeline
(interleaved with configurable batch size) needs a review: #1670NDBuffer
implementation: zarr-python#1826Notes:
Major updates
TODOs:
April 24, 2024
Attendees
Notes:
April 22, 2024
Attendees
Excused: Juan Nuñez-Iglesias
Goals
Agenda
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →support/v2
branch may be useful (JM)open()
which keeps things working. JH: zarr.open has a version flag. None could mean do both.- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →tifffile
of a missing default flag (size parameter?)Notes
April 10, 2024
Active work
members
: https://github.com/zarr-developers/zarr-python/pull/1782/files#r1558820360Apr 5, 2024
Todo list
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →p0 - must happen now
p1 - must happen before alpha release (target first week of May)
p2 - must happen before 3.0 release (target early June)
p3 - nice to have, can happen after 3.0 release
Arrays
Groups
Store
Tests
Docs
Misc
TypedDict
for all typed dictionaries[p2] Add logging throughout
Migration
March 27, 2024
Meeting notes:
Sanket: Bi-weekly meeting ends on May 1st, 2024 - shall we continue after that?
DB: Fleshing out the group API in v3 https://github.com/zarr-developers/zarr-python/pull/1726
NR: We need to find a common understanding of what we still need to work on for beta release. NR will create a tracking issue.
Akshay: Generalized array support
__array_interface__
__cuda_array_interface__
March 13, 2024
Meeting notes:
**v2_kwargs
and translate where possible, checking for conflicts with v3 kwargs if providedFebruary 28, 2024
Meeting notes:
February 14, 2024
Meeting notes:
January 31, 2024
Attendees
Meeting notes:
January 17, 2024
Attendees
Meeting notes:
raise KeyError
orreturn None
getsize
,move
,tree
,rmdir
,open
,close
Store.list_*
could change to return async generatorsevolve
andvalidate
- check if the codec matchesDiscussion
What goes in the beta release
January 3, 2024
Attendees
Meeting notes:
zarr.foo*
apiDecember 20, 2023
Attendees
Meeting notes:
December 6, 2023
Attendees
Meeting notes:
fsspec
(for now). Very convinced that having an async API is the right call for Zarr-Python. Less convinced that thefsspec
way of doing things will be the long-term solution.zarr.open()
). Most people use that top-level API.Looking for a champion on:
Agenda
November 22, 2023
Attendees
Agenda
Meeting notes:
get
more efficient, you need to wrap it in something - mostly users are getting multiple chunks at a timeNovember 8, 2023
Attendees
Agenda
November 1, 2023
Attendees
Agenda
Now that we are starting to work on implementing v3, we're faced with the question of what to do with the existing API
Observation: the current v2/v3 polymorphism is unsustainable (and incorrectly prioritizes v2 internally)
Proposal - we create a v3 namespace within zarr-python where we can develop in an isolated space toward a complete v3-spec implementation
zarr.v3.{Array,Group,Store}
zarr.create(shape=..., dtype=..., compressor=...) -> zarr.create(shape=..., data_type=..., codecs=..., attributes=...)
create
oropen
a v2 datasetGroup
orArray
does not need to be backward compatible though.main
branch in zarr-pythonAlternative proposal
v3
namespace and instead take over the primary namespace in a development branch (e.g.v3
)v3
branch is complete, we merge to main and make a3.0
releaseIdeas
Sanket Notes
October 18, 2023
Attendees
Agenda
September 20, 2023
Attendees
Agenda
September 6, 2023
Attendees
Agenda
Scoping V3 update (by @jhamman)
Written by @jhamman on September 5, 2023
In the Winter and Spring of 2022, while the V3 spec was still under development, an experimental V3 implementation was added to the Zarr-Python codebase (#898). This implementation followed the spec, as it was written at the time. However, in the months following these developments, major changes to the spec were made. This has left Zarr-Python out of sync with the V3 specification.
Summary of current status
zarr_version=3
andZARR_V3_EXPERIMENTAL_API=1
).zarr._storage.v3
.Major changes to the spec since the experimental implementation include:
zarr.json
) is no longer requiredmeta/foo/bar.group.json -> /foo/bar/zarr.json
)zarr.json
and anode_type
field is included in all documents)format_version
→int
dimension_names
chunk_memory_layout
(in favor of transpose codec)codecs
now includes a list of codects that was previously split between thefilters
andcompressor
fieldsOpen questions:
Actions
https://github.com/orgs/zarr-developers/projects/5/views/1
Zarr refactor meeting
Aug 16, 2023
Attendees
Discussion
Timeline
Goal: by the end of the year, have a fully-functional implementation of V3 in Zarr Python
Oct-Dec: Integration and interop testing
TODOs: