# `Structs`: Proposal for a new Base module ### **TL;DR:** * **What:** A new Base module called Structs providing convenience constructor macros, standardized field defaults and tags, and utilities for programmatically constructing and accessing objects; this will dramatically simplify various serialization/deserialization efforts across the ecosystem * **Why in Base:** Consolidation/standardization of a few ecosystem efforts, parity with other comparable language offerings, functionality that is "close" to structs/struct definitions * **Current Code**: The current Structs code exists in an [unregistered repo](https://github.com/quinnj/Structs.jl) ### Problems and Proposals * **Constructor reflection**: Default constructors require all fields to be passed in definition order. The presence of inner or outer constructors can change that, with no simple way to know the default constructor isn't available. The `@kwdef` macro provides a standardized way for construction via fields passed as keyword arguments to the constructor, but without a simple programmatic way to know `@kwdef` is supported. Another pattern, more common in other languages like Java, is defining a "no-arg" or empty constructor for a mutable struct and then setting fields separately. This can easily be achieved in Julia by defining an empty construtor on a mutable struct, but again, there's no signal that the struct supports this "no-arg" construction. * **Proposal**: Structs becomes the "home" for the `@kwdef` macro, and an additional `@noarg` macro, which would generate an empty inner constructor, allow field defaults (like `@kwdef`), and set default fields appropriately in the empty constructor. In addition to these macros, `Structs.kwdef(::Type{<:T}) = true` or `Structs.noarg(::Type{<:T}) = true` would also be defined when the macros are used, thus providing a programmatic inspection that a certain kind of construction method is supported. For the non-empty-non-kwdef immutable struct, it can then be assumed that the default fallback construction method is supported. * **Programmatic construction**: With standard and kwdef/noarg construction types, we can then provide a way to programmatically construct types, which currently doesn't exist in Julia, except attempting a naive `T(fields...)`, which can quickly run into a number of issues, like not accounting for field default values, non-default constructor thwarting, or awkwardness where not all fields can be provided. * **Proposal**: Structs provides a `Structs.make(T, source)` function for programmatically constructing an instance of type `T`, given an applicable `source`. `make` accounts for whether `T` supports keyword arg or "noarg" construction, then falling back to the default struct constructor. Field defaults are accounted for, and lifting/lowering for various inter-domain conversions are easily supported. * **Field Defaults**: As mentioned, currently only the `@kwdef` macro allows specifying default values for fields. * **Proposal**: Structs allows declaring default field values for `@kwdef`, `@noarg`, and provides a `@defaults` macro to allow declaring default values on regular structs. These would all be supported in the `Structs.make` construction method. For non-kwdef-noarg structs, default fields would be required for all "trailing" fields, i.e. the first N fields of a struct wouldn't need default fields, but any fields following a field with a default value would also require a default value. This matches Julia's current behavior for default positional function arguments. * **Field Tags**: The ability to attach arbitrary metadata or "field tags" to individual fields of a struct is crucial for serialization/deserialization, among other reflection-heavy use cases. Currently, there's no formal mechanism to support this, though this is very common in other languages. * **Proposal**: Allow specifying field tags in the `@kwdef`, `@noarg`, and `@defaults` macros, with the following syntax * `field::FieldType &(tag1="hey", tag2=3)` * That is, an `&` followed by a `NamedTuple` * An additional macro `@tags` would allow for field tags to be parsed/added to fields without requiring `@kwdef`, `@noarg`, or `@defaults`. Like the other macros, `@tags` would be applied to the struct definition. * The `&` provides a parsing token preceding the `NamedTuple` to signal the presence of a field tag (similar to the single backtick `` ` `` in Go), while providing minimal ambiguity for other meanings (i.e. the inability to specify a field default value _expression_ that included `&`) * **Domain Conversion Support**: Values in different domains are physically represented differently, like `"1"` and `1`. The challenge of serialization/deserialization is the conversion of equivalent values between different domain representations. Providing common interfaces to conveniently support this is challenging, yet incredibly powerful. * **Proposal**: Structs provides the `Structs.lower` and `Structs.lift` functions for "lowering" Julia values into a desired domain (via custom Style overloads), and "lifting" a domain value back as a Julia value, respectively. These functions gain access to field tags when applicable, with a standard set of field tags automatically supported by `Structs.make` (like field ignoring, renaming, type selection, etc.). * **Non-Concrete Type Construction Support**: Another common deserialization challenge is the need, at runtime, to select a concrete type to deserialize from a superset of types (either as an abstract type or Union). * **Proposal**: Structs provides the `Structs.choosetype(T, x)` interface, where `T` is an abstract type or Union, and `x` is the runtime source object. `choosetype` can then use whatever algorithm and runtime information from `x` necessary to choose a concrete type that actually is constructed when `Structs.make` is called. In the case of an abstract or union field type, any field tags are also available as an optional 3rd argument. * **Type Stability and Field Processing**: Julia's type system and performance optimizations rely heavily on type stability. There's a need for a programmatic means to process struct fields that is both type-stable and user-friendly, facilitating more efficient and effective struct manipulation. * **Proposal**: Structs defines the `Structs.applyeach(f, x)` function for applying a function `f` of the form `f(key, val)` to each field-value pair in a struct, index-value pair in a collection, or key-value pair in a dictionary. The functional style here allows the target `x` to branch accordingly that `f` might be applied in a type-stable way (vs. the inverse of type unstable fields being provided to the body of a for-loop). * **Flexible Struct Settings Without Type Piracy**: Finally, the ability to apply different settings or overloads to structs in various contexts—without resorting to type piracy—is an ergonomic challenge. There are legitimate use-cases where constructors, field settings, and domain conversions should behave differently in different contexts. There are also cases where the right interface overload may not exist for a type not owned by the calling context. * **Proposal**: Structs provides the `StructStyle` trait that can be subtyped by custom styles and used to overload any Structs interfaces for any type. It allows the flexibility for a type to behave differently in the context of different "styles", while also allowing overloads for non-owned types by providing an owned style subtype. #### Summary of Proposals via Example Code ```julia using Structs, Test @noarg mutable struct A a::Int = 0 b::Float64 = 0.0 c::String end @test Structs.noarg(A) a = A() @test a.a == 0 @test a.b == 0.0 @test !isdefined(a, :c) # construct instance of A from dictionary of appropriate key-value pairs a2 = Structs.make(A, Dict("a" => 1, "b" => 2.0, "c" => "3")) @test a2.a == 1 @test a2.b == 2.0 @test a2.c == "3" # define standardized fieldtags that automatically affect serialize/deserialize behavior @tags struct B id::Int first_name::String &(name=:firstName,) last_name::String &(name=:lastName,) birthday::Date &(dateformat=dateformat"yyyy.mm.dd",) ssn::String &(ignore=true,) end b = Structs.make(B, (id=1, firstName="Jacob", lastName="Quinn", birthday="1986.12.27", ssn="123-45-6789")) # Date was automatically parsed according to field dateformat @test b.birthday == Date(1986, 12, 27) # "output" `b` instance to a dictionary d = Structs.make(Dict, b) # `first_name` field was output as `firstName` according to `name` fieldtag @test haskey(d, :firstName) # date field was formatted according to `dateformat` fieldtag @test d[:birthday] == "1986.12.27" # `ssn` field wasn't output since it was marked with fieldtag `ignore` @test !haskey(d, :ssn) # define a custom style and override existing fieldtags struct RawStyle <: Structs.StructStyle end Structs.fieldtags(::RawStyle, ::Type{B}) = (;) # make another dict from `b` but using our RawStyle instead of the default style d2 = Structs.make(RawStyle(), Dict, b) # note no automatic transforms were applied on output @test haskey(d2, :first_name) @test d2[:birthday] == Date(1986, 12, 27) @test haskey(d2, :ssn) # we know `x` will be an `A` or `B`; define `choosetype` to pick the appropriate one Structs.choosetype(::Type{Union{A, B}}, x) = haskey(x, "id") ? B : A a3 = Structs.make(Union{A, B}, Dict("a" => 1, "b" => 2.0, "c" => "3")) @test a3.a == 1 b2 = Structs.make(Union{A, B}, Dict("id" => 1, "firstName" => "Jacob", "lastName" => "Quinn", "birthday" => "1986.12.27", "ssn" => "123-45-6789")) @test b2.id == 1 ``` ### Why in Base? The proposal for Structs is based on the need for: 1. **Ecosystem Consolidation and Standardization**: Structs aims to standardize struct management functionality across the ecosystem, enhancing compatibility and adoption of best practices. 2. **Parity with Other Languages**: Parity with languages that offer standardized solutions for struct metadata and construction. 3. **Filling Existing Gaps**: Structs addresses gaps in programmable construction, field defaults, and metadata handling, providing a comprehensive framework for struct management. 4. **Supporting Advanced Use Cases**: Essential for advanced scenarios like serialization/deserialization, dynamic object construction, and reflection-heavy use-cases. 5. **Base-like Functionality**: the code around constructors, field defaults and tags, and struct accessors are all very "core" kinds of functionality and thus keeping the code as "close to Base" as possible can help as the language evolves. 6. **Low code maintenance**: It's not expected that Structs code itself will need to change or evolve much going forward. It defines the interfaces and API boundaries and most usage and overloads will exist in outside contexts. 7. **Easy Compatibility**: By being a Base module, it's easy to provide ecosystem compatibility by having a registered `Structs.jl` package that just exports the Base module if defined (similar to new-ish ScopedValues.jl Base module) ### Comparison with other language solutions * Java * Supports class/method/field/argument annotations via `@[annotation]` syntax for attaching arbitrary metadata to objects * Language has standardized on common patterns like no-arg constructors w/ setters, or builder classes for "final"/sealed classes * Go * Supports [struct tags](https://www.digitalocean.com/community/tutorials/how-to-use-struct-tags-in-go), which allows attaching metadata to struct fields, used widely in reflection, serialization/deserialization * Primary inspiration for the "field tags" proposed in Structs * Rust * The `serde` serializtaion/deserialization library supports [field attributes](https://serde.rs/field-attrs.html), which are a set of hard-coded macros that provide similar functionality proposed in Structs: field defaults, field renaming, skipping, etc. * Swift * Supports [attributes](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/attributes/), which are similar, though more limited than Java's annotations for attaching arbitrary metadata to struct definitions and methods/fields * Has a formal concept of [initializers](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/initialization/) for struct/class construction where field defaults are applied and constraints can be enforced (like Julia's inner constructors). Initializers are automatically called when constructing structs/classes. ### Existing Julia Packages with Related Functionality * [FieldMetadata.jl](https://github.com/rafaqz/FieldMetadata.jl) * Not actively developed anymore; incorporated into ModelParameters.jl package * [ConstructionBase.jl](https://github.com/JuliaObjects/ConstructionBase.jl) * Very simple way to get a type's constructor; used in conjuction w/ Setfield.jl, Accessors.jl, and BangBang.jl focused on programmatic construction of immutable structs * [StructTypes.jl](https://github.com/JuliaData/StructTypes.jl) * Closest cousin of proposed functionality (by same author 🙋) * `StructType` too restrictive, as opposed to categorization traits proposed in Structs * Field properties were hard-coded and not fully supported * `StructTypes.construct` initially started as internal function that reluctantly became public; wasn't throroughly evolved to do what is should really do * Support for abstract type/custom type "lowering" was also too rigid and hard-coded * Overall StructTypes functionality was too tied to JSON3.jl and didn't apply to working with structs generically * [ArrowTypes.jl](https://github.com/apache/arrow-julia/tree/main/src/ArrowTypes) * Focused on arrow data specification of types and type categories * Provides similar means, however, for categorizing structs and interfaces for constructing and accessing ### Extended Examples * [JSONBase.jl](https://github.com/quinnj/JSONBase.jl/blob/main/src/materialize.jl) * [Postgres.jl](https://github.com/quinnj/Postgres.jl/blob/main/src/execute.jl#L115) * Models.jl: package for recursive struct diffing * Selectors.jl