# Defining vtable interfaces ## Reasoning For performance. Simply that. Only that. ## Sample definition: ```julia= @nospecialize abstract type Function _(args...) end # this is what we do now (in essence) @nospecialize abstract type IO read(_, Type{UInt8})::UInt8 (read(_, Type{<:A})::A) where A<:Array write(_, UInt8)::Int write(_, @specialize Array)::Int write(_, @nospecialize Any)::Int unsafe_write(_, Ptr{UInt8}, UInt)::UInt unsafe_read(_, Ptr{UInt8}, UInt)::UInt eof(_)::Bool isopen(_)::Bool isreadable(_)::Bool iswritable(_)::Bool get(_, Symbol, Any) write(_, _)::Int # this means _ is the same type here in the vtable write(_, IO)::Int (convert(::Type{IO}, T)::T) where T<:_ (convert(::Type{<:T}, T)::T) where T<:_ Type{_}(args...) end # this is what we would do for IO # and then add some more @interface IO read(_, Type{UInt8})::UInt8 # @interface IO +(Int, Int)::Int @interface IO write(_, ::IOBuffer)::Int # and implement some get(::IO, ::Symbol, default::Any) = default ``` Notes: - This lists the codegen signatures that will be used (typeintersected with the declared signatures for the implementations). - The `_` field denotes where the argument will be passed. It does not have a type constraint on it, but `::IO` is effectively implied. - The return type is optional, and may be inferred, but will be enforced for the calling convention. - The entire expression is evaluated immediately, and placeholder Method objects are created. - The return type is evaluated immediately (unlike usual) and stored in the placeholder Method object. - If this method is later defined (e.g. as a fallback option), it will silently merge with the fallback, copying it. - The `::` is dropped here in the call syntax preceeding the types. This is primarily because that will let us use `...` to mean `tuple_type_cat`. It also reduces line noise and is easier to copy to/from places like `code_typed` and `which`. But if we used `::` it would appear more like a call signature. And then the parameters could optionally be named (for documentation benefit), though names will be ignored. ## Sample usage: ```julia= eof(::Any) = error("not an IO") struct IOBuffer <: IO buf::Vector{UInt8} tell::Int end eof(iob::IOBuffer) = (iob.tell == length(iob.buf)) ``` Notes: - Just like normal, there is no changes here! - The runtime will detect that `eof(::IOBuffer)` conforms to the interface `eof(::IO)` and set a flag in the Method accordingly (for nospecialize and return type enforcement). ## Implementation details: - All `IO` subtypes become `@nospecialize` default when used in a signature. - The use of the `vtable` is _optional_, but may be used whenever codegen sees a `:call` that was not converted to an `:invoke` head, and has a `vtable` available in the function arguments that satisfies this call. - The inner ABI for calling an arbitrary function with a declared `IO` parameter will pass along an opaque `vtable` of function pointers for each of those values as follows: ```julia= opaque struct vtable next::vtable min_world::UInt # maximum over fptrs max_world::UInt # minimum over fptrs ### function pointer fields # (for def, invoke, and specsig fields) read::CodeInstance write::CodeInstance etc. end ``` - The list of current `vtables` will be added to the (concrete) `DataType` in a linked list chain, similar to `CodeInstance`, but currently opaque to Julia (as `Ptr{Cvoid}`). - The outer ABI call for the arbitrary function (the invoke generated in codegen) will need to look up the `vtable` for each `@unspecialized` arguments with a `vtable`. - The inner ABI for the interface Method will take the `_` slot argument by reference, and all other arguments will be handled as normal, except that the MethodInstance will also be appended to the end (in case we need to box everything and call the interpreter or calling convention 3). - The codegen and interpreter will add a typeassert before the return that ensures it meets the declared interface (n.b. no `convert` call is inserted). This will be particularly tricky since it must be enforced on all Methods which intersect the interface declaration, but only when called with the interface types. - Inference will be allowed to improve the return type of interface methods from what was declared, and to inline them or devirtualize calls to them, as usual. ## Other comments - Without `@nospecialize`, this would serve only to document the expected interface (e.g. for analysis tooling), but does not cause any of the other functional changes, except adding the hidden type-assert. - It remains to be decided how to deal with methods that satisfy multiple interfaces. For example, to continue the `IO` example, if someone defined a hypothetical type-erased object for serialization (This seems unlikely, but the behavior of this still needs to be defined anyways): ```julia= @nospecialize abstract TypeErasedSerializer{T} # read(IO, ::_)::T write(IO, _) # read(IOBuffer, Type{_})::T write(IOBuffer, _) # read(Any, Type{_})::T write(Any, _) end struct Serialized{T} <: TypeErasedSerializer{T} value::T end read(io::IO, ::Type{Serialized{T}}) where {T} = Serialized{T}(read(io, T)) write(io::IO, s::Serialized{T}) where {T} = write(io, s.value) ``` - The read methods cannot be defined, since they are dispatching on the Type instead of the value. Perhaps we could make TypeKinds special, so that `::_` or `Type{_}` is permitted here to get a `vtable` out of the TypeKind in some way? - There are roughly 2-4 cases here to consider, depending on how they are classified: - the new interface references the old one (i.e. `IO` or `Union{IO,Nothing}`) - the new interface subtypes the old one (i.e. `IOBuffer <: IO` or `Union{IOBuffer,Nothing}`) - the new interface supertypes the old one (i.e. `Any`) - In the cases where there is a morespecific ordering of the interfaces (or at least any explicit mentino of the interface `IO`), we could implicitly use the [visitor pattern](https://en.wikipedia.org/wiki/Visitor_pattern) for double-dispatch, where the `vtable` inside the least specific match points to a second level dispatch method which is specialized on both arguments. This is because we know the compiler must have seen the first interface `IO` before the second interface `TypeErasedSerializer`. - Q: how to we store and find the second level table here? - In cases where there is not a morespecific ordering to the interfaces (such as `write(Any,_)`), we don't know which interface the compiler will see first, so we don't know which one to prefer first. But we know all instances of this will see both. So perhaps we end up with N different orders, where some code might dispatch IO->TypeErasedSerializer order (because it had only IO defined) and other code might dispatch TypeErasedSerializer->IO (because it had only TypeErasedSerializer defined), and then the intersection of these what do we do then? - Q: what do we do for the discovered intersection of interfaces at runtime - Q: is it acceptable (initially) to populate this `vtable` entry with a dynamic dispatch? ## Use as FunctionWrapper replacement? ```julia= abstract type FunctionWrapper{AT<:Tuple, RT} _(AT...)::RT Base.invoke(_, Type{<:AT}, args...)::RT end struct FunctionWrapperImpl{T, AT, RT} <: FunctionWrapper{AT, RT} f::T end function FunctionWrapper{AT, RT}(f) where {AT, RT} return FunctionWrapperImpl{typeof(f), AT, RT}(f) end (f::FunctionWrapperImpl)(args...) = f.f(args...) Base.convert(::Type{FW}, f::FW) where {FW<:FunctionWrapper} = f Base.convert(::Type{FW}, f) where {FW<:FunctionWrapper} = FW(f) Base.show(io::IO, fw::FunctionWrapperImpl) = println("FunctionWrapper(", fw.f, ")") ``` ```julia= const MathBinaryOp{T} = FunctionWrapper{Tuple{T, T}, T} for f = MathBinaryOp{Int}[+, -, *, /, รท] @show f, f(1, 2) end ``` ## Use as interface specification This is a possible weird future side-effect of the above constraints, where we can cause it to define a function contract. These might not quite work in the current design above, because the return type `T where T` may become `Any` right away, causing the constraint from the signature to be discarded. And there might be other issues with the syntax too. For example, this interface might enforce that `convert` to `T` must return `T`: ```julia= abstract type AbstractConvert <: Function (_(Type{T}, Any)::T) where T end struct Convert <: AbstractConvert end const convert = Convert() ``` And this one might express that number constructors must return something of the given type, assuming `T<:_` was something also defined and allowed: ```julia= abstract type Number (Type{T}(Tuple...)::T) where T<:_ end primitive type Int8 <: Number 8 end ``` Or perhaps instead that could be expressed for all types, with a `vtable` on all TypeKinds: ```julia= @nospecialize abstract type Type{T} _(Tuple...)::T end ``` OR ```julia= @nospecialize abstract type Type _(Tuple...)::_ end ``` Which perhaps declares that all constructors must return something of the specified type (with a single vtable entry for `Any...`), while not specializing code on the return type (which would still force dispatch and boxing there)? # Alternative Would it be sufficient just to hoist invoke lookup when we can see that information is constant: ```julia= function readarray(io::IO, ::Type{T]}) @nospecialize m = lookup_applicable(read, io, T) output = T[] while !eof(io) push!(output, invoke(m, io, T)::T) end return output end ``` # Chris R. interface check Does this work? It seems rather awkward. ```julia= abstract type DiffEqNumber <: Number +(_, _)::DiffEqNumber +(Number, _)::DiffEqNumber (+(T, T)::T) where T<:_ end @interface DiffEqNumber +(_, Number)::DiffEqNumber #macro aqua_interface(block) # block = esc(Expr(:quote, block)) # isdefined(__module__, :__aqua_interface__) || # setglobal!(__module__, :__aqua_interface__, []) # return :(append!($(esc(:__aqua_interface__)), $block) #end #@aqua_interface begin #end # struct DiffEqNumberImpl{T} <: DiffEqNumber # value::T # end # +(a::DiffEqNumberImpl, b::DiffEqNumberImpl) = # DiffEqNumberImpl(a.value + b.value) # in Aqua.jl apply_interface(DiffEqNumber, Int) -> Tuple{typeof(+), Int, Int} => DiffEqNumber # !Int Tuple{typeof(+), Number, Int} => DiffEqNumber # !Number Tuple{typeof(+), Int, Number} => DiffEqNumber # !Number Tuple{typeof(+), T, T} where T<:Int => Int # =Int ``` ```julia= +(args...) = Base.:(+)(args...) @interface typeof(+) (_(::T...)::T) where {T} @interface typeof(+) _(::Number...)::Number <(a, b) = Base.:(<)(a, b) #::Bool @interface typeof(<) _(a, b)::Bool @interface typeof(<) _(a::Number, b::Number)::Bool for S in subtypes(Number), T in subtype(Number) @test_interface <(S, T) end @test_interface <(::Int, ::Int) @test_interface <(::Number, ::Number) # @interface +(Int, Int)::Int ```