Minimal Complete Definition

```swift trait Eq { fun eq(_ other: Self) -> Bool fun ne(_ other: Self) -> Bool } extension Eq { public fun eq(_ other: Self) -> Bool { !ne(other) } public fun ne(_ other: Self) -> Bool { !eq(other) } } type A : Eq, Deinitializable { let x: Int public memberwise init public fun ne(_ other: A) -> Bool {self.x != other.x} } type B : Eq, Deinitializable { let x: Int public memberwise init public fun eq(_ other: B) -> Bool {self.x == other.x} } public fun main(){ precondition(A(x: 1).eq(A(x: 1))) precondition(A(x: 1).ne(A(x: 2))) precondition(B(x: 1).eq(B(x: 1))) precondition(B(x: 1).ne(B(x: 2))) } // compiles, runs: https://godbolt.org/z/1GqGnf3Md ``` Problem: this is a footgun. If you forget to implement at least one of the functions, there will be infinite recursion. ## Another shot... ```swift trait Eq { public fun eq(_ b: Self) -> Bool } trait Ne { public fun ne(_ b: Self) -> Bool } conformance Eq : Ne { public fun ne(_ b: Self) -> Bool { !eq(b) } } conformance Ne : Eq { public fun eq(_ b: Self) -> Bool { !ne(b) } } type A : Ne { let x: Int public memberwise init public fun ne(_ b: A) -> Bool {self.x != b.x} } type B : Eq { let x: Int public memberwise init public fun eq(_ b: B) -> Bool {self.x == b.x} } public fun main() { precondition(A(x: 1).eq(A(x: 1))) precondition(A(x: 1).ne(A(x: 2))) } ``` This doesn't work, probably for the wrong reason. A cannot be passed to eq. See logs at: https://godbolt.org/z/a3aTe753G After that works, we would have probably the same issue as before. ## Haskell solution: Implement at least 1 of these ## Use Cases ### Conversion between coordinate systems ```swift protocol Coordinate { func cartesian() -> (x: Double, y: Double) func polar() -> (r: Double, theta: Double) } extension Coordinate { func cartesian() -> (x: Double, y: Double) { let (r, theta) = polar() return (r * cos(theta), r * sin(theta)) } func polar() -> (r: Double, theta: Double) { let (x, y) = cartesian() return (sqrt(x*x + y*y), atan2(y, x)) } } ``` #### The Trade-off - **Scenario A (UI/Grid Systems):** A bitmap image or a chess board is inherently Cartesian. If you implement `polar` and derive `cartesian`, every pixel rendering requires expensive trigonometric operations (`sin`/`cos`), killing your framerate. - **Scenario B (Robotics/Radar):** A LIDAR sensor naturally returns "distance and angle" (Polar). If you implement `cartesian` and derive `polar`, you are converting raw sensor data to $x,y$ just to immediately convert it back to check if an object is within a certain angle. ### Get many vs get one ```swift protocol DataStore { func get(key: String) -> Data? func getMany(keys: [String]) -> [String: Data?] } extension DataStore { func get(key: String) -> Data? { return getMany(keys: [key])[key] ?? nil } func getMany(keys: [String]) -> [String: Data?] { var result = [String: Data?]() for key in keys { result[key] = get(key: key) } return result } } ``` #### The Trade-off - **Scenario A (Network API):** If you implement `get` and rely on the default `getMany`, fetching 100 items results in **100 separate HTTP requests** (the "N+1 query problem"). This is a disaster for latency. You _must_ implement `getMany` natively. - **Scenario B (In-Memory HashMap):** If you implement `getMany` and rely on the default `get`, fetching a single item forces the allocation of an Array wrapper and a Dictionary result, creating unnecessary garbage collection pressure for a simple $O(1)$ lookup. ### 3. Graph Theory: Adjacency Matrix vs. Adjacency List When representing networks (social graphs, maps), the data structure determines which query is cheap. - **The Protocol:** - `hasEdge(from: Int, to: Int) -> Bool` - `neighbors(of: Int) -> [Int]` - **The Circular Defaults:** - `hasEdge`: Calls `neighbors(of: from)` and checks if the array contains `to`. - `neighbors`: Iterates _all possible nodes_ and calls `hasEdge` for each. ```swift protocol Graph { var nodeCount: Int { get } func hasEdge(from: Int, to: Int) -> Bool func neighbors(of node: Int) -> [Int] } extension Graph { // O(degree of node) - Linear search func hasEdge(from: Int, to: Int) -> Bool { return neighbors(of: from).contains(to) } // O(total nodes) - Very slow for sparse graphs func neighbors(of node: Int) -> [Int] { var results = [Int]() for i in 0..<nodeCount { if hasEdge(from: node, to: i) { results.append(i) } } return results } } ``` #### The Trade-off - **Scenario A (Dense Matrix):** If you represent the graph as a 2D boolean array (Matrix), `hasEdge` is an instant $O(1)$ array lookup. Implementing `neighbors` via the default is acceptable (it scans the row). - **Scenario B (Sparse List):** Most real-world graphs (friend lists) are sparse. You store a list of connections. `neighbors` is instant $O(1)$ (return the list). - **The Trap:** If you implement `hasEdge` natively and rely on the default `neighbors`, getting a user's friends requires checking _every single user in the database_ to see if they are connected. This turns a millisecond operation into an hour-long one. Resource to read: https://www.reddit.com/r/haskell/comments/1zshcu/til_the_hard_way_that_ghc_does_not_enforce_the/ Haskell minimal complete definition:https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/pragmas.html#minimal-pragma In Swift, they had to use this warning when they used this: https://developer.apple.com/documentation/swift/strideable A trait is supposed to represent an efficient set of basis operations according to EOP. However, there are cases when depending on how the data is stored, or in general what the concrete struct is doing, different subsets are more efficient than others, despite potentially being implementable in terms of the other from an information-theory perspective. For these cases, we have 3 options: require all those different operations in our trait require all the operations but provide default implementations, so that the type implementer can choose to implement the subset of basis operations that they consider more efficient, and get all other requirements automatically satisfied that can be derived from them. settle with a not maximally efficient basis For Equatable, I'm quite certain that the best tradeoff is to settle with a not maximally efficient basis, since the lost efficiency is only a double negation, which is probably insignificant compared to the cost of a deep equality check, and even more insignificant compared to adding an extra item in the witness table of such a commonly used trait (I'm not sure if it gets out of the witness table though if we add != as an extension). However, there are a number of use cases that would benefit from multiple possible basis sets. A DataStore trait that requires two accessors: - `get(key: String) -> Data?` - `get(keys: [String]) -> [String: Data?]` Both can be implemented in terms of the of the other There is different efficiency between which way we derive things for different concrete types: - Scenario A (Network API): If we implement get and rely on the default getMany, fetching 100 items results in 100 separate HTTP requests. This is bad for latency. We must implement getMany ourselves. - Scenario B (In-Memory HashMap): If we implement getMany and rely on the default get, fetching a single item forces the allocation of an Array wrapper and a Dictionary result, which are unnecessary heap allocations. We can think of similar different examples, the essence of the idea is that there needs to be different basis sets but it depends on the concrete type which one is the efficient subset. Haskell has a language feature for helping with this issue to avoid infinite recursion. It allows a type class (trait) to declare its minimal complete definitions (a set of (set of requirements) that must be implemented for each concrete type, without which the type declaration is considered incomplete). I would like to research this a bit further, see what implications this approach has and what other languages proposed to solve this problem. I think Haskell's design could work for Hylo. For both declaration-site and post-hoc conformances, we should be required to implement the minimal complete definition in the conformance declaration site and not assume it may come from anywhere else. > Dave: IMO this is probably not worth spending any language complexity on. Default implementations of traits are a (very powerful) convenience, and these kinds of cycles are very rare. Instead of creating a cycle you can expose what would be a default implementation as a different method, and have models forward to it explicitly. It is mostly a convenience, but I think this type of mistake can occour in practice causing inifinite recursion, and it would be great if we could at least mitigate that. One way could be to always require implementing all requirements in a conformance/struct declaration, and for the ones that are implemented by a default implementation, say it explicitly. ```swift struct K: Equatable { let x: Int fun operator== (other: Self) -> Bool { self.x == other.x } operator!= (other: Self) -> Bool = default } ``` While we are not sure what default is, it's a bit harder to accidentally implement everything with = default than to forget to fill out the implementation of a protocol requirement. We could also use the Go to definition action on default, asking the compiler to find given implementations that are available (this list may not be exhaustive though). (My inspiration was that the C++ core guidelines recommend to annotate the special member functions as =delete or =default explicitly when any special member function is present due to the tricky nature of when they get deleted or synthesized implicitly.) What is your opinion about such rule? (apart from being additional complexity, which I aggree) > It's less convenient and I'm not sure it's worth the complexity. I could be wrong, but IME this is not a very serious problem in practice. In any case, we can add a solution later if we discover otherwise. On the note of a special language feature for declaring minimal sets: I aggree that this is additional complexity. Note, it is only surfacing in the language for these special traits that have overlapping basis sets of operations. Annotating a trait with its minimal complete definition sets would be an optional, globally non-enforceable but recommended API design practice that one could use to annotate their traits to prevent recursion footguns for their conformance implementers. From the examples I've seen, it seems that the footgun extensions are rooted close to each other (probably in the same extension declaration), so static analysis could detect the majority of such problems (which is not a proof for the lack of infinite recursion) Asking trait authors to not have default implementations, just helper methods that models can explicitly use to express their derived implementations has similar learning experience: it's an optional, non-enforceable guideline. It may help if we require all requirement implementations to be stated explicitly within conformance declarations.