Rationale: Animation-hook-up Pointer Direction

# Rationale: Animation-hook-up Pointer Direction The [animation data model from the 2023 workshop](https://code.blender.org/2023/07/animation-workshop-june-2023/#Data-Model) proposed that `Animation` datablocks and `ID`s get hooked up with a bidirectional pointer relationship (`ID <—> Animation`). The `ID` would point to an `Animation` datablock to use it, and the `Animation` datablock's outputs would point back at `ID`s to determined which output an `ID` uses when attached. Sybren, Nathan, and Bastien recently revisited this, and came to the conclusion that a unidirectional approach with just the `ID` pointing to the `Animation` datablock (`ID —> Animation`) would probably be better. (There is also a third option, unidirectional `Animation -> ID`. That was not explored.) Below is a summary of the two approaches we discussed, the trade offs we identified, and our rationale for choosing the unidirectional `ID —> Animation` model. ## The Two Models :::info Note: the diagrams below only show the minimum parts relevant to this discussion, and do not represent the complete data model. See the [workshop blog post for the full data model](https://code.blender.org/2023/07/animation-workshop-june-2023/#Data-Model). ::: ### 1. Bidirectional: ID <—> Animation #### ID pointers as output identifiers: ```mermaid classDiagram direction LR class ID { Animation* anim ... } class Animation { list~OutputIdentifier~ outputs list~Layer~ layers } class Layer { list strips } class KeyframeStrip { map[OutputIdentifier → array~AnimChannel~] channels } class AnimStrip { Animation *reference; map[OutputIdentifier → OutputIdentifier] output_mapping } class OutputIdentifier { ID * id_pointer string fallback_string } ID --> Animation Animation --> ID Animation --> Layer Layer --> KeyframeStrip Layer --> AnimStrip KeyframeStrip --> ID AnimStrip --> ID ``` The bidirectional model uses `ID` pointers as the primary identifier for animation outputs. The original model from the workshop put these `ID` pointers only at the top-level, in the list of outputs in the `Animation` datablock. For everything else internally, it then used indices into that list. However, during prototyping it was discovered that using indices everywhere internally isn't feasible, because when the output list changes, the indices change. For example, removing the first output in the list would offset all the other output indices by one. Within the `Animation` datablock itself that wouldn't be an issue: the code that removes outputs can just update the indices in all the strips as well. But output indices in that design are also used to map outputs between `Animation` datablocks in `AnimStrip`s, which can't be so easily found and updated. And in some library linking situations it would be *impossible* to update them. So to make the bidirectional model work, `ID` pointers must be used essentially everywhere that outputs are identified. This is reflected in the diagram above. Additionally, a fallback string is necessary to identify outputs when there is no associated `ID` yet. This could happen with imported animations, for example. So the complete `{ID, fallback_string}` pair needs to be carried around everywhere, and kept up-to-date everywhere. This is also reflected in the diagram. (Another solution would be to still use indices in `KeyframeStrips`, but use full `OutputIdentifer`s at data interface boundaries like `AnimStrip`s. This doesn't affect the trade offs discussed further below, however, and would mean that different strip types identify outputs in different ways.) ### 2. Unidirectional: ID —> Animation #### Stable indices as output identifiers: ```mermaid classDiagram direction LR class ID { Animation* anim OutputIdentifier anim_output ... } class Animation { int stable_index_counter set~OutputIdentifier~ outputs list~Layer~ layers } class Layer { list strips } class KeyframeStrip { map[output_stable_index → array~AnimChannel~] channels } class AnimStrip { Animation *reference; map[output_stable_index → output_stable_index] output_mapping } class OutputIdentifier { int stable_index String name } ID --> Animation Animation --> Layer Layer --> KeyframeStrip Layer --> AnimStrip ``` This variant of the unidirectional model was born from the observation that the above `ID`-pointer-based model actually *can* use simple indices (including at data interface boundaries) if those indices are *stable*. The idea of stable indices is as follows: store the top-level output list as a map instead of an array, with integers as the keys. Those integers are handed out by an incrementing counter in the `Animation` datablock. No integer is ever reused within the same `Animation` datablock, and the integer associated with an output never changes. Those integers are the "stable output indices". With that design, using indices into the output list as originally planned works fine. However, once you've gone that far, **the indices themselves become a stable identifier of the outputs**, and can just be used directly for everything. To make that work, `ID`s also store which output index they're using in addition to a pointer to the `Animation` datablock they're using. This model is the result of doing that. ## Commonalities Both models assume that all outputs also have a name or fallback string. In both models, these strings should be unique within the outputs of the `Animation` datablock. In the case of the bidirectional model, this is because the string serves as a backup identifier if the `ID` pointer is null (e.g. imported animation that isn't hooked up to anything yet). In the case of the unidirectional model, this is because it serves as the *display* identifier in the UI (and possibly the *Python API* identifier as well), because—much like raw `ID` pointer values—the stable indices won't be meaningful to the user. During discussion we also realized that these strings could be used strategically by productions if `ID`s also contain a corresponding user-settable animation output string. For example, both a proxy and full character rig could store the string "Frankie", and all `Animation` datablocks would set the string for that character's output to "Frankie" as well. Then the animation can be automatically hooked up correctly regardless of whether it's the proxy or full rig. This would work for both the bidirectional and unidirectional models. ## Trade Offs ### Swapping `Animation` Datablocks A handy feature of the current `Action` system is that it's easy to swap out `Action`s on an `ID`. This is relevant to both animator workflows and to production pipelines. Both the bidirectional and unidirectional model support this. However, the new data model supports animation for multiple `ID`s in a single `Animation` datablock, which introduces the new challenge of swapping out an `Animation` datablock on multiple `ID`s at once. Observations: * In the bidirectional model, outputs know what `ID` they're for, which makes it easier to swap out `Animation` datablocks on `ID`s en mass, without having to manually reconnect each `ID` to the right output. * However, this doesn't help when the `ID`s involved aren't the original `ID`s. This comes up in production when the models in the animation file are proxies, but the models in the lighting file are the full-res versions. * As discussed in the "Commonalities" section above, by using the output name/fallback strings, we get most of the benefits of `ID`-based reconnecting while also being more flexible for production. * Swapping is also used by animators for managing different "takes". However, for that use case we plan to have a takes system within the `Animation` datablock anyway, which is both more flexible and makes it unnecessary to swap `Animation` datablocks. In practice this seems like a toss-up, since the (apparently) best solutions to this problem are independent of the bidirectional vs unidirectional decision. ### Fancy Features Eventually we want to have features like animation-level constraints, mid-animation-swappable control rigs, simulation layers, etc. Observations: * In the bidirectional model, the `Animation` datablock has insight into what `ID`s it's animating. This would be valuable in implementing such fancy features, which in general need to know this kind of information. * However, for the unidirectional model this information could be provided by a run-time data structure. And run-time data structures will likely be necessary for fancy features even with the bidirectional model. For example, for maintaining evaluation dependencies. The specifics of how this will play out are still quite fuzzy and unknown. But it seems likely that there are ways forward regardless of the bidirectional vs unidirectional decision, and that the primary challenges won't be due to this decision. ### Output Connection Validation When outputs and/or `ID`s are removed, the connections between them need to be invalidated and updated. Observations: * In the bidrectional model we can take advantage of the existing `ID` management infrastructure for this, because `ID` pointers are used as the output identifiers. Whereas in the unidirectional model we need to manage this ourselves with new code. * The unidirectional model doesn't have to deal with the `ID` removal case *at all*, since nothing in the `Animation` datablock references `ID`s, and all connection data is stored in the `ID`. Whereas the bidirectional model has to deal with both `ID` and output removal. The unidirectional model is conceptually simpler here (fewer cases that need validation in the first place), but may also require writing more custom code. ### Connecting Library-linked `Animation` and `ID`s In production workflows, `Animation` datablocks are often linked from animation blend files, and hooked up to sometimes different `ID`s (e.g. full-res characters and props) in the lighting file. This means that both the `Animation` and the `ID` are library-linked assets, and need to be hooked up. Observations: * In the bidirectional model, both the `ID` *and* `Animation` datablock have to be overridden to connect them to each other, because the `ID` pointers in the `Animation` datablock also have to be changed. * In the unidirectional model, only the `ID` needs to be overridden, and it's very straightforward/low-complexity. This is a clear advantage in favor of the unidirectional model. ### Connection Stability In the same library-linked `Animation` and `ID` situation described above, what happens to the animation in the lighting file when the outputs in the animation file change? Observations: * Example: an animator completely deletes the output used by a character or prop to start over. * In the bidirectional model, even if an output is completely removed and recreated in the animation file, the connection in the lighting file remains stable because the `ID` connected to it remains the same. * In the unidirectional model, the connection in the lighting file is lost because the original output no longer exists, and it's the stable index of that output that defined the connection. * Example: an animator swaps out a temporary stand-in prop with the final one. * In the bidirectional model, the connection in the lighting file is lost because the `ID` pointer of the output changes. * In the unidirectional model, the connection in the lighting file remains stable because the output's stable index doesn't change. * With the output name/fallback string set appropriately, automated re-hooking up of broken connections is possible with both the bidirectional and unidirectional models. The trade off here is basically that in the bidirectional model *what the output is attached to* in the animation file determines the attachment in the lighting file, whereas in the unidirectional model *the output itself* is stable and determines the connection. Both approaches have failure cases, but the latter seems *probably* easier to understand and more straightforward from a UX perspective. And we need to provide good tooling/solutions for the failure cases either way. ### Connecting Multiple `ID`s to an Output With `Action`s it's trivial to connect multiple `ID`s to the same animation data: you just assign the same `Action` to multiple `ID`s. The unidirectional model keeps this very simple: you just assign the same `Animation` datablock and output index to multiple `ID`s. The bidirectional model, on the other hand, makes this more complex. Outputs need to store a *list* of `ID` pointers, corresponding to all connected `ID`s. And that, in turn, raises further questions that need to be answered. For example, which `ID` pointer (or do all of them?) identifies the output when library linking and overrides are involved? These aren't unanswerable questions, but they become design tasks themselves, with their own trade offs to make. Arguably, *directly* connecting multiple `ID`s to the same `Animation` datablock output (as opposed to doing so via e.g. an intermediate `AnimStrip`) might not be super useful. But it is a capability of the current animation system that would be good to preserve if we reasonably can, even if for no other reason than having an easy upgrade path for old animation data. ### Suzanne Principle As far as we know, there is nowhere else in Blender that uses bidirectional `ID` pointers to specify a single connection between `ID`s. There is a clear unidirectional flow of all `ID` connections in Blender. This makes the bidirectional model a departure from the current norm in Blender. Whereas the unidirectional model is consistent with the existing data models already present in Blender. This doesn't mean we should be afraid to break from current Blender norms. But we *should* have a good justification for it if we do. ## Decision We have decided to use the `ID -> Animation` unidirectional model with stable indices. For all of the trade offs we identified and explored, the unidirectional model was either better than or (nearly) on par with the bidirectional model. Additionally, it seems to generally be simpler and provide more straightforward answers to secondary design questions.