Notes on Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

# Notes on Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems #### Author: [Sharath Chandra](https://sharathraparthy.github.io/) ## [Paper Link](https://arxiv.org/pdf/2006.16225.pdf) ## Outline 1. This paper focuses on knowledge factorization by combining *declarative knowledge* (which keeps track of objects in an environment) and *procedural knowledge* (which predicts the dynamics of the environment). 2. The terms *Object Files* (OF) and *Schemata* are borrowed from cognitive science literature, where an *OF* is a short term memory that keeps track of the object's location, properties and history and *schemata* encapsulates the object's abstract knowledge of the dynamics and behavior. 3. The terms *OF* and *schemata* can be well understood when seen through the lens of object oriented programming (OOP). Each *OF* can be viewed as an object in OOP which is an instantiation of the object class (called schema). Once the object is instantiated, the internal details of the object is pertained to that particular *OF* but not with other *OF* and the methods are accessible to all and only objects to which they are applicable. 4. The authors show that this type of knowledge factorization of the declarative and procedural knowledge enables the systematicity and improved accuracy in the next state prediction models. ## SCOFF model The SCOFF model can be decribed as follows. It takes a sequential input which is often encoded using a neural encoder to obtain a deep embedding. These embeddings are then passed through the SCOFF model. The SCOFF model performs the following three steps: 1. OFs are modelled using GRUs or LSTMs. There are multiple OFs that are instantiated and based on the input they get activated. This "activation" is decided by the attention based soft competition amongst the OFs and the compatiblity of the keys (generated from the input state) and the queries (generated by the OFs) are regarded as winners. 2. The activated OF performs one step update of its state layer of GRU/LSTM units and this update is also a selective update where we "select" schemas. To expand this, the weight parameters needed for the update are sub-divided/factorized into sets of parameters where a set of parameters $\theta_j$ correspond to schema $j$. The selection of schemas is again based on attention scores but now the scores are "hard" (this is done using a gumbel softmax). 3. OFs can communicate with each other and this communication step is again done using a soft attention mechanism. The algorithm is summarized in the following figure ![](https://i.imgur.com/q4LDrUI.png) ## Experiments The experiments are geared towards answering the knowledge factorization ability of SCOFF and how this modularity helps in systematicity. **Toy setup** The authors present a neat toy setup where a clear factorization in the knowledge can help in achieving better results. The task is as follows. The model is shown a video sequence where there is one object (ball) moving in one of the following ways: 1. Accelerating in one particular direction. 2. Move with a constant velocity in one particular direction and, 3. Random walk with a constant velocity. Here we can treat these three different dynamics as three schema's and the object (ball) as a OF. The scoff model is shown to specialize in choosing either of these schemas based on the input. ![](https://i.imgur.com/QRerwTz.png) Authors also analyzed how good the SCOFF model is in switching the dynamics on fly. The setup is still the same but now there are two schemas (dynamics): vertical and horizontal motion (shown below). ![](https://i.imgur.com/bGCbQC3.png) There is a visual marker in the video sequence which indicates which dynamics to follow. The authors show that the SCOFF model is able to learn the context dependent dynamics selection on fly yielding high-precision prediction of the next frame. **Multiple objects and schemas** For this the authors consider same bouncing balls task where now the input has multiple balls each operating under billiard-ball dynamics which is shared across all balls. The SCOFF model can thus treat multiple balls as different OFs which share the common set of schemas and can selectively activate OFs and then schemas based on the soft and hard attention scores respectively. The experiments show that the proposed model succesfully predicts the next 10/30 steps as compared with GRUs and RIMs (goyal et al. 2019) ![](https://i.imgur.com/oU5dWNn.png)