Outcome Design Spec

# Outcome Design Spec :::warning Keep in mind that this document is still very much a work in progess. ::: [TOC] # Quick overview Outcome is an architecture, or a kind of recipe if you will, describing an approach for creating and running civilization scale simulation models from simple json/yaml files organized into modules. These input files can be parsed and a simulation instance can be spawned using that data. Data from the module files provides both the initial states of the simulated entities as well as the instructions necessary for processing the sim data at runtime. The created simulation models’ behavior is intended to mirror the behaviors of real world systems. That said, it’s still entirely possible to create all kinds of fictional models as well. The models can be designed using varying levels of abstraction. The main intention is to focus on relatively high levels of abstraction, mostly because it can proove to just be too difficult to work at the lower levels of abstraction. One of the core ideas for the simulation architecture here is having a limited set of entity types based on similarities in behavior and functional characteristics. We recognize 4 basic entity types: *region*, *organization*, *global* and *universal*. We will look into what and how they are supposed to represent in a later chapter. Most of the actual processing for the simulation is designed using simple state machines, called collectively *elements*. A lot of core functionality involves using the element construct and having a set of abstractions derived from this element base, each with a specific purpose in mind. One example of that would be a modifier element, which is a sort of proxy for modifying entity properties in an orderly fashion. Another example would be a policy element which encapsulates organization entities’ internal operational rules. Yet another example would be a guide element, which helps organization entities arrive at decisions based on their current state and current states of other entities. We’ll cover all elements later in this spec. Element states contain *evaluations* and *executions*. Evaluation of states, and executions if evaluation goals were met, is performed mostly on the basis of *tick events*, where ticks signify the passage of simulation time (base tick is one sim hour). There are also ways for directly invoking states by their name on selected elements. Once again, we’ll dive into details for element processing later on. # Challenges and trade-offs There are many challenges to creating large scale simulations involving both social and physical systems. One of the first things is probably the question: is this something actually worth pursuing? And if so, to what degree could it be useful? For us the answer to the first question is a definite yes, for many different reasons which we won’t go into here. But what about the second question? The thing to understand is that this architecture and its implementations can only go so far. The high-level-abstraction approach is itself a sort of tradeoff, in the sense that you can’t expect the result to be as good as the real-world systems, because you’re not even trying to mimic them at sufficiently low level. There is something to be said about generating sufficient amount of complexity through modularity and extensibility though. Our idea for getting to high levels of complexity is to spread the model creation process across a large amount of contributors, preferably thousands of people. This implies the need for ease of use and built in modularity. ## What is needed? We need a simulation system that is flexible and approachable. Adding new functionality to the existing simulation models, and indeed creation of new simulation models altogether, should be fairly straightforward. We need it to allow for almost drag-and-drop-like modularity, and we need to have low-barrier-to-entry for non-tech-savvy people to be able to contribute. At the same time we need the computation to be fast and stable, allow for quick iteration. We also need to plan for scalability, spreading the computation across multiple threads and preferably multiple machines. ## Hardcode everything Hardcoding simulation models into the program itself is definitely the most efficient way to proceed, in the sense that it will end up being highly optimized and will run fast. While great in terms of operational (computational) efficiency, this approach doesn’t solve any of the problems related to ease-of-use. Raw code (especially lower-level languages), compilers and all that jazz is not something a non-tech-savvy person can deal with. That’s not to say the approach can’t be used. With the Rust language implementation being openly distributed, for example, it’s entirely possible for someone to take on transcribing already existing simulation modules to Rust code. There will probably need to exist an established simulation content base first for this to happen though. ## Scripting languages Using some scripting language as an intermediary between the user and the main process is generally a good idea. It allows for a lot of flexibility at the user end, mostly because you basically have a feature-packed programming language at your disposal, access to things like functional and object oriented programming concepts and such. Including any scripting language into the picture, however, can lead to significant overhead. This is to be expected, as we’re ultimately importing a lot of functionality into the system. This may or may not be what you want. Whether you want it depends on what kind of functionality you really need for your system. Building the original prototype interface for Anthropocene (the game implementation) using C# and Lua was a moderate success. It was quite easy to create a working API, the whole thing was quite flexible at the user (content creator) end, allowing for writing actual scripts that could be executed at runtime. In the end though this solution didn’t work out as well as was needed. Over time, mostly because of pressures related to performance, it started morphing into the system that is in use today (see below). ## Simpler data structures and custom parsing Another approach is to use simple structure file formats like json and yaml to build the simulation parts. Initialization procedure is able to parse the files and use them for the instantiating of simulation parts. This is quite neat because at the architecture implementation level you can use basically any language, as most of them support json/yaml out of the box. This way we don’t have to rely on external libraries for running Lua scripts for example. We end up with simple state machines with closed set of possible processing instructions defined, all parsed before running the simulation and initiated into objects containing simple if..else..then flow control statements that can be understood by the program implementing the simulation runner. On the user end, writing sim models is less flexible with this approach. There is a limited set of instructions you can use, there are more restrictions on what’s possible. At one hand this is not ideal, but on the other hand this means learning to write sim models can be easier. This approach seems to be a good tradeoff between ease-of-use and performance, and it is the currently used one. The following pages are mostly description of this particular approach. # Design with specific implementations in mind Before diving into details, it’s important to note that parts of the architecture are designed already with some specific uses and implementations in mind. Choices for those concrete uses and implementations are driven by the project’s overall goals and motivations, which will not be discussed here. Currently there are plans for 2 applications that will make use of the architecture: - game "Anthropocene" (C#, Unity3D) - command line tool (Rust) While modules written for use with any of those applications should work on any other application conforming with this spec, there can be aspects that are relevant to one implementation while not at all relevant to another. Good example of this are images specified for org choices, which are relevant only for implementations including some graphical interface, like the game implementation. Such images can be totally ignored by implementations without a GUI to show them in, like the command line tool. # Simulation Core Let’s dive into the sim core. We’ll take a higher-level perspective here, for more concrete implementation details you can check the public code repositories. > [color=#0094d9] Why it’s useful to know how the simulation works? Well, the most important aspect of it is that it’s easier to create (and debug) content if you know what actually happens at the simulation level. Getting a good grip on what kinds of objects are involved and how they relate to each other is recommended before getting into writing module files. Let’s see what kind of concepts and processes are involved here. ## Declarations Everything to exist within the simulation needs to be declared in a module file. The module files used for initialization are really just lists of declarations. Declarations can vary in size and content, depending on the thing being declared. :::danger TODO: Give some examples of declarations. ::: ### Variable Permanence Simulation depends on a database structure for the storage of all the data it uses throughout the process. The nature of all the variables’ addresses stored in the database is that they can’t be changed or deleted once initialized. Also there is no creation of variables at runtime, they all have to be properly defined and initialized ahead of time. :::danger DEBATED: Is this the right approach? What if we wanted to create things like orgs at runtime? ::: ## Unified Addressing All variables can be referenced using unix-style paths. ``` # a few examples of different addresses /region/e_01001/prop/immigration_rate_daily /org/GER/policy/immigration_policy_1/current_state prop/population /org/{prop/master_org}/prop/ ``` Dash at the beginning suggests an absolute path, while no dash suggests a relative path. Relative path means that the first part of the path is omitted because it’s the same as the first part of the path of the entity where the current execution is happening. For example we might currently be doing an execution on some region entity “m_01012” and we specify some path like “prop/population” somewhere. In such case the path will be implicitly taken to be “/region/m_01012/prop/population”. Curly braces can be used to insert some other variable’s value into the path (by referencing the path to that other variable). All paths are checked for correctness at initialization. If a path is incorrect, then an error message will be thrown to the debug output and the evaluation/execution it was used at will be disabled. Checking if a path is correct or not involves checking all parts of the address and whether they are correct or not. For example if some path specifies a region “/region/12345/(..)” but such region was not declared in the first place, then such path is incorrect. :::danger TODO: Define how exactly are the addresses built up and why. ::: ### Address Synonyms There are some useful synonyms defined that can be used. They mostly include shorter versions of the address parts. Examples: `region` -> `reg` `organization` -> `org` `property` -> `prop` The addresses are parsed at initialization and any recognized synonyms are rewritten into the basic (longer) variants. ## Basic concepts for the sim core There are few basic concepts underlying the design: - entities (entity) - properties (prop) - property maps (propmap) - elements (element) - element states (state) - element state modes (mode) - evaluations (eval) - executions (exec) - commands (cmd) Let’s go through all of those in detail. ## Entities Entities are divided into 4 types: - region (reg) - organization (org) - global (glo) - universal (uni) You can think of these types of entities as of different “simulation levels”, each entity existing on one of these 4 “simulation levels”. Entities of each type are always given the same set of instructions and the same properties. Following the “simulation level” analogy we could say that for example all region entities exist on the same “simulation level” (we could call it the “region simulation level”). Having the same set of instructions means their behavior will be appropriate for what they are actually representing (e.g. a region). :::info Note: This doesn’t mean they will all behave the same way - given different inputs and some amount of randomness the actual behaviour will be quite different. ::: ### Region Regions are the smallest parts of the puzzle. Each region entity represents an actual geographic region as defined on the *region canvas*. Regions contain much of the actual simulation data. Most of the social and physical phenomena that can be divided into smaller parts (geographically) exist on this region level. :::info Note: Perhaps the most important thing to remember when talking about region entities (and other entity types as well, really) is that all the region entities all contain the same set of instructions. It's the initial values and later random circumstances that make the regions different from one another. The base set of instructions stays the same during the whole simulation run. This basically means that under proper circumstances any region could become something that resembles any other region. ::: ### Organization Organization entity type defines an entity capable of self-management in some sense - organizations are the *agents* within the simulation. There is a special system of decisions available to organization entities, through which they are presented with choices during the simulation run. This approach allows for easy switching of the organization’s input interface, so to speak. Because the decisions don’t have to be bound to the "computational schemes" (guides), and because they involve really simple input, it’s quite easy to substitute any organization’s default input with an external one. This is something that’s done for the game implementation of this spec - player is simply “inserted” into the simulation as the “choice machine” for an organization of their choice, and they get to make all the choices which would normally be done by the appropriate guides. Of course there are more things involved to make this kind of approach playable, such as “outsourcing” some of the low-level decisions to the default guides, but the basic mechanism remains simple. :::info Note: This interface can be used to create AI players that would learn how to make decisions “playing” as some organization entity. For the command line tool implementation there is a plan to include tools for easier organization input substitution. This will probably be implemented as a “player mode” where a sort of an API is provided for getting required information into and out of the simulation. ::: ### ?? Global ?? Global level is basically the planetary level. In practice this can also encompass smaller celestial bodies, like moons. Global level is relevant for traits that are shared by different planetary/moon scale bodies. One example of such thing would be an atmosphere. Of course we’re talking mostly about things that are relevant to human operations, and so we’ll focus on global entities being places humans can possibly inhabit. Earth, The Moon and Mars are the chief examples and indeed the three main globals we’ll usually focus on in the context of our simulations. Property maps exist on the global level. Property map maps region level values taken from multiple regions to a global map. In that sense the variables and their corresponding values making up the property map are themselves not global, but the whole property map as a thing is. :::danger **Global sim level is still debated. It could get implemented in the future. Right now the focus is on the Earth system.** ::: ### Universal For any simulation instance there is only one universal entity. This means that anything that needs to “exist only once”, as opposed to being duplicated for each instance of an entity as is the case with all other entity types, will exist on the universal level. Things that are not replicable in any of the regional or global contexts exists on the universal level. For example sun activity is modeled on this level. One less obvious example of this is a killer asteroid hitting the earth. Here “the line gets blurry” as we could just as well have asteroid events replicated for all the planets as the odds for such event could be comparable. Other than the above, the universal level can be used to define very specific things we want to include in the simulation as "singular instances". ## Vars (variables) Vars are key-value pairs that are persistent throughout the simulation. They are stored inside the Database object. Value of a var can be retrieved using it’s address (key). Vars need to be registered at initialization. Var registration is a process of registering a var for a specific entity type, instead of for a specific entity. For example when registering a new var for the region level each region entity will receive an instance of this new var and will be able to use it. ### Element vars ### Props (properties/prop vars) Properties are subset of vars that's specific in the way that it's not element based. You could say these are the entities' properties - values that define the baseline states of the entities. Example of a prop address: `/reg/e_01001/prop/forest_coverage` ### Var map Var map is a collection of values of some variable over . ## Elements Basically speaking, elements are based on the concept of Finite State Machines (FSM). Each element has a set of states specified, at any moment only one of the states of the element can be “active”. This idea of a state machine is useful here because the simulation invokes execution of the element states on a time tick basis, meaning it’s easy this way to have for example an element waiting There are a number of specific element types, all designed to handle specific things. All of the elements are designed using the same base structure of states. Indeed there is a generic element type which is the default element type. Other element types build on top of that, differing mostly in terms of how they are declared. Let’s have a look at all the element types we can use to create our sim models. ### Generic asd ### Event asd ### Modifier asd ### Policy asd ### Tech asd ### Agreement asd ### Guide asd ### Decision asd ## Element States wip ## Evaluations Evaluation ## Executions Executions are collections of commands that are parsed and executed. Executions are the instructions that specify what modifications to the entities’ prop tables should happen and when they should happen. They rely on Time Ticks to fire their triggers and possibly actions if the required checks are passed. ## Commands Commands (cmds) are the micro-programs that are invoked by the element states. There are simple commands (cmd) and complex commands (ccmd). Complex commands are parsed and cut into multiple simpler commands and stored this way for later processing. Commands are hardcoded (though there is also user_cmd cmd allowing for user defined commands??). Commands can do many different things and take varying numbers of arguments. ??Commands return some value (base value class, so either string, num or bool) as their return value. Let's look at all the available cmds. ### Simple #### set `set [var_address_1] [var_address_2]` Sets the value at `var_address_1` to the value of `var_address_2`. #### log `log [log_message]` ### Complex #### oper `oper [var_address_1] [operation_sign] [(calc)]` Operates on the `var_address_1`, converts contents of the third argument into separate calc Logs a message to the default log store. # Data Management Data Management is about handling, organising and distributing the data that is used to create and run simulation models. The main features relevant here are Mods, Scenarios and Snapshots. In one sentence: Mods are collections of instructions, scenarios are collections of mods, snapshots are already initialized scenarios containing saved state (like a game save). Let’s go through those three levels of data and see how they are related to each other. ## Module Modules, mods for short, are, simply put, collections of data that can be gathered together and used to spawn a simulation process. It’s helpful to think of mods as packages - mods allow for modularity, in the sense that we can put different collections of mods together and achieve a working simulation model that can be run. This ability to combine multiple mods is the key here. For managing multiple mods within one “environment” we’re moving into the domain of Scenarios. ### Flexibility within the mod There is a degree of flexibility to the organization of files inside the mod. For example certain guidelines for structuring files/content within a mod exist, but they are not mandatory - there are multiple ways to tackle file organisation inside a mod. :::warning TODO: Define the different way of organizing files within mods. ::: Mods always exist in the dedicated *mods* directory, which itself always resides inside the *scenario* directory. ## Scenario Scenario wraps a collection of mods into one *simulation environment*, so to speak. Simulation instance is always spawned using a single scenario as input. This is also true for initiating simulation instance using a snapshot - in that case the snapshot points to a scenario to be used. ## Snapshot Snapshot holds initial-state data which can be used to load specific state for a simulation instance. Snapshot still needs to point to a scenario so a simulation instance can be spawned. Snapshot only provides data which can be loaded onto an already existing simulation. Snapshot must contain: - serialized copy of the database object - snapshot metadata :::warning TODO: Snapshot could also optionally contain a collection of archived states for past simulation ticks. ::: Keep in mind that snapshot data is loaded only after the simulation instance has been spawned. This means that if there is any data on the snapshot that corresponds to an item that doesn’t exist in the initialized simulation instance then it will not be loaded. Also the other way around, if the snapshot doesn’t contain data for something that is defined and exists at the simulation instance, then the default value from the declaration will be used. # Initialization process Initialization is where the raw input data gets converted into program-readable data structures. ## Read the files The first step is to read all the input files. Files are read in the order of, and then organized into groups based on the mods. So we get the mods’ directories for the requested scenario and one by one get their files and read them. ## Register all the declared items first This is an intermediate stage which ends with a complete list of all the declarations from the read files. This includes the deletions. ## Deletions Deletion is a tool that allows for “deregistering” some item that has been declared. It’s possible to have some declaration in one mod, and then have a deletion declared in another mod, effectively deleting the thing that was registered from the earlier declaration. Deletion phase is always processed after the main registration stage, because we need to know about all the registered items first before we can start deleting them. ## Initialization There is a specific order in which different types of things are initialized. This is important since some things need to exist first for other things to be able to be properly initialized. The order of initialization is as follows: - entities - elements - # Simulation tick process asd # Proofs One of the best ways to determine connections between certain elements of large systems is to focus on a few of those elements and test system output with the elements changed slightly. That's exactly what the proofs are about. Proof runs multiple simulation instances, each can have some properties' values changed based on the proof manifest file. For each simulation run the results are recorded. The results we want to get saved are also defined in the manifest file. Usually the results will include simple information pieces like the specific values of interest in the starting and ending points of the simulation. It's possible though to use an option to save entire snapshots for the entry and end sim points, or to specify simulation times for saving snapshots, or a time interval for saving snapshots. Indeed, if it was needed, snapshots could be made for each simulation tick. Having more data from all the different simulation runs could help arrive at better conclusions when analysing the data afterwards. ## Structure of a proof manifest The whole idea is to provide a **simple way** to do such tests, so there are only a few things a proof will need: - initial state for the simulation (either scenario or snapshot) - start date (optional, defaults to the start date of the initial state) - end date - save snapshots mode (optional, needs a list of sim dates and/or time intervals) - a list of variants For the properties of interest, each declared property can specify it's own additional requirements, such as: - variants for the initial state (optional, by default there are no variants) - saving time intervals (optional, defaults to only saving the values of the property for start and end simulation tick) - ``` # example of a simple proof.yaml start_date: default end_date: "02-02-2019" save_snapshots: false runs: 10 properties: # is this needed?? probably don't need to specify # all the the props beforehand - /reg/e_01001/prop/population - /reg/e_01001/prop/population_growth - /reg/e_01001/prop/forest_coverage variants: - name: Higher population variant properties: - addr: /reg/e_01001/prop/population value: 1000 - addr: /reg/e_01001/prop/population_growth value: 0.7 - name: Lower population variant properties: # modified+inspected props - addr: /reg/e_01001/prop/population value: 666 - addr: /reg/e_01001/prop/population_growth value: 0.6 # inspected props - addr: /reg/e_01001/prop/forest_coverage ``` # Multi-thread considerations Multithreading is almost a necessity if we want to use all the available resources to speed up our simulation runs. With some implementations the task of multi-threading can be easier than with other ones. For example Rust programming language offers relatively easy solutions to most common problems with multi-threaded applications, but really this is beyond the scope of this doc. The takeaway here should be that using all available threads will not always be possible. The approach to take for speading the computation load across multiple processor threads is not to be dictated by the specification, rather this should be considered on-implementation-basis. That all said, there are multiple ways for implementing multiple-thread use, some easier than others. Probably the easiest way would be to "outsource" the lowest level simple tasks to other threads, figuring out what would be the necessary data for that task and replicating it along with the task instruction to the destination thread. Basically the level of simulation objects we're talking about would be element states here. Doing this with higher level tasks such as whole elements (multiple states) can be hard because # Multi-process considerations It would be really nice to create a program capable of running an outcome sim using multiple instances of itself (processes). This would enable quite straightforward scaling with larger and computationally-heavier scenarios - just add more machines and you’re set! ## Distributing single simulation instance There are a few problems with using multiple processes to collaboratively run a single simulation instance. The biggest ones are sharing resources and timely coordination. Sharing resources can messy quickly. If we had a process Because of those problems this approach will not be pursued. ## Distributing proofs Another approach to scaling up the processing power for running simulations is to run multiple simulation instances in parallel. Proofs are based on running multiple simulation instances based on the same scenario with only slight changes. Distributing this kind of work is easier because we just tell processes to run simulation instances on their own. The only coordination needed is telling the worker processes what to work on and then have them report the results back to the main process.