Moneta - HackMD

# Moneta Moneta is the memory access visualizer tool built by students who took the first version of this class and who wanted a better way to understand how their programs were accessing memory. Moneta has two parts: The first is a *binary instrumenter* that modifies your executable to record a "trace" of the memory accesses it performs. This is based on a very cool tool called [PIN](https://software.intel.com/content/www/us/en/develop/articles/pin-a-dynamic-binary-instrumentation-tool.html). This piece of instrumentation is called a "pin tool". The pin tool does two things: 1. It records memory accesses. 2. It simulates a simple cache hierarchy so it can label memory accesses as hits and misses. It records all this information is file called a trace. The second piece of Moneta is a trace viewer built using Jupyter Notebook using a collection of tools for visualizing large data sets. It lets you quickly load and explore a trace file. The visualizer displays a graph with relative address on the vertical access and memory access number on the horizontal axis. This gives you a visual depiction of how the processor is accessing memory over time. Using Moneta has three steps: Adding some special functions to your code to tell Moneta what you want to trace, collecting a trace, and then visualizing it to learn something. ## Instrumenting your Code to Collect a Moneta Trace Your program accesses a lot of memory -- the stack, the heap, all your data structures, a bunch of stuff from the standard libraries, etc. This can make it very hard to find what you are looking for when you are trying to optimize a particular function. In addition, if you recorded all the memory accesses a large program performed, it would take many, many GBs of storage and probably hours to process. To avoid these problems, Moneta provides a facility to: 1. Mark regions of memory with 'tags' so you can find them easily. 2. Turn tracing on and off. Both these mechanisms work by inserting calls to special functions that Moneta can identify. You'll need to `#include<pin_tags.h>` to use these: * `START_TRACE()` -- Turns on tracing. If you don't call `START_TRACE` _nothing_ will be recorded. * `STOP_TRACE()' -- Turns off tracing. Nothing will be recorded until you call `START_TRACE` again. * `DUMP_START(const char* tag, const void* begin, const void* end, bool create_new)` -- Opens a _tag_ which will label accesses between `begin` and `end` with the `tag`. All operations in that range until `DUMP_STOP` is called with the same tag will be part of the tag. `DUMP_START` takes four parameters: 1. `tag`: A string name to identify the trace 2. `begin`: Identifies the memory address lower bound to trace (Array/Vector Example: &arr[0]) 3. `end`: Identifies the memory address upper bound to trace (Array/Vector Example: &arr[arr.size()-1]) 4. `create_new`: If the `tag` name has not been used before, then `create_new` is ignored.If the tag name has been used before then, if `create_new` is `true`, then the tags will start having an index, `tag0`, `tag1`, ... If `create_new` is false, then the tracing will add the information to the last tag of the same name, so the same tag. * `DUMP_STOP(const char* tag)` -- Closes a tag. * `DUMP_START_ALL(const char * tag)` -- A wrapper around `DUMP_START` that create a tag that track _all_ memory accesses. This is useful for tagging all the memory accesses that occur during a period of time. You can close the tag with `DUMP_STOP`. For instance, you can use this to tag all the memory operations that occur in a function. **Note:** Once you create a tag, you can't create a new tag with the same name that covers a different address range. If you do, you'll get an error like "Error: Tag redefined - Tag can't map to different ranges". **Note:** Due to memory limitations on `dsmlp`, we can only reliably record 10 million memory accesses. This is not that many. You'll need to carefully choose where and when to enable tracing. A good practice is to call `START_TRACE()` right before the code you want to trace and `STOP_TRACE()` when you're done. The program will stop running after it traces 10M memory operations. ## Collecting Traces **You can only generate Moneta traces inside the class Docker image on `dsmlp` (or your own machine). Moneta never runs in the cloud.** The simplest way to collect a trace is at the command line using the lab's `Makefile`: ``` make traceme_trace ``` will run run the code in `traceme.cpp` and generate a trace (try it!). You'll have these files: ``` meta_data_traceme.txt tag_map_traceme.csv trace_traceme.hdf5 ``` These are the "trace" of the program's execution. Likewise, ``` make code_trace ``` will run your code with the same command line arguments as `make code.csv`. After you run it, you'll find three files: ``` meta_data_code.txt tag_map_code.csv trace_code.hdf5 ``` ## Launching Moneta Moneta runs inside Jupyter Notebook and you will access via a web browser. **To access Moneta, you must be connected to the campus VPN**. After you log into dsmlp-login.ucsd.edu, and run `launch-142` as usual, you'll see something like this: ``` ucsdnvsl/cse141pp:sp21.150 is now active. Please connect to: http://dsmlp-login.ucsd.edu:19589/?token=a4da2a4c6d82c31d9525ba51b3c734fd2e748d3ea929eafa675b3166e4b10a Connected to sjswanson-32617; type 'exit' to terminate pod/processes and close Jupyter notebooks. /course/CSE141pp-Config ~ ``` Visit the url provided after "Please connect to:". This should open a window showing the contents of your login directory on `dsmlp-login` like this: ![jupyter notebook](img/jupyter-start-screen.png) From there, you can navigate to the `Moneta.ipynb` in your lab repo. At the top is box that says ``` %run /home/jovyan/work/moneta/moneta/main ``` Click on the text and press return. That should drop you into Moneta: ![Moneta](img/moneta-home-screen.png) At top left are some text fields **THAT YOU SHOULD IGNORE AND NOT USE**. The only parts of the UI you'll need in this lab is the list of traces in the box on the right and the "Load Trace" button. In that box, you should see "traceme". That's the trace you just built. Click "Load Trace" to load it. Click on "Tags" (right hand side) and then click the magnifying glass next to "both". It should zoom into the trace of `main` in `traceme.cpp`. Take a moment to figure out which part of the program each of the lines you see represents. Check out the [video demo](https://youtu.be/s2lRgt2P_kU). ## Moneta Important Notes 2. **We have included an example usage of MONETA's functions described below in the starter code.** You may need to make minimal changes to do the same for other data structures. 3. Once you are in the jupyter notebook (Moneta.ipynb), enter something random/gibberish in the "Function to start trace at" field if you are using the START_TRACE() function. This is because Moneta will begin tracing at either the location of START_TRACE() or the function in this field, whichever comes first. We want it to start tracing at START_TRACE() or it may max out the number of memory accesses before reaching this point. 5. To view the trace for a "tag", like "weights" in the example moneta code, de-select the "stack" tag and leave the tag for heap and "weights" selected. Then click the small magnifying glass for "weights". 6. You will need to pass the command line options in config.env to code.exe when running the jupyter notebook. By default code.exe will run a different dataset. For instance the "Executable Path" should be code.exe --dataset cifar100. This will make sure you are running on cifar100.