2022-02-03 Pernos.co demo

--- title: 2022-02-03 Pernos.co demo tags: pernos.co --- # Pernos.co demo This document was used to drive a demonstration of pernos.co. You can see that demo on YouTube, here: https://youtu.be/uTc7KCBbVFI?t=87 [toc] ## Links to pernos.co sessions {%hackmd gWG5XO-XRK6GdyVkMBqV4A %} ## Basics ### Point and click Here is the initial presentation for Pernos.co ``` ┌---------------┬----------┐ | | | | stdout/stderr | | | | Source | ├---------------┤ Code | | call stack | | ├---------------┤ | | alerts | | └---------------┴----------┘ ``` Scroll up to output of interest in stdout/stderr. Click on a character in it. You should see several changes: ``` ┌---------------┬----------┐ | stdout/stderr | | ├---------------┤ | | call stack | Source | ├---------------┤ Code | | alerts | | ├---------------┤ | | dataflow | | └---------------┴----------┘ ``` * Some portion of the stdout/stderr gets highlighted. * The source code and call stack panels change their focus. This reflects the point in the control flow where that output was emitted. * A new panel pops up, "Dataflow for to `$C`" (for whatever output character $C you clicked) ### Tool panel maintance I have found pernos.co can sometimes be quick to inject new panels on the left side, which can be quickly overwhelmed. So, familiarize yourself with the "Add Column" and "Remove Column" buttons on the right toolstrip. We can drag the Dataflow panel over to the right. ### The Dataflow Tool Each step in the dataflow is made up of a pair of two lines: the even lines are the source locations where the memory update occured, and the odd lines are the actual variable (i.e. place) that was updated, and the new and old values. (The old value is rendered as ~~struck out~~) As we click through steps in the Dataflow, we see the source view adjust to each step we find. But, oh golly, now have we lost our place? Where were we before in the source code? ### Browser Interaction Luckily, pernos.co largely tries to behave like a proper browser: The back button usually does the right thing, and takes to back to where you just were, in terms of the current source code flow. (However, I have found it won't revert time in terms of what data you have focused in the dataflow view, for better or for worse. You have to reconstruct that by hand.) ### Multiple Models of Time Its also good to keep in mind the two directions you may want to go in: backwards in time itself, according to the raw dataflow of *where* memory updates occured, and backwards *up the call stack* at each of those points in time. You can visualize this sort of like this: ```mermaid graph LR main["main function"] main --->|invoke| a0 main --> b0 main --> c0 main --> d0 mem0["x = <value>;"] mem1["a[i] = x;"] mem2["*p = a[i];"] mem3["z = *p;"] value((value)) -.->|initialize| mem0 -.->|copy| mem1 -.->|copy| mem2 -.->|copy| mem3 subgraph Loading and Initialization a0 --> a1 --> a2 --> mem0 end subgraph Manipulation b0 --> b1 --> b2 --> mem1 end subgraph More Manipulation c0 --> c1 --> c2 --> mem2 end subgraph Output Generation d0 --> d1 --> d2 --> mem3 end ``` (One could imagine further elaborating this picture to incldue the outputs to stdout/stderr, which is yet another series of events ordered in time that we can navigate.) The memory updates happen at the "mem" points written `A := B;`. But those points of execution are at the "top" (hot end) of a call stack, and the actual level of detail that you the programmer care about is likely further *up* the stack. So it can be useful to *scan* the call stack with your eyes as you step backward through the data flow chain formed by the dotted lines. For example, we should start by scanning the call stack for the very last memory update, and see if we can find a function call that seems relevant to what we're investigating. Its also entirely possible to track back "too far" if you're looking just at the dataflow. In the picture above, the kinds of bugs we're concerned with usually arise in the midle, during *Manipulation* and *More Manipulation*. The parts where strings are initialized (e.g. due to lexing the program) are unlikely to tell us much about bugs in name resolution. ### Notebook Tool When traversing an unfamiliar source base in pernos.co, I find it useful to use *notebook* to log control-flow points of interest (along with any attached notes on them.) Let's open that now. The current focus is presented in the Notebook, dimly. If you want to save the current focus as an entry, take the mouse and click its checkbox. This will preserve that focus, like a bookmark: If we browse elsewhere in the control flow, we can always click this notebook entry to return to this focus point (in terms of what point in the control flow of the program we are focusing on; it won't reset the Pernos.co UI to its state at the time we saved this focus entry). Try scrolling back through the source code to see if there's another source location worth clicking. ### Callees Tool If you want to see all the immediate subroutines that were called from here and what they returned, use the Callees tool. ### Local Variables Tool If there's local state you want to inspect, you can bring up the Local Variables tool. ### Search Box Much like modern web browsers, Pernos.co has a single search box, sitting at the top of the interface. You can click it, or type Ctrl-S to switch the cursor to it. You put in text, and it tries to figure out what you were asking for: Function names, source file names, pernos.co tool names, expressions to evalate in the current focus (which you can then select to add them to the Notebook tool). ### Dataflow vs Computation Note that the dataflow view is only going to handle direct copies at most. As soon as there's any interesting logic, you'll need to make that connection. (E.g., how `report_ambiguity_error` converts a given `AmbiguityError` into a `DiagnosticBuilder` that can be emitted.) **But**: you can get Dataflow views on those intermediate values; it is not limited to characters in stdout/stderr tool. ### Data Structure Rendering Also: Note that when state is boxed, you need to take care in understanding the difference between dataflow for `s: Box<Struct>` versus the dataflow for some field `s.x`: quite often, you'll see totally different chains of memory accesses for each. This should not surprise us: Part of the whole point of boxing is to avoid memory traffic. But this means that if you jump into looking at just the chain of dataflow operations for `s.x`, you may miss insights garnered from how `s` itself was moved around. * In Pernos.co's interface, this distinction between `s` and `s.x` is shown in something like: ``` ▶︎ ambiguity_error … ::AmbiguityError* @0x7fca2c050e60=0x7fca2c050e60 ``` and you can click triangle to expand it into ``` ▼ ambiguity_error … ::AmbiguityError* @0x7fca2c050e60=0x7fca2c050e60 ▶︎ *0x7fca2c050e60={b1=0x7fca2c217a18, b2=0x7fca2c217a60, ident={name= ({private=1470}), span={base_or_index=214, len_or_tag=3, ctxt_or_zero=0}}, kind=GlobVsGlob, misc1=None, misc2=None} ``` and so on (continuing to click ▶︎ to expand the hidden state). ### Customizing Notebook Entries Each entry in the Notebook can have a custom note attached. It can also get a custom color, which may be useful when you want to visualize where that focus falls in the control flow presentation of other tools. ### Source Code View Each line of source code has a color in the leftmost column and a line number. If that line is executed in the current focus, the leftmost color will be non-white. * If you click that non-white leftmost color, you get a tool pop up, "Executions of file:line *in this call*" (emphasis added). * If you click the line number instead, you get a tool pop up, "Executions of file:line", which is (implicitly) *across the whole program execution*. ### Merging Views As mentioned above, several of the tools are presenting a projection of events in the forward flow of time, identifying certain points of interest. The "Callees" tool is one example of this. The "stdout/stderr" tool is another. One advanced feature of Pernos.co that I am myself trying to understand is *merging* such views: You can take the "Callees" tool and drop it onto the "stdout/stderr" tool, and you get a hybrid view in one panel, ordered according to the total ordering of the event history captured in the trace. ### gdb tool If you are finding the Pernos.co's renderings of local variables weak, or if you are a gdb expert, you may want to try the `gdb` tool, which runs an actual gdb attached to a process stopped at the current focus. So you can do the usual stuff from the `(pernosco)` prompt in the `gdb` tool, like printing local variables with `print local` * This is especially important when it comes to interpreting things like pointers to data arrays, which Pernos.co cannot automatically determine how to render in the general case, and so to inspect such things you'll really want to use the `gdb` tool. * (I would think that control flow operations like single stepping or `reverse-finish` would work, and the `gdb` tool *accepts* them, but they did not work smoothly for me. So, maybe avoid those for now. I'm going to check with the pernos.co developer about the gap between my expectations and reality there.) * Also, support for Rust expression evaluation in gdb is somewhat limited today (e.g. you can only invoke already monomorphized things, and have to explicitly write out how any generics need to be instantiated). This complicates having as much fun with `(pernosco) call EXPR` here as I was hoping. * However, there's a lot of interesting extensions documented on the Pernosco Tips, which is accesed via the button that has a circled question mark * The one that caught my eye was: "In some contexts expressions can use the return value of a function call via `$ret`"; that's the sort of thing that is only possible with an oracle, which is what Pernos.co is acting as here. * It also says "gdb and python scripts associated with the project and system libraries are automatically imported into gdb." * but I need to double-check that; my first look at printing a vec shows `alloc::vec::Vec<alloc::string::String, alloc::alloc::Global> {buf: alloc::raw_vec::RawVec<alloc::string::String, alloc::alloc::Global> {ptr: core::ptr::unique::Unique<alloc::string::String> {pointer: 0x7fca2c0c0ef0, _marker: core::marker::PhantomData<alloc::string::String>}, cap: 4, alloc: alloc::alloc::Global}, len: 1}`, and I thought our gdb extensions do a better job for `Vec` than that.