# Summer Status Updates ###### tags: `minutes` ## Week 1 (June 2nd, 2022) Hi Dr. Valvano, Over the past week we have: - written [documentation](https://github.com/ut-utp/tm4c/blob/kludge/README.md#development) for setting up a [developer environment](https://github.com/ut-utp/.github/wiki/Dev-Environment-Setup) and made changes to the `tm4c` repo to streamline the build process - this has been tested on macOS and Linux; still polishing the Windows flow a bit - done some work to improve installation and usage for students - most of this is in the planning phase: [1](https://github.com/ut-utp/core/issues/149), [2](https://github.com/ut-utp/tui/issues/11), [3](https://github.com/ut-utp/tui/issues/8), [4](https://github.com/ut-utp/tui/issues/7), [5](https://github.com/ut-utp/tui/issues/6), [6](https://github.com/ut-utp/tm4c/issues/7), [7](https://github.com/ut-utp/tui/issues/10) - but we have also written better install documentation and [packaged lmflash](https://github.com/ut-utp/.github/wiki/lm4flash-Binaries) for windows/macOS/linux (lm4flash offers an easier way to flash the board the first time than installing opencod) - [started planning](https://hackmd.io/ulCH3pjVSX-9aOBgOjpuFw) how a flash based memory solution will fit in with the rest of the software stack - Pranav made a [prototype](https://github.com/ut-utp/embedded-experiment) of a TM4C flash solution - Pranav has also [begun work](https://github.com/ut-utp/core/tree/slip_encoding_rpc) on making our board communication solution more robust - David Gipson (another student on the original senior design team) has been working on a [rewrite of the assembler that](https://github.com/ut-utp/assembler/tree/chumsky) is better structured and performs much better than our initial version; in the past week he's reached the point where this rewrite passes our entire assembler test suite Over the next week I'm planning to: - spend some time cleaning up the `core` repo - we have some abstractions that, in retrospect, can be simplified and there's much that could do with some better documentation - work with Pranav on adding fault tolerance to our board communication protocol - begin exploring how tractable adding ICDI support to [probe-rs](https://github.com/probe-rs/probe-rs) will be - this is not high priority but I want to begin looking at this because being able to leverage the probe-rs ecosystem will give us: - a better user experience (students will not need a separate program to flash the TM4C the first time) - a better developer experience (despite our best efforts, openocd + gdb have never been particularly reliable) - the ability to run [automated tests](https://crates.io/crates/defmt-test) on the TM4C We're at the point where you can now run LC3 programs on the TM4C with the caveat that some peripheral support is still missing/bugprone and board communication is flaky (tends to crash after a few minutes of use). As mentioned, we're working to address this but we thought that it'd be good to get a somewhat working solution in your hands right away so that you can start to experiment with it. I think Pranav will not be able to make the Zoom meeting today, but I am still available at 7PM if you have questions or want to get set up on macOS (or anything else). Thank you, Rahul Butani # Week 2 (June 9th, 2022) ## Status Update Didn't send out the status report this week but, just for posterity for this week I think it would have been: ### Rahul - Some improvements to the LC3 testing infra (running with an OS actually runs OS_INIT, letting us ensure that tests work in user mode and don't ACV; misc improvements to the macros) - Some clean up for the peripheral traits (still in flight, but I'm increasingly convinced not threading the AtomicBool interrupt flags through the interpreter is a good change to make; it — needing interrupt state to come from the Interpreter — is kind of a hold over from the callback based design and is just confusing and cumbersome at this point; removing it from the peripheral traits (and letting individual peripheral impls that need it manage it themselves) seems good because it'll let us get run of our one lifetime param most everywhere) - Some fixes/test cases around I/O (the stuff I was yelling about in #lc3-programs) - An embedded_hal based GPIO impl (decided against using GATs for this, it's [increasingly questionable](https://github.com/rust-lang/rust/pull/96709) what the stabilization story for type GATs is and I'd like to stay on stable. It also doesn't buy us much, with the [macros we have](https://github.com/ut-utp/tm4c/blob/e734f5890e02b2eabc1dc5139d9c181918dc6e86/src/main.rs#L131-L167) it's ergonomic enough for me) + happy to have a design discussion about ^ if anyone is interested; I think there are some interesting things around coherence/assuming things about HAL impls - Random QoL fixes in the TUI (reset terminal even when crashing, hide some panic messages, etc) - Staring at the RPC impl and the Plan for fault tolerance in... horror, wondering what I was thinking - Staring at the state machine for futures and this issue, thinking about how to simplify/whether we need to - Didn't do much in the way of actually concrete changes for RPC but I did do some testing: + the failure mode for communication with the board _once you launch a session successfully_ is the futures state machine issue described above. this crashes the TUI. remembering to pause/halt the LC-3 interpreter before exiting the TUI is the workaround for now + I don't get crashes anymore _once I have a successful connection_. I ran multiple hour long sessions with the board just running a program that jumps around in memory + This corroborates Pranav's assertion that USB-UART is, in fact, extremely reliable and not prone to dropped bytes/flipped bits as I mistakenly believed For the coming week, I plan to: - finish up the dev env fixes / figure out how to get openocd to run right on windows - do the benchmarking thing - clean up the changes I have for peripheral cleanup/tui cleanup and commit them - finish off the generic gpio impl (interrupts are the remaining bit) and open a PR for that - work with pranav on RPC debug - as mentioned above, I'd like to get more data on the prefix 0 bytes and decide how we want to mitigate - do clean up for the other peripheral traits (`interrupt_occured`, etc) - some codebase cleanup so we don't get a million warnings on every build - get CI working - start on other generic peripheral impls (ADC next I think!) ### Pranav - Finished up a [working Flash implementation](https://github.com/ut-utp/embedded-experiment/commit/1f51e85998ffc071aef2b86d3f12a92c39681ccd) for the TM4C (!) - Put together some self-contained tests for board communication, to try to figure out what the failure modes are + Pranav made a very interesting discovery: the TM4C seems to, sporadically, send a few extra 0 bytes over UART (perhaps on first startup) + We're not really sure why this happens (theories are: synchronization between the CTS/RTS lines on the debugger chip and the main chip or something about the UART fifos?) * not yet sure if it's just on reset * or if it affects all boards or just some * or if the host computer is a factor at all * anecdotally, I (macOS, TM4C not lm4f) cannot reproduce this issue but Dr. Valvano (on Windows) seems to be able to and so can Pranav (Linux) * I'd love to get more data about this; if anyone else still has their TM4C and is willing to test please reach out + This is a great find on Pranav's part and it explains the weird "sometimes you have to reset a few times before the TUI will successfully connect" behavior that we're seeing + When we met, Pranav and I agreed that mitigating this in software is probably the only tractable solution * we discussed some fault tolerant layer ideas, at my urging, in the vein of [this rant](https://github.com/ut-utp/core/blob/2655e618862cac4f131015b4cd05fc048090b60d/device-support/src/rpc/mod.rs) * but * seeing that USB UART does seem to be very reliable once you get it working, I'm wondering if we can get away with just special casing a little bit of logic that clears out some zeros on transport startup - unsure, definitely something to discuss further - Put together a [prototype DMA implementation](https://github.com/ut-utp/embedded-experiment/commit/ef6a47ba91a13f2ea208daa6df6325e2906d320e) (!!) and got the board to run all the way up to [4M baud](https://github.com/ut-utp/embedded-experiment/commit/2b8b59cf7dea800ad821ce84cc6235103a161cfa) (!!!) + there are still some kinks to be worked out (receive and not send, IIRC; hangs with the TUI) but this is very impressive For the coming week: - (not totally sure but I assume RPC, maybe some finishing touches on the flash impl?) @pranav12321 + entirely up to you but if you're looking to work on flash a bit this week, I'd love to talk a bit around what the embedded-storage/generic plumbing [might look like](https://hackmd.io/ulCH3pjVSX-9aOBgOjpuFw); I think it'd be nice to have a design review at some point; not a priority though - @pranav12321 I'd love to get your feedback on the generic gpio impl; I can tag you as a reviewer once I open the PR + @gipsond too if you're interested; looking for feedback on like, the API/the experience of using it as a person looking to support a new board ### David Gipson I've reviewed the errors the old assembler produces and I think I've covered everything except: 1) the case where a file contains nothing to assemble (no content, or missing some .orig/.end) 2) The new assembler is also less specific about the reasons that certain tokens are invalid (e.g., where the old assembler expected a number literal, it would point out if you aren't using a valid base prefix [`b`, `#`, `x`]). My plan for the next week or two is: 1) Address #1 above (should be quick) 2) Set up tests for the error cases, something which was missing from the original. 3) Clean up the new library API so it's tailored to how it's used in the TUI (and other crates? I forget). 4) Go ahead and try merging this branch 5) Make any necessary changes to the other crates. Addressing #2 (from the top list) would be nice but it just amounts to making the errors nicer; the new assembler still produces errors for any invalid tokens. So even though it's a significant regression, I want to try and merge this branch, then address it soon after. The plan after merging is: 1) Enhance errors with more specific analysis to match old assembler (#2 from top) 2) Document thoroughly for users, then maintainers 3) Support .EXTERNAL as proposed in Patt Ch. 7.4.2 (stretch goal, but should be fairly quick) 4) Switch to working on the LSP (stretch goal, if I've still got the courage to embark on that) and I guess I'll fix the ariadne `<unknown>` thing before the PR too. Priority #2.5 ## Meeting Summary I just met with Dr. Valvano and wanted to type up notes while it's still fresh: - Earlier today I finished up my generic embedded_hal based GPIO impl (well, not interrupts yet but the rest of it) and... it Just Worked:tm:, first try - I [merged](https://github.com/ut-utp/core/pull/154) in `feat-wasm-support` so we can have the tui build from the same core repos as everything else (@Pranav Rama I'm not sure if this was your issue but I got bitten by the wire format mismatch issues described in [this issue](https://github.com/ut-utp/core/issues/149); it certainly manifests in the same way, just a blank screen, with decode errors in the logs) - I didn't have time to properly clean up our stuff and make the commits I wanted to so I made `snapshot-06-09` branches on all the repos with a gross commit; I'll delete this later - I demoed both of ^ for Dr. Valvano and after a little fumbling we got all of ^ (running the TUI, flashing the TM4C, connecting to the board, running blinky) working on Windows :tada: - Dr. Valvano installed a git client, should make sharing code easier :stuck_out_tongue: I also showed him: - how to build and look at the TRAP docs - how to use the TUI's built in assembler, the file watcher stuff, etc. All in all, I think it was a good meeting; I think he's warming to the TUI and at this point he has enough to write/debug/flash TM4C programs He mentioned that he told his 319K students that his plan for the summer is to "learn Rust" :crab: His one ask was execution speed of the interpreter when running on the TM4C; he's concerned about the compute requirements of the solar panel lab. I showed him a quick demo that demonstrates that execution speed is _at least_ 50KIPS (with the TUI still updating at 5Hz) and I'm, frankly, not concerned about execution speed for what the 302 students will be doing but I _will_ throw together some quick benchmarks (using the clock peripheral). I don't anticipate needing to optimize for perf but if we do need to there's tons of low hanging fruit. Ultimately, I'm: _cautiously optimistic_ # Week 3 (June 16th, 2022) ## Status Update ### Rahul Slower week for me, had some other obligations. - adding some logging functionality for RPC (for debugging) - wrote some one-off fixes for RPC: + specifically: * retry sending on decode error (existing) * retry sending on timeout * ignore wrong response (instead of panicking) + the above don't constitute a super principled story for error resilient communication but they seem to work well enough in practice + still need to write up an issue/make a PR but it's on the snapshot branch for now - modified `Transport` to have a `blocking_get` method + this is used in the `Controller` only in contexts when we _know_ we're expecting data (everywhere but `tick`) + it's a default method, not a breaking change + lets us save some CPU cycles when we can shell out to an implementation that leans on OS primitives that block with a timeout instead of spinning - found a "bug": the interpreter doesn't reset the PSR's user/supervisor bit explicitly and relies on `Memory::reset` doing it + this fails on the TM4C since the memory impl's `reset` doesn't actually do anything yet... - fixed some TUI bugs/annoyances: + still need to make PRs for these but they're on the snapshot branch - investigated binary size/codegen for the generic GPIO impl's `set_state` function + I've been looking at running on other boards, specifically AVR parts; one of the things that stood out as a blocker is that some of these MCUs have far less flash/SRAM than the TM4C (for example the atmega328p has 2KB SRAM, 32KB flash) + as it stands, `.text` on the TM4C is ~44KB with `-O3`, ~33KB with `-Oz` + one of the things that stood out to be is that `Gpio::set_state` weighs in at a hefty 4.5KB * adding some `unreachable_unchecked()` calls brings this down a bunch * but it's still a few KB on Thumb2 ARM * codegen produces flattened jump tables (even with `-Oz` and `-Os`); this means 8 pins * 5 starting states * 4 ending states = 160 entries - even on thumb where we have [an instruction](https://developer.arm.com/documentation/dui0473/m/arm-and-thumb-instructions/tbb-and-tbh) that accepts a table of byte sized PC-relative pointers to use as a jump table this still ends up using a decent amount of space - this is great for latency (pretty much every branch ends up being <4 instructions executed) but isn't great for code size - `-Oz` does some outlining but it still feels like the compiler should be able to collapse cases that are isomorphic + indeed, when the generic GPIO impl is instantiated with "type erased" pins rather than 8 unique types the jump tables produced shrink commensurately * extracted minimal example is [here](https://rust.godbolt.org/z/4aodn39dY) if anyone wants to play with it * ultimately the above example ends up being about 760 bytes on AVR so * for the time being I'm leaving it as is (on platforms where space is at a premium we can see if using type erased pins helps) - investigated the weird behavior we're seeing RE: baud rates and the TM4C + in short: the moment my machine issues the `ioctl`s to set the ICDI chip to baud rates above 1.5MBaud, I get nothing back from it (regardless of what baud rate the main TM4C is actually communicating with the ICDI chip at) + need to do further testing with other boards/host machines though * 2MBaud+ seems to work fine for Pranav - found a bug where issuing Load API commands _while the interpreter is running_ crashes, but only when running on the board + still need to debug this + for now the tui pauses before starting a load (which we may want to do anyways) - fixed up the blinky program + the starting SP we were using was outside what the partial memory impl had mapped - looked at the Web Serial API a bit + definitely suitable for what we want to do but it will require creating `async` versions of everything from the `Transport` and `Control` trait on up (including `Widget::update`) ### Pranav - Got DMA-based board communication more or less working with the TM4C at 2MBaud (!) + send _and_ recieve ### David Gipson - Finished rewriting assembler to not crash under as many circumstances as possible, and instead provide formal error messages - Wrote automated tests for many basic error cases - _Next, will:_ - Expand automated tests to cover a few more complex cases - Integrate the new assembler with the rest of the project - Document the assembler for maintainers and students