# CS 591 Sys-Net [![hackmd-github-sync-badge](https://hackmd.io/RnBY_-2BSSmSsNYsOmIG7A/badge)](https://hackmd.io/RnBY_-2BSSmSsNYsOmIG7A) ## Week2: Some interesting paper from ATC/OSDI 2022 - Xudong Sun ### Automatic Reliability Testing For Cluster Management Controllers Fuzz test ### DuoAI Ivy vs CoQ - What is the advantages of Ivy compared to other verification language like CoQ? Because I have heard of CoQ but Ivy is new to me. ### RESIN: Memory Leak #### Previous Tech - static - dynamic 人工智障 Fair idea - 首先内存如果慢慢上涨,那多半有问题。其次,在其他机器上运行没问题,那一定是你的问题。 - 误报率有点高 ### Debugging the OmniTable Way Idea: replay Insight: Lazy materialization >SteamDrill introduces lazy materialization as a solution. Rather than materializing an OmniTable during execution, SteamDrill uses deterministic record and replay [10] to cap- ture a log of non-deterministic inputs to the execution. The system uses the log to generate OmniTable state on-demand by instrumenting and re-executing the original execution as necessary to resolve debugging queries. Delaying OmniTable materialization allows SteamDrill to filter OmniTable data before extracting state instead of afterwards. ## Week4: OSDI / ATC ## Week 5: VLDB Very Large DataBases - Engines - Graphs - ML,AI 4 Papers Today ### Netherite: Efficient Execution of Serverless Workflows Write Buffer ### DBOS: a DBMS-oriented operating system Distributed OS by DBMS everything is a file -> everything is a table Database Operating System - straw - rude test - wood - run one app - brick - finally scheduler: context switch file system: page change, pointers, metadata, e.g. scheduler: FIFO with sql: \`order by\` 和几年前那个 excel 操作系统有的一拼 ![](https://i.imgur.com/aBZaqTk.png) ### S￿￿￿￿￿: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks Its common to use Graph database to do training, such as use nebura for GNN. The innovation of this paper should focus on how to avoid communication. The techniques this paper used are “Despite the promising performance, the major challenge that limits the adoption of GNNs to large-scale graphs lies in the inability to utilize all data in ￿nite time and the scalability of the algorithm itself.” ([Peng 等。, 2022, p. 1937](zotero://select/library/items/QVR4CWIF)) ([pdf](zotero://open-pdf/library/items/U43KUQTT?page=1&annotation=K5ZBHVC8)) ## Week 6: WISC Storage from Prof. Andrea Arpaci-Dusseau and Prof. Remzi Arpaci-Dusseau ADSL Lab ### Scale and Performance in a Filesystem Semi-Microkernel uFS ![](https://i.imgur.com/mxfqLDJ.png) ### Can Applications Recover from fsync Failures? No ## Week 7: ### Starvation in End-to-End Congestion Control - Linux TCP: Loss-based congestion control - Delay-convergence congestion control - delay bias suffer from starvation - unfair resource cause starvation - Starvation Phenomenon: 不公平的带宽来源于不一样的相应 - Delay - AP, WIFI : non-congestive - Delay-convergencnt CCAs have similar delays for difference link rates - Delays stands for different link rates, you cannot use RTT to do congestion control - Fixs - use ECN - specify the link rate, like QoS (set net speed) ### NeuroScaler: Neural Video Enhancement at Scale #### Current Solution: - Scale down to low resolution - superresolution on low res 1. How to acceleraye neural enhancement 2. How to schedule it on a cluster of gpu instance #### Observation - Choose good frame (key frame) by resolve dependencies - Maximize to impact of residual - Good for superresolv ### RF-Protect: Privacy against Device-Free Human Tracking #### Indoor Radar-tracking - FMCW for distance - Antenna Array for angle (phase) #### {Distance,Angle} Spoofing Let ghost to do spoofing #### ML Part How to create reflection? Move in a realistic way ![](https://i.imgur.com/oXzJgiS.png) ## Week 8 Mobisys ### AutoCast: scalable infrastructure-less cooperative perception for distributed collaborative driving ## Week 9 ### Determining non-deterministic events for better idle state prediction #### Background ##### Idle states - Target residency - Exit Latency Choose idle right states saves energy and performance ##### Ticked systen & Tickless systems If a process is waiting for IO, then sleep it. - TEO Governor - recompute idle duration - measure the accuracy (Hit / Miss / Early Hit) #### Is more history better No. What is the prediction state For differrent system or user behavoiur, do we need to adjust learning rate as well? #### Results Being less wrong is better than being more wrong #### Challenges - Learning Rate - Variance across architecture - This is based on IBM PowerPC. X86 has reported no significant improvement.