# CSE141L Lab 3 Caching Optimizations Worksheet2 Name: __________________________ Student ID: ____________________ # Instructions * Complete this worksheet while reading/working through the lab write up. The worksheet doesn't make sense without the lab. * The point values are listed for each question. Altering the size of the cells will cost you 1 point. The write up portion of the lab is 30% of your total point for the lab as shown in the lab's README.md ## Tier 2: Optimizing calc_grads #### P1 (4pt) Change the order of loops from `b i n` to `b n i` in the the triply-nested loop in `fc_layer_t::calc_grads` and report the speedup. Speedup after loop reordering : _______________ #### P2 (4pt) Block loop `n` in the the triply-nested loop in `fc_layer_t::calc_grads` with different step sizes and fill out the following table. | Function | Step size | Base implementation time | Blocked implementation time | Speedup | |------------|-----------|--------------------------|-----------------------------|---------| | calc_grads | _________ | ________________________ | ___________________________ | _______ | | calc_grads | _________ | ________________________ | ___________________________ | _______ | | calc_grads | _________ | ________________________ | ___________________________ | _______ | | calc_grads | _________ | ________________________ | ___________________________ | _______ | | calc_grads | _________ | ________________________ | ___________________________ | _______ | #### P3 (4pt) In a single line plot, plot performance vs. block size for blocking the loop `n` in the the triply-nested loop in `fc_layer_t::calc_grads` and return block size that gives maximum speedup. Block size is the independent vairable. ``` Your graph here ``` Best block size : _____________________ ## Tier 3: Applying More Optimizations #### P1 (5pt) Give a brief description of two additional loops you tried blocking. Report the speedup you achieved for each one. ``` Your answer here ``` #### P2 (5pt) Give a brief description of an additional optimization you implemented to speedup training. ``` Your answer here ``` #### P3 (2pt) Illustrate the effect of one of your tier 3 optimizations with a screen capture from moneta. ``` Your answer here ``` #### P4 (1pt) Were there any differences in the miss rate observed using the performance counters and moneta? What could contribute to the differences? (A brief answer is fine) ``` Your answer here ```