# Boosting Performance: Optimizing Directory Extent Block Access ## Legacy design The current design: * New file/ directory add into parent directory find the latest emply file sector and store it * Remove a file/ directory Set the inode to 0 and move the following sector forward. ## New design Inspired by ext4, we added a nr_blk field to record the number of contiguous blocks within the file. ### [Ext4 dir entry info](https://elixir.bootlin.com/linux/v6.14.8/source/fs/ext4/ext4.h#L2328) * __le32 inode; /* Inode number */ * __le16 rec_len; /* Directory entry length */ * __u8 name_len; /* Name length */ * __u8 file_type; /* See file type macros EXT4_FT_* below */ :::info e.g. ``` 01091000 02 00 00 00 0c 00 01 02 2e 00 00 00 02 00 00 00 |................| 01091010 0c 00 02 02 2e 2e 00 00 0b 00 00 00 14 00 0a 02 |................| 01091020 6c 6f 73 74 2b 66 6f 75 6e 64 00 00 0c 00 00 00 |lost+found......| 01091030 10 00 05 02 74 65 73 74 31 00 00 00 01 20 00 00 |....test1.... ..| 01091040 10 00 05 02 74 65 73 74 32 00 00 00 02 20 00 00 |....test2.... ..| 01091050 10 00 05 02 74 65 73 74 33 00 00 00 03 20 00 00 |....test3.... ..| 01091060 98 0f 05 02 74 65 73 74 34 00 00 00 00 00 00 00 |....test4.......| 01091070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| ``` lost+found: ``` __le32 inode =0x00 00 00 0b __le16 rec_len = 0x0014 __u8 name_len = 0x0a(10) __u8 file_type = 0x02 file_name = 6c 6f 73 74 2b 66 6f 75 6e 64 ``` -> empty space = 0x00 00 (len = 2) empty len = 0x14 - 0x04(inode) - 0x02(rec_len) - 0x01(name_len) - 0x01(file_type) - 0x0a(file_name) = 0x02 test1: ``` __le32 inode =0x00 00 20 01 __le16 rec_len = 0x0010 __u8 name_len = 0x05(16) __u8 file_type = 0x02 file_name = 74 65 73 74 31 ``` -> empty space = 0x00 00 00 (len = 3) empty len = 0x10 - 0x04(inode) - 0x02(rec_len) - 0x01(name_len) - 0x01(file_type) - 0x05(file_name) = 0x03 ::: When removing a file or directory in Ext4, the filesystem merges the freed space with the previous to reduce fragmentation. This way, the space left behind can be efficiently reused. Later, when a new file or directory is created, Ext4 scans the directory entries from the beginning — a first-fit strategy — to find available space. If it encounters a free segment that is larger than the requested size, Ext4 will sperate from this segment. ### simplefs modify ```diff struct simplefs_file { uint32_t inode; + uint32_t nr_blk; char filename[SIMPLEFS_FILENAME_LEN]; }; struct simplefs_dir_block { + uint32_t nr_files; struct simplefs_file files[SIMPLEFS_FILES_PER_BLOCK]; }; ``` * simplefs_file->nr_blk * record the number of blocks by following this file * e.g * nr_blk = 1 -> only this file * nr_blk = 3 -> this + 2 empty simplefs_file * simplefs_dir_block->nr_files * number of files in the ==simplefs_file[]== ## limitation 1. The maximum number of files allowed per directory is reduced to 30,600 (for 200 M image file) 2. Listing operation is slower compared to the legacy method 3. Creating operation is slower compared to the legacy method ## Modify code ### code structure ``` (simplefs_file_ei_block) (simplefs_extent) eblock---- nr_files |-- nr_files (NEW) | | |-- simplefs_extent extents[ei] ---- ee_block | |-- ee_len | (point to simplefs_dir_block) (simplefs_file) |-- ee_start ---- nr_files (NEW) |-- nr_blk (NEW) | | |-- simplefs_file files[fi] ---- inode | |-- filename ``` ### simplefs_create flow ```flow st=>start: simplefs_create e=>end: end get_inode=>operation: get new inode fd_free_in_fi_ei_blk=>operation: free blk in "file_ei_block.extents" chk_alloc_new_dir_blk=>condition: alloc new simplefs_dir_block alloc_new_dir_blk=>operation: alloc a new one find_free_dir_blk=>operation: free bi in simplefs_dir_block find_free_simplefs_file=>operation: free fi in simplefs_file fillin=>operation: simplefs_set_file_into_dir first_free=>operation: first nr_blk > 1 set_nxt=>operation: set next(fi + 1) nr_blk = previous - 1 set_cur=>operation: set current(fi ) nr_blk = 1 st->get_inode->fd_free_in_fi_ei_blk->chk_alloc_new_dir_blk chk_alloc_new_dir_blk(yes)->alloc_new_dir_blk->find_free_dir_blk->find_free_simplefs_file->fillin->first_free->set_nxt->set_cur->e chk_alloc_new_dir_blk(no)->find_free_dir_blk ``` ### simplefs_remove_from_dir ```flow st=>start: simplefs_remove_from_dir e=>end: end find_file=>operation: find fi in simplefs_file clear_blk=>operation: set inode and merge into previous blk st->find_file->clear_blk->e ``` ## Test the performance * Remove 30600 files: * legacy method ``` 22062.16 msec task-clock # 0.996 CPUs utilized 1795 context-switches # 81.361 /sec 7 cpu-migrations # 0.317 /sec 4747 page-faults # 215.165 /sec 51268295368 cycles # 2.324 GHz 45654883366 instructions # 0.89 insn per cycle 10488643855 branches # 475.413 M/sec 15951697 branch-misses # 0.15% of all branches 22.157212315 seconds time elapsed 0.147259000 seconds user 21.875822000 seconds sys ``` * use block record ``` 14749.43 msec task-clock # 0.985 CPUs utilized 3707 context-switches # 251.332 /sec 8 cpu-migrations # 0.542 /sec 4725 page-faults # 320.351 /sec 35578980636 cycles # 2.412 GHz 41977131884 instructions # 1.18 insn per cycle 9800763267 branches # 664.484 M/sec 6998037 branch-misses # 0.07% of all branches 14.968838411 seconds time elapsed 0.101744000 seconds user 14.571970000 seconds sys ``` * create file * legacy ``` 102618.47 msec task-clock # 0.742 CPUs utilized 135243 context-switches # 1.318 K/sec 48370 cpu-migrations # 471.358 /sec 3855991 page-faults # 37.576 K/sec 190008162916 cycles # 1.852 GHz 137070979481 instructions # 0.72 insn per cycle 28270105922 branches # 275.488 M/sec 561816180 branch-misses # 1.99% of all branches 138.242576340 seconds time elapsed 20.828307000 seconds user 85.203551000 seconds sys ``` * use block record ``` 112542.76 msec task-clock # 0.759 CPUs utilized 133884 context-switches # 1.190 K/sec 50032 cpu-migrations # 444.560 /sec 3856118 page-faults # 34.264 K/sec 212785219291 cycles # 1.891 GHz 141859239946 instructions # 0.67 insn per cycle 28871184080 branches # 256.535 M/sec 569827259 branch-misses # 1.97% of all branches 148.324364965 seconds time elapsed 20.976661000 seconds user 94.894073000 seconds sys ``` * list all files * legacy ``` 165.21 msec task-clock # 0.968 CPUs utilized 15 context-switches # 90.796 /sec 3 cpu-migrations # 18.159 /sec 2220 page-faults # 13.438 K/sec 389961636 cycles # 2.360 GHz 577849909 instructions # 1.48 insn per cycle 108154959 branches # 654.666 M/sec 315500 branch-misses # 0.29% of all branches 0.170592822 seconds time elapsed 0.073521000 seconds user 0.092742000 seconds sys ``` * use block record ``` 162.95 msec task-clock # 0.966 CPUs utilized 17 context-switches # 104.326 /sec 3 cpu-migrations # 18.410 /sec 2215 page-faults # 13.593 K/sec 377643105 cycles # 2.318 GHz 570636788 instructions # 1.51 insn per cycle 106418855 branches # 653.075 M/sec 302796 branch-misses # 0.28% of all branches 0.168633385 seconds time elapsed 0.081990000 seconds user 0.081889000 seconds sys ``` * Non-linear file removal, including recreating the same number of files * legacy ``` Performance counter stats for 'bash ../ut_script/random_remove_and_create.sh': 245107.46 msec task-clock # 0.930 CPUs utilized 235366 context-switches # 960.256 /sec 3848 cpu-migrations # 15.699 /sec 7676129 page-faults # 31.317 K/sec 447994787202 cycles # 1.828 GHz 311853053406 instructions # 0.70 insn per cycle 64490868435 branches # 263.113 M/sec 1103054966 branch-misses # 1.71% of all branches 263.676798169 seconds time elapsed 41.648814000 seconds user 209.926699000 seconds sys ``` * use block record ``` 198258.26 msec task-clock # 0.941 CPUs utilized 208116 context-switches # 1.050 K/sec 3557 cpu-migrations # 17.941 /sec 7737611 page-faults # 39.028 K/sec 383803751737 cycles # 1.936 GHz 269416592342 instructions # 0.70 insn per cycle 54192972144 branches # 273.345 M/sec 1046335847 branch-misses # 1.93% of all branches 210.784186080 seconds time elapsed 39.877310000 seconds user 164.927539000 seconds sys ```