# MySQL Server + Benchbase ## BenchBase https://github.com/cmu-db/benchbase ``` git clone --depth 1 https://github.com/cmu-db/benchbase.git cd benchbase ``` Modify src/main/resources/benchmarks/auctionmark/ddl-mysql.sql to fix a potential error in the auctionmark benchmark that occurs in newer versions of MySQL. ```diff diff --git a/src/main/resources/benchmarks/auctionmark/ddl-mysql.sql b/src/main/resources/benchmarks/auctionmark/ddl-mysql.sql index 110bfdf5..0ca47c64 100644 --- a/src/main/resources/benchmarks/auctionmark/ddl-mysql.sql +++ b/src/main/resources/benchmarks/auctionmark/ddl-mysql.sql @@ -215,8 +215,8 @@ CREATE TABLE item ( i_num_images bigint, i_num_global_attrs bigint, i_num_comments bigint, - i_start_date timestamp DEFAULT '1970-01-01 00:00:01', - i_end_date timestamp DEFAULT '1970-01-01 00:00:01', + i_start_date timestamp DEFAULT '2000-01-01 00:00:01', + i_end_date timestamp DEFAULT '2000-01-01 00:00:01', i_status int DEFAULT 0, i_created timestamp DEFAULT CURRENT_TIMESTAMP, i_updated timestamp DEFAULT CURRENT_TIMESTAMP, ``` ```sh ./mvnw clean # ./mvnw package -P postgres ./mvnw package -P mysql ``` ```sh cd target # tar xvzf benchbase-postgres.tgz tar xvzf benchbase-mysql.tgz ``` ## MySQL Download MySQL ([8.0.39](https://github.com/mysql/mysql-server/archive/refs/tags/mysql-8.0.39.tar.gz)) and extract contents. ``` wget https://github.com/mysql/mysql-server/archive/refs/tags/mysql-8.0.39.tar.gz tar -xzf mysql-8.0.39.tar.gz cd mysql-server-mysql-8.0.39 ``` ### Prerequisites (Package Manager) Tested on Ubuntu 24.04. ``` sudo apt install -y build-essential make bison libssl-dev libudev-dev libtirpc-dev pkg-config ``` ### Prerequisites (Build from source) This may be necessary for building on a supercomputer cluster, or in an environment where you do not have root access. At least `make`, `gcc`/`clang` and `m4` should be installed. Or you will need to build them yourself as well. ``` sudo apt install -y build-essential make m4 ``` (In mysql-server-mysql-8.0.39 directory) ``` mkdir downloads cd downloads mkdir usr ``` (In downloads directory) #### Bison ``` wget https://ftp.gnu.org/gnu/bison/bison-3.8.2.tar.gz tar -xzvf bison-3.8.2.tar.gz cd bison-3.8.2 ./configure --prefix=$(pwd)/../usr make -j`nproc` make install cd .. ``` #### OpenSSL ``` wget https://github.com/openssl/openssl/releases/download/openssl-3.0.13/openssl-3.0.13.tar.gz tar -xzvf openssl-3.0.13.tar.gz cd openssl-3.0.13 ./Configure --prefix=$(pwd)/../usr make -j`nproc` make install cd .. ``` #### libtirpc ``` wget -O libtirpc-1.3.5.tar.bz2 https://sourceforge.net/projects/libtirpc/files/libtirpc/1.3.5/libtirpc-1.3.5.tar.bz2/download tar -xf libtirpc-1.3.5.tar.bz2 cd libtirpc-1.3.5 ./configure --prefix=$(pwd)/../usr --disable-gssapi make -j`nproc` make install cd .. ``` #### patchelf ``` wget https://github.com/NixOS/patchelf/releases/download/0.18.0/patchelf-0.18.0.tar.gz tar -xzvf patchelf-0.18.0.tar.gz cd patchelf-0.18.0 ./configure --prefix=$(pwd)/../usr make -j`nproc` make install cd .. ``` #### ncurses ``` wget https://ftp.gnu.org/gnu/ncurses/ncurses-6.3.tar.gz tar -xzvf ncurses-6.3.tar.gz cd ncurses-6.3 ./configure --prefix=$(pwd)/../usr make -j`nproc` make install cd .. ``` ### CMake First, modify CMakeLists.txt as follows, so that the TSan build is optimized with O2. This is important for benchmarking if we want to compare performance against mysqld compiled without TSan, which is configured to be compiled with O2. ``` 1110,1111c1110,1111 < STRING_APPEND(CMAKE_C_FLAGS " -O1 -fno-inline") < STRING_APPEND(CMAKE_CXX_FLAGS " -O1 -fno-inline") --- > # STRING_APPEND(CMAKE_C_FLAGS " -O1 -fno-inline") > # STRING_APPEND(CMAKE_CXX_FLAGS " -O1 -fno-inline") ``` In both methods below, set `-DCMAKE_INSTALL_PREFIX` to the directory that you want to install MySQL in. I prefer setting it to be `dist-tsan` within the MySQL source directory. When building MySQL with difference modifications to TSan, I name them accordingly, e.g. `dist-tsan-treeclocks` or `dist-tsan-sampling` (or without TSan, `dist-no-tsan`). Flags `-DWITH_UNIT_TESTS=OFF` and `-DINSTALL_MYSQLTESTDIR=` disable building and installing tests so that the process can be faster. #### Prerequisites from package manager ``` cmake -S . -B build-tsan -DCMAKE_INSTALL_PREFIX=dist-tsan -DWITH_TSAN=ON -DCMAKE_C_COMPILER=$HOME/llvm-project/build/bin/clang -DCMAKE_CXX_COMPILER=$HOME/llvm-project/build/bin/clang++ -DDOWNLOAD_BOOST=1 -DWITH_BOOST=downloads -DWITH_UNIT_TESTS=OFF -DINSTALL_MYSQLTESTDIR= ``` #### Prerequisites built from source ``` cmake -S . -B build-tsan -DCMAKE_INSTALL_PREFIX=dist-tsan -DWITH_TSAN=ON -DCMAKE_C_COMPILER=$HOME/llvm-project/build/bin/clang -DCMAKE_CXX_COMPILER=$HOME/llvm-project/build/bin/clang++ -DDOWNLOAD_BOOST=1 -DWITH_BOOST=downloads -DWITH_UNIT_TESTS=OFF -DINSTALL_MYSQLTESTDIR= -DCMAKE_PREFIX_PATH=$(pwd)/downloads/usr -DWITH_SSL=$(pwd)/downloads/usr ``` After running the previous command, run it again but with the additional flags below appended. It is necessary to run this separately because somehow with these flags CMake will fail to detect that TSan is supported by the compiler. ``` -DCMAKE_C_FLAGS="-I$(pwd)/downloads/usr/include -L$(pwd)/downloads/usr/lib -L$(pwd)/downloads/usr/lib64" -DCMAKE_CXX_FLAGS="-I$(pwd)/downloads/usr/include -L$(pwd)/downloads/usr/lib -L$(pwd)/downloads/usr/lib64" ``` Convenience script (I name it `setup`) of doing the above: ``` #!/bin/bash set -x rm -r $1 rm -r $2 cmake -S . -B $1 -DCMAKE_INSTALL_PREFIX=$2 -DWITH_TSAN=ON -DCMAKE_C_COMPILER=$HOME/llvm-project/build/bin/clang -DCMAKE_CXX_COMPILER=$HOME/llvm-project/build/bin/clang++ -DDOWNLOAD_BOOST=1 -DWITH_BOOST=downloads -DWITH_UNIT_TESTS=OFF -DINSTALL_MYSQLTESTDIR= -DCMAKE_PREFIX_PATH=$(pwd)/downloads/usr -DWITH_SSL=$(pwd)/downloads/usr cmake -S . -B $1 -DCMAKE_INSTALL_PREFIX=$2 -DWITH_TSAN=ON -DCMAKE_C_COMPILER=$HOME/llvm-project/build/bin/clang -DCMAKE_CXX_COMPILER=$HOME/llvm-project/build/bin/clang++ -DDOWNLOAD_BOOST=1 -DWITH_BOOST=downloads -DWITH_UNIT_TESTS=OFF -DINSTALL_MYSQLTESTDIR= -DCMAKE_PREFIX_PATH=$(pwd)/downloads/usr -DWITH_SSL=$(pwd)/downloads/usr -DCMAKE_C_FLAGS="-I$(pwd)/downloads/usr/include -L$(pwd)/downloads/usr/lib -L$(pwd)/downloads/usr/lib64" -DCMAKE_CXX_FLAGS="-I$(pwd)/downloads/usr/include -L$(pwd)/downloads/usr/lib -L$(pwd)/downloads/usr/lib64" ``` Simply call `./setup build-tsan dist-tsan`. Note: Sometimes CMake will fail because it tries to determine certain properties by looking at the output. If your modification to TSan prints something to standard output, remove that printing first and you can put it back when CMake configuration is done. ### Build and Install ``` cmake --build build-tsan cmake --install build-tsan --prefix dist-tsan ``` Actually, better to use run with `-t mysqld` because otherwise CMake builds a lot of things that we don't actually need. ``` cmake --build build-tsan -t mysqld ``` However with this `cmake --install` wouldn't work, so in this case we will just run mysqld from the build directory. :::warning If you get an error saying "undefined function tgoto", run CMake with `-DHAVE_TERM_H=1` and build again. ::: ### After rebuilding compiler-rt If you only modified the source code in compiler-rt, you don't need to rebuild every MySQL source file. You just need to remove `bin/mysqld` and link it again with the new compiler-rt library. Convenience script (I name it `relink`) of just relinking mysqld with the latest TSan runtime: ``` #!/bin/bash set -x rm -r $1 rm build-tsan/bin/mysqld cmake --build build-tsan cmake --install build-tsan --prefix $1 ``` Simply call `./relink dist-tsan` to relink mysqld in dist-tsan with the latest TSan runtime. ### After rebuilding LLVM If you modified the compiler infrastructure/instrumentation code in LLVM, say ThreadSanitizer.cpp, you obviously will need to rebuild the whole codebase. But you don't need to remove everything too. Just run the following script. ``` #!/bin/bash set -x find sql -name "*.o" -exec rm {} \; cmake --build . -j`nproc` ``` Honestly not sure if this is good, because many files wont be rebuilt, maybe just do it for quick incremental development tests. I noticed that this is not so good because some libraries will remain with the old instrumentation. Better to run the following, `make clean` to clear everything, and passing `-t mysqld` to CMake so that it only builds `mysqld` and ignores many other things that we don't actually need, e.g. tests. ``` #!/bin/bash set -x make clean cmake --build . -j`nproc` -t mysqld # the server cmake --build . -j`nproc` -t mysql # the client ``` ## PostgreSQL Download and extract [Postgres 16.3](https://ftp.postgresql.org/pub/source/v16.3/postgresql-16.3.tar.gz). ``` ./configure --prefix=$(pwd)/../dist-tsan --without-icu --without-readline CC=~/llvm-project/build/bin/clang CXX=~/llvm-project/build/bin/clang++ CFLAGS="-fsanitize=thread" CXXFLAGS="-fsanitize=thread" ``` ``` make -j`nproc` make install ``` There will be an error when building with TSan. Make the following change. ``` diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile index bfcc7cdde9..083ca6f4cc 100644 --- a/src/interfaces/libpq/Makefile +++ b/src/interfaces/libpq/Makefile @@ -105,7 +105,9 @@ backend_src = $(top_srcdir)/src/backend # build toolchains insert abort() calls, e.g. to implement assert().) # If nm doesn't exist or doesn't work on shlibs, this test will do nothing, # which is fine. The exclusion of __cxa_atexit is necessary on OpenBSD, -# which seems to insert references to that even in pure C code. +# which seems to insert references to that even in pure C code. Excluding +# __tsan_func_exit is necessary when using ThreadSanitizer data race detector +# which use this function for instrumentation of function exit. # Skip the test when profiling, as gcc may insert exit() calls for that. # Also skip the test on platforms where libpq infrastructure may be provided # by statically-linked libraries, as we can't expect them to honor this @@ -113,7 +115,7 @@ backend_src = $(top_srcdir)/src/backend libpq-refs-stamp: $(shlib) ifneq ($(enable_coverage), yes) ifeq (,$(filter aix solaris,$(PORTNAME))) - @if nm -A -u $< 2>/dev/null | grep -v __cxa_atexit | grep exit; then \ + @if nm -A -u $< 2>/dev/null | grep -v -e __cxa_atexit -e __tsan_func_exit | grep exit; then \ echo 'libpq must not be calling any function which invokes exit'; exit 1; \ fi endif ``` Following which, the postgres server will segfault at startup because postgres defined its own `ubsan_default_options`, which makes TSan instrument it, causing some TSan operations to be performed before thread state is initialized. To fix this, open ./src/backend/main/main.c and scroll to the end and delete the `ubsan_default_options` definition. ## Running At the moment, I have only made scripts for MySQL. However it should be quite straightforward to port them to other DBMS's. ``` git clone https://github.com/focs-lab/benchbase-experiments cd benchbase-experiments cd scripts ``` ### Configure Setup Modify config.yaml (example below). ``` benchbase: /home/users/nus/dws.lim/scratch/benchbase/benchbase mysql: /home/users/nus/dws.lim/scratch/mysql/mysql-server-mysql-8.0.37 mysql-dist: dist-tsan-sampling-uclocks mysql-dist-no-tsan: dist-no-tsan workspace: /home/users/nus/dws.lim/scratch/benchbase-experiments/workspace seed: 1234 benchmark: epinions ``` `benchbase` is just the path to where you cloned BenchBase. `mysql-dist` and `mysql-dist-no-tsan` are the paths relative to the path given in the `mysql` parameter, for the MySQL build to be tested and MySQL build without TSan respectively. (I initially thought doing the DB initialization steps without TSan would be faster. But actually there is not much difference.) `seed` was used by me for sampling in TSan (only relevant if you are [doing something like this](https://github.com/focs-lab/llvm-project/commit/7c311d27727e461df8a4881fe66c84e4cfe6ef23)). MySQL must store its database files somewhere. Create a new folder and specify the path in `workspace`. `benchmark` indicates the benchmark in BenchBase to run under. For now, check the [files](https://github.com/focs-lab/benchbase-experiments/tree/5a3e18718c04cdb56c752ff9cac165974848394d/scripts) that end in `_config.xml` for the available benchmark names (e.g. `ycsb` for `ycsb_config.xml`). ### Configure Benchmark You may want to change the benchmark's workload [settings in the corresponding `_config.xml` file](https://github.com/focs-lab/benchbase-experiments/blob/5a3e18718c04cdb56c752ff9cac165974848394d/scripts/ycsb_config.xml#L21-L29). ```xml <!-- The workload --> <terminals>60</terminals> <works> <work> <time>3600</time> <rate>10000000</rate> <weights>50,5,15,10,10,10</weights> </work> </works> ``` `terminals` is the number of workers that query the SQL server. `time` is the duration in seconds. `rate` is the maximum number of requests per minute. `weights` are for the distribution of the different types of queries, which is specific to each benchmark (you will likely just ignore this). ### Run Run the benchmark according to the configuration above ``` python3 runner.py ``` The script does the following: 1. Initializes a new database in a randomly-named folder in `workspace`. 2. Starts MySQL Server to listen for requests. 3. Run benchbase to send many requests to the MySQL Server instance. [runner2.py](https://github.com/focs-lab/benchbase-experiments/blob/5a3e18718c04cdb56c752ff9cac165974848394d/scripts/runner2.py) and [config2.yaml](https://github.com/focs-lab/benchbase-experiments/blob/5a3e18718c04cdb56c752ff9cac165974848394d/scripts/config2.yaml) runs the benchmark on multiple TSan builds. :::info These scripts were made for my POPL25 experiments. There is room for improvements. :::