This is the benchmarking code used to get the results presented in the short paper "Should I Hide My Duck in the Lake?". When cloning the repo, you have to initialize the submodules with:
git submodule update --init --recursiveTo build the benchmark:
mkdir build && cd build
cmake ..
make -j 64
cd ..To build the DuckDB ViewRewriter extension:
cd extension
makeTo generate scale factor 30 as Parquet files in the directory data/tpch-30 with default TPC-H ordering execute:
./generate -b tpch -s 30To get more options, including how to generate TPC-DS data, execute with -h. The ClickBench Parquet file can be downloaded with ./download_clickbench.sh.
To put data into tmpfs on the HACC cluster (the benchmark were run on hacc-box-01 of the HACC cluster):
sudo tmpfs-create -s 32 -n 0
cp -r data/tpch-30 /mnt/ramdiskTo run experiments and plot the results (this requires some manual moving the measurements to the correct folders for now):
./run_all.sh
python3 plot.py