# Tuplex starter projects ### 1. Redesign Local thread pool Currently, there are some issues with the design of the local thread pool used to execute queries. In this project, a better design should be implemented. To understand the terminology, let's first briefly define a couple terms: - **Executor**: In Tuplex an executor (`Executor.cc/.h`) is a thread together with a managed memory region (`BitmapAllocator.h`). Memory is managed in Tuplex in blocks which are called partitions (`Partition.cc/.h`). Each Partition can be spilled to disk if necessary, which works using LRU (least-recently-used) disk spilling. Any Executor can read any executor's partition, but can only write to a partition which comes from its own memory region (single-writer/multiple reader lock). - **WorkQueue**: A Workqueue is a thread-safe queue where Tasks can be added (Tasks derived from `IExecutorTask.cc/.h`). To make an Executor work tasks from a WorkQueue, it can get attached/detached from a workqueue (cf. `Executor.cc/.h`). - **LocalEngine**: Not perfectly named, the LocalEngine is a Singleton which manages the thread-pool. Whenever a new Context is created, either an existing thread/Executor is reused or a new one created if no matching one in the current pool exists. In particular, the current design has the following drawbacks: 1. Whenever a context is created, the driver may get reused. This could lead to issues when using multiple context objects. Better would be to use a design where execution switches between contexts. 2. Threads run the whole time after a Context object is created. This leads to machines running the whole time spin work. Better would be to create the threads, put them to sleep and wake them up (i.e. notify/observer pattern or so) In this project, the local engine should be improved to allow switching between contexts (only one Context active at a time) and put threads to sleep when Tuplex is in idle mode (i.e. not executing anything). ### 2. Add progress bar to WebUI Currently the WebUI does not feature any fancy progress bar displaying the progress of a job. In this starter project, the WebUI should be improved by adding a better overview on the running jobs to it. Code for the WebUI can be found in `https://github.com/tuplex/tuplex/tree/master/tuplex/historyserver`. ### 3. Add Valgrind testing infrastructure to CI Tuplex possibly has a good amount of memory leaks currently. In this project, Valgrind leak checks should be added to the CI. A subsequent project could be to detect various leaks and fix them. ### 4. Add numpy conversions Some users may desire automatic conversion from/to Numpy arrays. In this project, this feature should be added: Cf. https://www.javaer101.com/en/article/18072453.html for some tricks on how to make numpy work with the Boost python module. ### 5. Apache Arrow integration Reading/Writing to Apache Arrow may be a desirable feature as it will allow to integrate Tuplex into Arrow based stacks. As a bonus, we get conversion to/from various formats for free. ### 6. Add ord/chr/oct builtins These are simple functions to add, makes a good starter project for getting involved with the Compiler infrastructure. ### 7. Add ctx.uncache(...) to uncache a cached dataset i.e., this will free all the data and make it not usable anymore. ### 8. Add round builtin function https://docs.python.org/3/library/functions.html#round ### 9. Upgrade to LLVM13 Make Tuplex work with LLVM13. Bonus: We get Apple Silicon support! ### 10. Add print(...) functionality [Improvement] Support print(...) for easier debugging Compiling print(...) as well might catch certain bugs easier and also at least this debugging by printing to users if something goes wrong in the code generation. ### 11. Peephole optimization for x in [], x in (), ... Check that expressions like x in <empty seq> are reduced to False. ==> potentially a whole peephole-optimization system for ASTs could be added. ### 12. Rule-based optimization rules: Right now optimization rules are handwritten. A better way could be to have a mechanism implemented which automatically applies rules to trees. This is not trivial though, because multiple rules could interact with each other. I.e., effectively we need to introduce a time-capped fixpoint iteration here. ### 13. Add rightJoin this would complete innerJoin + leftJoin and allow for correct optimization of build-side. To make this fast an atomic-bitset is needed. One way to achieve this to use atomic add and disallow clearing of bits. I.e., https://stackoverflow.com/questions/44751700/single-bit-manipulations-with-guaranteed-atomicity ### 14. Add support for flatMap This requires rewriting some infrastructure, yet would then allow traditional wordCount application. In addition, because of similarities we could add mini batching/mini vectorization as feature :)