# FOSDEM 2023 ## Track: Fast and Streaming Data devroom https://fosdem.org/2023/schedule/track/fast_and_streaming_data/ ### Session: Running Real-time Stream Processing Analytics On Traces scale as bottleneck. Produce alets or trends about the data. real-time data landscape (slide). Hard to decide which tools to use for an application.: * stream processing engines * Hazelcast platform: opensource https://hazelcast.com/open-source-projects/ ### Session: CDC Stream Processing with Apache Flink building blocks of stream processsing - streams - time - state - snapshots distinct because: - parallelization - optional state storage - snapshopts of datapiline topologies APIs - datastream - table - built on dataflow runtime - working with flink sql: use dynamic tables (those are considered streams) ### Session: Ingesting over a million rows per second on a single instance - specialize on timeseries, nothing else - does ok for goespatial data - optimization of inseting data to DB, while not compromising reading data - 1 million rows per second - keywords for grouping by time - spatial joins to join based on time (ASOF JOIN) - bechmarks for timeseries: timescale/tsbs ### Session: Building A Real-Time Analytics Dashboard with Streamlit, Apache Pinot, and Apache Pulsar - real-time analyics (an area of applications for my phd) - interest of: analyist, management, users - photo, - properties of real-time analytics sysems: photo - demo: sourcecode in github - ## Track: Continuous Integration and Continuous Deployment https://fosdem.org/2023/schedule/track/continuous_integration_and_continuous_deployment/ ### Session: How To Automate Documentation Workflow For Developers - Cohesion in documentation: use a voice chart - Voice and tone: put on a style guide ## Track: HPC, Big Data and Data Science https://fosdem.org/2023/schedule/track/hpc_big_data_and_data_science/ ### Session: Simplifying the creation of Slurm client environments - working with slurm + containers - Straw: python based tool to manage autentication between slurm and external containers. - https://github.com/pllopis/straw ### Session: Troika: Submit, monitor, and interrupt jobs on any HPC system with the same interface - Project Destination Earth - troika: python packages to handle connections to a remote system, prepare and submit jobs for multiple HPC facilities. Easier to port workflows to different HPC facilities: https://pypi.org/project/troika/ - https://fosdem.org/2023/schedule/event/troika_hpc_jobs/ ### Session: Reproducibility and performance: why choose? CPU tuning in GNU Guix - https://hpc.guix.info/about/ - Mutiversion packaging for optimization of hpc hardware architecture ### Session: LIBRSB: Universal Sparse BLAS Library - https://fosdem.org/2023/schedule/event/librsb/ - dealing with sparse matrices - for iterable methods ### Session: numba-mpi - makes possible to couple numba and MPI - https://pypi.org/project/numba-mpi/ - https://fosdem.org/2023/schedule/event/numba_mpi/ ## Track: Python https://fosdem.org/2023/schedule/track/python/ ### Session: How to build an event-driven application in Python - Memphis.dev - define events -> define actions - https://github.com/yanivbh1/python-event-driven-app - https://fosdem.org/2023/schedule/event/python_build_event_driven_application/ ### Session: Continuous Documentation for Your Code - How to guides, tutorials, : https://diataxis.fr/ ## Other session of Interest -https://fosdem.org/2023/schedule/event/python_build_event_driven_application/ ## Tools of Interest - https://fosdem.org/2023/schedule/event/openresearch_pimmi/ - https://github.com/questdb/questdb - https://streamlit.io/ - check slides of Pulsar - click-house local - https://vale.sh/ - haystack python