# Jonata's Summer 2021 Internships ## BioDynaMo: A new plugin for a simulation software *Cosupervisor: Manuel Mazzara* BioDynaMo (https://biodynamo.org/) is an open-source software used in medical simulations. The objective of this internship is to extend the software (written in C++) to simulate the interations between people in a city or their global flow through airports. The tool would be used to visualize the spreading of viruses and simulate scenarios like the implementation of blocking particular moves (e.g. closing an airport). This would enable identfying minimal actions with maximal impact. <!-- ![permalink setting demo](https://biodynamo.org/images/cells.png) --> <figure class="image"> <img src="https://biodynamo.org/images/cells.png" alt="{{ include.description }}"> <figcaption><i>Viewing cells in BioDynaMo</i></figcaption> </figure> There are similar and simpler simulation tools. For example Netlogo (http://www.netlogoweb.org/). It is a simple software that enable the simulation and customisation of many scenarios for many types of problems. <figure class="image"> <img src="https://3.bp.blogspot.com/-bn3G2U_Q8jA/VZSksUbEyHI/AAAAAAAABlg/-XlqPXfMTkg/s1600/Virus%2Bview.png" alt="{{Virus spreading in Netlogo}}"> <figcaption><i>Viewing cells in BioDynaMo</i></figcaption> </figure> The objective is to run a simulation in BioDynaMo; but instead of cells, there should be millions of dots on a 2D map interacting and reacting to each others. Eventually, answer the question: Is it possible to use Lanchester's law (https://en.wikipedia.org/wiki/Lanchester%27s_laws) to identify the actions with minimum effort and maximum effect? ### Technology stack - C++ - Paraview (https://www.paraview.org/) - Netlogo (http://www.netlogoweb.org/) - Trello ### Output - Code generation - Improved visualization module ### What you will learn - Project management using Trello - Practice with C++ development - Computer simulation using a tool developed and used by CERN (https://home.cern/) ### Who can participate and pre-requisites - 1-2 students - In-site or online - Medium or advanced level in C++ <!--- CANDIDATES (in order of application) --> --- ## Innopolis Data Warehouse: A tool for analysing students' performance Data warehouses are data intensive systems that serve as source for data analysis including data mining, business intelligence and business analystics. However, this type of system requires the collecting and processing of large amounts of data from heterogeneous sources. This internship has as an objective to implement ETL processes to collect and store data on the performance of students from Moodle and eventually other systems. One project direction is the use of Pentaho and the other direction is using Python libraries. In the end, we should compare the approaches and adopt one of them. ![permalink setting demo](https://data-warehouses.net/images/datawarehousingenvironment.gif) This tool would allow faculty members to analyse the performance of individual students since their admission until the last semester. This would give the full picture of their performance, weaknesses and strengths in a single and customisable report. But first, it is necessary to implement a database and the ETL processes to feed a datalake to serve as staging area. ### Technology stack - Data warehouse - Python libraries - PostgreSQL - Pentaho - Trello ![permalink setting demo](https://miro.medium.com/max/1024/1*61kHiw9au4qRIIVJY7ndbQ.jpeg) ### Steps 1. Setup a virtual machine with the IT department 2. Create and schedule a data notebook which run queries and exports the results to a PostgreSQL database 3. Create a multidimensional database using design techniques from Business Intelligence 4. Select and deploy a tool to access this data, in a form of a report 5. Test the data pipeline ### Expected outcomes - Design of ETL processes - PostgreSQL database - Code ### What you will learn - Data warehouse concepts and tools - How to build ETL (extraction, transforming and loading) processes - Data analysis tools (e.g. Pentaho, PostgreSQL, Python) - Trello for project management ### Who can participate and pre-requisites - 4-5 students - In-site - Students with knowledge in databases <!--- CANDIDATES (in order of application) --> ### More information https://www.coursera.org/specializations/data-warehousing --- ## A reinforcement learning environment for Diplomacy game The objective is to extend existing software that VISUALIZES games (stored as JSON files) to serve as a toolbox for the use of reinforcement learning. ![permalink setting demo](https://www.backstabbr.com/images/new-map.png) Random moves should be generated, eventually creating a database that can be explited by machine learning models and reinforcement learning techniques. Then, a bot capable to play Diplomacy should be tested against human players - but this last part is a long shot that might not be finished during this internship. ![permalink setting demo](https://previews.123rf.com/images/studiostoks/studiostoks1907/studiostoks190700039/127875603-policy-diplomacy-and-negotiations-fight-opponents-man-versus-robot-new-technologies-and-progress-con.jpg) Starting point for this project: 1. Video with instructions: https://youtu.be/rctHHEGFMgc 2. GitHub repository: https://github.com/AlexLGr/Diplomacy_AI (private repo) 3. Diplomacy Engine: https://diplomacy.readthedocs.io/en/stable/api/diplomacy.engine.html ### Technology stack - Python - Trello - GitHub - Natural Language Processing libraries ### Expected outcomes - A game implemented according to the specifications - Online material for the community ### Who can participate - 3-4 students - In-site or remotely - Students of any year <!--- CANDIDATES (in order of application) --> --- ## Algorithm selection vs. hyper-heuristics for single objective optimisation functions Algorithm selection is a technique for choosing the best algorithm for a given problem instance. It uses characteristics of the input problem and classification models in machine learning. ![permalink setting demo](https://www.researchgate.net/profile/Joaquin-Vanschoren/publication/289380311/figure/fig4/AS:669437887795200@1536617838182/Rices-framework-for-algorithm-selection-Adapted-from-Smith-Miles-2008a.png) For this internship, we will investigate which algorithms work better for single objective optimization functions such as, for example, the Rastrigin function. ![permalink setting demo](https://upload.wikimedia.org/wikipedia/commons/8/8b/Rastrigin_function.png) ### Technology stack - Java and Python - Sci-kit learn or any other ML library ### Expected outcomes - Code and testing results - Data notebooks with the detailed process ### What you will learn - Basic optimisation algorithms - Classification models in machine learning ### Who can participate - 3-4 students - In-site or remotely - Students of any year willing to learn <!--- CANDIDATES (in order of application) --> --- ## Online algorithm selection and parameter tuning in search Algorithm selection and parameter tuning are expensive computational operations that represent a major bottleneck for testing ML models and selecting the best algorithms for diverse optimisation problems. This work aims to develop online methods on expensive optimisation problems. ![permalink setting demo](https://us.123rf.com/450wm/saamxvr/saamxvr1911/saamxvr191101214/135755677-hands-of-dj-mixing-tracks-on-professional-sound-mixer-fashionable-rings-on-fingers-of-girl-disc-jock.jpg?ver=6) The tuning of parameters should be online, that is, 'on the fly'. In other words, the algorithm should learn how to fly during the flight. See video below ;) https://youtu.be/1VQ_3sBZEm0 ![permalink setting demo](https://i.pinimg.com/originals/dc/6e/05/dc6e05d653787c965826ba1bcb011a4a.gif) <iframe width="560" height="315" src="https://www.youtube.com/embed/1VQ_3sBZEm0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ### Technology stack - Java or Python - Sci-kit learn - XML and JSON ### Expected outcomes - Dataset with results of tests - Standard for representing hyperparamenter - Integration of optimisation algorithms and code that calls ML models - Academic paper ### Who can participate - 3-4 students - In-site or remotely - Students 2-3 or MSc <!--- CANDIDATES (in order of application) --> - - - ## Combining mathematical models to machine learning for microgrid management A microgrid is a small-scale power grid that can operate independently or collaboratively with other small power grids. The practice of using microgrids is known as distributed, dispersed, decentralized, district or embedded energy production. ![permalink setting demo](https://www.pewtrusts.org/-/media/post-launch-images/2016/02/brief3_figure1v2.jpg?la=en&hash=05123A358E2028BADF9EDC8B7C25C4CF10D5CEB7) The objective of this internship is to develop a framework to recommend users the best time and for how long to use electric appliances with minimal impact on their routines. Then, to simulate and analyse the results for the system balance in different scenarios (users cooperate, users don't cooperate, variable percentage of cooperative users). ![permalink setting demo](https://www.researchgate.net/profile/Benjamin_Davies9/publication/323497450/figure/fig3/AS:599505535770632@1519944666807/User-interface-of-HMODEL-simulation-in-NetLogo-60-Full-color-version-available-online.png) ### Technology stack - Integer Programming Models in LINDO, Gurobi or CPLEX - scikit-learn - LaTeX ### Expected outcomes - Development of machine learning models - Development of mathematical models in the adopted solver (LINDO, Gurobi or CPLEX) - Academic paper ### What you will learn - Linear optimization - Linear programming - Solvers (e.g. Gurobi and CPLEX) - Computer simulation ### Who can participate - 2-3 students - In-site or remotely - Students of any year <!--- CANDIDATES (in order of application) --> - - - ## Improving existing code for simulating the surface roughness of 3D printed objects The objective of this internship is to improve existing code written in Java to simmulate the total surface roughness of a 3D object. See: https://drive.google.com/file/d/1wptnzshZSVFhBbdH8cFyBZ8WYuenUCEJ/view?usp=sharing The objective is to evaluate the expected roughness of a 3D printed piece. The existing code hasn't been updated for a long time and requires considerable improvement and add the calculation of the roughness. Another group of students will work on the migration of the Java code to C++ since STL models are large and computationally expensive to process. <iframe width="560" height="315" src="https://www.youtube.com/embed/hjgKKXPIG2Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ### Technology stack - Java - C++ - Mesh models for 3D printing ### Expected outcomes - Java and C++ code ### What you will learn - Project management using Trello - Mesh models - Practice with Java and C++ ### Who can participate - 2-4 students - In-site or remotely - Bachelor students of any year - Knowledge in C++ <!--- CANDIDATES (in order of application) Andrey Vagin - @KKroliKK - BS1 --> - - - ## Improving existing code for calculating the complexity of 3D printed objects The objective of this internship is to convert existing code written in Java to a C++ version. Currently, the program has two layers, one written in C++ to voxalize mesh objects and another written in Java to calculate the complexity. The objective is to cast the lastest to C++ and improve the code to increase efficiency. My original work, which will serve as basis: https://www.researchgate.net/publication/305508242_A_part_complexity_measurement_method_supporting_3D_Printing <figure class="image"> <img src="https://www.forceflow.be/wp-content/uploads/2012/10/difference_voxelisation3.png" alt="{{ include.description }}"> <figcaption><i>A mesh (of a duck) voxelised in two different resolutions</i></figcaption> </figure> ### Technology stack - Java - C++ - Mesh models for 3D printing ### Expected outcomes - C++ code ### What you will learn - Project management using Trello - Mesh models - Practice with Java and C++ ### Who can participate - 1 student - In-site or remotely - Bachelor students of any year - Medium to advanced C++ <!--- CANDIDATES (in order of application) -->