owned this note changed 2 days ago
Published Linked with GitHub

GSoC 2025 - ML Forecasting Tina

tags: aeon-gsoc

Contributor: Tina Jin
GSoC page: https://summerofcode.withgoogle.com/organizations/numfocus/projects/details/MPYRSOTi
Project: aeon - Implementing and Evaluating Machine Learning Forecasters
Project length: 12 weeks
Mentors: Matthew Middlehurst, Tony Bagnall
Mid-project evaluation: July 14
Final evaluation: September 1
Blog link: https://medium.com/@jintina48/list/gsoc25-blog-11a0081fc6e2
Regular meeting time: 9:30 Monday UTC
Meeting time availability: 8:00 - 14:00 UTC

Project Summary

This project will investigate algorithms for forecasting based on traditional machine learning (tree based) and time series machine learning (transformation based). It will involve helping develop the aeon framework to process both standard ML and extrisnic regression algorithms for forecasting. This will involve evaluating regression algorithms already in aeon for forecasting problems as well as scikit-learn regressors. The tree-based SETAR-Tree and SETAR-Forest algorithms will also be implemented in the forecasting module.

Wish list of algorithms

SETAR-Tree/SETAR-Forest

Project Timeline

Issues: #2816 (SETAR-Tree/SETAR-Forest)

June 2nd: Week 1-2

  • Implement the basic SETAR-Tree class structure.
  • If viable, create separate TAR/SETAR forecaster
  • Add testing and documentation as the class develops

June 16th: Week 3-4

  • Finish SETAR-Tree implementation
  • Implement SETAR-Forest extension
  • Add to API and ensure broad testing coverage

June 30th: Week 5-6

  • Run SETAR-Tree/Forest on datasets from the publication and the ensure results are comparable
  • Write/extend forecasting notebook to include usage of models such at SETAR-Tree and ETS (as a new notebook or maybe extending current)

Mid-project Deliverables

  • Implementation of SETAR-Tree with documentation and testing
  • Implementation of SETAR-Forest with documentation and testing
  • Notebook (or notebook section) on forecasting for SETAR-Tree

(preliminary)
MLP?

Week 7-8

  • Implement core functions of the pipeline forecasting module.
  • Write unit tests and API documentation (docstrings) for the core functionality.
  • Implement data loading and preprocessing scripts for the M5(M4?) dataset.
  • Set up the benchmarking framework. Run baseline models on the M5 data.

Week 9-10

  • Run the selected regression-based forecasters on the preprocessed M5 data.
  • Collect and store results, debug if necessary.
  • Analyze the results of the experiments.
  • Draft the technical report for the evaluation task.

Week 11-12

  • Complete the technical report.
  • Review the implementation of the pipeline forecasting module.
  • Prepare a report for the Midterm Evaluation.

Final Deliverables

  • todo

Community Bonding Period

  • Introduce yourself in the community Slack channels.
  • Go through the contributor guide on the aeon website (https://www.aeon-toolkit.org/en/stable/contributing.html).
  • Set up a development environment, including pytest and pre-commit dependencies. This will make development a lot easier for you, as you must pass the PR tests to have your code merged (https://www.aeon-toolkit.org/en/stable/developer_guide/dev_installation.html).
  • Review some of the important dependencies for developing aeon at a basic level:
    • pytest for unit testing. Any code added will have to be covered by tests.
    • sphinx/myst for documentation. Adding new functions and classes will have to be added to the API docs.
    • numba for writing efficient functions.
  • Make some basic Pull Requests (PRs) to gain some experience with contributing to aeon through GitHub. Some suggestions:
  • Read up on the subject of your project (machine learning forecasters). We will provide some literature, but we encourage you to go beyond that and ask any questions you have.
  • Decide on a project length. 12 weeks is the default but can be extended if you will be unable to work for some periods during the summer.
  • Refine the project timeline and deliverables with the project mentors. Agree on some milestones for both mid-project and final evaluations.
  • Update the GSoC webpage project to better match any new directions after discussions with mentors.
  • Select a tracking/blogging medium to write down and track progress made on the project. Agree on a frequency of updates.
  • Set up regular meeting days and times to discuss the project and track progress.

Week 1:

Week 2:

Week 3:

Week 4:

Week 5:

Week 6:

Week 7:

Week 8:

Week 9:

Week 10:

Week 11:

Week 12:

Select a repo