# Regression corrections ## Draft cover letter Dear Sir/Madam We kindly submit our paper titled “Unsupervised Feature Based Algorithms for Time Series Extrinsic Regression” for your consideration for publication in your journal. This survey paper builds on the work of a 2021 paper in Data Mining and Knowledge Discovery (https://link.springer.com/article/10.1007/s10618-021-00745-9) This work defined a new area of research called “Time series extrinsic regression”, introduced 19 problems, archived at https://tseregression.org/ and compared some standard algorithms. We extend this work by: 1. Introducing 44 new problems that will form part of a revised archive; 2. Reproducing the results from the original paper; and 3. Introducing several time series regressors adapted from time series classification. 4. Finding a new state-of-the-art approach for this recent field. We find that two feature-based pipelines, DrCIF and FreshPRINCE, outperform previously proposed approaches. We use the aeon open source toolkit to conduct experiments and endeavour to make recreation of our results and conclusions quick and simple by publishing all code and the results presented in the paper. For potential reviewers, we think any of the following would be a 1. Geoff Webb 2. Eamonn Keogh 3. Anthony Bagnall. ## ORCHID IDS: Tony: 0000-0003-2360-8994 Matthew: 0000-0002-3293-8779 David: 0000-0002-8035-4057 Diego: 0000-0001-6007-0471 Guilherme: 0000-0002-2101-4028 ## R1: ~~**1. Including an example of a TSER problem in the Introduction Section enhances the clarity of the paper. However I believe that the presented example in the Introduction Section should be replaced by a purely time series regression problem, where the values are measured over time.**~~ ~~Tony: I think we can ignore this, unless there is a better example we can think of?~~ ~~Diego: I think that is a stupid claim. However, we shouldn't ignore that we may find another reviewer with the same thoughts. Mainly because this is a relatively simple change to make. An example in health (like ECG and/or PPG) could be good and intuitive for most readers.~~ ~~Guilherme: I added an example from the BIDMC32HR dataset (blue text). Not sure if we should remove the previous (soil spectogram) example, I think it's really good and illustrative.~~ ~~**3. The addition of a flowchart or pseudocode for the FreshPRINCE approach would aid in understanding its implementation.**~~ ~~Matt?~~ ~~**4. Time Series Classification generally compares univariate and multivariate datasets separately. However, TSER groups them. Could you please provide a paragraph in this regard or include a visual comparison apart from Table 3?**~~ ~~David: Done. Red text to be checked in page 18 (Section 6.2):~~ ~~Summarising, it can be said that FreshPRINCE is better when dealing with univariate problemes, whereas DrCIF is, in general, better when applied to multivariate datasets.~~ ~~5. What can be the reasons behind the better results of the RotF approach over some time series specific approaches such as ROCKET or MultiROCKET?~~ ~~I think just ignore for now. We can make a case, spoke to Geoff about it, but maybe only if asked.~~ ~~6. While authors mentioned that no interesting trend was observed in breaking down RMSE per problem type, it would still be informative to provide some numbers regarding this breakdown.~~ ~~I vote ignore~~ ~~**7. A comparison of the standard deviations obtained by the different approaches would be a valuable addition to the results.**~~ ~~could be good?~~ ~~Diego: I think so. For instance, I took a while to find out that ROCKET usually presents a high standard deviation (so it's "less safe" to use).~~ ~~**8. Does Fig 8 uses the same statistical set up than CDDs? If not, please specify.**~~ ~~clarify~~ ~~9. Exploring the applicability of these approaches to forecasting datasets and discussing their potential benefits in this context would be an interesting avenue for practitioners in the field.~~ ~~future work~~ ~~Diego: Agreed. That is COMPLETELY out of scope.~~ ## R2: ~~1. The paper's structure appears more aligned with a technical report than a research paper. Suggest restructuring it, such as dedicating Section II to background and creating a separate section for related work, enhancing readability.~~ ~~dont agree with this~~ ~~2. Activating hyperlinks in the PDF would improve navigation and user-friendliness.~~ ~~should do this~~ ~~David: done~~ ~~**3. While the methodology for extending the TSER archive and evaluating the proposed algorithms is fairly explained, additional clarification is needed for Algorithm 1. The function "transform" in line 3 should be explained more clearly.**~~ ~~look at this~~ ~~**4. To provide a more comprehensive assessment of the algorithms, consider including additional evaluation metrics beyond root mean square error, such as mean absolute error, even if they are not used in ref[5].**~~ ~~dont think its necessary~~ ## R3 ~~1. This paper lacks enough novelty although it added many datasets and regressors for TSER. These works promote the development of TSER, but they could not provide new deep insights.~~ ~~ignore~~ ~~**2. In subsection II.A, Kernel/convolution-based models employ the convolution and pooling operations, and so does the deep learning. The difference between Kernel/convolution-based models and deep learning models should be clearly distinguished.**~~ ~~I dont think this is necessary~~ ~~Diego: Maybe a note regarding kernel-based methods do not "cascade" the results of convolutions or adjust the kernels by a learning procedure?~~ ~~David: Done. Red text to be checked in page 6 (Section 2.1):~~ ~~These kernel/convolution-based methodologies diverge from conventional deep learning practices by abstaining from adjusting the kernels using a learning procedure. Instead, they primarily engage in the computation of features over the kernels extracted.~~ ~~3. FCN is used twice for different terms, fully convolutional neural networks and fully connected networks.~~ ~~David: Corrected!~~ ~~**4. Why replace the Nemenyi test with the Wilcoxon test in the critical difference diagram? The reason and their difference should be explained.**~~ ~~could add in the stock text~~ ~~5. Is it fair that the hyper-parameter is optimized for some models, while it is fixed for others? In addition, the hyper-parameter optimization is missed.~~ ~~ignore.~~ ~~6. Other problems:~~ ~~Confusing hyper-parameter space in Table II in the supplementary material, grid-SVR, Kernel Coefficient ∈ {1 · 10−4, 1 · 10−1, . . . , 1 · 10−1}.~~ ~~David: Done!~~ ~~7. Inconsistent terms: kNN-dtw, kNN-DTW.~~ ~~David: Done!~~