# GASP Reviews ECCV
## Reviewer 1
We thank the reviewers for their helpful comments.
> The technical innovation is limited. The proposed framework is a simple extension of the traditional bottom-up hierarchical agglomerative clustering. Many of the properties among different linkage criteria, including robustness, have been discussed in the traditional settings [1], and there seems not many differences for the signed graph here.
>
Although we agree with the reviewer that the proposed approach elaborates on several existing algorithms for unsigned graphs, we would like to stress that so far a comparison between agglomerative algorithms was never performed in the context of signed graphs. Furthermore, one of the new algorithms of the framework outperforms other alternatives. As stated in the introduction of the paper, the innovative aspect of GASP is its theoretical generalization of the existing algorithms, which provides a powerful tool for studying these methods.
We will also extend the introduction section to stress that another contribution of the paper is the theoretical proof that Mutex Watershed is a specialization of GASP -- an insight which we claim is nontrivial.
> There is little discussion regarding the large performance difference between the "without constraint" and "with constraint" versions. For example, the GASP Average + Constraints has a CREMI-Score more than doubled compared to that of the GASP Average in Table 2, but the AP scores of these two in Table 4 are roughly the same. This difference also show up in Fig 4.
>
We will add an explanation to make this point clearer:
The CREMI dataset is 3D and the predictions in the depth direction made by a CNN are often noisy and biased towards over-clustering. The GASP Average with constraints performs worse on the CREMI dataset as compared to GASP Average without constraints, because constraints are often applied along the depth direction resulting in over-clustering mistakes, cf. Fig. 3.
> It is unclear if the average-linkage based method is always preferred. From Table 6, it does not seems to be the case since all three methods perform comparably. Then what property of the datasets makes the it work better?
>
We will add a paragraph with an explanation as follows:
Table 6 compares the performance of the GASP with the spectra clustering methods. Since the spectral clustering methods are computationally expensive, we have randomly selected a small subset of the CREMI 2016 dataset for this comparison. In the full dataset containing neuron segments stretching over the whole dataset, local mistakes can have global impact on the segmentation score. This is not the case for the small subset representing a small brain section. Note that therefore the small dataset cannot be used to compare different versions of the GASP. Indeed, as Table 6 shows, all the evaluated GASP versions perform similarly and significantly outperform the spectral clustering methods.
In summary: on a small data set such as shown in Table 6, all GASP versions perform similarly, and much better than spectral clustering approaches. On big data sets, the average linkage emerges as clear winner.
## Reviewer 2
We thank the reviewers for their helpful comments.
> The squared complexity and running time of the most proposed algorithms limit the applicability of the new approach. Application to the large dataset requires preprocessing reducing the data. The "Abs Max" variant can scale almost linearly but it is very closely resembles prior work on watershed. The authors also conclude that the "Abs Max" is a viable option even despite slightly worse performance than Average. That may reduce the significance of the proposed new algorithms.
>
Although the GASP Average is indeed slower than Abs Max, it still has better complexity than other approximation solvers for signed graph clustering. On the CREMI 2016 dataset as shown in Table 2 the difference between the GASP Average and Abs Max is consistent and significant (0.1 CREMI score difference).
> The results are worse when compared to the other state-of-the-art methods. The author suggest that this is due to the usage of smaller neural network
>
Although the proposed method indeed performs worse than other state-of-the-art methods, the difference in the CREMI score is small (around 0.02), especially for the CREMI 2016 dataset. Thus, it is fair to claim that we are competitive with state-of-the-art. Furthermore, the advantage of the GASP Average lies in its simplicity. Unlike the SOA methods, the proposed method does not require the selection of hyperparameters of the superpixel generation procedure and of more complex deep learning models.
> It is worth noting that the generalized algorithm frameworks (at least for positive graphs) are common and was used as early as in Day, W. H., & Edelsbrunner, H. (1984). Efficient algorithms for agglomerative hierarchical clustering methods. Journal of classification, 1(1), 7-24.
>
Thank you for the comment! We added it to our related work section.
> Table 7 does not provide runtime. In table 8 the time unit for the runtime is not specified.
>
Probably the reviewer is referring to Table 6 rather than Table 7. We will add the runtime information to Table 6.
We revised Table 8 and added the time unit.
## Reviewer 3
We thank the reviewers for their helpful comments.
> The proposed solvers can be understood as greedy solvers for the minimum cost multicut problem. Yet, a problem definition is not given in the paper.
>
We will add the definition of the multicut problem. We agree that some of the algorithms (GAEC, GreedyFixation, GASP Average) can be understood as greedy solvers. However, it is not the case for all of them: those based on Max or Min linkage do not necessarely decrease the multicut objective function at every iteration. Only the Mutex Watershed was shown to optimize an objective closely related to the multicut problem [89, Wolf et al., 2020].
> The proposed algorithm GASP is very similar to GAEC proposed in Keuper et al., ICCV'2015, while it proposed to compare different greedy strategies. Previous follow-up works of Keuper et al. also tested alternative strategies. Especially the balanced edge contraction strategy in Kardoost et al, ACCV2018 is very similar to the proposed maximum average strategy. Yet, a comparison is missing.
>
We agree that this would be a meaningful comparison, but given the week for the rebuttal, we were not able to run the corresponding experiment in time.
**TODO: test balanced edge contraction on Cityscapes (and CREMI?)**
> A comparison to other greedy strategies such as the fusion moves approach (Beier et al.,ECCV 2016) or KLj (Keuper et al., ICCV 2015) is also missing and desirable, especially on the Cityscapes Instance segmentation example.
>
We are grateful for the suggestion. However, fusion moves approach and KLj are usually applied to smaller images or superpixels graphs, because they are computationally expensive. Given the time we have for the rebuttal, it is not feasible to run this comparison on the Cityscapes dataset.
**TODO?**
> Decomposition problems on signed graphs can be evaluated by the cost of the resulting cut. These numbers should be reported in tables 2 and 4, in addition to the application driven segmentation metrics.
>
We will add the numbers to Tables 2 and 4.
> In general, I wonder why these methods are not evaluated on seminal benchmarks such as the OpenGM benchmark (http://hciweb2.iwr.uni-heidelberg.de/opengm/index.php?l0=benchmark).
>
The focus of our paper is on pixel grid graphs from images. An additional comparison with OpenGM would be possible and we will consider it for future work, but it seems to be out of the scope of this study.