# Response to R4
>First of all, you need to make this zero padding scheme clear in the paper. Figure 1 clearly shows a subset of the spectrum feature vector is used to train each model. Readers are not omnipotent, cannot figure out you "imply the input feature vector is " from the statements such as "each detector leverages a subset of the frequency components", "splitting the frequency components amongst multiple detector models", and later in the Experimental Setup section, "ensemble that partition frequency components into 2 or 4 disjoint subsets".
We agree that we should have made the zero-masking scheme clear in the paper. We will make the necessary changes to Figure 1, and add an explanation and example to our methodology sections.
>Secondly, can simply padding a bunch of zeros to replace the masked frequency trick the adversary output orthogonally defeating and ? If so, what's the point to split the frequency into disjoint subsets? Why wouldn't a random split work? Say, if and are partially redundant from a random split, the "zero transferability" defense in your response wouldn't break, would it?
As a follow-up to the above discussion, I think a fair comparison should not be between your ensemble and a single detector, but your ensemble vs. an ensemble with random frequency split.
A disjoint random split would work, and follow the zero transferability argument in our response. We have performed experiments on this setting under the D3-R(2) and D3-R(4) names in our paper. On the other hand, a non-disjoint random split would not follow the zero transferability argument, as the individual classifiers would look at some common set of frequencies, possibly allowing a perturbation to transfer from model $F_A$ to $F_B$.
To visualize this, we have performed below additional experiments on new settings called Overlap-70(2) and Overlap-50(2) where frequency splits are random but not disjoint. Specifically, in Overlap-50(2), both models receive 50% of the frequencies, sampled with replacement. In overlap-70(2), both models receive 70% of the frequencies, sampled with replacement (this setting should have more common frequencies, causing more transferability, and thus less robustness). We repeat below the performance of D3-R(2) (random and disjoint, i.e., sampling without replacement), and D3-S(2) for comparison convenience, as they should exhibit more robustness due to "zero transferability". We used APGD (50 steps) with the CE loss to evaluate all the 4 settings.
|Epsilon(L2)|Overlap-70(2)|Overlap-50(2)|D3-R(2)|D3-S(2)|
|-|-|-|-|-|
|1 | 85.1| 97.0 | 97.2 | 100.0 |
|5 | 0.0| 3.0 | 96.0 | 100.0 |
|10 | 0.0| 0.0 | 1.2 | 76.2 |
It is clear from the numbers above that the Overlap-50(2) and Overlap-70(2) settings, where frequencies are common between the models, exhibit less robustness than the D3 ensembles. Moreover, since Overlap-70(2) should have more common frequencies than Overlap-50(2), it exhibits more transferability and thus less robustness.
>Finally, assuming the adversary would produce orthogonal is native, unless your experiment is set up for the adversary to attack and sequentially, one at a time. If the adversary is attacking the ensemble, there is no guarantee that , because attacking the whole is not as simple as attacking each summed. It could very well be , because the ensemble reveals a combo, a much more accurate decision region than an individual model. If and are correlated, the ensemble would output the mixture of the individual decision boundaries.
We would like to emphasize that we do not make any assumptions regarding how the adversary produces perturbations, i.e., we do not make assumptions about the attack methodology. Regardless of whether $\delta_i$ and $\delta_j$ are produced by attacking the individual models $i$ and $j$ sequentially, or together, the overall perturbation vector is still $\delta = [\delta_i; \delta_j]$, since $\delta_i$ and $\delta_j$ represent the perturbation contribution from two disjoint subsets of frequencies. And by definition of the $L_2$ norm, this implies that $|\delta|^2 = |\delta_i|^2 + |\delta_j|^2$.