owned this note
owned this note
Published
Linked with GitHub
# Passive Fingerprinting Surface #3101
Thanks for bringing these concerns up!
We're on-board with looking out for ways to reduce what fingerprinting we enable by default as UAs, and some UAs are working on ways to pare back the information reported by older "more revealing" APIs (e.g. WebGL, Canvas2D).
I hope it is reassuring that, of the avenues for consideration that you've given us here, (if my memory serves me right) only "limit these capabilities to frames that have received an activation" has not been discussed yet in the group. It's an interesting idea that is worth considering as part of a passive fingerprinting mitigation toolkit for our UAs. Thank you!
Unfortunately, I think it is very difficult to get a consensus here on how specifically how to (and to what degree) to constrain what we reveal in webgpu. In my mind, it's a very UA-centric question, without good closed-form solutions.
However, I think there are some very important subtleties here, including that for example GPUSupportedLimits does not in practice reveal `bits(maxTextureDimension1D) +bits(maxTextureDimension2D) + bits(maxTextureDimension3D) + bits(maxTextureArrayLayers) + ....`. Instead, what's revealed here in practice is likely about ~3-5 bits for all limits. I actually expect this to be better in-practice than WebGL (e.g. there are no webgpu devices with less than 8k texture size support).
Another subtlety is that certain math and rasterization artifacts that reveal information about which GPU is being used are in-practice inescapable without reverting to CPU rendering for the implementation. As such it's unfortunately just not viable for privacy-absolute (e.g. Tor) users to use hardware rendering for canvas2d/webgl/webgpu, and they *must* rely on software rendering. However, even these artifacts reveal very few bits. While it's possible to trivially observe whether you're running on e.g. Intel vs Nvidia gpus via rasterization artifacts alone, you generally can't tell at all which e.g. Intel gpu you're running on, not even what specific generation it's from.
I know our implementations are each quite concerned with privacy, but we simply disagree at specifically how to address such concerns with these APIs today. (and in many ways, we're still experimenting!) I also believe that we simply lack the knowledge of precisely how to compromise here.
I understand that, absent background info here, it's very scary to see so many theoretically-orthogonal limits (and other additive bits of information), but in practice these bits of info are extremely non-orthogonal, and what differences there *are* are sometimes (and unpredictably) infeasable to eliminate without usefulness-eliminatingly extreme compromises.
---
I think the short answer is "we're working on it", but on a broader scale than right here in WebGPU. For the forseable future, it's not viable to ship with webgpu without webgl/canvas2d, and so in order for a full shipping UA to move the needle on privacy, webpu is not the limiting factor, but rather the UA must have mitigations in place also for webgl (and canvas2d), or there's no meaningful progress on fingerprinting. Unfortunately this will be true for many years.
I believe our current feeling (average, if not consensus) is that we aren't exposing users to any major new risks that they do not currently endure, and that this will be true for quite some time. But this is not to say that we're satisfied with the status quo. We are actively investigating UA-level approaches, working within too-flexible specs and experimenting in ways that have no clear immediate path towards consensus. We are making design decisions that preserve and extend our ability as UAs to experiment and implement fingerprinting mitigations. (Additionally, webgpu is much more capable in its minimum-limits form, and so we expect it will be generally easier for UAs to expose very few bits of information via webgpu) These efforts are earnest and ongoing, and in the end, there's no way to handcuff a defecting UA into not exposing such information, except by our mutual trust and consent.
---
I will say that if it helps satisfy any concerns, I think we could add open-ended mandates directionally towards making things better, I just don't believe we have a path towards consensus on precise or specific mitigations.
For example:
> MM: we can use "must" but not specify what gets bucketed how. "A browser must make similar devices appear as though they're identical" or something like that.
While I worry that this kind of spec wording is not directly actionable, testable, or enforceable, but I believe there's room for consensus if we feel it's productive or otherwise important. I do trust UAs will do better than a pessimal implementation of the spec though, and I want to focus on areas where I see more concerns blocking us from a shipping v1.