owned this note
owned this note
Published
Linked with GitHub
# DrawnApart (2022Feb Brief)
Public paper: https://orenlab.sise.bgu.ac.il/p/DrawnApart.pdf
Public repo: https://github.com/drawnapart/drawnapart
Excerpts:
> Our technique, which we call DRAWNAPART, is a
new GPU fingerprinting technique that identifies a device from
the unique properties of its GPU stack. Specifically, **we show
that variations in speed among the multiple execution units
that comprise a GPU can serve as a reliable and robust device
signature**, which can be collected using unprivileged JavaScript.
> DRAWNAPART makes two contributions to the state of the
art in browser fingerprinting. On the conceptual front, it is the
first work that explores the manufacturing differences between
identical GPUs and the first to exploit these differences in a
privacy context. **On the practical front, it demonstrates a robust
technique for distinguishing between machines with identical
hardware and software configurations**, a technique that delivers
practical accuracy gains in a realistic setting.
> In the current implementation of WebGL, a single call
to drawArrays() generates multiple drawing operations in
the underlying graphics API, which **appear to assign vertices
to EUs in a deterministic order during vertex processing**.
The operations are differentiated by a global variable, named
gl_VertexID. This special variable is an integer index for
the current vertex, intrinsically generated by the hardware in all
of the graphics APIs used to implement WebGL as it executes
gl.drawArrays. **We created a vertex shader in GLSL that
examines the gl_VertexID identifier, and executes a com-
putationally intensive stall function only if it matches an input
variable named shader_stalled_point_id** provided by
the JavaScript code running on the CPU.
> **By executing this parallel drawing
operation multiple times, each with a different value for
shader_stalled_point_id, we iterate over the different
EUs and measure the relative performance of each.** The output
is a trace of multiple timing measurements, corresponding to
the time taken by the targeted EU to draw the scene.
TABLE I. ACCURACY GAINS ACHIEVED UNDER LAB CONDITIONS
| Device Type |GPU Device|Count |Timer |BaseRate (%) |Accuracy (%) |Gain|
|-|-|-|-|-|-|-|
|Intel i5-3470 (GEN 3 Ivy Bridge)| Intel HD Graphics 2500 |10 |Onscreen |10.0 |93.0±0.3 |9.3|
||||Offscreen |10.0 |36.3±1.6|3.6|
|Intel i5-4590 (GEN 4 Haswell) |Intel HD Graphics 4600 |23|Onscreen |4.3 |32.7±0.3 |7.6|
||||Offscreen |4.3 |63.7±0.6 |14.7|
||||GPU |4.3 |15.2±0.5 |3.5|
|Intel i5-8500 (GEN 8 Coffee Lake) |Intel UHD Graphics 630 |15|Onscreen |6.7 |42.2±0.7 |6.3|
||||Offscreen |6.7 |55.5±0.8 |8.3|
||||GPU |6.7 |53.5±0.8 |8.0|
Intel i5-10500 (GEN 10 Comet Lake) |Nvidia GTX1650 |10 |Offscreen |10.0 |70.0±0.5 |7.0|
||||GPU |10.0 |95.8±0.9 |9.6|
|Apple Mac mini M1 |Apple M1 |4 |Offscreen |25.0 |46.9±0.4 |1.9|
||||GPU |25.0 |73.1±0.7 |2.9|
|Samsung Galaxy S8/S8+| Mali-G71 MP20| 6 |Onscreen |16.7 |36.7±2.7 |2.2|
|Samsung Galaxy S9/S9+ |Mali-G72 MP18 |6 |Onscreen |16.7 |54.3±5.5 |3.3|
|Samsung Galaxy S10e/S10/S10+ |Mali-G76 MP12 |8 |Onscreen |12.5 |54.1±1.5 |4.3|
|Samsung Galaxy S20/S20 Ultra |Mali-G77 MP11 |6 |Onscreen |16.7 |92.7±1.8 |5.6|
> [WebGPU and WebGL 2.0 Compute] introduce compute shaders, a form of com-
putational pipeline that coexists with the existing graphics
pipeline. One significant feature offered to compute shaders
is the ability to synchronize among different work units, by
using atomic functions, message queueing or shared memory.
We used this synchronization primitive to prototype a faster
fingerprinting technique for WebGL 2.0 Compute. **In our pro-
totype, all workers race to acquire a mutex, and we record the
order in which the different work units were granted the mutex.**
We tested this fingerprinting technique on our GEN 3 corpus,
after enabling WebGL 2.0 Compute support in Chrome through
a command-line parameter. **This compute-based fingerprint
delivered a near-perfect classification accuracy of 98%, while
taking only 150 milliseconds to run**, much faster than the
onscreen fingerprint which took a median time of 8 seconds
to collect. **We believe that a similar method can also be found
for the WebGPU API once it becomes generally available.** The
effects of accelerated compute APIs on user privacy should be
considered before they are enabled globally.
> **The silicon-based physi-
cally unclonable function (PUF) concept is based on the idea
that, even if a set of several integrated circuits is created
through an identical manufacturing process, each circuit is ac-
tually slightly different due to normal manufacturing variabil-
ity.** This variability can be used as a unique device fingerprint
based on hardware. Examples of silicon PUF sources include
logic race conditions [44, 71], Rowhammer behavior [21], and
SRAM initialization data [45, 46]. Ruhrmair et al. [64] defined
a fingerprint as “a small, fixed set of unique analog properties”,
and explain that the fingerprint should be measured quickly and
preferably by an inexpensive device. In this work the GPU is
used as a PUF, and our challenge is how to successfully capture
the PUF behavior while using the limited APIs available to a
web browser.
## Status: Big If True
I would describe the (prior) consensus in the WebGL group as "tentatively unconvinced".
We received technical feedback from two (unnamed here) IHVs. One's position was "if the approach is successfully measuring something, it isn't what they claim they're measuring." "Execution units don't work that way". Last I heard, the other was somewhat more credulous, and looking into it. (KR, have we heard more? (Not since --KR))
More superficial investigations haven't demonstrated clear signal here. The approach seems to hinge on the opaque ML training and classification. This makes it hard to reproduce in our own (Graphics) teams.
Note that a citation added on request of the WebGL working group may help explain the differences between specific chips:
J. von Kistowski, H. Block, J. Beckett, C. Spradling, K.-D. Lange, and S. Kounev, “Variations in cpu power consumption,” in Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering, ser. ICPE ’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 147–158. Online: https://doi.org/10.1145/2851553.2851567
## Next Steps
We need to spin up our own reproductions and evaluate this proposed attack vector. We have some theories for potential directions for hypothetical mitigations for some classes of operations in WebGL, but we can't hope to test solutions without being able to reproduce the problem. Engagement with ML experts would help as well. The paper has gone through a peer review process as a prerequisite for publication, so the responsibility for investigation is on us now.