DrawnApart (2022Feb Brief)

# DrawnApart (2022Feb Brief) Public paper: https://orenlab.sise.bgu.ac.il/p/DrawnApart.pdf Public repo: https://github.com/drawnapart/drawnapart Excerpts: > Our technique, which we call DRAWNAPART, is a new GPU fingerprinting technique that identifies a device from the unique properties of its GPU stack. Specifically, **we show that variations in speed among the multiple execution units that comprise a GPU can serve as a reliable and robust device signature**, which can be collected using unprivileged JavaScript. > DRAWNAPART makes two contributions to the state of the art in browser fingerprinting. On the conceptual front, it is the first work that explores the manufacturing differences between identical GPUs and the first to exploit these differences in a privacy context. **On the practical front, it demonstrates a robust technique for distinguishing between machines with identical hardware and software configurations**, a technique that delivers practical accuracy gains in a realistic setting. > In the current implementation of WebGL, a single call to drawArrays() generates multiple drawing operations in the underlying graphics API, which **appear to assign vertices to EUs in a deterministic order during vertex processing**. The operations are differentiated by a global variable, named gl_VertexID. This special variable is an integer index for the current vertex, intrinsically generated by the hardware in all of the graphics APIs used to implement WebGL as it executes gl.drawArrays. **We created a vertex shader in GLSL that examines the gl_VertexID identifier, and executes a com- putationally intensive stall function only if it matches an input variable named shader_stalled_point_id** provided by the JavaScript code running on the CPU. > **By executing this parallel drawing operation multiple times, each with a different value for shader_stalled_point_id, we iterate over the different EUs and measure the relative performance of each.** The output is a trace of multiple timing measurements, corresponding to the time taken by the targeted EU to draw the scene. TABLE I. ACCURACY GAINS ACHIEVED UNDER LAB CONDITIONS | Device Type |GPU Device|Count |Timer |BaseRate (%) |Accuracy (%) |Gain| |-|-|-|-|-|-|-| |Intel i5-3470 (GEN 3 Ivy Bridge)| Intel HD Graphics 2500 |10 |Onscreen |10.0 |93.0±0.3 |9.3| ||||Offscreen |10.0 |36.3±1.6|3.6| |Intel i5-4590 (GEN 4 Haswell) |Intel HD Graphics 4600 |23|Onscreen |4.3 |32.7±0.3 |7.6| ||||Offscreen |4.3 |63.7±0.6 |14.7| ||||GPU |4.3 |15.2±0.5 |3.5| |Intel i5-8500 (GEN 8 Coffee Lake) |Intel UHD Graphics 630 |15|Onscreen |6.7 |42.2±0.7 |6.3| ||||Offscreen |6.7 |55.5±0.8 |8.3| ||||GPU |6.7 |53.5±0.8 |8.0| Intel i5-10500 (GEN 10 Comet Lake) |Nvidia GTX1650 |10 |Offscreen |10.0 |70.0±0.5 |7.0| ||||GPU |10.0 |95.8±0.9 |9.6| |Apple Mac mini M1 |Apple M1 |4 |Offscreen |25.0 |46.9±0.4 |1.9| ||||GPU |25.0 |73.1±0.7 |2.9| |Samsung Galaxy S8/S8+| Mali-G71 MP20| 6 |Onscreen |16.7 |36.7±2.7 |2.2| |Samsung Galaxy S9/S9+ |Mali-G72 MP18 |6 |Onscreen |16.7 |54.3±5.5 |3.3| |Samsung Galaxy S10e/S10/S10+ |Mali-G76 MP12 |8 |Onscreen |12.5 |54.1±1.5 |4.3| |Samsung Galaxy S20/S20 Ultra |Mali-G77 MP11 |6 |Onscreen |16.7 |92.7±1.8 |5.6| > [WebGPU and WebGL 2.0 Compute] introduce compute shaders, a form of com- putational pipeline that coexists with the existing graphics pipeline. One significant feature offered to compute shaders is the ability to synchronize among different work units, by using atomic functions, message queueing or shared memory. We used this synchronization primitive to prototype a faster fingerprinting technique for WebGL 2.0 Compute. **In our pro- totype, all workers race to acquire a mutex, and we record the order in which the different work units were granted the mutex.** We tested this fingerprinting technique on our GEN 3 corpus, after enabling WebGL 2.0 Compute support in Chrome through a command-line parameter. **This compute-based fingerprint delivered a near-perfect classification accuracy of 98%, while taking only 150 milliseconds to run**, much faster than the onscreen fingerprint which took a median time of 8 seconds to collect. **We believe that a similar method can also be found for the WebGPU API once it becomes generally available.** The effects of accelerated compute APIs on user privacy should be considered before they are enabled globally. > **The silicon-based physi- cally unclonable function (PUF) concept is based on the idea that, even if a set of several integrated circuits is created through an identical manufacturing process, each circuit is ac- tually slightly different due to normal manufacturing variabil- ity.** This variability can be used as a unique device fingerprint based on hardware. Examples of silicon PUF sources include logic race conditions [44, 71], Rowhammer behavior [21], and SRAM initialization data [45, 46]. Ruhrmair et al. [64] defined a fingerprint as “a small, fixed set of unique analog properties”, and explain that the fingerprint should be measured quickly and preferably by an inexpensive device. In this work the GPU is used as a PUF, and our challenge is how to successfully capture the PUF behavior while using the limited APIs available to a web browser. ## Status: Big If True I would describe the (prior) consensus in the WebGL group as "tentatively unconvinced". We received technical feedback from two (unnamed here) IHVs. One's position was "if the approach is successfully measuring something, it isn't what they claim they're measuring." "Execution units don't work that way". Last I heard, the other was somewhat more credulous, and looking into it. (KR, have we heard more? (Not since --KR)) More superficial investigations haven't demonstrated clear signal here. The approach seems to hinge on the opaque ML training and classification. This makes it hard to reproduce in our own (Graphics) teams. Note that a citation added on request of the WebGL working group may help explain the differences between specific chips: J. von Kistowski, H. Block, J. Beckett, C. Spradling, K.-D. Lange, and S. Kounev, “Variations in cpu power consumption,” in Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering, ser. ICPE ’16. New York, NY, USA: Association for Computing Machinery, 2016, p. 147–158. Online: https://doi.org/10.1145/2851553.2851567 ## Next Steps We need to spin up our own reproductions and evaluate this proposed attack vector. We have some theories for potential directions for hypothetical mitigations for some classes of operations in WebGL, but we can't hope to test solutions without being able to reproduce the problem. Engagement with ML experts would help as well. The paper has gone through a peer review process as a prerequisite for publication, so the responsibility for investigation is on us now.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.