On Privacy Notions in Anonymous Communication - 2019

# On Privacy Notions in Anonymous Communication - 2019 ###### tags: `Tag(HashCloak - Validator Privacy)` Authors: Christiane Kuhn*, Martin Beck, Stefan Schiffner, Eduard Jorswieck, and Thorsten Strufe Paper: https://petsymposium.org/2019/files/papers/issue2/popets-2019-0022.pdf Definitions: * Following the standard in cryptology, we use the term **privacy notion**, to describe such a formalized privacy goal that defines properties to be hidden from the adversary. ### Table of Contents [toc] :::info >Abstract: Many anonymous communication networks (ACNs) with different privacy goals have been developed. Still, there are no accepted formal definitions of privacy goals, and ACNs often define their goals ad hoc. ::: ## 1 Introduction > To protect metadata from state and industrial surveillance, a broad variety of anonymous communication networks (ACNs) has emerged; one of the most deployed is Tor, but also others, e.g. I2P or Freenet, are readily available. Additionally, many conceptual systems, like MixNets, DC-Nets, Loopix and Crowds have been published. The published ACNs address a variety of privacy goals. However, many definitions of privacy goals are ad hoc and created for a particular use case. In general, comparing privacy goals is difficult since their formalization is often incompatible and their naming confusing. This has contributed to a situation where existing informal comparisons disagree: e.g., Sender Unlinkablity of Hevia and Micciancio’s framework and Sender Anonymity of AnoA are both claimed to be equivalent to Sender Anonymity of Pfitzmann and Hansen’s terminology, but significantly differ in the protection they actually ;provide. These naming issues further complicate understanding of privacy goals and hence analysis of ACNs. For every pair of notions it must be clear if one is stronger or weaker than the other, or if they have no direct relationship. Such a comparison has already been made for the notions of semantic security In summary, our main contributions are: * the mapping of practitioners’ intuitions to gamebased proofs, * the definition of building blocks for privacy notions, * the selection and unified definition of notions, * a complete hierarchy of privacy notions, which simplifies comparison of ACNs, and * the resolution of inconsistencies and revision of mistakes in previous (frame)works. ## 2 Overview We start with an example of a use case and the corresponding implicit privacy goal, to then introduce the idea of the related indistinguishability game. * Example: Alice is a citizen of a repressive regime and engaged with a resistance group. Despite the regime’s sanctions on distributing critical content, Alice wants to publish her latest critical findings. * A vanilla encryption scheme would reduce Alice’s potential audience and thus does not solve her problem. * Hence, she needs to hide the link between the message and herself as the sender. * We call this goal sender-message unlinkability. 2. We show how such a game works and what it means for a protocol to be secure according to this goal. 3. By adopting the game we sketch how privacy goals can be formalized as notions and provide an intuition for the relations between different goals. ### First Attempt We start by presenting an easy game, that at first glance looks like the correct formalization for the goal of the example, but turns out to model an even stronger goal. ![](https://i.imgur.com/XUJT2nA.png) * The adversary wins the game if she guesses the correct scenario. * If she can devise a strategy that allows her to win the game repeatedly with a probability higher than random guessing: * She must have learned some information that is supposed to be protected, here the sender (e.g. that Alice is more probable the sender of the message than Charlie), since everything else was identical in both scenarios. * Hence, we say that, if the adversary can find such a strategy, we do not consider the analyzed protocol secure regarding the respective privacy goal. ### Why This is Too Strong As an example, consider the following two scenarios: (1) Alice and Bob send messages (2) Charlie and Dave send messages. * If the adversary can learn the active senders, she can distinguish the scenarios and win the game. * However, if she only learns the set of active senders, she may still not know who of the two active senders in the played scenario actually sent the regimecritical content. * Thus, a protocol hiding the information of who sent a message within a set of active senders is good enough for the given example. * Yet, it is considered insecure regarding the above game, since an adversary can learn the active senders. * Hence, the game defines a goal stronger than the required sender-message unlinkability. ### Correcting the Formalization However, the game of Fig. 1 can be adjusted to model sender-message unlinkability. * We desire that the only information about the communications that differs between the scenarios is who is sending which message. * Thus, we allow the adversary to pick scenarios that differ in the senders, but not in the activity of the senders, i.e. the number of messages each active sender sends. * This means, we change what the adversary is allowed to submit in step 1 and what the challenger checks in step 2. * So, if the adversary now wants to use Alice and Charlie, she has to use both in both scenarios. * Hence, given an ACN where this game cannot be won, the adversary is not able to distinguish whether Alice or another active user sent the regime-critical message. * The adversary might learn, e.g. that someone sent a regime-critical message and the identities of all active senders (here that Alice and Charlie are active senders). * However, since none of this is sanctioned in the above example, Alice is safe, and we say such an ACN provides sender-message unlinkability. ### Lessons Learned Depending on the formalized privacy goal (e.g. sender unobservability) the scenarios are allowed to differ in certain properties of the communications (e.g. the active senders) as we have illustrated in two example games. * We explain and define this general game in Section 3. We then define the properties (e.g. that the set of active senders can change) in Section 4 and build notions (e.g. for sender unobservability) upon them in Section 5. * Every protocol achieving sender unobservability also achieves sender-message unlinkability. * Intuitively, if whether Alice is an active sender or not is hidden, whether she sent a certain message or not is also hidden. ## 3. Our Game model Our goal in formalizing the notions as a game is to analyze a given ACN protocol w.r.t. to a notion, i.e. the game is a tool to investigate if an adversary can distinguish two self-chosen, notion-compliant scenarios. * Scenarios are sequences of communications. * A communication is described by its sender, receiver, message and auxiliary information (e.g. session identifiers) or the empty communication, signaling that nobody wants to communicate at this point. * Some protocols might restrict the information flow to the adversary to only happen at specific points in the execution of the protocol. * Therefore, we introduce batches as a sequence of communications, which is processed as a unit before the adversary observes anything. For a given ACN and notion, our general game is simply: * Instantiated with a model of the ACN, which we call the protocol model, and the notion. * The protocol model accepts a sequence of communications as input. * Similar to the real implementations the outputs of the protocol model are the observations the real adversary can make. In the simplest version of the game, the adversary constructs two scenarios, which are just two batches of communications and sends them to the challenger. * The challenger checks that the batches are compliant with the notion. * If so, the challenger tosses a fair coin to randomly decide which of the two batches it executes with the protocol model. * The protocol model’s output is returned to the game adversary. * Based on this information, the game adversary makes a guess about the outcome of the coin toss. > We extend this simple version of the game, to allow the game adversary to send multiple times two batches to the challenger. > To unfetter our general game from the concrete adversary model, we allow the adversary to send protocol queries. ### Formalization In this subsection, we formalize the game model to conform to the above explanation. * We use $Π$ to denote the analyzed ACN protocol model. * $Ch$ for the challenger. * $A$ for the adversary. * Which is a probabilistic polynomial time algorithm. * Additionally, we use $X$ as a placeholder for the specific notion if we explain or define something for all the notions. * A $communication$ $r$ in $Π$ is represented by a tuple $(u, u', m, aux)$ with * A sender $u$. * A receiver $u'$. * A message $m$. * And auxiliary information $aux$ (e.g. session identifiers). * Further, we use $\lozenge$ instead of the communication tuple $(u, u' , m, aux)$ to represent that no communication occurs. Communications are clustered into $batches$ * $\underline{r} _b = (r_ {b_{1}} , . . . , r_{b_{l}} )$ * with $r_{b_{i}}$ being the $i$-th communication of batch $\underline{r}_b$. > Note that we use $\underline{r}$ (underlined) to identify batches and $r$ (no underline) for single communications. * Batches in turn are clustered into scenarios * the first scenario is $(\underline{r} _{0_{1}} , . . . , \underline{r} _{0_{k}}$ )$. * A challenge is defined as the tuple of two scenarios $((\underline{r} _{0_{1}} , . . . , \underline{r} _{0_{k}}), (\underline{r} _{1_{1}} , . . . , \underline{r} _{1_{k}}))$. ![](https://i.imgur.com/LxWcr6Y.png) **Achieving Notion X.** Intuitively, a protocol $Π$ achieves a notion $X$ if any possible adversary has at most negligible advantage in winning the game. To formalize the informal understanding of $Π$ achieving goal $X$, we need the following denotation: * $Pr[g = A || Ch(Π, X, b)i]$ describes the probability that $A$ outputs $g$, when $Ch$ is instantiated with $Π$ and $X$ and the challenge bit was chosen to be $b$. * With this probability, achieving a notion translates to Definition 1 ![](https://i.imgur.com/Bp7cwIo.png) ## 4 Protected Properties Properties are defined to specify which information about the communication is allowed to be disclosed to the adversary, and which must be protected to achieve a privacy notion, as mentioned in Section 2. ### 4.1 Simple Properties ![](https://i.imgur.com/Mjxwnfp.png) ### 4.2 Complex Properties Various simple properties to protect senders, messages, receivers, their activity, frequency and the grouping of messages are mentioned in Section 4.1. However, this is not sufficient to formalize several relevant privacy goals, and we must hence introduce complex properties. **Learning Sender and Receiver.** Consider that one aims to hide which sender is communicating with which receiver. * An intuitive solution may be to model this goal by allowing the adversary to pick different senders and receivers $(E_{SR})$ in both scenarios (see Fig. 2 (a) for an example). * This, however, does not actually model the privacy goal. * By identifying only the sender or only the receiver of the communication, the game adversary could tell which scenario was chosen by the challenger. * We hence must extend the simple properties and introduce scenario instances to model dependencies. Scenario Instances: * We now require the adversary to give alternative instances for both scenarios (Fig. 2 (b)). * The challenger chooses the scenario according * To the challenge bit. * Which is picked randomly for every game * And the instance according to the instance bit. * Which is picked randomly for every challenge. ![](https://i.imgur.com/QnguT88.png) * This allows us to model the goal that the adversary is not allowed to learn the sender and receiver: * We allow the adversary to pick two sender-receiver pairs. * Which she uses as instances for the first scenario. * The mixed sender-receiver pairs must then be provided as instances for the second scenario (see Fig. 2 (b)). * We thus force the game adversary to provide alternative assignments for each scenario. * This way she cannot abuse the model to win the game by identifying only the sender or the receiver. * We call this property Random Sender Receiver $R_{SR}$. We further restrict the adversary picking instances where both senders and both receivers are active by defining the property Mix Sender Receiver $M_{SR}$. * Here, the adversary picks two instances for $b = 0$ where her chosen sender-receiver pairs communicate, and two for $b = 1$ where the mixed senderreceiver pairs communicate. The two instances simply swap the order in which the pairs communicate (Fig. 2 (c )). * This way, we force the adversary to provide alternative assignments for each scenario where both suspected senders and both suspected receivers are active. * This combination prevents the adversary from winning the game without learning the information that the real system is actually supposed to protect, i.e. the sender-receiver pair. ![](https://i.imgur.com/vE65MEd.png) **Defining Complex Properties.** To simplify the formal definition of complex properties, we introduce $challenge rows$. * . A challenge row is a pair of communications with the same index that differ in the two scenarios. * (e.g. $r_{0_{j}},r_1{_{j}}$ with index $j$). * For complex properties, the challenger only checks the differences of the challenge rows in the two scenarios. ![](https://i.imgur.com/UpmLWmp.png) ![](https://i.imgur.com/rgsng9y.png) **Linking Message Senders.** A final common privacy goal that still cannot be covered is the unlinkability of senders over a pair of messages (Twice Sender Unlinkability). * Assume a real world adversary that can determine that the sender of two messages is the same entity. * If subsequently she discovers the identity of the sender of one of the messages through a side channel, she can also link the second message to the same individual. * To model this goal, we need two scenarios: 1. Both messages are sent by the same sender. 2. Each message is sent by a different sender. * The concept of $stages$ is introduced: * Only one sender sends in the challenge rows of stage 1. * In stage 2: * Either the same sender continues sending (b = 0) * Or another sender sends those messages (b = 1). * This behavior is specified as the property $Twice Sender T_S$. ![](https://i.imgur.com/hmiFiE4.png) * Hence, we need to facilitate distinct stages for notions with the complex properties TS or TR. * Precisely, in step 2 of the game, the adversary is additionally allowed to switch the stages. ## 5 Privacy Notions Given the properties above (Section 4), we can now set out to express intuitive privacy goals as formal privacy notions. We start by specifying sender unobservability as an example leading to a general definition of our privacy notions. * Recall the first game we defined in Section 2, which corresponds to sender unobservability $(S\overline{O} = S(ender) \lnot O(bservability))$. * There, in both scenarios we need to specify that sending nothing is not allowed: ![](https://i.imgur.com/SR1WNK7.png) * we also need the property that everything but the senders is equal: $E_S$. * Hence, we define sender unobservability as $S\overline{O} :=$![](https://i.imgur.com/iJniYiU.png) $∧E_S$. We define all other notions in the same way: ![](https://i.imgur.com/f5INRSR.png) ![](https://i.imgur.com/jGb7wpf.png) Modeling the notions as a game, the respective challenger will check all aspects of the adversary’s queries. ## 6 On the Choice of Notions The space of possible combinations of properties, and hence of conceivable privacy notions, is naturally large. Due to this, we verify our selection of privacy goals by finding exemple use cases. Additionally, we demonstrate the choice and the applicability of our definition by analyzing the privacy goals of Loopix, an ACN that was recently published. ## 6.1 6.1 Example Use Cases for the Notions We illustrate our notions by continuing the example of an activist group trying to communicate in a repressive regime, although our notions are generally applicable. ### 6.1.1 Impartial Privacy Notions These notions treat senders and receivers equally. **Message Observability.** The content of messages can be learned in notions of this group, as messages are not considered confidential. * Because the real world adversary can learn the content, we must prevent her from winning the game trivially by choosing different content. * Hence, such notions use the property that the scenarios are identical except for the senders and receivers $(E_{SR})$. * This ensures that the messages are equal in both scenarios. **Sender-Receiver Linkability (Message Confidentiality).** Senders and receivers can be learned in notions of this group, because they are not considered private. * Hence, such notions include the property that the scenarios are identical, except for the messages (EM) * This ensures that the sender-receiver pairs are equal in both scenarios. **Both-Side Unobservability.** Even the activity of a certain sender or receiver is hidden in notions of this group. * In this case, nothing is disclosed about senders, receivers, messages or their combination. * However, the adversary can learn the total number of communications happening in the ACN. * Modeling this, we need to assure that for every communication in the first scenario, there exists one in the second. > Note that this is the only notion where the existence of a communication is hidden. All other notions include ![](https://i.imgur.com/pGcit2L.png) and hence do not allow for the use of the empty communication. #### 6.1.2 Sender (and Receiver) Privacy Notions These notions allow a greater freedom in picking the senders (or receivers: analogous notions are defined for receivers.). **Receiver-Message Linkability.** The receiver-message relation can be disclosed in notions of this group. Hence, such notions include the property that the scenarios are identical except for the senders $(E_S)$ to ensure the receiver-message relations are equal in both scenarios. * In $Sender-Message$ $Unlinkability$ $(S M \overline{L})$: the total number of communications and how often each user sends can be additionally learned. * However, who sends which message is hidden. * In $Sender-Frequency$ $Unlinkability$ $(S F \overline{L})$: the set of users and the total number of communications can be additionally disclosed. * However, how often a certain user sends is hidden, since it can vary between the two scenarios. * In $Sender Unobservability$ $(S\overline{O})$: the total number of communications can additionally be disclosed. * However, especially the set of active senders $U_b$ is hidden. If a notion further includes the following abbreviations, the following information can be disclosed as well: * With User Number Leak $(−|U|)$: the number of senders that send something in the scenario. * With Histogram Leak $(−H)$: the histogram of how many senders send how often. * With Pseudonym Leak $(−P)$: which messages are sent from the same user. ### 6.2 Analyzing Loopix’s Privacy Goals > To check if we include currently-used privacy goals, we decide on a current ACN that has defined its goals based on an existing analytical framework and which has already been analyzed: the Loopix anonymity system \[15\]. In this section, we show that the privacy goals of Loopix map to notions we have defined (although the naming differs). * Loopix aims for: * Sender-Receiver ThirdParty Unlinkability. * Sender online Unobservability. * And Receiver Unobservability. **Sender-Receiver Third-Party Unlinkability.** Sender-Receiver Third-Party Unlinkability means that an adversary cannot distinguish scenarios where two receivers are switched: > “The senders and receivers should be unlinkable by any unauthorized party. Thus, we consider an adversary that wants to infer whether two users are communicating. We define sender-receiver third party unlinkability as the inability of the adversary to distinguish whether $\{S_1 → R_1, S_2 → R_2\}$ or $\{S_1 → R_2, S_2 → R_1\}$ for any concurrently online honest senders $S_1, S_2$ and honest receivers $R_1, R_2$ of the adversary’s choice.” \[15\] The definition in Loopix allows the two scenarios to be distinguished by learning the first receiver. * We interpret the notion such that it is only broken if the adversary learns a sender-receiver-pair, which we assume is what is meant in \[15\]. * This means that the sender and receiver of a communication must be learned and is exactly the goal that motivated our introduction of complex properties: $(SR)\overline{L}$. **Unobservability.** In sender online unobservability the adversary cannot distinguish whether an adversarychosen sender communicates $(\{S →\})$ or not $({S \nrightarrow})$: > “Whether or not senders are communicating should be hidden from an unauthorized third party. We define sender online unobservability as the inability of an adversary to decide whether a specific sender $S$ is communicating with any receiver $\{S →\}$ or not $({S \nrightarrow})$, for any concurrently online honest sender $S$ of the adversary’s choice.” \[15\] * Receiver unobservability is defined analogously. * Those definitions are open to interpretation. * On the one hand, $\{S \nrightarrow \}$ can mean that there is no corresponding communication in the other scenario. * When a sender is not sending in one of the two scenarios, this means that there will be a receiver receiving in the other, but not in this scenario. * On the other hand, $\{S \nrightarrow \}$ can mean that sender u does not send anything in this challenge. * In this case, the receivers can experience the same behavior in both scenarios and the notions differ. **Remark.** We do not claim that the Loopix system achieves or does not achieve any of these notions, since we based our analysis on the definitions of their goals, which were not sufficient to unambiguously derive the corresponding notions. ### 6.3 Relation to Existing Analysis Frameworks In this section, we briefly introduce the existing frameworks based on indistinguishability games. We argue that our summary of notions includes all their notions and therefore allows a comparison along this dimension. ##### AnonA Framework AnoA \[1\] builds its privacy notions on $(\epsilon, δ)$ differential privacy and compares them to their interpretation of the terminology paper of Pfitzmann and Hansen \[14\]. * Conceptually our model differs from AnoA’s model in the definition of achieving a notion, batch queries, and the use of notions instead of anonymity functions. * AnoA’s definition of achieving a notion can be easily included (see Appendix B), if needed. * In AnoA, the adversary gets information after every communication. * This is equivalent to multiple batches of size one in our case. ##### Bohli’s Framework Bohli and Pashalidis \[3\] build a hierarchy of applicationindependent privacy notions based on what they define as “interesting properties”, that the adversary is or is not allowed to learn. * Additionally, they compare their notions to Hevia’s, which we introduce next, and find equivalences. ##### Hevia’s Framework Hevia and Micciancio \[12\] define scenarios based on message matrices. * Those message matrices specify who sends what message to whom. * Notions restrict different communication properties like the number or set of sent/received messages per fixed user, or the number of total messages. * Further, they construct a hierarchy of their notions and give optimal ACN protocol transformations that, when applied, lead from weaker to stronger notions. In contrast, our model considers the order of communications. * Analyzing protocol models that ignore the order will lead to identical results * However, protocol models that consider the order do not achieve a notion. * Although they would in Hevia’s framework, if an attack based on the order exists. ##### Gelernter’s Framework Gelernter and Herzberg \[10\] extend Hevia’s framework to include corrupted participants. * Additionally, they show that under this strong adversary an ACN protocol achieving the strongest notions exists * However, they prove that any ACN protocol with this strength has to be inefficient, i.e. the message overhead is at least linear in the number of honest senders. * Further, they introduce relaxed privacy notions that can be efficiently achieved. ## 7 Hierarchy > Next, we want to compare all notions and establish their hierarchy. For any pair of notions we analyze which one is stronger than, i.e. implies, the other. * Any ACN achieving the stronger notion also achieves the weaker (implied) one. * Our result is shown in Figure 3, where all arrow types represent implications. ![](https://i.imgur.com/5Nmb0Uw.png) ## 8 Discussion In this section, we present the lessons learned while creating our framework. **Learning about privacy goals.** The need for formal definitions is emphasized by the mapping of Loopix’s privacy goals to notions as example that less formal alternatives leave room for interpretation. Further, a result like our hierarchy would be much harder to achieve without formal definitions. * These definitions allow us to point out the relation of privacy and confidentiality $(M\overline{O} − |M|)$. * Note that any privacy notion implying $M\overline{O} − |M|$ can be broken by distinguishing the message’s content. * Further, nearly all those notions also imply $M\overline{O}$ and hence, all such notions can be broken by learning the message length. We present a broad selection of privacy notions. We are aware that understanding them all in detail might be a challenging task, so we want to provide some intuitions and preferences, based on what we know and conjecture: * We expect the lower part of the hierarchy to be more important for ACNs as \[10\] already includes an inefficiency result for $S\overline{O}$ and thus for all notions implying $S\overline{O}$. * As a first guess, we think $S\overline{O}$, if higher overhead is manageable, $SF\overline{L}, SM\overline{L}, (SM)\overline{L}$ (and receiver counterparts), $M\overline{O} − |M|$ and $(SR)\overline{L}$ are the most popular notions for ACNs. * Further, we want to add some results concerning two well-known systems to ease intuition. * \[1\]’s analysis of Tor results in a small, but non-negligible probability to break $S\overline{O}$ and thus Tor does not achieve $S\overline{O}$ with our strict definition. * Classical DC-Nets, on the other hand, do achieve at least $S\overline{O} −P$ \[10\]. > Personal Note: Seems like a good direction in terms of VP regardless of latency issues in mix-net systems. **Correcting Inconsistencies.** While the above similarities most likely stem from the influence of prior informal work on privacy goals, attempts to provide concrete mappings have led to contradictions. * We belive that AnoA’s sender anonymity should be called sender unobservability, which is also our name for the corresponding notion. * This follows the naming proposal of Pfitzmann and Hansen and their mapping to Hevia. * It is also more suitable because AnoA’s sender anonymity can be broken by learning whether a certain sender is active. * i.e. sends a real message, in the system $(u ∈ U_b)$. * In order to achieve this notion, all senders have to be unobservable. ## 9 Conclusion and Future Work We formally specified privacy goals from ACNs and sorted them into a proven hierarchy, according to their strength. * This means, for every pair of notions, we know which one is the stronger; or if they do not imply each other. * As a result, we resolved inconsistencies between existing analytical frameworks and built the foundations to understand the strengths and weaknesses of ACNs better. * Which helps analyzing and building improved ACNs. **Future Work.** As we mentioned in the discussion, providing more intuitions and understanding the significance of notions is necessary. * Therefore, analogous to the analysis of Loopix’s privacy goals, more current ACNs can be analyzed to understand which parts of the hierarchy they cover. * This can also identify gaps in research; privacy goals for which ACNs are currently missing. * Further, a survey of goals in greater depth would be useful to identify the most important notions in the hierarchy and to provide intuitions and thus ease deciding on the correct notions for practitioners. * Beyond that, an investigation of the applicability of our notions and hierarchy to other areas, like. * e.g. anonymous payment channels, would be interesting.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.