Fundamental Limits of Invisible Flow Fingerprinting

Fundamental Limits of In visible Flo w Fingerprinting Ramin Soltani ∗ , Dennis Goeckel ∗ , Don T o wsley † , and Amir Houmansadr † ∗ Electrical and Computer Engineering Department, Uni versity of Massachusetts, Amherst, { soltani, goeckel } @ecs.umass.edu † College of Information and Computer Sciences, Uni versity of Massachusetts, Amherst, { to wsley , amir } @cs.umass.edu Abstract Network ﬂow ﬁngerprinting can be used to de-anonymize communications on anonymity systems such as T or by linking the ingress and egress segments of anonymized connections. Assume Alice and Bob hav e access to the input and the output links of an anonymous network, respecti vely , and they wish to collaborati vely rev eal the connections between the input and the output links without being detected by W illie who protects the network. Alice generates a codebook of ﬁngerprints, where each ﬁngerprint corresponds to a unique sequence of inter- packet delays and shares it only with Bob . For each input ﬂow , she selects a ﬁngerprint from the codebook and embeds it in the ﬂo w , i.e., changes the packet timings of the ﬂow to follow the packet timings suggested by the ﬁngerprint, and Bob extracts the ﬁngerprints from the output ﬂows. W e model the network as parallel M / M / 1 queues where each queue is shared by a ﬂow from Alice to Bob and other ﬂows independent of the ﬂow from Alice to Bob . The timings of the ﬂows are governed by independent Poisson point processes. Assuming all input ﬂows have equal rates and that Bob observes only ﬂows with ﬁngerprints, we ﬁrst present two scenarios: 1) Alice ﬁngerprints all the ﬂows; 2) Alice ﬁngerprints a subset of the ﬂows, unknown to W illie. Then, we extend the construction and analysis to the case where ﬂow rates are arbitrary as well as the case where not all the ﬂows that Bob observes have a ﬁngerprint. For each scenario, we derive the number of ﬂows that Alice can ﬁngerprint and Bob can trace by ﬁngerprinting. This work has been supported by the National Science Foundation under grants CNS-1564067 and CNS-1525642. The preliminary version of this work has been presented at the 51st Annual Asiloma r Conference on Signals, Systems, and Computers, November 2017 [1]. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. 1 K eywords: Network De-anonymization, Flow Fingerprinting, Anonymity Networks, Priv acy and Se- curity , Queueing Theory , T iming Channel, Bits Through Queues, Covert Communication, Network Security , Information Theoretic Security , Covert Bits Through Queues. I . I N T R O D U C T I O N G IVEN the presence of communication systems in daily life and their rapid growth, e.g., cellular networks, internet of things, etc., security and pri vac y has emerged as a vital area of research and de velopment [2]–[9]. For ev ery communication system, security in volv es not only allowing authorized users to communicate a message in a way that the message content is protected from unauthorized users, but also pre venting access by malicious users. Hence, breaking the anonymity of users in an anonymous network such as T or , Bitblinder , and Darknet plays a major role in pre venting malicious use of technology . Even if the messages are encrypted, traf ﬁc analysis can be used to infer sensitiv e information from the packet characteristics such as timing patterns, sizes, and packet rates. For instance, packet timings can rev eal information about passwords sent ov er SSH channels [10]. Also, traf ﬁc analysis can discover stepping stone attacks where malicious users employ compromised computers to relay their trafﬁc [11], [12]. Furthermore, it can be used to ﬁnd correlations between input and output links of a network to re veal connections between the links [13]. Unlike passi ve trafﬁc analysis which in volv es only recording trafﬁc characteristics, such as packet timings, active trafﬁc analysis in volv es both recording and modifying traf ﬁc characteristics to embed information in them. For instance, in ﬂow watermarking [14]–[16], watermarks are embedded into ﬂows by changing their packet timings according to a unique secret pattern. Therefore, each ﬂo w contains one bit of information indicating whether it contains the watermark. Howe ver , in ﬂo w ﬁngerprinting, the embedded patters are used to communicate information such as the identity of the party that performed ﬁngerprinting [17], the location of the ﬂow in the network where it was ﬁngerprinted [1], and the time when the ﬁngerprint was embedded. Thus necessarily this will con ve y more than one bit of information. Acti ve traf ﬁc analysis has emerged as a vibrant area of research recently . In [18], the authors propose detecting stepping stones using ﬂow watermarking. Peng et al. [19] sho w that this method is detectable and propose attacks on it. W ang et al. [20] show that the anonymity of V oIP calls made over an anonymity network can be broken using watermarking methods. Kiyav ash et al. [21] propose a multi- ﬂo w attack on interv al-based watermarking methods, which delay packets of speciﬁc interv als based on the value of the watermarks. Houmansadr et al. propose RAINBOW watermarking [14] and SWIRL [15] which is a scalable traf ﬁc analysis method resilient against aggregated-ﬂo ws attacks. They also study the 2 capacity of ﬂo w watermarking [22] and propose a ﬂow ﬁngerprinting scheme allowing ﬁngerprinting of millions of ﬂows by perturbing the packet timings of relativ ely short lengths of ﬂows [23]. Rezaei et al. [24] introduce an acti ve ﬁngerprinting method called T agIt that works by slightly delaying packets into secret time interv als. In [25], [26], the authors consider watermarking and analyze in visibility and error probability of watermarking schemes in practice. Pre vious acti ve trafﬁc analysis methods do not offer theoretical guarantees on the trade-off between performance (number of the ﬂows) and in visibility , i.e., altering the packet timings so that the outcome is statistically indistinguishable from intact packet timings. When the traf ﬁc analyzer is the warden of the network who protects the links from being traced by anonymous users (e.g., for de-anonymization), in visibility of trafﬁc analysis is important since attackers (anonymous users) can ev ade analysis if they are aw are of the ﬁngerprinting process. Even when the trafﬁc analyzer is not the network warden, the in visibility of the trafﬁc analysis is crucial in order to hide from the network warden. In this paper , we consider in visible ﬁngerprinting to trace the input and output links of a network in the presence of a network warden. Consider an anonymous network where connections between input and output links are unkno wn. W e model the network as M parallel work conserving queues with Poisson arriv als and exponential service times ( M / M / 1 queues) and First In First Out (FIFO) discipline. Queues are independent and each queue is shared by a ﬂo w from the input of a network to the output of the network and other ﬂows independent of the ﬂow from Alice to Bob (see Fig. 1a). Alice has access to the input ﬂo ws and she can buf fer and release packets when she desires. On the other side of the network, Bob has access to the output ﬂows so he can read the packet timings of the ﬂows. Alice and Bob wish to perform ﬁngerprinting to infer the connections between input links and output links, without being detected by W illie whose goal is to discov er ﬂow ﬁngerprints. W e consider the following problem: in a time interv al of length T , can Alice and Bob perform ﬁngerprinting to link input and output ﬂows of the network without being detected by W illie, and if yes, ho w can they do so and what is the maximum number m of ﬂo ws that they can link reliably? For the case where packet timings of each ﬂo w is an independent instantiation of a Poisson process, we present the construction and analysis, and calculate the asymptotic expression for m as a function of T . W e ﬁrst assume ﬂow packet rates are equal and that Bob observes only ﬂows with ﬁngerprints and consider two main scenarios: 1) Alice ﬁngerprints all ﬂows she observes; 2) Alice ﬁngerprints a subset of the ﬂo ws, and the subset is unkno wn to W illie. Then, we present the extensions to arbitrary ﬂow rates as well as the case where Bob observes a set of ﬂo ws in which not all ﬂows are ﬁngerprinted. The contributions of this work relative to the conference version in [1] are: • For the case where Alice ﬁngerprints all ﬂows, we present more details of the analysis for both 3 (a) Setting 1: The network is modeled as independent parallel M / M / 1 queues where each queue is shared between a ﬂo w from Alice to Bob (main ﬂow) and other interfering ﬂows that are independent of the main ﬂow . (b) Setting 2: The network is modeled as independent parallel M / M / 1 queues with single input/output where each queue con veys a ﬂow from Alice to Bob . Fig. 1: Alice may ﬁngerprint the ﬂows, and Bob receives the ﬁngerprinted ﬂow after they pass through the network which adds timing noise to the ﬁngerprints. W illie who is warden of the network protects the links from being traced; he wishes to determine whether Alice has ﬁngerprinted ﬂows. 4 the reliability and the number of possible ﬁngerprinted ﬂows. • For the case where Alice ﬁngerprints a subset of the ﬂo ws, in [1, Theorem 2], we presented a scenario where Alice ﬁngerprints each ﬂo w independently with probability q . Here, we present a slightly dif ferent v ariation of this scheme where instead of a probabilistic selection of ﬂows for ﬁngerprinting, Alice ﬁngerprints a subset of the ﬂo ws which is known to both Alice and Bob (see Theorem 2). Furthermore, the results of [1, Theorem 2] were applicable only for speciﬁc values of q and the total number of ﬂo ws that yield a close to a maximal number of traceable ﬂows. Here, we present the results for arbitrary q and number of ﬂows (see Theorem 4.3). • The extension to the case of arbitrary ﬂow rates was discussed in [1, Section V .B] brieﬂy . Here, we present the full construction and analysis (see Theorems 3.1 and 3.2). • W e analyze the case where Bob observes a set of ﬂows in which some of them are not ﬁngerprinted. W e present a construction where Bob uses a detector to determine if a ﬂow is ﬁngerprinted (see Theorems 4.1 and 4.2). • W e present simulation results for W illie’ s probability of error , the probability that Alice runs out of packets, Bob’ s probability of error , and robustness of our scheme against changes in processing time of queues. The remainder of the paper is or ganized as follo ws. W e present the system model, deﬁnitions, and in visibility and reliability metrics employed in this paper in Section II. Then, in Sections III and IV, we present constructions and analyses for the two main ﬁngerprinting scenarios. In Section V, we present the extensions of the main scenarios to arbitrary ﬂo w rates, and in Section VI, we present the extensions of the main scenarios to the case where Bob observes ﬂows with and without ﬁngerprints. Section VIII discusses the results, and Section IX discusses future work. W e conclude in Section X. I I . S Y S T E M M O D E L , D E FI N I T I O N S , A N D M E T R I C S A. System Model W e consider a set of M ﬂows between M pairs of input and output links. W e assume the links are known but not the pairings. Also present are two parties Alice and Bob whose goal is to identify some or all of the pairings by ﬁngerprinting, without a third party , W illie, detecting this identiﬁcation. Moreov er, Alice and Bob wish to do so within the time interval [0 , T ] . Alice, Bob, and W illie kno w that all packet timings are gov erned by Poisson processes and they the rate of each ﬂow that they observe. 5 Alice has access to a subset of the input links where each link con veys a packet ﬂow f ( A ) i ∈ F A = { f ( A ) 1 , f ( A ) 2 , . . . , f ( A ) M } . She is allo wed to b uffer packets and release them from her buf fer b ut no other operations (e.g., inserting packets, changing packet ordering). W illie is located between Alice and the network, and he watchfully observes all of the input links accessed by Alice ( F A ) to detect whether or not Alice is ﬁngerprinting ﬂows (see Fig. 1). W illie is able to verify the sources and the order of the packets. Therefore, if Alice inserts a packet of her own or re-orders the packets on any of the links to transmit information to Bob, W illie will detect her immediately . Bob observes a subset of the output links where each link con ve ys a packet ﬂow f ( B ) j ∈ F B = { f ( B ) 1 , f ( B ) 2 , . . . , f ( B ) M b } . He is only allo wed to observe the time of the arri v al of each of the packets in each ﬂo w . Bob and W illie cannot manipulate the ﬂows (e.g., change packet timings, remov e packets, insert packets, change packet ordering). Prior to ﬁngerprinting, Alice generates a codebook of ﬁngerprints and shares it with Bob . The codebook is secret, and thus W illie does not ha ve access to it. On the other side of the network, Bob uses the codebook to extract the ﬁngerprints and identify the ﬂows. Each ﬁngerprint (code word) of the codebook corresponds to a sequence of inter-pack et delays, which plays the role of a unique ﬂow identiﬁer . Alice embeds a unique ﬁngerprint in each ﬂow , i.e., she buf fers packets of each ﬂow and releases them according to timings associated with a ﬁngerprint. W e denote by F f ⊂ F A the set of ﬂo ws with ﬁngerprints. In general, not e very ﬁngerprinted ﬂow is observed by Bob . Howe ver , since our goal is to calculate the maximum number of ﬂo ws that can be traced by Alice and Bob, we assume Bob observes all ﬁngerprinted ﬂows, i.e., F f ⊂ F B . As W illie is only able to read the channel, he cannot change packet timings; howe ver , packet timings change after they pass through the network. Nev ertheless, we present a construction where Bob can successfully identify the ﬂows. W e model the network as M parallel First In First Out (FIFO) queues with exponential service times ( M / M / 1 queues). W e consider two settings for the network: 1) Setting 1: each M / M / 1 queue is shared by the ﬂow Alice and Bob are monitoring, which we refer to it as “main ﬂow”, and other ﬂows independent of the main ﬂo w , which we refer to them as “interfering ﬂows”. (see Fig. 1a). 2) Setting 2: each M / M / 1 queue con ve ys just the ﬂow Alice and Bob are monitoring (see Fig. 1b). Denote by q i the i th queue, and by µ i , λ i , and λ 0 i the service rate, the input rate, and the sum of the rates of the interfering ﬂows at q i , respectiv ely . W e term µ 0 i = µ i − λ 0 i the effecti ve service rate [27] of q i and we assume Alice knows the ef fecti ve service time of all queues q 1 , . . . , q M . The queues are stable, i.e., λ i + λ 0 i < µ i . First, we consider Setting 1 (sho wn in Fig. 1a). Assuming the ﬂo w rates of the ﬂows observ ed by Alice 6 and Bob are the same ( λ i = λ ) and that Bob observes only the set of ﬁngerprinted ﬂo ws ( F B = F f ), we present two scenarios: • Scenario 1 (analyzed in Section III): Alice ﬁngerprints all ﬂows to which she has access ( F f = F A ). • Scenario 2 (analyzed in Section IV): Alice ﬁngerprints a subset of the ﬂows to which she has access ( F f ⊂ F A ). Then, considering the same setting for the network (Setting 1 shown in Fig. 1a), we present Scenarios 3 and 4 which are extensions of Scenarios 1 and 2, respectiv ely , to the case that ﬂow rates are arbitrary . Scenarios 3 and 4 are analyzed in Sections V -A and V -B, respecti vely . Next, we consider Setting 2 (shown in Fig. 1b) and present Scenarios 5 and 6, which are extensions of Scenarios 1 and 2, respecti vely , to the case that Bob observes ﬁngerprinted ﬂo ws as well as other ﬂows that are not ﬁngerprinted ( F f ⊂ F B ). If Bob observes a ﬂow f ( B ) i that is not ﬁngerprinted, the ﬂo w can be either coming from Alice ( f ( B ) i ∈ F A ) or other inputs of the network ( f ( B ) i / ∈ F A ). Scenarios 5 and 6 are analyzed in Sections VI-A and VI-B, respecti vely . W e sho w that in each scenario Alice can ﬁngerprint the ﬂo ws in visible to W illie but distinguishable by Bob . In addition, we determine the number of ﬂows that Alice and Bob can in visibly and reliably trace by ﬁngerprinting. Next, we present deﬁnitions and describe in visibility and reliability metrics. B. Deﬁnitions W illie uses hypothesis testing to detect whether Alice is ﬁngerprinting: • H 0 : Alice is not ﬁngerprinting. • H 1 : Alice is ﬁngerprinting. Denote P F A as the false alarm probability of rejecting H 0 when Alice is not ﬁngerprinting (type I error), and P MD as the missed detection probability of rejecting H 1 when Alice is ﬁngerprinting (type II error). T o gi ve more power to W illie, we assume he knows the probability that Alice is ﬁngerprinting, P ( H 1 ) = 1 − P ( H 0 ) . Similar to the deﬁnition of cov ertness [28]–[32], we deﬁne in visibility [1]: Deﬁnition 1. (In visibility) Alice’ s ﬁngerprinting is in visible (cov ert) if and only if she can lower bound W illie’ s probability of error , P (w) e = P F A + P MD 2 , by 1 2 −  for any  > 0 , as T → ∞ . W e term  the in visibility parameter . Deﬁnition 2. (Reliability) Alice’ s ﬁngerprinting is reliable if and only if for any ζ > 0 and any ﬂow , the probability of the failure ev ent satisﬁes P f ≤ ζ as T → ∞ . W e term ζ the reliability parameter . For a ﬂow with a ﬁngerprint the failure event occurs when one of the follo wing ev ents occurs: 7 • Alice cannot successfully ﬁngerprint the ﬂow since she does not have a packet av ailable to release when she needs one. W e denote by P f 1 the probability of this ev ent. • Bob cannot extract the ﬁngerprint successfully . W e denote by P f 2 the probability of this ev ent. For a ﬂow without a ﬁngerprint, the failure ev ent occurs when Bob detects a ﬁngerprint. W e denote by P f 3 the probability of this ev ent. Note that both P f 3 and P F A refer to the (erroneous) detection of ﬁngerprints when ﬂo ws are not ﬁngerprinted; howe ver , the former refers to detection by W illie after observing all the ﬂows, and the latter refers to detection by Bob for each ﬂow . Deﬁnition 3. (Lambert-W function) The Lambert-W function is the in verse function of f ( W ) = W e W . W e present results under the assumption that P ( H 0 ) = P ( H 1 ) = 1 / 2 . W e show in Appendix A that this results in in visibility for the general case where P ( H 0 ) 6 = P ( H 1 ) . In this paper , we use standard Big-O, Little-o, Big-Omega, little-omega, and Big-Theta notations [33, Ch. 3]. I I I . S C E N A R I O 1 : A L L FL O W S A R E FI N G E R P R I N T E D , S E T T I N G 1 Consider Scenario 1: Alice ﬁngerprints all ﬂo ws she observes ( F f = F A ), and Bob observes only the ﬁngerprinted ﬂows ( F B = F f ). All of ﬂow rates are equal ( λ i = λ ). W e consider Setting 1 (see Fig. 1a), i.e., M parallel M / M / 1 queues where each queue is shared by a ﬁngerprinted ﬂow and other interfering ﬂows independent of the ﬁngerprinted ﬂo w . Alice ﬁngerprints the input ﬂows during time interv al [0 , T ] , and Bob extracts the ﬁngerprints from the ﬂows on the output links of the network to infer the connections between input and output ﬂows. Alice buf fers packets and releases them according to a ﬁngerprint. She uses a secret codebook where each codeword (ﬁngerprint) is a unique ﬂow identiﬁer consisting of a sequence of inter-packet delays. Because the timings of packets that Alice recei ves as well as the code words are random, Alice will face a causality problem: the need to send a packet before she receiv es it. W e giv e an example of when Alice cannot successfully ﬁngerprint a ﬂow in Fig. 2. Consider a ﬂow and assume the inter-arri val times of this ﬂow before Alice makes any changes are [10 µs, 2 µs . . . ] . Also assume Alice selects a ﬁngerprint C ( W ) = [5 µs, 3 µs, . . . ] from her codebook. Note that the inter-arri val time between the ﬁrst and second packets of the ﬂow is 10 µs but Alice has to alter the packet timings of the ﬂow to achiev e an inter-arri val of 5 µs between the ﬁrst and the second packets. In other words, she has to send the second packet before she recei ves it. 8 Fig. 2: An example of when Alice cannot successfully ﬁngerprint a ﬂow: the packet timings of the ﬂow receiv ed by Alice and the packet timings suggested by the selected ﬁngerprint are [10 µs, 2 µs . . . ] and [5 µs, 3 µs, . . . ] , respectiv ely . Alice faces a causality problem when she needs to send the second packet since she has to send it before she receives it. T o account for this, prior to ﬁngerprinting, Alice in visibly slows down the ﬂow in order to buf fer packets [29, Section IV]. This ensures she will have a packet in her buf fer to transmit at the appropriate times and can ﬁngerprint the ﬂow successfully . W e calculate the number of ﬂows m = M that Alice and Bob can trace by ﬁngerprinting using this scheme, asymptotically as a function of T . Theorem 1. Consider Setting 1 (see F ig. 1a). If Alice ﬁngerprints all M input ﬂows ( F f = F A ) whose rates ar e equal ( λ ) and Bob only observes ﬁngerprinted ﬂows ( F B = F f ), then Alice and Bob can in visibly and reliably trace m = M = O ( T / log T ) ﬂows in a time interval of length T . Proof. Construction : Per above, Alice uses a scheme consisting of two phases of lengths T 1 and T 2 , and employs a codebook of ﬁngerprints to embed in the ﬂows. The codebook construction is similar to the one adopted in [1], [29], [30]. In particular , Alice generates m independent instantiations of a Poisson process with parameter λT 2 , where T 2 is the length of the second phase, as follows. T o generate 9 Fig. 3: Alice’ s divides the time interval of length T into two phases: a buf fering phase of length T 1 where packets of each ﬂow are slowed do wn, and a ﬁngerprinting phase of length T 2 = T − T 1 where Alice ﬁngerprints the ﬂows. the l th code word ( 1 ≤ l ≤ m ), ﬁrst a number n l is generated according to a Poisson distribution with mean λT 2 , and then n l points are distributed randomly and uniformly in a time interval of length T 2 [34] (see Fig. 4). Therefore, the codebook contains m ﬁngerprints (code words) { C ( W l ) } l = m l =1 . Alice selects a ﬁngerprint for each ﬂow and applies the inter-pack et delays of the chosen ﬁngerprint to the packets of the ﬂow . The codebook is shared with Bob, not kno w to W illie. Alice divides the time interval of length T into two phases (see Fig. 3): • Phase 1 (buf fering phase) of length T 1 : Alice slows each ﬂow from rate λ to rate λ − ∆ to buf fer packets, i.e., if she receives a packet at time τ , she transmits it at time τ λ λ − ∆ . This allows her to build up a backlog of packets in her buf fer which ensures that she will be able to ﬁngerprint each ﬂo w during the next phase successfully . • Phase 2 (ﬁngerprinting phase) of length T 2 = T − T 1 : for each ﬂow , she selects a ﬁngerprint from her codebook and then alters the packet timings of the ﬂo w according to the selected ﬁngerprint. The lengths of the two phases are, T 1 = T mα 1 + mα , (1) T 2 = T − T 1 = T 1 + mα , (2) where α is a constant deﬁned later , and m is the number of ﬂows to be ﬁngerprinted. Analysis : ( In visibility ) Similar to the analysis of covertness in [29, Theorem 2], we can show that Alice’ s ﬁngerprinting is in visible. Consider the ﬁrst phase. W e can show that for all  ∈ (0 , 1 2 ) , Alice can slow down the ﬂo ws from rate λ to rate λ −  p 2 λ/mT 1 , and achiev e (see the proof in Appendix B) P (w) e > 1 2 − , (3) where P e is W illie’ s error probability . Thus, her buf fering is in visible. In the second phase, the packet timings for each ﬂow is an instantiation of a Poisson process with rate λ and hence the traf ﬁc pattern is indistinguishable from the pattern that Willie expects to observe. Hence, the scheme is in visible. 10 Fig. 4: Codebook generation: Alice generates a codebook whose code words (ﬁngerprints) specify the sequence of inter-pack et delays to be embedded in the ﬂows. Each codeword is an instantiation of a Poisson process of rate λ min = min( λ 1 , . . . , λ m ) in a time interv al of length T 2 . For each code word, ﬁrst a random v ariable N is generated according to the Poisson distribution with parameter λT 2 . Then N points are placed uniformly and randomly in the time interval of length T 2 . The codebook is shared with Bob, but it is unknown to W illie. ( Reliability ) Now , we sho w that Alice’ s ﬁngerprinting satisﬁes all of the conditions in Deﬁnition 2, and thus is reliable. Note that all ﬂows hav e ﬁngerprints. By the union bound: P f ≤ P f 1 + P f 2 . (4) Thus, to show the ﬁngerprinting is reliable, it suf ﬁces to show that P f 1 + P f 2 ≤ ζ for all ζ > 0 . First, we show that P f 2 → 0 as T → ∞ for each ﬂo w , i.e., Bob can successfully extract a ﬁngerprint from each ﬂo w . Recall that Alice ﬁngerprints all m ﬂows that she observes and Bob observes only the ﬂo ws ﬁngerprinted by Alice ( F A = F f = F B ). Therefore, m = |F A | = |F f | = |F B | , where | · | denotes the cardinality of a set. W ithout loss of generality , we assume that ﬂow f ( B ) i passes through the i th queue ( q i ). Denote by C i the capacity of q i for the transmission of information via packet timings. Recall that q i is an M / M / 1 queue with multiple inputs and outputs and that Alice establishes a timing channel on each input ﬂow to send a ﬁngerprint to Bob . Recall that q i is an M / M / 1 queue with multiple inputs and outputs and that Alice establishes a timing channel on each input ﬂow to send a ﬁngerprint to Bob . Therefore, we use the bound on the capacity of the timing channel for a shared M / M / 1 queue [27, Proposition 1]: C i ≥ λ log (( µ i − λ 0 i ) /λ ) , (5) where λ 0 i is the sum of rates of the interfering ﬂows passing through q i , and µ i is the service rate of q i . Note that (5) implies that although q i changes the packet timings of the ﬂow and thus the embedded 11 ﬁngerprint, Bob is able to successfully decode at least C i nats/second bits from the packet timings of the ﬂo w and thus extract Alice’ s ﬁngerprint. From [34, Deﬁnition 1], the rate of the codebook is log m T 2 , and [34, Deﬁnition 2], (5) implies that all transmission rates smaller than λ log (( µ i − λ 0 i ) /λ ) result in a decoding error probability that tends to zero as T 2 → ∞ . Therefore, we require log m T 2 < λ log (( µ i − λ 0 i ) /λ ) (6) for Bob to successfully extract the ﬁngerprint from f ( B ) i . Note that (6) holds for all 1 ≤ i ≤ m . Hence, as long as log m T 2 < C , (7) where C = λ log  min i { µ i − λ 0 i } /λ  , (8) for each ﬂow P f 2 → 0 as T 2 → ∞ . Note that (2) implies that T 1 , T 2 → ∞ as T → ∞ . Therefore, P f 2 → 0 as T → ∞ . (9) Next, we sho w that P f 1 ≤ ζ , i.e., Alice can successfully ﬁngerprint the ﬂows. Recall that Alice accounts for the causality problem by buf fering packets before she starts ﬁngerprinting. Since in the ﬁrst phase Alice slows down the packet rate from rate λ to rate λ −  p 2 λ/mT 1 , on average she can buf fer  p 2 λT 1 /m packets. Consequently , we can apply the weak law of large numbers (WLLN) to sho w that the probability that Alice b uf fers more than  p λT 1 /m packets tends to one, as T tends to inﬁnity . Now , we hav e to answer this question: noting that Alice has  p λT 1 /m packets in her buf fer , what is the probability that Alice cannot successfully ﬁngerprint f ( A ) i ? Because Alice receiv es and transmits packets on each ﬂow according to two independent Poisson processes of rate λ , and the Poisson process is memoryless, we model the process as a symmetric random walk on a 1-D grid to answer this question [29]. The location of the walker corresponds to the number of packets in Alice’ s b uffer . The walker goes from location z to z + 1 when Alice receiv es a packet, and goes from location z to z − 1 when Alice transmits a packet. Denote by P k,t the probability of the ev ent that the walker starting from the location z = k reaches the point z = 0 , at least once, during the time [0 , t ] . Then [29, Eq. (27)]: lim t →∞ P k,t ≤ 1 − lim t →∞ erf  k √ 8 λt  . (10) 12 Since Alice ﬁngerprints the ﬂows in the second phase, t = T 2 . Recall that the probability that Alice buf fers more than  p λT 1 /m packets tends to one, as T → ∞ . Therefore, we let k =  p λT 1 /m . By (10), the probability that Alice runs out of packets for ﬂow f ( A ) i satisﬁes: lim T →∞ P f 1 ≤ 1 − lim T →∞ erf  2 r T 1 2 mT 2 ! = 1 − erf   r α 8  . (11) where the equality holds since T 1 /T 2 = mα following from (1) and (2). Note that (11) is independent of i (index of the ﬂow), and holds for all ﬂows f ( A ) i , 1 ≤ i ≤ m . Let α = (8 / 2 )(erf − 1 (1 − ζ )) 2 . (12) By (12), (11) yields P f 1 ≤ ζ as T → ∞ . (13) Consequently , by (4), (9), (13), P f ≤ ζ for all ζ > 0 , when T → ∞ and thus Alice and Bob’ s ﬁngerprinting is reliable. ( Number of ﬂows ) By (7) and (2), we require log m T 2 = (1 + mα ) log m T < C . (14) as T 2 → ∞ ( T → ∞ ). In Appendix C we show that we can achie ve (14) as long as m = 1 2 min  α − 1  T C W ( T C ) − 1  , T C W ( T C )  , (15) where W ( · ) is the Lambert-W function. Since for T > e , W ( T ) ≤ ln( T ) , Alice and Bob can invisibly and reliably pair the end points of ev ery ﬂo w , and thus break the anonymity of a network (Setting 1 sho wn in Fig. 1a) with m = O ( T / log T ) ﬂows.  I V . S C E N A R I O 2 : A L I C E FI N G E R P R I N T A S U B S E T O F T H E FL O W S , S E T T I N G 1 In Scenario 1, W illie is certain that if H 1 is true, i.e., Alice ﬁngerprints, then all ﬂo ws are slowed do wn in the ﬁrst phase. In Scenario 2, we add uncertainty to W illie’ s knowledge under H 1 : Alice ﬁngerprints a subset F f of the ﬂo ws, and F f is unknown to W illie. Therefore, W illie has to in vestigate a large set of ﬂows to detect if some are slowed down in the ﬁrst phase as required for ﬁngerprinting. W e show that W illie’ s uncertainty allows Alice to ﬁngerprint more ﬂows without being visible. Alice ﬁngerprints a subset of the ﬂows she observes ( F f ⊂ F A ). For each ﬂow , she selects a unique ﬁngerprint from her codebook and alters the timings of that ﬂow according it. Similar to Scenario 1, Alice has T units of time which she di vides into two phases: a buf fering phase of length T 1 , which 13 ensures Alice can successfully ﬁngerprint, and a ﬁngerprinting phase of length T 2 = T − T 1 . Bob, who has access to the ﬁngerprint codebook and observes the set of ﬁngerprinted ﬂo ws ( F B = F f ), extracts the ﬁngerprints from the ﬂows. The ﬁngerprint codebook is secret and W illie does not hav e access to it. The network is modeled by M parallel M / M / 1 queues with each queue shared by a ﬂo w from Alice to Bob (main ﬂo w) as well as other interfering ﬂows independent of the main ﬂow (Setting 1 sho wn in Fig. 1a). W e calculate the number of ﬂo ws ( m ) that Alice can ﬁngerprint using this scheme, asymptotically as a function of T . Theorem 2. Consider Setting 1 (see F ig. 1a). In a set F A containing M ﬂows with equal rates ( λ ), if Bob observes only the ﬁngerprinted ﬂows ( F B = F f ), Alice and Bob can in visibly and r eliably trace m ﬂows in a time interval of length T , wher e m =            M , M = O (1) o (min { √ M , e T C 1 } ) , M = ω (1) & M = O ( e 2 T C ) Θ( e T C 2 ) , M = ω ( e 2 T C ) (16) C is given in (8) , and C 1 , C 2 ∈ (0 , C ) ar e arbitrary constants. A more accurate characterization of m with respect to M is presented in (26) in the proof below . Proof. Construction : The construction is similar to that of Scenario 1 except that Alice ﬁngerprints a subset of the ﬂows that she observes. Recall that all of ﬂows observed by Bob are also observed by Alice ( F B ⊂ F A ). Alice knows which set of her ﬂows will be observed by Bob, and chooses them for ﬁngerprinting ( F f = F B ). Note that W illie does not know which subset of F A is F B . Alice generates a codebook of m ﬁngerprints (similar to Scenario 1) and shares it with Bob prior to ﬁngerprinting, where m is giv en in (16). Recall that we calculate the maximum number of ﬂows that Alice and Bob can trace; therefore, we only consider the case |F B | = m which can be extended to |F B | ≤ m tri vially . Alice’ s scheme consists of two phases, a buf fering phase of length T 1 , and a ﬁngerprinting phase of length T 2 = T − T 1 , where T 1 = T α 0 ln(1 +  2 M 2 m 2 ) + α 0 , (17) T 2 = T − T 1 = T 1 + α 0 / ln(1 +  2 M 2 m 2 ) , (18) α 0 = α 2 . (19) 14 Recall that α and C are giv en in (12) and (8), respectiv ely , and  is the in visibility parameter . Alice generates ﬁngerprints for her codebook analogous to Scenario 1. The number of ﬁngerprints in her codebook is m . Analysis : ( In visibility ) For each phase, we show that all operations Alice performs on the ﬂo ws are in visible. Consider the ﬁrst phase [0 , T 1 ] where Alice slows down each ﬂow from rate λ to rate λ − ∆ with ∆ = s λ T 1 ln  1 +  2 M 2 m 2  . (20) From W illie’ s perspecti ve, the number packets in time [0 , T 1 ] is a suf ﬁcient statistic to detect Alice [29]. If Alice does not ﬁngerprint ( H 0 ), then the joint probability density function (pdf) of W illie’ s observations is P 0 = Q M i =1 P λ ( n i ) where P λ ( n ) is the pdf of a Poisson random variable with mean λ . Note that W illie knows that m out of M ﬂows observed by Alice is selected to be ﬁngerprinted, but he does not kno w which set is selected. Therefore, from W illie’ s point of view , if Alice chooses to ﬁngerprint ﬂo ws ( H 1 ), then each ﬂow will contain a ﬁngerprint with probability p = m M . (21) Thus, the joint pdfs of W illie’ s observ ations when Alice ﬁngerprints ( H 1 ) is P 1 = M Y i =1 ( p P λ − ∆ ( n i ) + (1 − p ) P λ ( n i )) , where ∆ is the change in ﬂow rate. Note that the change of rate differs from the one in Scenario 1. Suppose that W illie applies an optimal hypothesis test to minimize his probability of error P (w) e . Then, we can obtain a lower bound on his probability of error [31, Eq.1]: P (w) e ≥ 1 2 − r 1 8 D ( P 1 || P 0 ) , (22) where D ( P 1 || P 0 ) is the Kullback–Leibler diver gence (relativ e entropy) between P 1 and P 0 . Alice’ s scheme is in visible as long as she can make W illie’ s detector operate as close as desired to the detector that disregards W illie’ s observ ations and results in P (w) e = 1 / 2 (see Deﬁnition 1). In Appendix D, we show that for  > 0 , r 1 8 D ( P 1 || P 0 ) ≤ . (23) Thus, (22) yields P (w) e ≥ 1 2 −  as T → ∞ , and thus Alice’ s buf fering is in visible. The second phase is in visible because the ﬁngerprints are samples of Poisson processes with rate λ . Combined with the invisibility of the ﬁrst phase, Alice and Bob’ s scheme is in visible. 15 ( Reliability ) The analysis is similar to that of Scenario 1. Since all ﬂo ws observed by Bob are ﬁngerprinted ( P f 3 = 0 ), to sho w Alice and Bob’ s scheme is reliable, it sufﬁces to sho w that for each ﬂo w P f 1 + P f 2 ≤ ζ for all ζ > 0 . Similar to Scenario 1, in Appendix E we show that P f 1 ≤ ζ as T → ∞ . (24) No w , consider Bob’ s decoding error for each ﬂow , P f 2 . By (17) and (18), T 1 , T 2 → ∞ as T → ∞ . In order for Bob to be able to successfully extract the ﬁngerprint from each ﬂow , we require log m T 2 < C . (25) as T 2 → ∞ ( T → ∞ ). Substituting T 2 from (18) and re-arranging yields: m ≤ exp T C 1 + α 0 / ln(1 +  2 M 2 m 2 ) ! (26) W e show in Appendix F that (26) holds asymptotically as T → ∞ , giv en the v alue of m provided in (16). Consequently , P f 2 → 0 as T → ∞ . (27) By (4), (24), and (27), P f → 0 as T → ∞ . Thus, if M = ω (1) , Alice can invisibly and reliably ﬁngerprint o  min { √ M , e T C }  ﬂo ws in a time interval of length T , and Bob can successfully extract the ﬁngerprints, where C is giv en in (8), and if M = O (1) , Alice can in visibly and reliably ﬁngerprint all M ﬂows in a time interval of length T , and Bob can successfully extract the ﬁngerprints.  In Scenario 2, we assumed that all ﬂows observed by Bob are also observed by Alice and chosen for ﬁngerprinting ( F f = F B ⊂ F A ). Although this is applicable in many schemes, we present results for the case where this assumption is relaxed in Section VI, i.e., Bob observes ﬂows with and without ﬁngerprints. V . E X T E N S I O N T O A R B I T R A RY R A T E S In this section, we extend Theorems 1 and 2 to the case that the ﬂow rates are arbitrary . 16 A. Scenario 3: All ﬂows ar e ﬁngerprinted and ﬂow rates ar e arbitrary , Setting 1 Consider Scenario 3, which is the extension of Scenario 1 to arbitrary rates: Alice ﬁngerprints all of the ﬂows she observes ( F f = F A ), and Bob observes only the ﬁngerprinted ﬂows ( F B = F f ). W e consider Setting 1 (see Fig. 1a), i.e., M parallel M / M / 1 queues with multiple inputs and outputs, where each queue is shared between a ﬂo w from Alice to Bob (main ﬂow) as well as other interfering ﬂo ws independent of the main ﬂow . Here the ﬂo ws rates λ 1 , . . . , λ M can be arbitrary , and the main ﬂow passing through the i th queue ( q i ) has the rate of λ i . Alice ﬁngerprints the input ﬂows of the network in the time interval [0 , T ] , and Bob extracts the ﬁngerprints from the ﬂows on the output links of the network to infer the connections between input and output ﬂows. Similar to Scenario 1, for each ﬂow Alice selects a code word (ﬁngerprint) from her codebook and embeds it in the ﬂo w by changing the packet timings of the ﬂo w . She b uilds her codebook based on the minimum rate of the ﬂows λ min = min( λ 1 , . . . , λ M ) , and to embed a ﬁngerprint (of rate λ min ) in a ﬂo w of rate λ i , she scales the ﬁngerprint by a factor of λ min /λ i to obtain a modiﬁed ﬁngerprint of rate λ i , and then embeds it in the ﬂow . In addition, she uses a two-phase (buf fering-ﬁngerprinting) scheme similar to those of Scenarios 1 and 2. W e calculate the number of ﬂows ( m = M ) that Alice and Bob can trace by ﬁngerprinting using this scheme, asymptotically as a function of T . Theorem 3.1. Consider Setting 1 (see F ig. 1a). If Alice ﬁngerprints all M input ﬂows ( F f = F A ) whose rates λ 1 , . . . , λ m ar e arbitrary and Bob observes only the set of ﬁngerprinted ﬂows ( F B = F f ), then Alice and Bob can in visibly and r eliably trace m = M = O ( T / log T ) ﬂows in a time interval of length T . Proof. Construction : Per above, Alice employs a two-phase scheme: a buf fering phase of length T 1 and a ﬁngerprinting phase of length T 2 = T − T 1 (see Fig. 3), where T 1 and T 2 are giv en in (1) and (2). The codebook construction is similar to Scenario 1, but the rate of the ﬁngerprints (code words) is λ min = min( λ 1 , . . . , λ M ) . T o embed a ﬁngerprint in a ﬂow of rate λ i , Alice selects a ﬁngerprint ( τ 1 , . . . , τ N ) and scales by a factor λ min /λ i to generate a modiﬁed ﬁngerprint of rate λ i , ( λ min τ 1 λ i , . . . , λ min τ N λ i ) . Since ﬁngerprints are instantiations of a Poisson process of parameter λ min (i.e., its inter-arri val times are instantiations of an exponential random v ariable of mean 1 /λ min ), the modiﬁed ﬁngerprint is an instantiation of a Poisson process of parameters λ i . Next, Alice applies the inter-packet delays giv en by the modiﬁed ﬁngerprint to each ﬂow . Recall that Bob knows the rate of each ﬂow . Upon observing f ( B ) i , the ﬂow with packet timings ¯ t i = ( t (1) i , t (2) i , . . . , t ( N ) i ) and rate λ i , Bob seeks to answer the following question: 17 Question 1: Given that Alice used the codebook { C ( W l ) } l = m l =1 whose ﬁngerprints ar e of rate λ min , what is the inde x of the ﬁngerprint that was selected by Alice, scaled to rate λ i , and transmitted thr ough q i to pr oduce the output packet timings ¯ t i ? Analysis : ( In visibility ) Similar to Scenario 1, we analyze the in visibility of the ﬁrst and second phases separately . In the ﬁrst phase, Alice slows do wn each ﬂow of rate λ i to rate λ i −  p 2 λ i /mT 1 . Using arguments similar to that of Theorem 1, we can sho w that [29, Theorem 2]: P (w) e > 1 2 − , where P ( w ) e is W illie’ s error probability . Thus, this phase is in visible to W illie. In the second phase, since Alice embeds a modiﬁed ﬁngerprint of rate λ i in a ﬂow of rate λ i , the trafﬁc pattern remains Poisson with rate λ i indistinguishable from the pattern that W illie expects to observe. Hence, the scheme is in visible. ( Reliability ) Similar to the reliability analysis in Scenario 1, we upper bound P f 1 + P f 2 by ζ , for all ζ > 0 . Recall that upon observing f ( B ) i , Bob seeks the answer to Question 1. Note that the answer to this question is the same as the answer to the following question: Question 2: Given that Alice used the codebook { C 0 ( W l ) } l = m l =1 = λ min λ i { C ( W l ) } l = m l =1 what is the index of the ﬁngerprint that was selected by Alice and transmitted thr ough q i to pr oduce the output packet timings ¯ t i ? In other words, although Alice generates a codebook whose ﬁngerprints are of rate λ min and then scales each ﬁngerprint to adjust to rate λ i of the ﬂow , Bob’ s decoding of each ﬂo w is equiv alent to the case where Alice uses a codebook whose ﬁngerprints are of rate λ i and she does not scale the ﬁngerprints; the only differences are in the number of ﬁngerprints (codewords) and the time to transmit the ﬁngerprint, as we will explain later . Therefore, from (6), Bob can successfully extract the ﬁngerprint from the ﬂow of rate λ i as long as T 2 is large and log m T ( i ) 2 < λ i log (( µ i − λ 0 i ) /λ i ) , (28) where T ( i ) 2 = T 2 λ min /λ i is the time of the transmission of the ﬁngerprint embedded in the ﬂow of rate λ i . Therefore, log m T 2 < λ min log (( µ i − λ 0 i ) /λ i ) . (29) Since the size of the codebook is m , ﬁngerprinting the ﬂow f i corresponds to transmission of log m nats of information through the inter-pack et delays of the ﬂow f i . Note that scaling a ﬁngerprint of rate λ min to rate λ i results in transmission of log m nats of information at a higher rate but a shorter time. 18 Since (29) holds for all 1 ≤ i ≤ m , we require log m T 2 < C 0 , (30) where C 0 = λ min min i { log (( µ i − λ 0 i ) /λ i ) } , (31) to achie ve P f 2 → 0 as T 2 → ∞ for each ﬂow . Note that (2) implies that T 1 , T 2 → ∞ as T → ∞ . Therefore, P f 2 → 0 as T → ∞ . (32) No w , consider P f 1 . In the second phase, on each link Alice recei ves and transmits the packets according to two independent Poisson processes of equal rate. Thus, we employ a random walk analysis similar to that of Scenario 1 to show that P f 1 ≤ 1 − erf  2 r T 1 2 mT 2 ! ≤ ζ as T → ∞ . (33) Consequently , by (4), (32) and (33), P f ≤ ζ for all ζ > 0 , and thus Alice and Bob’ s ﬁngerprinting is reliable. ( Number of ﬂows ) The analysis is similar to that of Scenario 1. As T → ∞ , we require log m T 2 = (1 + mα ) log m T < C 0 . (34) which we can achieve as long as m = 1 2 min  α − 1  T C 0 W ( T C 0 ) − 1  , T C 0 W ( T C 0 )  , (35) Since for T > e , W ( T ) ≤ ln( T ) , Alice and Bob can invisibly and reliably break the anonymity of a network (Setting 1 shown in Fig. 1a) with m = O ( T / log T ) ﬂows.  B. Scenario 4: Alice ﬁngerprints a subset of the ﬂows, Setting 1 Consider Scenario 4, which is the extension of Scenario 2 to arbitrary rates: Alice ﬁngerprints a subset F f of the ﬂows, and F f is unknown to W illie. Similar to Scenario 2, since W illie has to in vestig ate a large set of ﬂows to detect if some are slowed down in the ﬁrst phase as required for ﬁngerprinting, Alice can make more ﬁngerprinted ﬂows in visible. For each ﬂow in F f , she selects a unique ﬁngerprint from her codebook and alters the timings of that ﬂow according to the ﬁngerprint. W e consider Setting 1 (see Fig. 1a), i.e., M parallel M / M / 1 queues with multiple inputs and outputs, where each queue is shared between a ﬂow from Alice to Bob 19 (main ﬂow) as well as other interfering ﬂo ws independent of the main ﬂow . Flows rates are λ 1 , . . . , λ M , which can be arbitrary , and the main ﬂow passing through the i th queue ( q i ) has the rate of λ i . Alice ﬁngerprints the input ﬂows of the network in the time interv al [0 , T ] , and Bob extracts the ﬁngerprints from the ﬂo ws on the output links of the network to infer the connections between input and output ﬂo ws. For each selected ﬂow Alice selects a code word from her codebook and embeds it in the ﬂow by changing its packet timings according to the selected ﬁngerprint. Since ﬂow rates are arbitrary , similar to Scenario 3, she b uilds her codebook based on the minimum rate of the ﬂo ws to be ﬁngerprinted and scales each ﬁngerprint based on the rate of the ﬂo w to be ﬁngerprinted. Also, she uses a two-phase (buf fering-ﬁngerprinting) scheme. W e calculate the number of ﬂo ws ( m ) in which Alice ﬁngerprints using this scheme, asymptotically as a function of T . Theorem 3.2. Consider Setting 1 (see F ig. 1a). In a set F A containing M ﬂows with rates λ 1 , . . . , λ M , if Bob observes only the ﬁngerprinted ﬂows ( F B = F f ), Alice and Bob can in visibly and r eliably trace m ﬂows in a time interval of length T , wher e m is given in (16) , wher e C is r eplaced with C 0 which is given in (31) . Proof. The construction and analysis follow from those of Scenarios 2 with modiﬁcations due to arbitrary rates. The extension to arbitrary rates follows from that of Scenario 3.  V I . M I X I N G FL O W S W I T H A N D W I T H O U T FI N G E R P R I N T S W e hav e previously assumed that Bob only observes the set of ﬁngerprinted ﬂows, i.e., F B = F f . But, in practice Bob might observe a set of ﬂows in which some of the ﬂows are not ﬁngerprinted, and therefore, he must be able to detect if a ﬂow contains a ﬁngerprint. In this Section, we consider Setting 2 (see Fig. 1b) and we present Scenarios 5 and 6 which are extensions of Scenarios 1 and 2, respecti vely , to the case where Bob observes a set of ﬂows in which some of them are not ﬁngerprinted. W e present a detector for Bob that is able to detect if a ﬂow is ﬁngerprinted. A. Scenario 5: All ﬂows ar e ﬁngerprinted and Bob observes ﬂows with and without ﬁngerprints, Setting 2 Consider Scenario 5, which is the extension of Scenario 1 to the case where Bob observes ﬂo ws with and without ﬁngerprints ( F f ⊂ F B ): Alice ﬁngerprints all of the ﬂows she observes ( F f = F A ), ﬂo w rates are equal ( λ ), and Bob observes ﬂows with and without ﬁngerprints. W e consider Setting 2 20 (see Fig. 1b), i.e., M parallel M / M / 1 queues with single input and output. Alice ﬁngerprints the input ﬂo ws of the network in the time interval [0 , T ] , and Bob extracts the ﬁngerprints from the ﬂo ws on the output links of the network to infer the connections between input and output ﬂo ws. In contrast to Scenarios 1-4, Bob uses a detector to determine if a ﬂo w is ﬁngerprinted. W e calculate the number of ﬂows ( m ) that Alice and Bob can trace by ﬁngerprinting using this scheme, asymptotically as a function of T . Theorem 4.1. Consider Setting 2 (see F ig. 1b). If Alice ﬁngerprints all M input ﬂows ( F f = F A ) whose rates ar e equal ( λ ) and Bob observes a set of ﬂows with and without ﬁngerprints ( F f ⊂ F B ), then Alice and Bob can invisibly and r eliably trace m = M = O ( T / log T ) ﬂows in a time interval of length T . Proof. Construction : The only dif ference between the construction of Scenarios 1 and 5 is that, for Scenario 5, Bob must use a detector which detects if a ﬂow contains a ﬁngerprint. Here, Bob’ s decoder is different from the maximum likelihood decoder proposed in [34, p. 9], which for each code word calculates the service times that yield ¯ D i , remov es the codewords that result in negati ve values of service times, and ﬁnally ﬁnds a unique codew ord that corresponds to the minimum sum of service times. Instead, Bob’ s decoder selects a threshold β = log ( µ i /λ ) , applies a function on each codew ord, and ﬁnds a unique codeword that generates an output for the function that is larger than β . Next, we describe Bob’ s decoder in detail [35, p. 12]. For ¯ x = ( x 1 , . . . , x n ) ∈ R n + and ¯ y = ( y 0 , y 1 , . . . , y n ) ∈ R n +1 + , if ¯ x is the sequence of packet timings before the ﬂow passes through q i (inter-arri val times), then the pdf of the observed packet timings ¯ y (inter-departure times) is: P ( ¯ y | ¯ x ) = e µ − λ ( y 0 ) n Y k =1 ( y k − w k ) , where e u ( x ) = ue − ux is the exponential pdf with mean 1 /u , and w k = max { 0 , P k i =1 x i − P k − 1 i =0 y i } is the k th waiting time, the amount of time that the queue waits until it receiv es the k th packet. Since the packet timings of the ﬁngerprinted ﬂow is an instantiation of a Poisson process of rate λ , the joint pdf of the inter-arri val times is Q n k =1 e λ ( x k ) . Consequently , the pdf of ¯ y is: P ( ¯ y ) = Z R n + P ( ¯ y | ¯ x ) n Y k =1 e λ ( x k ) d ¯ x. (36) Bob’ s decoder ﬁnds a unique ﬁngerprint (codew ord) W l from { W l } l = m l =1 that satisﬁes P ( ¯ y | W l ) P ( ¯ y ) > β ; if such a unique codew ord does not exist, it outputs ﬂow not ﬁngerprinted . Analysis : The analysis follows from that of Scenario 1. The only differences appear in the analysis of Bob’ s decoding error probability . The auxiliary threshold decoder used in the analysis of the mismatched 21 decoder in [35, p. 413-417] provides what we need for our application. If Bob uses this detector , the decoding error probability of a ﬁngerprinted ﬂow will be: P f 2 → 0 as T → ∞ , (37) which implies that if we generate m independent instantiations of a Poisson process of rate λ on a time interv al of length T 2 , W 1 , . . . , W m , we select one of them W l and send a packet stream whose packet timings follow W l ov er the network, then the probability that at least one W k 6 = W l satisﬁes P ( ¯ y | W l ) P ( ¯ y ) > β tends to zero, i.e., P ( ∃ W k 6 = l : P ( ¯ y | W k ) / P ( ¯ y ) > β | W l sent ) → 0 as T → ∞ . (38) Consider the case where Bob observes a ﬂo w that is not ﬁngerprinted. Recall that the packet timings of all the ﬂo ws follow a Poisson process of rate λ . Denote by Z ? an instantiation of a Poisson process that corresponds to the packet timings of the this ﬂow before it passes through the network. If Bob detects a ﬁngerprint, it must be that one of the ﬁngerprints W l in the codebook resulted in P ( ¯ y | W l ) P ( ¯ y ) > β . Hence, P f 3 = P  ∃ W k : P ( ¯ y | W k ) P ( ¯ y ) > β     Z ? sent  (39) Recalling that W 1 , . . . , W m and Z ? are independent instantiations of a Poisson process of rate λ , (37) and (38) yield P f 3 → 0 as T → ∞ . Thus, Alice and Bob’ s ﬁngerprinting is reliable.  B. Scenario 6: Alice ﬁngerprints a subset of the ﬂows and Bob observes ﬂows with and without ﬁngerprints, Setting 2 Consider Scenario 6, which is the extension of Scenario 2 to the case where Bob observes ﬂows with and without ﬁngerprints ( F f ⊂ F B ): Alice ﬁngerprints a subset of the ﬂows she observes ( F f ⊂ F A ), ﬂo w rates are equal ( λ ), and Bob observes ﬂows with and without ﬁngerprints. W e consider Setting 2 (see Fig. 1b), i.e., M parallel M / M / 1 queues with single input and output. Alice ﬁngerprints the input ﬂo ws of the network in the time interval [0 , T ] , and Bob extracts the ﬁngerprints from the ﬂo ws on the output links of the network to infer the connections between input and output ﬂo ws. Similar to Scenario 5, Bob’ s detector is able to distinguish whether a ﬂow is ﬁngerprinted. W e calculate the number of ﬂo ws ( m ) that Alice and Bob can trace by ﬁngerprinting, asymptotically as a function of T . Theorem 4.2. Consider Setting 2 (see F ig. 1b). In a set F A containing M ﬂows with equal rates ( λ ), if Bob observes ﬂows with and without ﬁngerprints ( F f ⊂ F B ), Alice and Bob can in visibly and r eliably trace m ﬂows in a time interval of length T , wher e m is given in (16) , where C is r eplaced with C 00 = λ log  min i { µ i } /λ  , (40) 22 Note that the replacement of C with C 00 is necessary since here we consider Setting 1 which implies that the rates of interfering ﬂows λ 0 i are zero. Proof. The construction and analysis follow from those of Scenarios 2 with modiﬁcations due to the change of Bob’ s detector to detect whether a ﬂow is ﬁngerprinted or not. In addition, Alice does not need to know which subset of the ﬂows she observes are observed by Bob to ﬁngerprint the. But, she chooses an arbitrary subset of ﬂows and ﬁngerprints them. In general, each ﬁngerprinted ﬂow will not be observed by Bob . Howe ver , since we determine the maximum number of ﬂo ws that can be traced, we assume that each ﬁngerprinted ﬂow will be observed by Alice. The analysis for Bob’ s detector follo ws from that of Scenario 5.  In Theorem 4.2, Alice’ s selection of subset might be due to the preference of Alice and Bob . But, if there is no such preference, Alice can choose the ﬂows randomly and independently to ﬁngerprint them. Next, we present Theorem 4.3 to address this case. Theorem 4.3. Consider Setting 2 (see F ig. 1b). In a set F A containing M ﬂows, if Alice ﬁngerprints each ﬂow independently with pr obability q , each ﬂow has rate λ , and Bob observes a set of ﬂows that contains ﬂows with and without ﬁngerprints ( F f ⊂ F B ), then Alice and Bob can in visibly and r eliably trace m = O min ( M q , exp T C 1 + α 0 / ln(1 +  2 2 M q 2 ) !)! (41) ﬂows in a time interval of length T , where  is the in visibility parameter , and C 00 and α 0 ar e given in (19) and (40) , r espectively . Proof. The construction and analysis follo ws those of Theorem 4.2 with modiﬁcations due to the random selection of the ﬂows. Alice builds a ﬁngerprint codebook of size m , where m = exp T C 1 + α 0 / ln(1 +  2 2 M q 2 ) ! . (42) She selects the ﬂow f ( A ) i ∈ F A to be ﬁngerprinted with probability q , independent of other ﬂo ws. For each ﬂo w f ( A ) i she generates an independent Bernoulli random variable X i with P ( X i = 1) = q ; she selects a unique (unused) ﬁngerprint from her codebook and embeds it in ﬂow f ( A ) i if and only if X i = 1 . Similar to the analysis of Scenario 2, we can show that for reliable ﬁngerprinting we require m ≤ exp T C 1 + α 0 / ln(1 +  2 2 M q 2 ) ! , (43) 23 which is satisﬁed by (42). Next, we show that N s = P M k =1 X i = O ( M q ) . Consider random variables Y i = X i /q , i = 1 , . . . , M . Since E [ Y i ] = E [ X i ] /q = 1 , the weak law of large numbers (WLLN) yields lim T →∞ P  1 M P M i =1 Y i > 1 / 2  = 1 . Let γ = 1 / 2 and X i = q Y i . Thus, lim T →∞ P  P M i =1 X i > M q / 2  = 1 . Since Alice ﬁngerprints min { m, N s } ﬂo ws, the number of ﬂo ws that Alice and Bob can in visible and reliably trace is (41). In [1, Theorem 2], we presented values of q and M that yield a close to a maximal number of ﬂows that can be traced.  V I I . S I M U L A T I O N R E S U LT S A. W illie’ s err or pr obability First, we consider Scenario 1 and present the results of the simulation for W illie’ s detection. Then, we discuss how similar results apply to all of the scenarios with slight modiﬁcations. Consider Scenario 1. Recall that when H 0 is true (Alice is not ﬁngerprinting), W illie observes ﬂows where each ﬂow’ s packet timing is gov erned by a Poisson process of rate λ . When H 1 is true (Alice is ﬁngerprinting), the packet timing of each ﬂow observed by W illie is governed by a Poisson process of rate λ −  p 2 λ/mT 1 in the ﬁrst phase and a Poisson process of rate λ in the second phase. Since the statistical properties of the ﬂows are the same for H 0 and H 1 in the second phase, he uses the information obtained from his observ ations in the ﬁrst phase to test whether Alice is ﬁngerprinting. Note that when H 1 is true, W illie observes m ﬂo ws each of whose packet rates is λ −  p 2 λ/mT 1 in the ﬁrst phase. Similar to [29], we can show that a packet counter is an optimal detector for W illie. He counts the total number of packets S in the ﬁrst phase for all m ﬂows, and sets a threshold U . If S < λT m − U , he selects H 1 ; otherwise, he selects H 0 . W e consider P ( H 0 ) = P ( H 1 ) = 0 . 5 . The simulation parameters are λ = 7 . 36 packets/second, min i { µ i − λ 0 i } = 20 packets/second, C = 7 . 36 nats/second (see (8)), T = 3600 × 11 seconds, ζ = 0 . 01 ,  = 0 . 1 , m = 10 (see (15)), U ∈ [0 . 07 , 100] √ λmT ≈ [80 , 120000] . Alice reduces the rate of each packet stream from λ to λ − r  p 2 λ/mT 1 , and we plot recei ver operating characteristic (R OC) curves for W illie for r ∈ [0 . 1 , 9] (see Fig. 5). Note that the x-axis and y-axis of this ﬁgure are W illie’ s probability of false-alarm ( P F A ) and true-detection (1- P MD ), respectiv ely . The number of trials is 8000 . According to Theorem 1, r = 1 corresponds to the case which yields cov ertness, as veriﬁed by the R OC curve. Note that large values for r , which corresponds to more slow down of the packets by Alice in the ﬁrst phase, lead to detection by W illie with high probability . Next, we discuss why these results apply to other scenarios. Note that W illie’ s detection defers across scenarios since he observes a dif ferent number of ﬂows. Ho we ver , in all scenarios W illie’ s optimal 24 Fig. 5: The receiver operating characteristic (ROC) curve for W illie’ s detection. Alice reduces the rate of each packet stream from λ to λ − r  p 2 λ/mT 1 , and we draw R OC curves for r ∈ [0 . 1 , 9] . detector is a packet counter . Since all of the links are gov erned by independent Poisson processes and the sum of independent Poisson random v ariables (with distinct parameters) is another Poisson random v ariable, W illie’ s detection problem differs only slightly . B. Pr obability that Alice runs out of packets Recall that in all scenarios Alice slightly slo ws do wn the packet rate of each ﬂow so as to buf fer packets. She does this to ensure that in the second phase, she does not run out of packets with high probability . W e denoted the probability that Alice runs out of packets by P f 1 , and recall we want to achie ve P f 1 < ζ . 25 W e consider a single link and plot the curve for the probability that Alice runs out of packets when she reduces the rate from λ to λ − r 0  p 2 λ/ ( mT 1 ) (see Fig. 6), where r 0 ∈ [0 , 0 . 5] is v ariable. W e term 100 r 0 the percentage of ideal rate reduction. According to Theorem 1, r 0 = 1 corresponds to the rate reduction that yields P f 1 < ζ . The simulation parameters are λ = 20 packets/second, min i { µ i − λ 0 i } = 25 packets/second, C = 4 . 46 nats/second (see (8)), T = 3600 × 2 seconds, ζ = 0 . 1 ,  = 0 . 1 , m = 9 (see (15)). The number of trials is 10 , 000 . As expected, larger values of r 0 yields a smaller probability of failure for Alice. Although only the value of m is tied to Scenario 1, this result applies to all scenarios with minor modiﬁcations. Note that the ideal reduced rate in the ﬁrst phase ( λ − r 0  p 2 λ/ ( mT 1 ) with r 0 = 1 ) is expected to achie ve P f ≤ ζ = 0 . 1 . Although the simulation for r 0 ∈ [0 . 6 , 1] has not been done due to processing time limits, the Fig. 6 sho ws that e ven with less rate reduction ( r 0 = 0 . 5 ) and hence less buf fering, we achie ve a much smaller probability of failure ( P f 1 ≤ ζ = 2 × 10 − 4 ). So, our buf fering requirements are conserv ati ve rate reduction in the ﬁrst phase is conservati ve. That leads to allocating a large portion of T to the ﬁrst phase, and a small portion to the second phase. The plot shows that in practice we can reduce the rate in the ﬁrst phase less, and allocate a smaller portion of T to the ﬁrst phase. C. Bob’ s decoding err or pr obability W e consider a single link and plot Bob’ s error probability ( P f 2 ), i.e., the probability that Bob extracts a wrong ﬁngerprint from the link. Although the simulation results presented here are according to the number of links m deri ved from Scenario 1, this result also applies to all scenarios with minor modiﬁcations. The simulation parameters are λ = 5 . 485 packets/second, min i { µ i − λ 0 i } = 5 . 5 packets/second, C = 0 . 015 nats/second (see (8)), T = 3600 × 40 seconds, ζ = 0 . 15 ,  = 0 . 1 , m = 13 (see (15)). The number of trials is 2 , 000 . The maximum allow able size of the codebook is m = 13 . For simulation, we let the size of the codebook be b r 00 × m c , where r 00 ∈ [0 . 002 , 1 . 2] (see Fig. 7). The x-axis is r 00 . According to Theorem 1, r 00 = 1 corresponds to the ideal codebook size that results arbitrarily small error probability for Bob . Note that the results of Theorem 1 is based on Shannon’ s random coding which relies on large T . If we consider larger values for T , we expect to see small error probabilities for Bob’ s decoding when the size of the codebook is ideal or smaller than that r 00 ≤ 1 . Currently , because of processing time limits, we observe P f 2 = 0 . 02 when r 00 = 1 . Although T = 3600 × 40 seconds, the length of the second phase is only 173 seconds. In other words, if Alice and Bob are gi ven 40 hours, they only use about 3 minutes of that time to embed and extract the ﬁngerprints, and Alice uses the rest of the time to buf fer packets in the ﬁrst phase to ensure 26 Fig. 6: The probability that Alice runs out of packets for a single link when, in the buf fering phase, she reduces the packet rate from λ to λ − r 0  p 2 λ/ ( mT 1 ) , where 100 r 0 is the percentage of ideal rate reduction. her ﬁngerprinting will be successful. As stated in Section VII-B, this is because the parameters for packet buf fering are conserv ati ve. Improving the parameters and reducing the amount of time needed for buf fering lies beyond the scope of this work since our primary goal is to establish the fundamental limits. Noting that using only 3 minutes for embedding and extracting the ﬁngerprints results in a decoding probability of error P f 2 = 0 . 02 , we can state that our current scheme is efﬁcient in this way . D. Robustness against pr ocessing time of queues Bob’ s detector relies on the fact that the queues are M / M / 1 which implies the processing times of the queues are i.i.d exponential random variables. Here, we consider M /G/ 1 queues, whose processing times are i.i.d. samples of non-exponential random variables, and plot Bob’ s decoding error probability . 27 Fig. 7: The probability that Bob extracts a wrong ﬁngerprint from a ﬂow . Bob looks at the packet timings of the ﬂow and extracts the ﬁngerprint from it according to the codebook shared with Alice. The size of the codebook is r 00 × m , where m is the ideal codebook size according to Theorem 1. W e let the processing times of the queue be i.i.d. instantiations of a W eibull distrib ution with shape parameters 1 , 2 , 3 , 4 , with the same processing rate, mu . Note that the shape parameters 1 corresponds to an exponential random variable. Similar to Section VII-C, we consider a single link and plot Bob’ s error probability ( P f 2 ), i.e., the probability that Bob extracts a wrong ﬁngerprint from the link (see Fig. 8). Although the simulation results presented here are according to the number of links m deri ved from Scenario 1, this result also applies to all scenarios with minor modiﬁcations. The simulation parameters are the same as those of 28 Fig. 8: The probability that Bob extracts a wrong ﬁngerprint from a ﬂow when the service times of the queue are i.i.d. instantiations of exponential distribution and W eibull distribution with shape parameters 2 , 3 , 4 . Bob looks at the packet timings of the ﬂow and extracts the ﬁngerprint from it according to the codebook shared with Alice. The size of the codebook is r 00 × m , where m is the ideal codebook size when the processing times are instantiations of an exponential random variable, according to Theorem 1. Section VII-C. According to Fig. 8, the change of distribution does not yield a major change in Bob’ s error probability , and thus Bob’ s decoder is robust against this change, i.d., if the distribution of the processing times of the queue changes from W eib ull with shape parameter 1 to W eibull with shape parameter 2 , 3 , 4 . 29 V I I I . D I S C U S S I O N A. Sour ce of the gain in Scenarios 2, 4, and 6 Comparing the results of Scenarios 1, 3, and 5 (Alice ﬁngerprints all ﬂo ws she observes) with those of Scenarios 2, 4, and 6 (Alice ﬁngerprints a subset of ﬂo ws she observes), we notice a large gain for the number of ﬂows that can be ﬁngerprinted when Alice ﬁngerprints the ﬂows with a small probability . Intuiti vely , if H 1 is true, in Scenarios 1, 3, and 5, W illie is certain that there is only one possibility: all ﬂo ws are slowed down by Alice in the ﬁrst phase. Howe ver , if H 1 is true, the number of possible sets of ﬂows that might ha ve been slo wed down by Alice in the ﬁrst phase is  m M  for Scenarios 2 and 4, and 2 M for Scenario 6, where sets whose cardinality is about M q are more probable. Since a small portion of the ﬂows is ﬁngerprinted in Scenarios 2, 4, and 6, W illie needs to in vestigate a large number of ﬂo ws to look for the decreasing of ﬂow rates of a relatively (very) small random subset of those ﬂo ws. This makes in visibility much easier to achiev e and leads to the signiﬁcant gains observed. B. Alternative char acterization of m with respect to M for Scenarios 2,4, and 6 Consider Scenario 2 (Theorem 2). An alternati ve way to sho w the relation between the maximum number of ﬂows m that could be traced from a set of ﬂows of size M observed by Alice is: • If there exists a constant ξ < C such that M = O ( e T ξ ) , then m = O ( √ M ) . • If for all ξ < C , M = ω ( e ξ T ) , then m = O ( e T C 5 √ M ) , for all C 5 ∈ (0 , C ) . This applies to Scenario 4 (Theorem 3.2) and Scenario 6 (Theorem 4.2), replacing C with C 0 and C 00 , respecti vely . C. Alice’ s knowledge about effective service times of the queues W e presented results assuming Alice knows the ef fecti ve service rates of the queues, i.e., µ i − λ 0 i for all 1 ≤ i ≤ M . W e can show that if Alice does not know the ef fecti ve service rates, b ut she kno ws a positi ve lo wer bound on each of them, then we achiev e the same big-O results for the number of ﬂo ws that Alice and Bob can trace. Furthermore, if she does not kno w the lo wer bounds, our big-O results achie ved for Scenarios 1, 3, and 5 will change to Little-o results. D. Sharing the ﬁngerprinting codebook The use of a secret pre-shared key has been largely addressed in security and cryptography [36], [37]. In practice, the distribution of secret ke ys can be done by face-to-f ace meeting, use of a trusted courier , or sending the key through an existing encryption channel. In many scenarios a secure low throughput 30 channel is av ailable that the parties can use to share the key . Also, Difﬁe-Hellman ke y exchange (DH) can be used for sharing such a key [38] over a public channel. E. Delay performance Our ﬁngerprinting scheme requires that Alice ﬁrst buf fers packets, which increases the end-to-end delay of the network. W e can sho w that the av erage packet delay in Scenarios 1, 3, and 5 is O ( √ log T ) , and in Scenarios 2, 4, and 6 is O ( √ T ) . W e hav e shown in the reliability analyses that the packet delay does not impact Bob’ s decoding, and he can extract ﬁngerprints with arbitrarily small error probability . This is true because Bob extracts the ﬁngerprints from inter-packet delays. Furthermore, it does not help W illie’ s detection. In other words, in the in visibility analysis we have shown that although packets experience delays, W illie cannot detect Alice and Bob’ s ﬁngerprinting. This is true because W illie does not ha ve access to the original packet timings; rather , he only kno ws the statistics of the packet timings which change only slightly and are undetectable to him. Consider the users of the network. Although this delay is not tolerable in applications such as voice ov er IP , there are many applications such as ﬁle transfer that allow for this. F . Unwinding packets in Alice’ s buf fer Note that Alice’ s ﬁngerprinting requires that she ﬁrst buf fers packets. W e can show that Alice will hav e O ( √ T ) packets in her buf fer after the second phase ends at t = T . T o unwind the packets, after t = T , Alice relays all the ﬂows she recei ves at the rate she receiv es them, and insert packets from her buf fer according to a Poisson process of rate ∆ . Similar to the arguments where we showed that the change of rate from λ − ∆ is undetectable to Willie, we sho w that the change of rate from λ to λ + ∆ is undetectable to W illie, and thus Alice’ s unwinding is in visible. Similar analyses has been addressed in our previous works [29], [30]. I X . F U T U R E W O R K The future work consists of alternati ve network models and extending the current network model. W e will consider the cases where 1) packets drop; 2) packets are duplicated; 3) the order of the packets change; 4) packets are fragmented; and 5) ﬂo ws are re-packetized. In addition, we will apply the results of [30] to extend our results to G/ M / 1 queues and we will consider other queuing models. Furthermore, we will apply the work of [39] to consider a network of parallel links where each link contains a set of M / M / 1 single input/output queues in tandem, and then we will extend this to tandem queues shared between a ﬂow between Alice and Bob (main ﬂow) and independent interfering ﬂows. Furthermore, we 31 will use [40, Corollary 3.3] to relax the condition of independent interference for queues on each route. Moreov er, we will extend our model to a feedforward multiclass product form network [41] containing parallel links where each link consists of multiple M / M / 1 queues in tandem shared between a ﬂo w between Alice and Bob (main ﬂow) as well as interfering ﬂows. X . C O N C L U S I O N W e ha ve presented the construction and analysis for in visible ﬁngerprinting of ﬂows to infer the connections between input and output links of a network that is modeled as M independent, parallel, and work-conserving M / M / 1 queues with background traf ﬁc. In a setting where ﬂows whose packet timings are go verned by Poisson processes visit Alice, W illie, the network, and Bob respectiv ely , we ha ve presented a construction where Alice ﬁngerprints ﬂows in a time interval of length T by manipulating packet timing of the ﬂows according to a ﬁngerprint codebook shared with Bob and unknown to W illie. In particular , each code word (ﬁngerprint) of the codebook is a unique ﬂow identiﬁer which corresponds to a sequence of inter-packet delays. If ﬂo w rates are equal, Bob observes only ﬂows with ﬁngerprints, and Alice chooses to ﬁngerprint all M input ﬂo ws of the network that she observes, Alice and Bob can in visibly trace the ﬁngerprinted ﬂo ws as long as m = M = O ( T / log T ) . But, if she ﬁngerprints a subset of the ﬂows F f , Alice and Bob can in visibly trace the ﬁngerprinted ﬂows as long as m = |F f | = o (min { √ M , e T C 1 } ) , for all C 1 ∈ (0 , C ) , with more accurate characterizations of m with respect to M presented in (16) and (26). Similar results hold for arbitrary ﬂo w rates as well as the case where Bob observes ﬂows with and without ﬁngerprints, with minor modiﬁcations. A P P E N D I X A. Applicability of cov ertness metric when P ( H 0 ) 6 = P ( H 1 ) : Deﬁnition 1 implies that when P ( H 0 ) = P ( H 1 ) = 1 / 2 , Alice can make W illie’ s detector operate as close as desired to a detector that disregards Willie’ s observation, e.g., tosses a fair coin to decide whether Alice is ﬁngerprinting. For P ( H 0 ) 6 = P ( H 1 ) , if Alice’ s scheme satisﬁes the in visibility metric in Deﬁnition 1, she can also make W illie’ s detector operate as close as desired to a detector that disregards W illie’ s observations, as follo w . Recall that P (w) e = P F A + P MD 2 is W illie’ s error probability when prior probabilities are equal, P ( H 0 ) = P ( H 1 ) = 0 . 5 . Denote by P 0 (w) e W illie’ s error probability when prior probabilities are not equal. Then: P 0 (w) e = (1 − P ( H 1 )) P F A + P ( H 1 ) P MD , ≥ 2 min ( P ( H 1 ) , 1 − P ( H 1 )) P F A + P MD 2 , ≥ 2 min ( P ( H 1 ) , 1 − P ( H 1 )) P (w) e , (44) 32 By Deﬁnition 1, if Alice’ s ﬁngerprinting is in visible, then for large enough T she can achie ve P (w) e > 1 2 −  , for all  > 0 . Hence, (44) yields: P 0 (w) e ≥ min ( P ( H 1 ) , 1 − P ( H 1 ))(1 − 2  ) , ≥ min ( P ( H 1 ) , 1 − P ( H 1 )) −  0 , (45) where  0 = 2  min ( P ( H 1 ) , 1 − P ( H 1 )) . Consider a detector that disregards W illie’ s observations: if P ( H 0 ) > 0 . 5 , W illie always decides that Alice is ﬁngerprinting; otherwise, Willie decides that she is not. Using this detector , W illie achiev es P 0 (w) e = min( P ( H 1 ) , 1 − P ( H 1 )) . From (45), Alice can make W illie’ s detector operate as close as desired to this detector . B. Proof of (3) : Denote by P 0 the pdf for W illie’ s observ ations in the ﬁrst phase under the null hypothesis H 0 (Alice is not ﬁngerprinting), and by P 1 the joint pdf for corresponding observations under the hypothesis H 1 (Alice is ﬁngerprinting) in the ﬁrst phase. Note that under H 1 , Alice in the ﬁrst phase slo ws down the ﬂo w f i from rate λ i to λ i − ∆ i , for 1 ≤ i ≤ m . Since the number of observed packets for Poisson processes is a sufﬁcient statistic for hypothesis testing [29], P 0 = m Y i =1 P λ i ( n i ) , P 1 = m Y i =1 P λ i − ∆ i ( n i ) , where P λ ( n ) is the probability mass function (pmf) of the number of packets in time T 1 for a ﬂo w whose packet timings are governed by a Poisson process with rate λ , and T 1 is the length of the ﬁrst phase. Observe D ( P λ i − ∆ i ( n i ) || P λ i ( n i )) = ∆ i T 1 − ( λ i − ∆ i ) T 1 log λ i λ i − ∆ i ≤ T 1 ∆ 2 i 2( λ i − ∆ i ) , (46) where the last steps follows from the inequality ln(1 + x ) ≥ x − x 2 / 2 for x ≥ 0 . Thus, D ( P 1 || P 0 ) = m X i =1 D ( P λ i − ∆ i ( n i ) || P λ i ( n i )) ≤ m X i =1 T 1 ∆ 2 i 2( λ i − ∆ i ) . Let ∆ i =  q 2 λ i mT 1 , where  > 0 . Therefore, D ( P 1 || P 0 ) ≤  2 m P m i =1 λ i λ i − √ 2 λ i /T 1 . For large enough T 1 , λ i λ i − √ 2 λ i /T 1 ≤ 2 , and thus D ( P 1 || P 0 ) ≤ 2  2 as T 1 → ∞ . Combining with (22), P (w) e ≥ 1 2 −  2 ≥ 1 2 −  . Consequently , the ﬁrst phase is invisible. 33 C. Proof of (14) : Consider the following fact: F act 1. For x, y > 0 , if x < y /W ( y ) , then x log x < y . Proof . Assume x 0 = y /W ( y ) . First, we show that x 0 log x 0 = y . From the deﬁnition of the Lambert-W function, W ( y ) e W ( y ) = y . Therefore, W ( y ) = log y W ( y ) . Consequently , x 0 log x 0 = y W ( y ) log y W ( y ) = y W ( y ) W ( y ) = y . (47) Since x 0 = y /W ( y ) , x < y /W ( y ) implies that x < x 0 . Because x log x is an increasing function of x , x log x < x 0 log x 0 = y , and the proof is complete.  Next, for both cases m ≥ 1 + mα and m < 1 + mα we sho w that T C > (1 + mα ) log m , which implies (14). Consider m ≥ 1 + mα . Note that (15) implies m < T C W ( T C ) . Therefore, Fact 1 yields: T C > m log m. Since m ≥ 1 + mα , T C > m log m > (1 + mα ) log m. No w , consider m < 1 + mα . Note that (15) implies that m < α − 1  T C W ( T C ) − 1  , which implies 1 + mα < T C W ( T C ) Hence, Fact 1 yields T C > (1 + mα ) log (1 + mα ) ≥ (1 + mα ) log m, where the last inequality follows from m < 1 + mα . Consequently , (15) satisﬁes (14). D. Proof of (23) : Observe: D ( P 1 || P 0 ) ( a ) = M D ( p P λ − ∆ ( n ) + (1 − p ) P λ ( n ) || P λ ( n )) , ( b ) = M E 1  ln  p P λ − ∆ ( n ) + (1 − p ) P λ ( n ) P λ ( n )  , = M E 1  ln  pe ∆ T 1  λ − ∆ λ  n + (1 − p )  , ( c ) ≤ M p E 1  e ∆ T 1  λ − ∆ λ  n  − M p, ( d ) = M p 2 ( e ∆ 2 T 1 /λ − 1) ( e ) =  2 / 2 . (48) where ( a ) follows from the chain rule for relati ve entropy [42, Eq. (2.67)], E 1 [ · ] denotes expected v alue with respect to the pdf ( p P λ − ∆ ( n i ) + (1 − p ) P λ ( n i )) , ( b ) follo ws from the deﬁnition of the Kullback–Leibler di ver gence, ( c ) is true since ln(1 + x ) ≤ x , ( d ) is true since E 1  λ − ∆ λ  n  = e − ∆ T 1  pe ∆ 2 T 1 /λ + (1 − p )  , and ( e ) follows from substituting the values of ∆ , p , and T 1 gi ven in (20), (21), and (17) respectiv ely . 34 E. Proof of (24) : The ﬁrst difference is in the number of packets that Alice can buf fer from each ﬂow in the ﬁrst phase. Here, since Alice slows down each ﬂow from rate λ to λ − ∆ , where ∆ is gi ven in (20), the probability that Alice can buf fer more than ∆ T 1 / 2 packets in the second phase tends to one as T → ∞ . Therefore, letting t = T 1 and k = ∆ T 1 / 2 = q λT 1 ln  1 +  2 M 2 m 2  / 4 in (10) yields: lim T →∞ P f 1 ≤ 1 − lim T →∞ erf   s T 1 ln  1 +  2 M 2 m 2  8 T 2   . (49) The second dif ference in the analysis of P f 1 is due to dif ferences in the expressions for T 1 and T 2 . By (17) and (18), T 1 /T 2 = α 0 / ln(1 +  2 M 2 m 2 ) . Therefore, (49) yields: lim T →∞ P f 1 ≤ 1 − erf r α 0 8 ! = 1 − erf   r α 8  , (50) where the last step is true since α 0 =  2 α . By (12), F . Proof of (26) : If M = O (1) , by (16), the left hand side (LHS) of (26) is m = M = O (1) . No w , consider the right hand side (RHS) of (26). Since m = O (1) , there exists ρ such that for large enough T , m ≤ ρ . Consequently , the RHS of (26) is Ω( e T C 1+ ρ 0 ) , where ρ 0 = α 0 / ln(1 +  2 2 ρ ) . Thus, (26) is satisﬁed. If M = ω (1) and M = O ( e 2 T C ) , m = o (min √ M , e T C 1 ) , where C 1 ∈ (0 , C ) . Thus, the LHS of (26) is o ( e T C 1 ) . T o show (26) is satisﬁed, it sufﬁces to show that there exists a constant C 0 1 ∈ ( C 1 , C ) such that makes the RHS of (26) Ω( e T C 0 1 ) , which is true since exp T C 1 + α 0 / ln(1 +  2 m 2 M 2 ) ! ≥ e T C  1 − α 0 / ln(1+  2 M 2 m 2 )  (51) provided that 1 / (1 + x ) ≥ 1 − x for all x > 0 . Note that m = o (min √ M , e T C 1 ) implies that m ∈ o ( √ M ) , and thus α 0 / ln(1 +  2 M 2 m 2 ) in the RHS of (51) gets as small as desired. If M = ω ( e 2 T C ) , m = Θ( e T C 2 ) for any C 2 ∈ (0 , C ) , and thus the LHS of (26) is Θ( e T C 2 ) . Now , consider the RHS of (26). Since m = Θ( e T C 2 ) and M = ω ( e 2 T C ) , m = o ( √ M ) , and thus α 0 / ln(1 +  2 M 2 m 2 ) in the RHS of (51) gets as small as desired. Consequently , there exists a constant C 0 2 ∈ ( C 2 , C ) such that the RHS of (26) is Ω( e T C 0 2 ) . Hence, (26) is satisﬁed. 35 R E F E R E N C E S [1] R. Soltani, D. Goeckel, D. T owsle y , and A. Houmansadr, “T ow ards prov ably in visible network ﬂow ﬁngerprints, ” in 2017 51st Asilomar Confer ence on Signals, Systems, and Computers , pp. 258–262, Oct 2017. [2] J. L ´ opez and J. Zhou, W ireless sensor network security , vol. 1. Ios Press, 2008. [3] N. T akbiri, A. Houmansadr, D. L. Goeckel, and H. Pishro-Nik, “Limits of location priv acy under anonymization and obfuscation, ” in International Symposium on Information Theory (ISIT) , (Aachen, Germany), pp. 764–768, IEEE, 2017. [4] N. T akbiri, A. Houmansadr , D. L. Goeckel, and H. Pishro-Nik, “Priv acy against statistical matching: Inter- user correlation, ” in International Symposium on Information Theory (ISIT) , (V ail, Colorado, USA), 2018. [5] M. Hadian, X. Liang, T . Altuwaiyan, and M. M. Mahmoud, “Priv acy-preserving mhealth data release with pattern consistency , ” in Global Communications Confer ence (GLOBECOM), 2016 IEEE , pp. 1–6, IEEE, 2016. [6] N. T akbiri, R. Soltani, D. L. Goeckel, A. Houmansadr , and H. Pishro-Nik, “ Asymptotic loss in priv acy due to dependency in gaussian traces, ” arXiv pr eprint arXiv:1809.10289 , 2018. [7] M. Hadian, T . Altuwaiyan, X. Liang, and W . Li, “Pri vac y-preserving voice-based search over mhealth data, ” Smart Health , 2018. [8] R. K. Nichols, P . Lekkas, and P . C. Lekkas, W ir eless security . McGraw-Hill Professional Publishing, 2001. [9] A. Naghizadeh, S. Berenjian, E. Meamari, and R. E. Atani, “Structural-based tunneling: preserving mutual anonymity for circular p2p networks, ” International J ournal of Communication Systems , vol. 29, no. 3, pp. 602–619, 2016. [10] D. X. Song, D. W agner , and X. T ian, “Timing analysis of ke ystrokes and timing attacks on ssh., ” in USENIX Security Symposium , vol. 2001, 2001. [11] S. Staniford-Chen and L. T . Heberlein, “Holding intruders accountable on the internet, ” in Security and Privacy , 1995. Proceedings., 1995 IEEE Symposium on , pp. 39–49, IEEE, 1995. [12] Y . Zhang and V . Paxson, “Detecting stepping stones., ” in USENIX Security Symposium , vol. 171, p. 184, 2000. [13] P . Syverson, G. Tsudik, M. Reed, and C. Landwehr, “T o wards an analysis of onion routing security , ” in Designing Privacy Enhancing T echnologies , pp. 96–114, Springer , 2001. [14] A. Houmansadr , N. Kiyav ash, and N. Borisov , “Rainbow: A robust and in visible non-blind watermark for network ﬂows., ” in NDSS , 2009. [15] A. Houmansadr and N. Borisov , “Swirl: A scalable watermark to detect correlated network ﬂows., ” in NDSS , 2011. [16] A. Houmansadr, N. Kiyavash, and N. Borisov , “Multi-ﬂow attack resistant watermarks for network ﬂows, ” 2009. [17] A. Houmansadr , Design, analysis, and implementation of effective network ﬂow watermarking schemes . PhD thesis, Uni versity of Illinois at Urbana-Champaign, 2012. [18] X. W ang and D. S. Reev es, “Robust correlation of encrypted attack trafﬁc through stepping stones by manipulation of interpacket delays, ” in Pr oceedings of the 10th ACM confer ence on Computer and communications security , pp. 20–29, A CM, 2003. [19] P . Peng, P . Ning, and D. S. Ree ves, “On the secrecy of timing-based activ e watermarking trace-back techniques, ” in Security and Privacy , 2006 IEEE Symposium on , pp. 15–pp, IEEE, 2006. [20] X. W ang, S. Chen, and S. Jajodia, “Tracking anonymous peer-to-peer voip calls on the internet, ” in Pr oceedings of the 12th ACM confer ence on Computer and communications security , pp. 81–91, A CM, 2005. [21] N. Kiyav ash, A. Houmansadr , and N. Borisov , “Multi-ﬂow attacks against network ﬂow watermarking schemes., ” in USENIX security symposium , pp. 307–320, 2008. [22] A. Houmansadr, T . Coleman, N. Kiyav ash, and N. Borisov , “On the channel capacity of network ﬂow watermarking, ” in Pr oceedings of 16th ACM conference on computer and communications security (CCS 09) , 2009. [23] A. Houmansadr and N. Borisov , “The need for ﬂow ﬁngerprints to link correlated network ﬂo ws, ” in International Symposium on Privacy Enhancing T echnologies Symposium , pp. 205–224, Springer , 2013. [24] F . Rezaei and A. Houmansadr , “T agit: T agging network ﬂows using blind ﬁngerprints, ” Pr oceedings on Privacy Enhancing T echnologies , v ol. 2017, no. 4, pp. 290–307, 2017. 36 [25] X. W ang, S. Chen, and S. Jajodia, “Network ﬂow watermarking attack on low-latency anonymous communication systems, ” in 2007 IEEE Symposium on Security and Privacy (SP , pp. 116–130, IEEE, 2007. [26] W . Y u, X. Fu, S. Graham, D. Xuan, and W . Zhao, “Dsss-based ﬂow marking technique for invisible traceback, ” in Security and Privacy , 2007. SP’07. IEEE Symposium on , pp. 18–32, IEEE, 2007. [27] X. Liu and R. Srikant, “The timing capacity of single-server queues with multiple input and output terminals, ” [28] R. Soltani, B. Bash, D. Goeckel, S. Guha, and D. T owsle y , “Covert single-hop communication in a wireless network with distributed artiﬁcial noise generation, ” in Communication, Contr ol, and Computing (Allerton), 2014 52nd Annual Allerton Confer ence on , pp. 1078–1085, IEEE, 2014. [29] R. Soltani, D. Goeckel, D. T owsle y , and A. Houmansadr , “Covert communications on poisson packet channels, ” in 2015 53r d Annual Allerton Confer ence on Communication, Contr ol, and Computing (Allerton) , pp. 1046–1052, IEEE, 2015. [30] R. Soltani, D. Goeckel, D. T owsle y , and A. Houmansadr , “Covert communications on renewal packet channels, ” in 2016 54th Annual Allerton Confer ence on Communication, Contr ol, and Computing (Allerton) , IEEE, 2016. [31] R. Soltani, D. Goeckel, D. T owsle y , B. Bash, and S. Guha, “Covert wireless communication with artiﬁcial noise generation, ” IEEE T ransactions on W ireless Communications , pp. 1–1, 2018. [32] R. Soltani, D. Goeckel, D. T o wsley , and A. Houmansadr , “Fundamental limits of covert bit insertion in packets, ” in 2018 56th Annual Allerton Confer ence on Communication, Contr ol, and Computing (Allerton) , IEEE, 2018. [33] T . H. Cormen, Intr oduction to algorithms . MIT press, 2009. [34] V . Anantharam and S. V erdu, “Bits through queues, ” Information Theory , IEEE T ransactions on , vol. 42, no. 1, pp. 4–18, 1996. [35] R. Sundaresan and S. V erd ´ u, “Robust decoding for timing channels, ” IEEE T ransactions on information Theory , vol. 46, no. 2, pp. 405–419, 2000. [36] J. Katz, A. J. Menezes, P . C. V an Oorschot, and S. A. V anstone, Handbook of applied cryptogr aphy . CRC press, 1996. [37] D. R. Stinson, Cryptography: theory and practice . CRC press, 2005. [38] M. Steiner, G. Tsudik, and M. W aidner , “Difﬁe-hellman key distribution extended to group communication, ” in Pr oceedings of the 3r d A CM conference on Computer and communications security , pp. 31–37, ACM, 1996. [39] P . Mimcilovic, “Mismatch decoding of a compound timing channel, ” in F orty-F ourth Annual Allerton Conference on Communication, Contr ol, and Computing , 2006. [40] F . P . Kelly , Reversibility and stochastic networks . Cambridge Univ ersity Press, 2011. [41] F . Baskett, K. M. Chandy , R. R. Muntz, and F . G. Palacios, “Open, closed, and mixed networks of queues with different classes of customers, ” Journal of the ACM (JA CM) , vol. 22, no. 2, pp. 248–260, 1975. [42] T . M. Cover and J. A. Thomas, Elements of information theory . John Wile y & Sons, 2012. 37

Fundamental Limits of Invisible Flow Fingerprinting

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment