Secret Sharing over Fast-Fading MIMO Wiretap Channels

Secr et Sharing ov er F ast-F ading MIMO W ir etap Channels T an F . W ong, Matthieu Bloch, and John M. Shea Abstract Secret sharing ov er the fast-fading MIMO wiretap channel is considered. A source and a destination try to share secret information over a fast-fading MIMO channel in the presence of an eavesdropper who also makes channel observ ations that are dif ferent from but correlated to those made by the destination. An interactiv e, authenticated public channel with unlimited capacity is available to the source and destination for the secret sharing process. This situation is a special case of the “channel model with wiretapper” considered by Ahlswede and Csisz ´ ar . An extension of their result to continuous channel alphabets is employed to ev aluate the ke y capacity of the fast-fading MIMO wiretap channel. The ef fects of spatial dimensionality provided by the use of multiple antennas at the source, destination, and eav esdropper are then in vestigated. I . I N T RO D U C T I O N The wiretap channel considered in the seminal paper [1] is the ﬁrst example that demonstrates the possibility of secure communications at the physical layer . It is shown in [1] that a source can transmit a message at a positive (secrecy) rate to a destination in such a way that an eavesdropper only gathers information at a negligible rate, when the source-to-ea vesdropper channel 1 is a degraded v ersion of the source-to-destination channel. A similar result for the Gaussian wiretap channel is provided in [2]. The work in [3] further remov es the degraded wiretap channel restriction showing that positi ve secrec y capacity is possible if the destination channel is “more capable” (“less noisy” for a full extension of the rate region in [1]) than the eav esdropper’ s channel. Recently , there has been a ﬂurry of interest in extending these early results to more sophisticated channel models, including fading wiretap channels, multi-input multi-output (MIMO) wiretap channels, multiple-access wiretap channels, broadcast wiretap channels, relay wiretap channels, etc . W e do not attempt to provide a comprehensi ve summary of all T an F . W ong and John M. Shea are with the W ireless Information Networking Group, University of Florida, Gainesvilles, Florida, 32611-6130, USA. Matthieu Bloch is with the School of Electrical and Computer Engineering, Georgia Institute of T echnology , Atlanta, GA, and with the GT -CNRS UMI 2958, 2-3 Rue Marconi, 57070, Metz, France 1 The source-to-eav esdropper and source-to-destination channels will hereafter be referred to as eavesdropper and destination channels, respectiv ely . 2 recent dev elopments, and highlight only results that are most relev ant to the present work. W e refer interested readers to the introduction and reference list of [4] for a concise and extensi ve overvie w of recent works. When the destination and eav esdropper channels experience independent fading, the strict requirement of having a more capable destination channel for positive secrecy capacity can be loosened. This is due to the simple observ ation that the destination channel may be more capable than the eavesdropper’ s channel under some fading realizations, e ven if the destination is not more capable than the ea vesdropper on av erage. Hence, if the channel state information (CSI) of both the destination and ea vesdropper channels is av ailable at the source, it is shown in [4], [5] that a positive secrecy capacity can be achie ved by means of appropriate po wer control at the source. The ke y idea is to opportunistically transmit only during those fading realizations for which the destination channel is more capable [6]. F or block-ergodic fading, it is also shown in [5] (see also [7]) that a positive secrecy capacity can be achie ved with a variable-rate transmission scheme without an y ea vesdropper CSI av ailable at the source. When the source, destination, and eav esdropper have multiple antennas, the resulting channel is kno wn as a MIMO wiretap channel (see [8], [9], [10], [11], [12]), which may also have positiv e secrecy capacity . Since the MIMO wiretap channel is not de graded, the characterization of its secrec y capacity is not straightforward. F or instance, the secrec y capacity of the MIMO wiretap channel is characterized in [9] as the saddle point of a minimax problem, while an alternative characterization based on a recent result for multi-antenna broadcast channels is provided in [11]. Interestingly all characterizations point to the fact that the capacity achie ving scheme is one that transmits only in the directions in which the destination channel is more capable than the eavesdropper’ s channel. Obviously , this is only possible when the destination and ea vesdropper CSI is av ailable at the source. It is shown in [9] that if the indi vidual channels from antennas to antennas suffer from independent Rayleigh fading, and the respectiv e ratios of the numbers of source and destination antennas to that of eavesdropper antennas are larger than certain ﬁxed v alues, then the secrec y capacity is positi ve with probability one when the numbers of source, destination, and ea vesdropper antennas become very large. As discussed abov e, the av ailability of destination (and ea vesdropper) CSI at the source is an implicit requirement for positiv e secrecy capacity in the fading and MIMO wiretap channels. Thus, an authenticated feedback channel is needed to send the CSI from the destination back to the source. In [5], [7], this feedback channel is assumed to be public, and hence the destination CSI is also av ailable to the eavesdropper . In addition, it is assumed that the eavesdropper kno ws its own CSI. W ith the av ailability of a feedback channel, if the objectiv e of having the source send secret information to the 3 destination is relaxed to distilling a secret key shared between the source and destination, it is shown in [13] that a positive key rate is achiev able when the destination and eav esdropper channels are two conditionally independent (giv en the source input symbols) memoryless binary channels, even if the destination channel is not more capable than the eav esdropper’ s channel. This notion of secret sharing is formalized in [14] based on the concept of common randomness between the source and destination. Assuming the av ailability of an interacti ve, authenticated public channel with unlimited capacity between the source and destination, [14] suggests two different system models, called the “source model with wiretapper” (SW) and the “channel model with wiretapper” (CW). The CW model is the similar to the (discrete memoryless) wiretap channel model that we hav e discussed before. The SW model differs in that the random symbols observed at the source, destination, and eav esdropper are realizations of a discrete memoryless source with multiple components. Both SW and CW models hav e been extended to the case of secret sharing among multiple terminals, with the possibility of some terminals acting as helpers [15], [16], [17]. K ey capacities hav e been obtained for the two special cases in which the eav esdropper’ s channel is a degraded version of the destination channel and in which the destination and eav esdropper channels are conditionally independent [14], [13]. Similar results ha ve been deriv ed for multi-terminal secret sharing [16], [17], with the two special cases above subsumed by the more general condition that the terminal symbols form a Markov chain on a tree. Authentication of the public channel can be achie ved by the use of an initial short ke y and then a small portion of the subsequent shared secret message [18]. A detailed study of secret sharing over an unauthenticated public channel is giv en in [19], [20], [21]. Other approaches to employ feedback hav e also been recently considered [22], [24], [23]. In particular, it is shown in [22] that positiv e secrecy capacity can be achiev ed for the modulo-additi ve discrete memoryless wiretap channel and the modulo- Λ channel if the destination is allo wed to send signals back to the source ov er the same wiretap channel and both terminals can operate in full-duplex manner . In fact, for the former channel, the secrecy capacity is the same as the capacity of such a channel in the absence of the ea vesdropper . In this paper, we consider secret sharing over a fast-fading MIMO wiretap channel. Thus, we are interested in the CW model of [14] with memoryless conditionally independent destination and eav esdropper channels and continuous channel alphabets. W e provide an extension of the key capacity result in [14] for this case to include continuous channel alphabets (Theorem 2.1). Using this result, we obtain the key capacity of the fast-f ading MIMO wiretap channel (Section III). Our result indicates that the ke y capacity is alw ays positiv e, no matter ho w large the channel gain of the eav esdropper’ s channel 4 is; in addition this holds even if the destination and eav esdropper CSI is av ailable at the destination and eav esdropper , respectively . Of course, the av ailability of the public channel implies that the destination CSI could be fed back to the source. Ho wever , due to the restrictions imposed on the secret-sharing strategies (see Section II), only causal feedback is allowed, and thus any destination CSI a v ailable at source is “outdated”. This does not turn out to be a problem since, unlik e the approaches mentioned above, the source does not use the CSI to avoid sending secret information when the destination is not more capable than the eavesdropper’ s channel. As a matter of fact, the fading process of the destination channel provides a signiﬁcant part of the common randomness from which the source and the destination distill a secret key . This fact is readily obtained from the alternati ve achiev ability proof gi ven in Section IV. W e note that [25], [26] consider the problem key generation from common randomness ov er wiretap channels and exploit a W yner -Ziv coding scheme to limit the amount of information con ve yed from the source to the destination via the wiretap channel. Unlike these previous works, we only employ W yner-Zi v coding to quantize the destination channel outputs. Our code construction still relies on a public channel with unlimited capacity to achie ve the ke y capacity . Finally , we also in vestigate the limiting v alue of the key capacity under three asymptotic scenarios. In the ﬁrst scenario, the transmission po wer of the source becomes asymptotically high (Corollary 3.1). In the second scenario, the destination and eav esdropper hav e a large number of antennas (Corollary 3.2). In the third scenario, the gain adv antage of the eav esdropper’ s channel becomes asymptotically lar ge (Corollary 3.3). These three scenarios reveal two different ef fects of spatial dimensionality upon ke y capacity . In the ﬁrst scenario, we sho w that the key capacity lev els off as the power increases if the eav esdropper has no fe wer antennas than the source. On the other hand, when the source has more antennas, the key capacity can increase without bound with the source power . In the second scenario, we sho w that the spatial dimensionality advantage that the eav esdropper has over the destination has exactly the same effect as the channel gain advantage of the eav esdropper . In the third scenario, we show that the limiting key capacity is positive only if the eav esdropper has fewer antennas than the source. The results in these scenarios conﬁrm that spatial dimensionality can be used to combat the eav esdropper’ s gain advantage, which was already observed for the MIMO wiretap channel. Perhaps more surprisingly , this is achie ved with neither the source nor destination needing any ea vesdropper CSI. I I . S E C R E T S H A R I N G A N D K E Y C A PAC I T Y W e consider the CW model of [14], and we recall its characteristics for completeness. W e consider three terminals, namely a source, a destination, and an ea vesdropper . The source sends symbols from 5 an alphabet X . The destination and ea vesdropper observe symbols belonging to alphabets Y and Z , respecti vely . Unlike in [14], X , Y , and Z need not be discrete. In fact, in Section III we will assume they are multi-dimensional vector spaces over the complex ﬁeld. The channel from the source to the destination and ea vesdropper is assumed memoryless. A generic symbol sent by the source is denoted by X and the corresponding symbols observed by the destination and eavesdropper are denoted by Y and Z , respectiv ely . For notational conv enience (and without loss of generality), we assume that ( X , Y , Z ) are jointly continuous, and the channel is speciﬁed by the conditional probability density function (pdf) p Y ,Z | X ( y , z | x ) . In addition, we restrict ourselves to cases in which Y and Z are conditionally independent gi ven X , i.e., p Y ,Z | X ( y , z | x ) = p Y | X ( y | x ) p Z | X ( z | x ) , which is a reasonable model for symbols broadcasted in a wireless medium. Hereafter , we drop the subscripts in pdfs whene ver the concerned symbols are well speciﬁed by the arguments of the pdfs. W e assume that an interactive, authenticated public channel with unlimited capacity is also av ailable for communicatin between the source and destination. Here, interactive means that the channel is two-way and can be used multiple times, unlimited capacity means that it is noiseless and has inﬁnite capacity , and public and authenticated mean that the eav esdropper can perfectly observe all communications ov er this channel but cannot tamper with the messages transmitted. W e consider the class of permissible secret-sharing strate gies suggested in [14]. Consider k time instants labeled by 1 , 2 , . . . , k , respectiv ely . The ( X , Y , Z ) channel is used n times during these k time instants at i 1 < i 2 < · · · < i n . Set i n +1 = k . The public channel is used for the other ( k − n ) time instants. Before the secret-sharing process starts, the source and destination generate, respectiv ely , independent random variable M X and M Y . T o simplify the notation, let a i represent a sequence of messages/symbols a 1 , a 2 , . . . , a i . Then a permissible strategy proceeds as follo ws: • At time instant 0 < i < i 1 , the source sends message Φ i = Φ i ( M X , Ψ i − 1 ) to the destination, and the destination sends message Ψ i = Ψ i ( M Y , Φ i − 1 ) to the source. Both transmissions are carried ov er the public channel. • At time instant i = i j for j = 1 , 2 , . . . , n , the source sends the symbol X j = X j ( M X , Ψ i j − 1 ) to the ( X, Y , Z ) channel. The destination and eav esdropper observe the corresponding symbols Y j and Z j . There is no message exchange via the public channel, i.e., Φ i and Ψ i are both null. • At time instant i j < i < i j +1 for j = 1 , 2 , . . . , n , the source sends message Φ i = Φ i ( M X , Ψ i − 1 ) to the destination, and the destination sends message Ψ i = Ψ i ( M Y , Y j , Φ i − 1 ) to the source. Both transmissions are carried o ver the public channel. 6 At the end of the k time instants, the source generates its secret key K = K ( M X , Ψ k ) , and the destination generates its secret ke y L = L ( M Y , Y n , Φ k ) , where K and L takes values from the same ﬁnite set K . According to [14], R is an achie vable ke y rate through the channel ( X , Y , Z ) if for e very ε > 0 , there exists a permissible secret-sharing strategy of the form described above such that 1) Pr { K 6 = L } < ε , 2) 1 n I ( K ; Z n , Φ k , Ψ k ) < ε , 3) 1 n H ( K ) > R − ε , and 4) 1 n log |K | < 1 n H ( K ) + ε , for sufﬁciently lar ge n . The key capacity of the channel ( X , Y , Z ) is the largest achie vable key rate through the channel. W e are interested in ﬁnding the key capacity . For the case of continuous channel alphabets considered here, we also add the following po wer constraint to the symbol sequence X n sent out by the source: 1 n n X j =1 | X j | 2 ≤ P (1) with probability one (w .p.1) for sufﬁciently large n . Theor em 2.1: The k ey capacity of a CW model ( X , Y , Z ) with conditional pdf p ( y , z | x ) = p ( y | x ) p ( z | x ) is gi ven by max X : E [ | X | 2 ] ≤ P [ I ( X ; Y ) − I ( Y ; Z )] . Pr oof: The case with discrete channel alphabets is established in [14, Corollary 2 of Theorem 2], whose achiev ability proof (also the ones in [16], [17]) does not readily extend to continuous channel alphabets. Ne vertheless the same single backward message strategy suggested in [14] is still applicable for continuous alphabets. That strategy uses k = n + 1 time instants with i j = j for j = 1 , 2 , . . . , n . That is the source ﬁrst sends n symbols through the ( X , Y , Z ) channel; after receiving these n symbols, the destination feeds back a single message at the last time instant to the source over the public channel. A carefully structured W yner -Ziv code can be employed to support this secret-sharing strategy . The detailed arguments are provided in the alternative achiev ability proof in Section IV. Here we outline an achiev ability argument based on the consideration of a conceptual wiretap channel from the destination back to the source and eav esdropper suggested in [13, Theorem 3]. First, assume the source sends a sequence of i.i.d. symbols X n , each distributed according to p ( x ) , ov er the wiretap channel. Suppose that E [ | X | 2 ] ≤ P . Because of the law of large numbers, we can assume that X n satisﬁes the po wer constraint (1) without loss of generality . Let Y n and Z n be the observ ations of the the destinations and eavesdropper , respectiv ely . T o transmit a sequence U n of symbols independent of ( X n , Y n , Z n ) , the destination sends U n + Y n back to the source via the public channel. This creates a 7 conceptual memoryless wiretap channel from the destination with input symbol U to the source in the presence of the eav esdropper , where the source observes ( U + Y , X ) while the eav esdropper observes ( U + Y , Z ) . Employing the continuous alphabet extension of the well known result in [3], the secrecy capacity of the conceptual wiretap channel (and hence the key capacity of the original channel) is lower bounded by max U [ I ( U ; U + Y , X ) − I ( U ; U + Y , Z )] . Note that the input symbol U has no power constraint since the public channel has inﬁnite capacity . But I ( U ; U + Y , X ) − I ( U ; U + Y , Z ) = I ( U ; X ) + I ( U ; U + Y | X ) − [ I ( U ; Z ) + I ( U ; U + Y | Z )] = h ( U ) − h ( U | X ) + h ( U + Y | X ) − h ( U + Y | U, X ) − h ( U ) + h ( U | Z ) − h ( U + Y | Z ) + h ( U + Y | U, Z ) = h ( Y | Z ) − h ( Y | X ) + [ h ( U + Y | X ) − h ( U | X )] − [ h ( U + Y | Z ) − h ( U | Z )] ≥ h ( Y | Z ) − h ( Y | X ) − [ h ( U + Y | X ) − h ( U | X )] ≥ h ( Y | Z ) − h ( Y | X ) − [ h ( U + Y ) − h ( U )] (2) where the equality on the fourth line results from h ( U + Y | U, X ) = h ( Y | U, X ) = h ( Y | X ) due to the independence of U and Y , the inequality on the ﬁfth line follows from the fact h ( U + Y | Z ) − h ( U | Z ) ≥ h ( U + Y | Z , Y ) − h ( U | Z ) = h ( U | Z , Y ) − h ( U | Z ) = 0 , which is again due to independence between ( Y , Z ) and U , and the inequality on the last line follows from h ( U + Y | X ) − h ( U | X ) = h ( U + Y | X ) − h ( U ) ≤ h ( U + Y ) − h ( U ) . W ithout loss of generality and for notational simplicity , assume that Y and U are both one-dimensional real random v ariables. Now , choose U to be Gaussian distributed with mean 0 and variance σ 2 U . Then h ( U + Y ) − h ( U ) ≤ 1 2 log (2 π e v ar( U + Y )) − 1 2 log(2 π eσ 2 U ) = 1 2 log  σ 2 U + v ar( Y ) σ 2 U  (3) where the ﬁrst inequality follows from [27, Theorem 8.6.5] and the last equality is due to the independence between Y and U . Combining (2) and (3), for e very ε > 0 , we can choose σ 2 U large enough such that I ( U ; U + Y , X ) − I ( U ; U + Y , Z ) ≥ h ( Y | Z ) − h ( Y | X ) − ε = I ( X ; Y ) − I ( Y ; Z ) − ε. Since ε is arbitrary , the key capacity is lower bounded by max E [ | X | 2 ] ≤ P [ I ( X ; Y ) − I ( Y ; Z )] . 8 The con verse proof in [14] is directly applicable to continuous channel alphabets, pro vided the av erage po wer constraint (1) can be incorporated into the arguments in [14, pp. 1129–1130]. This latter requirement is simpliﬁed by the additi ve and symmetric nature of the av erage power constraint [28, Section 3.6]. T o av oid too much repetition, we outline below only the steps of the proof that are not directly a vailable in [14, pp. 1129–1130]. For ev ery permissible strategy with achiev able ke y rate R , we ha ve 1 n I ( K ; L ) = 1 n H ( K ) − 1 n H ( K | L ) ≥ 1 n H ( K ) − 1 n [1 + Pr { K 6 = L } · log |K| ] > 1 n H ( K ) − 1 n − ε  1 n H ( K ) + ε  > (1 − ε )( R − ε ) − 1 n − ε 2 (4) where the second line follows from Fano’ s inequality , the third line results from conditions 1) and 4) in the deﬁnition of achiev able key rate, and the last line is due to condition 3). Thus it sufﬁces to upper bound I ( K ; L ) . From condition 2) in the deﬁnition of achie vable ke y rate and the chain rule, we ha ve 1 n I ( K ; L ) < 1 n I ( K ; L | Z n , Φ k , Ψ k ) + ε ≤ 1 n I ( M X ; M Y , Y n | Z n , Φ k , Ψ k ) + ε (5) where the second inequality is due to the fact that K = K ( M X , Ψ k ) and L = L ( M Y , Y n , Φ k ) . By repeated uses of the chain rule, the construction of permissible strategies, and the memoryless nature of the ( X, Y , Z ) channel, it is sho wn in [14, pp. 1129–1130] that 1 n I ( M X ; M Y , Y n | Z n , Φ k , Ψ k ) ≤ 1 n n X j =1 I ( X j ; Y j | Z j ) . (6) No w let Q be a uniform random v ariable that takes v alue from { 1 , 2 , . . . , n } , and is independent of all other random quantities. Deﬁne ( ˜ X , ˜ Y , ˜ Z ) = ( X j , Y j , Z j ) if Q = j . Then it is obvious that p ˜ Y , ˜ Z | ˜ X ( ˜ y , ˜ z | ˜ x ) = p Y ,Z | X ( ˜ y , ˜ z | ˜ x ) , and (6) can be rewritten as 1 n I ( M X ; M Y , Y n | Z n , Φ k , Ψ k ) ≤ I ( ˜ X ; ˜ Y | ˜ Z , Q ) ≤ I ( ˜ X ; ˜ Y | ˜ Z ) (7) where the second inequality is due to the fact that Q → ˜ X → ( ˜ Y , ˜ Z ) forms a Markov chain. On the other hand, the po wer constraint (1) implies that E [ | ˜ X | 2 ] = 1 n n X j =1 E [ | X j | 2 ] ≤ P . (8) 9 Combining (4), (5), and (7), we obtain R < 1 1 − ε  I ( ˜ X ; ˜ Y | ˜ Z ) + 2 ε + 1 n  . (9) Since ε can be arbitrarily small when n is sufﬁciently large, (9), together with (8), giv es R ≤ I ( ˜ X ; ˜ Y | ˜ Z ) ≤ max X : E [ | X | 2 ] ≤ P I ( X ; Y | Z ) = max X : E [ | X | 2 ] ≤ P [ I ( X ; Y ) − I ( Y ; Z )] where the last line is due to the fact that p ( y , z | x ) = p ( y | x ) p ( z | x ) . I I I . K E Y C A PAC I T Y O F F A S T F A D I N G M I M O W I R E TA P C H A N N E L Consider that the source, destination, and eav esdropper ha ve m S , m D , and m W antennas, respectiv ely . The antennas in each node are separated by at least a fe w wavelengths, and hence the fading processes of the channels across the transmit and receiv e antennas are independent. Using the complex baseband representation of the bandpass channel model: Y D = H D X + N D Y W = αH W X + N W (10) where • X is the m S × 1 complex-v alued transmit symbol vector by the source, • Y D is the m D × 1 complex-v alued receiv e symbol vector at the destination, • Y W is the m W × 1 complex-v alued receiv e symbol vector at the eavesdropper , • N D is the m D × 1 noise vector with independent identically distrib uted (i.i.d.) zero-mean, circular- symmetric complex Gaussian-distributed elements of v ariance σ 2 D (i.e., the real and imaginary parts of each elements are independent zero-mean Gaussian random variables with the same variance), • N W is the m W × 1 noise vector with i.i.d. zero-mean, circular -symmetric complex Gaussian- distributed elements of variance σ 2 W , • H D is the m D × m S channel matrix from the source to destination with i.i.d. zero-mean, circular - symmetric complex Gaussian-distributed elements of unit variance, • H W is the m W × m S channel matrix from the source to eavesdropper with i.i.d. zero-mean, circular- symmetric complex Gaussian-distributed elements of unit variance • α > 0 models the gain advantage of the eavesdropper over the destination. 10 Note that H D , H W , N D , and N W are independent. The wireless channel modeled by (10) is used n times as the ( X , Y , Z ) channel described in Section II with Y = [ Y D H D ] and Z = [ Y W H W ] . W e assume that the n uses of the wireless channel in (10) are i.i.d. so that the memoryless requirement of the ( X , Y , Z ) channel is satisﬁed. Since H D and H W are included in the respectiv e channel symbols observable by the destination and eavesdropper (i.e., Y and Z respectiv ely), this model also implicitly assumes that the destination and eav esdropper hav e perfect CSI of their respecti ve channels from the source. In practice, we can separate adjacent uses of the wireless channel by more than the coherence time of the channel to approximately ensure the i.i.d. channel use assumption. Training (known) symbols can be sent right before or after (within the channel coherence period) by the source so that the destination can acquire the required CSI. The ea vesdropper may also use these training symbols to acquire the CSI of its own channel. If the CSI required at the destination is obtained in the way just described, then a unit of channel use includes the symbol X together with the associated training symbols. Howe ver , as in [29], we do not count the power required to send the training symbols (cf. Eq. (1)). Moreover we note that the source (and also the eav esdropper) may get some information about the outdated CSI of the destination channel, because information about the destination channel CSI, up to the previous use, may be fed back to the source from the destination via the public channel. More speciﬁcally , at time instant i j , the source symbol X j is a function of the feedback message Ψ i j − 1 , which is in turn some function of the realizations of H D at time i 1 , i 2 , . . . , i j − 1 . W e also note that neither the source nor destination has any eavesdropper CSI. Referring back to (10), these two facts imply that X is independent of H D , H W , N D , and N W , i.e., the current source symbol X is independent of the current channel state. Since the fading MIMO wiretap channel model in (10) is a special case of the CW model considered in Section II, the key capacity C K is gi ven by Theorem 2.1 as: C K = max X : E [ | X | 2 ] ≤ P [ I ( X ; Y D , H D ) − I ( Y D , H D ; Y W , H W )] . (11) Note that I ( X ; Y D , H D ) − I ( Y D , H D ; Y W , H W ) = I ( X ; Y D | H D ) − I ( Y D ; Y W | H D , H W ) = h ( Y D | Y W , H D , H W ) − h ( Y D | X , H D ) = h ( Y D | Y W , H D , H W ) − m D log( π eσ 2 D ) . (12) Substituting this back into (11), we get C K = max X : E [ | X | 2 ] ≤ P h ( Y D | Y W , H D , H W ) − m D log( π eσ 2 D ) . (13) 11 As a result, the key capacity of the fast-fading wiretap channel described by (10) can be obtained by maximizing the conditional entropy h ( Y D | Y W , H D , H W ) . This maximization problem is solved belo w: Theor em 3.1: C K = E   log det  I m S + α 2 P m S σ 2 W H † W H W + P m S σ 2 D H † D H D  det  I m S + α 2 P m S σ 2 W H † W H W    . where † denotes conjugate transpose. Pr oof: T o determine the ke y capacity , we need the following upper bound on the conditional entropy h ( U | V ) Lemma 3.1: Let U and V be two jointly distributed complex random vectors of dimensions m U and m V , respectively . Let K U , K V , and K U V be the cov ariance of U , cov ariance of V , and cross-cov ariance of U and V , respectiv ely . If K V is in vertible, then h ( U | V ) ≤ log det( K U − K U V K − 1 V K V U ) + m U log( π e ) . The upper bound is achiev ed when [ U T V T ] T is a circular-symmetric complex Gaussian random vector . Pr oof: W e can assume that both U and V ha ve zero means without loss of generality . Also assume that the existence of all unconditional and conditional co v ariances stated belo w . For each v , h ( U | V = v ) ≤ log  ( π e ) m U det( K U | v )  (14) where K U | v is the cov ariance of U with respect to the conditional density p U | V ( u | v ) [29, Lemma 2]. This implies h ( U | V ) ≤ E V  log  ( π e ) m U det( K U | V )  ≤ log det( E V [ K U | V ]) + m U log( π e ) ≤ log det( K U − K U V K − 1 V K V U ) + m U log( π e ) . (15) The second inequality abov e is due to the conca vity of the function log det over the set of positi ve deﬁnite symmetric matrices [30, 7.6.7] and the Jensen’ s inequality . T o get the third inequality , observe that E V [ K U | V ] can be interpreted as the cov ariance of the estimation error of estimating U by the conditional mean estimator E [ U | V ] . On the other hand, K U − K U V K − 1 V K V U is the cov ariance of the estimation error of using the linear minimum mean squared error estimator K U V K − 1 V V instead. The inequality results from the fact that K U − K U V K − 1 V K V U ≥ E V [ K U | V ] (i.e., [ K U − K U V K − 1 V K V U ] − E V [ K U | V ] is positiv e semideﬁnite) [31] and the inequality of det( A ) ≥ det( B ) if A and B are positive deﬁnite, and A ≥ B [30, 7.7.4]. 12 Suppose that [ U T V T ] T is a circular -symmetric comple x Gaussian random vector . F or each v , the conditional cov ariance of U , conditioned on V = v , is the same as the (unconditional) cov ariance of U − K U V K − 1 V V . Since U − K U V K − 1 V V is a circular-symmetric complex Gaussian random vector [29, Lemma 3], so is U conditioned on V = v . Hence by [29, Lemma 2], the upper bound in (14) is achie ved with K U | v = K U − K U V K − 1 V K V U , which also gi ves the upper bound in (15). T o prove the theorem, we ﬁrst obtain an upper bound on C K and then show that the upper bound is achie v able. Using Lemma 3.1, we have h ( Y D | Y W , H D , H W ) − m D log( π eσ 2 D ) ≤ E  log det  K Y D − K Y D Y W K − 1 Y W K Y W Y D  − m D log σ 2 D (16) where K Y D and K Y W are respecti vely the conditional cov ariances of Y D and Y W , gi ven H D and H W , and K Y D Y W and K Y W Y D are the corresponding conditional cross-cov ariances. Substituting (16) into (13), an upper bound on C K is max X : E [ | X | 2 ] ≤ P E  log det  K Y D − K Y D Y W K − 1 Y W K Y W Y D  − m D log σ 2 D . (17) Thus we need to solve the maximization problem (17). T o do so, let λ 1 , λ 2 , . . . , λ m S be the (nonnegativ e) eigen values of K X . Since both the distributions of H D and H W are in variant to an y unitary transformation [29, Lemma 5], we can without any ambiguity deﬁne f ( λ 1 , λ 2 , . . . , λ m S ) = E " log det I m D + 1 σ 2 D H D K 1 / 2 X  I m S + α 2 σ 2 W K 1 / 2 X H † W H W K 1 / 2 X  − 1 K 1 / 2 X H † D !# . (18) That is, we can assume K X = diag( λ 1 , λ 2 , . . . , λ m S ) with no loss of generality . Then we have the follo wing lemma, which suggests that the objecti ve function in (17) is a concav e function depending only on the eigen v alues of the covariance of X : Lemma 3.2: Suppose that X has an arbitrary cov ariance K X , whose (nonnegati ve) eigen v alues are λ 1 , λ 2 , . . . , λ m S . Then E  log det  K Y D − K Y D Y W K − 1 Y W K Y W Y D  − m D log σ 2 D = f ( λ 1 , λ 2 , . . . , λ m S ) (19) is conca ve in Λ = { λ i ≥ 0 for i = 1 , 2 , . . . , m S } . 13 Pr oof: First write A D = H D K 1 / 2 X and A W = α H W K 1 / 2 X . It is easy to see from (10) that K Y D = A D A † D + σ 2 D I m D , K Y W = A W A † W + σ 2 W I m W , and K Y D Y W = A D A † W . Then K Y D − K Y D Y W K − 1 Y W K Y W Y D = σ 2 D  I m D + 1 σ 2 D A D  I m S − A † W  A W A † W + σ 2 W I m W  − 1 A W  A † D  = σ 2 D ( I m D + 1 σ 2 D A D  I m S + 1 σ 2 W A † W A W  − 1 A † D ) (20) where the last equality is due to the matrix in version formula. Substituting this result into the left hand side of (19), we obtain the right hand side of (18), and hence (19). T o show conca vity of f , it suf ﬁces to consider only diagonal K X = diag ( λ 1 , λ 2 , . . . , λ m S ) in Λ . Note that the mapping H : K X →   K Y D K Y D Y W K Y W Y D K Y W   is linear in Λ . Also the mapping F :   K Y D K Y D Y W K Y W Y D K Y W   → K Y D − K Y D Y W K − 1 Y W K Y W Y D is matrix-concave in H (Λ) [32, Ex. 3.58]. Thus the composition theorem [32] giv es that the mapping G : K X → K Y D − K Y D Y W K − 1 Y W K Y W Y D is matrix-concav e in Λ , since G = F ◦ H . Another use of the composite theorem together with the concavity of the function log det as mentioned in the proof of Lemma 3.1 shows that log det G is concav e in Λ . Thus (19) implies that f is also concave in Λ . Hence it suf ﬁces to consider only those X with zero mean in (17). No w deﬁne the constraint set Λ P = { λ i ≥ 0 for i = 1 , 2 , . . . , m S and P m S i =1 λ i ≤ P } . Lemma 3.2 implies that we can ﬁnd the upper bound on C K by calculating max Λ P f ( λ 1 , λ 2 , . . . , λ m S ) , whose v alue is gi ven by the next lemma: Lemma 3.3: max Λ P f ( λ 1 , λ 2 , . . . , λ m S ) = f  P m S , P m S , . . . , P m S  . Pr oof: Since the elements of both H D and H W are i.i.d., f is in v ariant to any permutation of its ar guments. This means that f is a symmetric function. By Lemma 3.2, f is also concav e in Λ P . Thus it is Schur-conca ve [33]. Hence a Schur-minimal element (an element majorized by any another element) in Λ P maximizes f . It is easy to check that  P m S , P m S , . . . , P m S  is Schur-minimal in Λ P . Hence max Λ P f ( λ 1 , λ 2 , . . . , λ m S ) = f  P m S , P m S , . . . , P m S  . 14 Combining the results in (17), (18), Lemmas 3.2 and 3.3, we obtain the upper bound on the ke y capacity as C K ≤ E " log det I m D + P m S σ 2 D H D  I m S + α 2 P m S σ 2 W H † W H W  − 1 H † D !# = E   log det  I m S + α 2 P m S σ 2 W H † W H W + P m S σ 2 D H † D H D  det  I m S + α 2 P m S σ 2 W H † W H W    (21) where the identity det( I + U V − 1 U † ) = det( V + U † U ) det( V ) for in vertible V [34, Theorem 18.1.1] has been used. On the other hand, consider choosing X to hav e i.i.d. zero-mean, circular-symmetric complex Gaussian- distributed elements of v ariance P m S . Then conditioned on H D and H W , [ Y T D Y T W ] T are a circular- symmetric comple x Gaussian random vector , by applying [29, Lemmas 3 and 4] to the linear model of (10). Hence Lemma 3.1 gives h ( Y D | Y W , H D , H W ) = E  log det  K Y D − K Y D Y W K − 1 Y W K Y W Y D  + m D log( π e ) where K Y D = P m S H D H † D + σ 2 D I m D , K Y W = α 2 P m S H W H † W + σ 2 W I m W , and K Y D Y W = αP m S H D H † W . Substituting this back into (12) and using the matrix in version formula to simplify the resulting expression, we obtain the same expression on the ﬁrst line of (21) for I ( X ; Y D , H D ) − I ( Y D , H D ; Y W , H W ) . Thus the upper bound in (21) is achiev able with this choice of X ; hence it is in fact the ke y capacity . In Fig. 1, the key capacities of sev eral fast-fading MIMO channels with different number of source, destination, and eavesdropper antennas are plotted against the source signal-to-noise ratio (SNR) P /σ 2 where σ 2 D = σ 2 W = σ 2 . The channel gain adv antage of the eav esdropper is set to α 2 = 1 . W e observe that the key capacity le vels off as P /σ 2 increases in three of the four channels, except the case of ( m S , m D , m W ) = (2 , 1 , 1) , considered in Fig. 1. It appears that the relativ e antenna dimensions determine the asymptotic behavior of the key capacity when the SNR is large. T o more precisely study this behavior , we ev aluate the limiting value of C K as the input power P of the source becomes very large. T o highlight the dependence of C K on P , we use the notation C K ( P ) . Cor ollary 3.1: 1) If m W ≥ m S , then lim P →∞ C K ( P ) = E   log det  H † W H W + σ 2 W α 2 σ 2 D H † D H D  det  H † W H W    . 2) Suppose that m W < m S . Deﬁne C ∞ ( P ) = E  log det  I m D + P m S σ 2 D H D  I m S − H † W  H W H † W  − 1 H W  H † D  . Then lim P →∞ C K ( P ) C ∞ ( P ) = 1 . 15 0 2 4 6 8 10 12 14 16 18 20 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 C k (bits/channel symbol) P/ σ 2 (dB) m S =1, m D =1, m W =1 m S =2, m D =1, m W =1 m S =2, m D =2, m W =2 m S =1, m D =10, m W =10 Fig. 1. Ke y capacities of fast-fading MIMO wiretap channels with different numbers of source, destination, eavesdropper antennas. The eav esdropper’ s channel gain α 2 = 0 dB, and σ 2 D = σ 2 W = σ 2 . Pr oof: First ﬁx ( λ 1 , λ 2 , . . . , λ m S ) =  P m S , P m S , . . . , P m S  or equi valently K X = P m s I m S , and consider the mapping G deﬁned in the proof of Lemma 3.2 as a function of P . Also deﬁne ˆ f ( P ) = log det I m D + P m S σ 2 D H D  I m S + α 2 P m S σ 2 W H † W H W  − 1 H † D ! . Thus C K ( P ) = E [ ˆ f ( P )] . It is not hard to check that for any P < ˜ P , G ( ˜ P ) ≥ G ( P ) , which implies that det( G ( P )) ≥ det( G ( ˜ P )) . Hence ˆ f is increasing in P . Since the elements of H W are continuously i.i.d., rank( H † W H W ) = rank( H W H † W ) = rank( H W ) = min( m S , m W ) w .p.1. Thus the matrix H † W H W (resp. H W H † W ) is in vertible w .p.1 when m W ≥ m S (resp. m W < m S ). No w , consider the case of m W ≥ m S . As in (21), we have ˆ f ( P ) = log det  m S σ 2 W α 2 P I m S + H † W H W + σ 2 W α 2 σ 2 D H † D H D  det  m S σ 2 W α 2 P I m S + H † W H W  . Since H † W H W is in vertible w .p.1, lim P →∞ ˆ f ( P ) = log det  H † W H W + σ 2 W α 2 σ 2 D H † D H D  det  H † W H W  w .p.1 . 16 Hence Part 1) of the lemma results from monotone con ver gence. For the case of m W < m S , the matrix in version formula allows us to instead write ˆ f ( P ) = log det I m D + P m S σ 2 D H D " I m S − H † W  m S σ 2 W α 2 P I m W + H W H † W  − 1 H W # H † D ! Since H W H † W is in vertible w .p.1, we can also deﬁne ˆ f ∞ ( P ) = log det  I m D + P m S σ 2 D H D  I m S − H † W  H W H † W  − 1 H W  H † D  . Note that C ∞ ( P ) = E [ ˆ f ∞ ( P )] . Since H W is of rank m W w .p.1, it has the singular value decomposition H W = U W [ S W 0 m S − m W ] V † W , where S W = diag ( s 1 , s 2 , . . . , s m W ) is a diagonal matrix whose diagonal elements are the positi ve singular v alues of H W . Also let V = [ ˜ V ˆ V ] , i.e., ˜ V W and ˆ V W consist respecti vely of the ﬁrst m W and the last m S − m W columns of V . Employing the unitary property of U W and V W , it is not hard to verify that ˆ f ( P ) = log det  I m D + P m S σ 2 D H D ˆ V W ˆ V † W H † D + H D ˜ V W Λ W ( P ) ˜ V † W H † D  (22) ˆ f ∞ ( P ) = log det  I m D + P m S σ 2 D H D ˆ V W ˆ V † W H † D  (23) where Λ W ( P ) = σ 2 W α 2 σ 2 D  m S σ 2 W α 2 P I m W + S 2 W  − 1 . From (22) and (23), it is clear that ˆ f ∞ ( P ) ≤ ˆ f ( P ) . Further let t ( P ) = tr  H D ˜ V W Λ W ( P ) ˜ V † W H † D  . Since t ( P ) I m D ≥ H D ˜ V W Λ W ( P ) ˜ V † W H † D , ˆ f ( P ) ≤ log det  [1 + t ( P )] I m D + P m S σ 2 D H D ˆ V W ˆ V † W H † D  = m D log(1 + t ( P )) + log det  I m D + P m S σ 2 D [1 + t ( P )] H D ˆ V W ˆ V † W H † D  . (24) Let µ 1 , µ 2 , . . . , µ j be the positiv e eigen values of H D ˆ V W ˆ V † W H † D . Note that 1 ≤ j ≤ min( m D , m S − m W ) , because of the fact that the elements of H D are continuously i.i.d. and are independent of the elements of H W . Hence, from (23), (24) and the fact that ˆ f ∞ ( P ) ≤ ˆ f ( P ) , we hav e 0 ≤ ˆ f ( P ) − ˆ f ∞ ( P ) ≤ m D log(1 + t ( P )) + log   Q j i =1 h 1 + P µ i m S σ 2 D (1+ t ( P )) i Q j i =1 h 1 + P µ i m S σ 2 D i   = m D log(1 + t ( P )) + j X i =1 log   1 1+ t ( P ) + m S σ 2 D P µ i 1 + m S σ 2 D P µ i   . (25) No w note that lim P →∞ t ( P ) = σ 2 W α 2 σ 2 D tr  H D ˜ V W S − 2 W ˜ V † W H † D  = σ 2 W α 2 σ 2 D tr  [ H − 1 W H † D ] † H − 1 W H † D  17 where H − 1 W denotes the Penrose-Moore pseudo-in verse of H W . Then (25) implies that 0 ≤ lim inf P →∞ [ ˆ f ( P ) − ˆ f ∞ ( P )] ≤ lim sup P →∞ [ ˆ f ( P ) − ˆ f ∞ ( P )] ≤ ( m D − j ) log  1 + σ 2 W α 2 σ 2 D tr  [ H − 1 W H † D ] † H − 1 W H † D   w .p.1. Hence by Fatou’ s lemma, we get 0 ≤ lim inf P →∞ [ C K ( P ) − C ∞ ( P )] ≤ lim sup P →∞ [ C K ( P ) − C ∞ ( P )] ≤ E  ( m D − j ) log  1 + σ 2 W α 2 σ 2 D tr  [ H − 1 W H † D ] † H − 1 W H † D   . (26) From (23), it is clear that ˆ f ∞ ( P ) increases without bound in P w .p.1; hence C ∞ ( P ) also increases without bound. Combining this fact with (26), we arri ve at the conclusion of P art 2) of the lemma. Part 1) of the lemma veriﬁes the observations shown in Fig. 1 that the key capacity lev els of f as the SNR increases if the number of source antennas is no larger than that of eavesdropper antennas. When the source has more antennas, Part 2) of the lemma suggests that the ke y capacity can grow without bound as P increases similarly to a MIMO fading channel with capacity C ∞ ( P ) . Note that the matrix I m S − H † W  H W H † W  − 1 H W in the expression that deﬁnes C ∞ ( P ) is a projection matrix to the orthogonal complement of the column space of H W . Thus C ∞ ( P ) has the physical interpretation that the secret information is passed across the dimensions not observable by the eav esdropper . The most interesting aspect is that this mode of operation can be achie ved ev en if neither the source nor the destination knows the channel matrix H W . W e note that the asymptotic behavior of the k ey capacity in the high SNR regime summarized in Corollary 3.1 is similar to the idea of secrecy degree of freedom introduced in [35]. The subtle dif ference here is that no up-to-date CSI of the destination channel is needed at the source. Another interesting observ ation from Fig. 1 is that for the case of ( m S , m D , m W ) = (1 , 10 , 10) , the source power P seems to have little effect on the key capacity . A small amount of source power is enough to get close to the leveling key capacity of about 1 bit per channel use. This observation is generalized belo w by Corollary 3.2, which characterizes the effect of spatial dimensionality of the destination and eav esdropper on the key capacity when the destination and eav esdropper both hav e a large number of antennas. 18 0 5 10 15 20 25 30 35 10 −3 10 −2 10 −1 10 0 10 1 C k (bits/channel symbol) α 2 (dB) m S =1, m D =1, m W =1 m S =2, m D =1, m W =1 m S =2, m D =2, m W =2 Fig. 2. Ke y capacities of fast-fading MIMO wiretap channels with different numbers of source, destination, eavesdropper antennas. The source signal to noise ratio P /σ 2 = 10 dB, where σ 2 D = σ 2 W = σ 2 . Cor ollary 3.2: When m D and m W approaches inﬁnity in such a way that lim m D ,m W →∞ m W m D = β , C K → m S log  1 + 1 β α 2 σ 2 D /σ 2 W  . Pr oof: This corollary is a direct consequence of the fact that 1 m D H † D H D → I m S and 1 m W H † W H W → I m S w .p.1, which is in turn due to the strong la w of lar ge numbers. Note that we can interpret the ratio β as the spatial dimensionality advantage of the eavesdropper ov er the destination. The expression for the limiting C K in the corollary clearly indicates that this spatial dimensionality adv antage affects the key capacity in the same way as the channel gain advantage α 2 . In Fig. 2, the key capacities of se veral fast-fading MIMO channels with different numbers of source, destination, and eav esdropper antennas are plotted against the eavesdropper’ s channel gain adv antage α 2 , with P /σ 2 = 10 dB. The results in Fig. 2 sho w the other ef fect of spatial dimensionality . W e observe that the key capacity decreases almost reciprocally with α 2 in the channels with ( m S , m D , m W ) = (1 , 1 , 1) and ( m S , m D , m W ) = (2 , 2 , 2) , but stays almost constant for the channel with ( m S , m D , m W ) = (2 , 1 , 1) . It seems that the relati ve numbers of source and eav esdropper antennas again play the main role in dif ferentiating these two dif ferent beha viors of the ke y capacity . T o verify that, we ev aluate the limiting v alue of C K as the gain adv antage α 2 of the eavesdropper becomes v ery large. T o highlight the dependence 19 of C K on α 2 , we use the notation C K ( α 2 ) . Cor ollary 3.3: lim α →∞ C K ( α 2 ) =    0 if m W ≥ m S C ∞ ( P ) if m W < m S . Pr oof: Similar to the proof of Corollary 3.1. Similar to the case of lar ge SNR, when the number of source antennas is larger than that of the eav esdropper’ s antennas, secret information can be passed across the dimensions not observable by the eav esdropper . This can be achie ved with neither the source nor the destination kno wing the channel matrix H W . I V . A LT E R N A T I V E A C H I E V A B I L I T Y O F K E Y C A PAC I T Y In this section, we pro vide an alternativ e proof of achie vability for key capacity , which does not require the transmission of continuous symbols ov er the public channel. W e deri ve the result from “ﬁrst principles”, which provides more insight on the desirable structure of a practical key agreement scheme. The main steps of the key agreement procedure are the following: 1) the source sends a sequence of i.i.d. symbols X n ; 2) the destination “quantizes” its receiv ed sequence Y n into ˆ Y n with a W yner -Ziv compression scheme; 3) the destination uses a binning scheme with the quantized symbol sequences to determine the secret ke y and the information to feed back to the source ov er the public channel; 4) the source e xploits the information sent by the destination to reconstruct the destination’ s quantized sequence ˆ Y n and uses the same binning scheme to generate its secret ke y . The secrecy of the resulting key is established by carefully structuring the binning scheme. For the memoryless wiretap channel ( X , Y , Z ) speciﬁed by the joint pdf p ( y | x ) p ( z | x ) p ( x ) , consider the quadruple ( X , Y , ˆ Y , Z ) deﬁned by the joint pdf p ( x, y , ˆ y , z ) = p ( ˆ y | y ) p ( y | x ) p ( z | x ) p ( x ) with p ( ˆ y | y ) to be speciﬁed later . W e assume that ˆ Y takes values in the alphabet Y . Gi ven a sequence of n elements x n = ( x 1 , x 2 , . . . , x n ) , p ( x n ) = Q n j =1 p ( x j ) unless otherwise speciﬁed. Similar notation and con vention apply to all other sequences as well as their corresponding pdfs and conditional pdfs considered hereafter . A. Random Code Generation Choose p ( ˆ y | y ) such that I ( X ; ˆ Y ) − I ( ˆ Y ; Z ) > 0 and I ( ˆ Y ; Z ) > 0 , and let p ( ˆ y ) denote the corresponding marginal. Note that the existence of such p ( ˆ y | y ) can be assumed without loss of generality if I ( X ; Y ) − I ( Y ; Z ) > 0 and I ( Y ; Z ) > 0 . If I ( X ; Y ) − I ( Y ; Z ) = 0 , there is nothing to prove. Similarly , if 20 I ( Y ; Z ) = 0 , the construction below can be trivially modiﬁed to sho w that I ( X ; Y ) is an achie vable key rate. Fix a small (small enough so that the v arious rate deﬁnitions and bounds on probabilities below make sense and are non-tri vial) ε > 0 . Let us deﬁne R 1 ∆ = I ( Y ; ˆ Y ) + 4 ε R 2 ∆ = I ( Y ; ˆ Y ) − I ( X ; ˆ Y ) + 22 ε R 3 ∆ = I ( X ; ˆ Y ) − I ( ˆ Y ; Z ) − ε R 4 ∆ = I ( ˆ Y ; Z ) − 17 ε. (27) For each j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 , generate 2 nR 4 code words ˆ Y n ( j, l , 1) , ˆ Y n ( j, l , 2) , . . . , ˆ Y n ( j, l , 2 nR 4 ) according to p ( ˆ y n ) . The set of code words { ˆ Y n ( j, l , k ) } with k = 1 . . . 2 nR 4 forms a subcode denoted by C ( j, l ) . The union of all subcodes C ( j , l ) for j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 forms the code C . For conv enience, we denote the 2 nR 1 code words in C as ˆ Y n (1) , ˆ Y n (2) , . . . , ˆ Y n (2 nR 1 ) , where ˆ Y n ( j + ( l − 1)2 nR 2 + ( w − 1)2 n ( R 2 + R 3 ) ) = ˆ Y n ( j, l , w ) for j = 1 , 2 , . . . , 2 nR 2 , l = 1 , 2 , . . . , 2 nR 3 , and w = 1 , 2 , . . . , 2 nR 4 . The code C and its subcodes C ( j , l ) is re vealed to the source, destination, and eav esdropper . In the following, we refer to a code word or its index in C interchangeably . Under this con vention, the subcode C ( j, l ) is also the set that contains all the indices of its codewords. Denote ˆ C ( j ) = S 2 nR 3 l =1 C ( j, l ) and ˜ C ( l ) = S 2 nR 2 j =1 C ( j, l ) . B. Secr et Sharing Pr ocedur e For con venience, we deﬁne the joint typicality indicator function T ε ( · ) that takes in a number of sequences as its arguments. The value of T ε ( · ) is 1 if the sequences are ε -jointly typical, and the v alue is 0 otherwise. Further deﬁne the indicator function for the sequence pair ( y n , ˆ y n ) : S ε ( y n , ˆ y n ) =    1 if Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } ≥ 1 − ε 0 otherwise where ( X n , Z n ) is distrib uted according to p ( x n , z n | y n , ˆ y n ) in the deﬁnition above. The source generates a random sequence X n distributed according to p ( x n ) . If X n satisﬁes the av erage po wer constraint (1), the source sends X n through the ( X , Y , Z ) channel. Otherwise, it ends the secret- sharing process. Since p ( x ) satisﬁes E [ | X | 2 ] ≤ P , the la w of large numbers implies that the probability of the latter ev ent can be made arbitrarily small by increasing n . Hence we can assume belo w , with no 21 loss of generality , that X n satisﬁes (1) and is sent by the source. This assumption helps to make the probability calculations in Section IV -C less tedious. Upon reception of the sequence Y n , the destination tries to quantize the receiv ed sequence. Let M be the output of its quantizer . Speciﬁcally , if there is a unique sequence ˆ Y n ( m ) ∈ C for some m ∈ { 1 , 2 , . . . , 2 nR 1 } such that S ε ( Y n , ˆ Y n ( m )) = 1 , then it sets the output of the quantizer to M = m . If there is more than one such sequence, M is set to be the smallest sequence index m . If there is no such sequence, it sets M = 0 . Let L and J be the unique indices such that ˆ Y n ( M ) ∈ C ( J, L ) . The index L will be used as the ke y while the index J is fed back to the source over the public channel, i.e. Ψ k = J . If M = 0 , set J = 0 and choose L randomly ov er { 1 , 2 , . . . , 2 nR 3 } with uniform probabilities. After receiving the feedback information J via the public channel, the source attempts to ﬁnd a unique ˆ Y n ( m ) ∈ C such that T ε ( X n , ˆ Y n ( m )) = 1 and m ∈ ˆ C ( J ) . If there is such a unique ˆ Y n ( m ) , the source decodes ˆ M = m . If there is no such sequence or more than one such sequence, the source sets ˆ M = 0 . If J = 0 , it sets ˆ M = 0 . Finally , if ˆ M > 0 , the source generates its ke y K = k , such that ˆ M ∈ C ( J, k ) . If ˆ M = 0 , it sets K = 0 . W e also consider a ﬁctitious receiver who observes the sequence Z n and obtains both indices J and L via the public channel. This receiver sets ˜ M = 0 if J = 0 . Otherwise, it attempts to ﬁnd a unique ˆ Y n ( m ) ∈ C such that T ε ( ˆ Y n ( m ) , Z n ) = 1 and m ∈ C ( J, L ) . If there is such a unique ˆ Y n ( m ) , the source decodes ˜ M = m . If there is no such sequence or more than one such sequence, the source sets ˜ M = 0 . C. Analysis of Pr obability of Err or W e use a random coding argument to establish the existence of a code with rates given by (27) such that Pr { K 6 = L } and Pr { M 6 = ˜ M } vanish in the limit of lar ge block length n . Without further clariﬁcation, we note that the probabilities of the e vents below , except otherwise stated, are ov er the joint distrib ution of the codebook C , code words, and all other random quantities inv olved. Before we proceed, we introduce the following lemma regarding the indicator function S ε . Lemma 4.1: 1) If ( Y n , ˆ Y n ) distributes according to p ( y n , ˆ y n ) , then Pr { S ε ( Y n , ˆ Y n ) = 1 } > 1 − ε for suf ﬁciently large n . 2) If ˆ Y n distributes according to p ( ˆ y n ) , then Pr { S ε ( y n , ˆ Y n ) = 1 } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε for all y n . 3) If Y n distributes according to p ( y n ) , then Pr { S ε ( Y n , ˆ y n ) = 1 } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε for all ˆ y n . 4) If ( Y n , ˆ Y n ) distrib utes according to p ( y n ) p ( ˆ y n ) , then Pr { S ε ( Y n , ˆ Y n ) = 1 } > (1 − ε ) · 2 − n ( R 1 − ε ) for suf ﬁciently large n . Pr oof: 22 1) This claim is actually sho wn in [36]. W e brieﬂy sketch the proof here using our notation for completeness and easy reference. By the rev erse Marko v inequality [36], Pr { S ε ( Y n , ˆ Y n ) = 1 } ≥ 1 − 1 − Pr { T ε ( X n , Y n , ˆ Y n , Z n ) = 1 } 1 − (1 − ε ) > 1 − ε where the second inequality is due to that fact that Pr { T ε ( X n , Y n , ˆ Y n , Z n ) = 1 } > 1 − ε 2 for suf ﬁciently lar ge n . 2) First, we only need to consider typical y n since the bound is tri vial when y n is not typical. Notice that for any such y n , 1 ≥ Z T ε ( x n , y n , ˆ y n , z n ) p ( x n , ˆ y n , z n | y n ) dx n dz n d ˆ y n = Z Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } · p ( y n , ˆ y n ) p ( y n ) d ˆ y n ≥ Z Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } · 2 − n ( h ( Y , ˆ Y )+ ε ) 2 − n ( h ( Y ) − ε ) d ˆ y n = 2 − n ( h ( ˆ Y | Y )+2 ε ) Z Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } d ˆ y n . Hence 2 n ( h ( ˆ Y | Y )+2 ε ) ≥ Z Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } d ˆ y n ≥ Z S ε ( y n , ˆ y n ) · Pr { T ε ( X n , y n , ˆ y n , Z n ) = 1 } d ˆ y n ≥ (1 − ε ) Z S ε ( y n , ˆ y n ) d ˆ y n . (28) No w Pr { S ε ( y n , ˆ Y n ) = 1 } = Z S ε ( y n , ˆ y n ) p ( ˆ y n ) d ˆ y n ≤ Z S ε ( y n , ˆ y n )2 − n ( h ( ˆ Y ) − ε ) d ˆ y n ≤ 2 − n ( I ( Y ; ˆ Y ) − 3 ε ) 1 − ε , where the last inequality is due to (28). 3) Same as Part 2), interchanging the roles of y n and ˆ y n . 23 4) From Part 1), we get 1 − ε < Z S ε ( y n , ˆ y n ) p ( y n , ˆ y n ) dy n d ˆ y n = Z S ε ( y n , ˆ y n ) p ( y n , ˆ y n ) p ( y n ) p ( ˆ y n ) p ( y n ) p ( ˆ y n ) dy n d ˆ y n ≤ Z S ε ( y n , ˆ y n ) · 2 − n ( h ( Y , ˆ Y ) − ε ) 2 − n ( h ( Y )+ ε ) · 2 − n ( h ( ˆ Y )+ ε ) · p ( y n ) p ( ˆ y n ) dy n d ˆ y n = 2 n ( I ( Y ; ˆ Y ) − 3 ε ) Pr { S ε ( Y n , ˆ Y n ) = 1 } . Moreov er we need to bound the probabilities of the following e vents pertaining to M . Lemma 4.2: 1) Pr { M = 0 } < 2 ε for suf ﬁciently large n . 2) For m = 1 , 2 , . . . , 2 nR 1 , Pr { M = m } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε . 3) When n is sufﬁciently large, Pr { M = m } ≥ h 1 − 2 − n ( R 1 − 7 ε ) 1 − ε i m − 1 · (1 − ε )2 − n ( R 1 − ε ) uniformly for all m = 1 , 2 , . . . , 2 nR 1 . 4) When n is suf ﬁciently large, Pr { J = j, L = l } > (1 − ε ) 4 · 2 − n ( R 1 − R 4 +6 ε ) uniformly for all j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 . Pr oof: 1) W e will use an argument similar to the one in the achiev ability proof of rate distortion function in [27, Section 10.5] to bound Pr { M = 0 } . First note that { M = 0 } is the e vent that S ε ( Y n , ˆ Y n ( m )) = 0 for all m ∈ { 1 , 2 , . . . , R 1 } , and hence Pr { M = 0 } = Pr    2 nR 1 \ m =1 { S ε ( Y n , ˆ Y n ( m )) = 0 }    = Z h Pr { S ε ( y n , ˆ Y n (1)) = 0 } i 2 nR 1 p ( y n ) dy n , (29) where the second equality is due to the fact that ˆ Y n (1) , . . . , ˆ Y n (2 nR 1 ) are i.i.d. giv en each ﬁxed y n . 24 But h Pr { S ε ( y n , ˆ Y n (1)) = 0 } i 2 nR 1 =  1 − Z S ε ( y n , ˆ y n ) p ( ˆ y n ) d ˆ y n  2 nR 1 =  1 − Z S ε ( y n , ˆ y n ) p ( ˆ y n | y n ) p ( y n ) p ( ˆ y n ) p ( y n , ˆ y n ) d ˆ y n  2 nR 1 ≤ " 1 − Z S ε ( y n , ˆ y n ) p ( ˆ y n | y n ) 2 − n ( h ( Y )+ ε ) · 2 − n ( h ( ˆ Y )+ ε ) 2 − n ( h ( Y , ˆ Y ) − ε ) d ˆ y n # 2 nR 1 =  1 − 2 − n ( I ( Y ; ˆ Y )+3 ε ) Z S ε ( y n , ˆ y n ) p ( ˆ y n | y n ) d ˆ y n  2 nR 1 ≤ 1 − Z S ε ( y n , ˆ y n ) p ( ˆ y n | y n ) d ˆ y n + exp ( − 2 nε ) , (30) where the inequality on the third line is due to the fact that S ε ( y n , ˆ y n ) = 1 implies T ε ( y n , ˆ y n ) = 1 , and the last line results from the inequality (1 − xy ) k ≤ 1 − x + e − ky for all 0 ≤ x, y ≤ 1 and positi ve inte ger k [27, Lemma 10.5.3]. Substituting (30) back into (29) and using Lemma 4.1 Part 1), we get Pr { M = 0 } ≤ 1 − Pr { S ε ( Y n , ˆ Y n ) = 1 } + exp ( − 2 nε ) < ε + ε = 2 ε for suf ﬁciently large n . 2) Notice that for m = 1 , 2 , . . . , 2 nR 1 , Pr { M = m } = Pr { S ε ( Y n , ˆ Y n ( m )) = 1 , S ε ( Y n , ˆ Y n ( m − 1)) = 0 , . . . , S ε ( Y n , ˆ Y n (1)) = 0 } = Z Pr { S ε ( y n , ˆ Y n (1)) = 1 } h Pr { S ε ( y n , ˆ Y n (1)) = 0 } i m − 1 p ( y n ) dy n (31) where the second equality results from the i.i.d. nature of ˆ Y n (1) , . . . , ˆ Y n ( m ) . Thus we ha ve Pr { M = m } ≤ Pr { S ε ( Y n , ˆ Y n (1)) = 1 } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε , where the last inequality is due to Part 2) of Lemma 4.1 since Y n and ˆ Y n (1) are independent. 3) From (31), we hav e the lo wer bound Pr { M = m } ≥ " 1 − 2 − n ( R 1 − 7 ε ) 1 − ε # m − 1 Pr { S ε ( Y n , ˆ Y n (1)) = 1 } ≥ " 1 − 2 − n ( R 1 − 7 ε ) 1 − ε # m − 1 · (1 − ε )2 − n ( R 1 − ε ) where the ﬁrst inequality is due to Part 2) of Lemma 4.1, and the second inequality is from Part 4) of Lemma 4.1 when n is sufﬁciently large. Note that the same sufﬁciently large n is enough to guarantee the v alidity of the lower bound above for all m = 1 , 2 , . . . , 2 nR 1 . 25 4) First note that, for j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 , Pr { J = j, L = l } = X m ∈C ( j,l ) Pr { M = m } = 2 nR 4 X w =1 Pr n M = j + ( l − 1)2 nR 2 + ( w − 1)2 n ( R 2 + R 3 ) o . Thus applying Part 3) of the lemma, we get Pr { J = j, L = l } ≥ (1 − ε )2 − n ( R 1 − ε ) · 2 nR 4 X w =1 " 1 − 2 − n ( R 1 − 7 ε ) 1 − ε # j − 1+( l − 1)2 nR 2 +( w − 1)2 n ( R 2 + R 3 ) ≥ (1 − ε )2 − n ( R 1 − ε ) " 1 − 2 − n ( R 1 − 7 ε ) 1 − ε # 2 n ( R 2 + R 3 ) 1 −  1 − 2 − n ( R 1 − 7 ε ) / (1 − ε )  2 nR 1 1 −  1 − 2 − n ( R 1 − 7 ε ) / (1 − ε )  2 n ( R 2 + R 3 ) ≥ (1 − ε )2 − n ( R 1 − ε ) " 1 − 2 − n ( R 4 − 7 ε ) 1 − ε # · 1 −  1 − 2 − n ( R 1 − 7 ε ) / (1 − ε )  2 nR 1 1 −  1 − 2 − n ( R 4 − 7 ε ) / (1 − ε )  ≥ (1 − ε ) 2 · 2 − n ( R 1 − R 4 +6 ε ) " 1 − 2 − n ( R 4 − 7 ε ) 1 − ε #  1 − exp( − 2 7 nε ) 1 − ε  > (1 − ε ) 4 · 2 − n ( R 1 − R 4 +6 ε ) (32) uniformly for all j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 , when n is sufﬁciently large. The lo wer bound on the fourth line of (32) abo ve is obtained from the inequality (1 − x ) k ≥ 1 − k x for any 0 ≤ x ≤ 1 and positiv e integer k . The lo wer bound on the ﬁfth line is in turn based on the inequality (1 − x ) k ≤ e − kx for 0 ≤ x ≤ 1 and positi ve integer k . W e ﬁrst consider the error e vent { K 6 = L } . Note that Pr { K 6 = L } = Pr { M = 0 } + Pr { M > 0 , K 6 = L } = Pr { M = 0 } + 2 nR 1 X m =1 Pr n ˜ E m ∪ E m , M = m o ≤ Pr { M = 0 } + 2 nR 1 X m =1 Pr n ˜ E m , M = m o + 2 nR 1 X m =1 Pr {E m , M = m } (33) where ˜ E m is the e vent { T ε ( X n , ˆ Y n ( m )) = 0 } , and E m is the e vent that there is an m 0 ∈ ˆ C ( j ) such that 26 m ∈ ˆ C ( j ) , m 0 6 = m , and T ε ( X n , ˆ Y n ( m 0 )) = 1 . From (31), we have Pr n ˜ E m , M = m o = Pr n T ε ( X n , ˆ Y n ( m )) = 0 , S ε ( Y n , ˆ Y n ( m )) = 1 , S ε ( Y n , ˆ Y n ( m − 1)) = 0 , . . . , S ε ( Y n , ˆ Y n (1)) = 0 o ≤ Pr n T ε ( X n , Y n , ˆ Y n ( m ) , Z n ) = 0 , S ε ( Y n , ˆ Y n ( m )) = 1 , S ε ( Y n , ˆ Y n ( m − 1)) = 0 , . . . , S ε ( Y n , ˆ Y n (1)) = 0 o = Z  Z Pr n T ε ( x n , y n , ˆ Y n ( m ) , z n ) = 0 , S ε ( y n , ˆ Y n ( m )) = 1 o p ( x n , z n | y n ) dx n dz n  · m − 1 Y m 0 =1 Pr { S ε ( y n , ˆ Y n ( m 0 )) = 0 } p ( y n ) dy n = Z  Z [1 − T ε ( x n , y n , ˆ y n , z n )] p ( x n , z n | y n , ˆ y n ) dx n dz n  · S ε ( y n , ˆ y n ) p ( ˆ y n ) d ˆ y n  · m − 1 Y m 0 =1 Pr { S ε ( y n , ˆ Y n ( m 0 )) = 0 } p ( y n ) dy n ≤ ε · Pr n S ε ( Y n , ˆ Y n ( m )) = 1 , S ε ( Y n , ˆ Y n ( m − 1)) = 0 , . . . , S ε ( Y n , ˆ Y n (1)) = 0 o = ε · Pr { M = m } , (34) where the equality on the fourth line is due to the i.i.d. nature of ˆ Y n (1) , . . . , ˆ Y n (2 nR 1 ) , the equality on the ﬁfth line results from the fact that p ( x n , z n | y n ) = p ( x n , z n | y n , ˆ y n ) (since ( X, Z ) → Y → ˆ Y ), and the inequality on the second last line is from the deﬁnition of the indicator function S ε . Similarly assuming m ∈ ˆ C ( j ) , we hav e from (31) Pr {E m , M = m } ≤ X m 0 ∈ ˆ C ( j ) m 0 6 = m Pr n T ε ( X n , ˆ Y n ( m 0 )) = 1 , S ε ( Y n , ˆ Y n ( m )) = 1 o = X m 0 ∈ ˆ C ( j ) m 0 6 = m Z Pr { T ε ( x n , ˆ Y n ( m 0 )) = 1 } · Pr { S ε ( y n , ˆ Y n ( m )) = 1 } p ( x n , y n ) dx n dy n ≤ 2 n ( R 1 − R 2 ) · 2 − n ( I ( X ; ˆ Y ) − 3 ε ) · 2 − n ( R 1 − 7 ε ) 1 − ε = 2 − n ( R 1 +8 ε ) 1 − ε , (35) where the equality on the second line is due to the independence between ˆ Y n ( m 0 ) and ˆ Y n ( m ) , and the last inequality results from Part 2) of Lemma 4.1 and the bound Pr { T ε ( x n , ˆ Y n ( m 0 )) = 1 } ≤ 2 − n ( I ( X ; ˆ Y ) − 3 ε ) , which is a direct result of [27, Theorem 15.2.2]. Hence, substituting the bounds in (34) and (35) back 27 into (33) and using Part 1) of Lemma 4.2, we obtain Pr { K 6 = L } ≤ 2 ε + ε · 2 nR 1 X m =1 Pr { M = m } + 2 nR 1 X m =1 2 − n ( R 1 +8 ε ) 1 − ε = 2 ε + ε + 2 − 8 nε 1 − ε < 4 ε (36) for n is suf ﬁciently large. Next we consider the event { M 6 = ˜ M } . Deﬁne ˜ F m as the ev ent { T ε ( ˆ Y n ( m ) , Z n ) = 0 } and F m as the event that there is an m 0 ∈ C ( l , j ) such that m ∈ C ( l , j ) , m 0 6 = m , and T ε ( ˆ Y n ( m 0 ) , Z n ) = 1 . Then we ha ve, when n is sufﬁciently large, uniformly for all j = 1 , 2 , . . . , 2 nR 2 and l = 1 , 2 , . . . , 2 nR 3 , Pr { ˜ M 6 = M | J = j, L = l } ≤ X m ∈C ( j,l ) Pr n ˜ F m , M = m | J = j, L = l o + X m ∈C ( j,l ) Pr {F m , M = m | J = j, L = l } ≤ X m ∈C ( j,l ) ε · Pr { M = m | J = j, L = l } + X m ∈C ( j,l ) 2 − n ( R 1 +7 ε ) 1 − ε · 1 Pr { J = j, L = l } ≤ ε + 2 − n ( R 1 +7 ε ) 1 − ε · 2 nR 4 (1 − ε ) 4 · 2 − n ( R 1 − R 4 +6 ε ) = ε + 2 − nε (1 − ε ) 5 < 2 ε. (37) Note that the inequality on the third line of (37) results from upper bounds of Pr { ˜ F m , M = m } and Pr {F m , M = m } , which can be obtained in ways almost identical to the deriv ations in (34) and (35) respecti vely . The inequality on the fourth line is, on the other hand, due to P art 4) of Lemma 4.2. By expur gating the random code ensemble, we obtain the following lemma. Lemma 4.3: For any  > 0 and n sufﬁciently large, there e xists a code C n with the rates R 1 , R 2 , R 3 , and R 4 gi ven by (27) such that 1) Pr { K 6 = L |C = C n } < 8 ε , 2) Pr { M 6 = ˜ M |C = C n } < 8 ε , 3) Pr { M = m |C = C n } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε for all m = 1 , 2 , . . . , 2 nR 1 , and 4) Pr { L = l |C = C n } < 2 − n ( R 3 − 8 ε ) for all l = 1 , 2 , . . . , 2 nR 3 . Pr oof: Combining Part 1) of Lemma 4.2, (36), and (37), we hav e Pr { M = 0 } + Pr { K 6 = L } + Pr { M 6 = ˜ M } < 8 ε for sufﬁciently large n . This implies that there must e xist a C n satisfying Pr { K 6 = L |C = C n } < 8 ε , Pr { M 6 = ˜ M |C = C n } < 8 ε , and Pr { M = 0 |C = C n } < 8 ε . Thus, Parts 1) and 2) are prov ed. 28 No w , ﬁx this C n . For m = 1 , 2 , . . . , 2 nR 1 , let ˆ y n ( m ) be the m th codeword of C n . Then, by Part 3) of Lemma 4.1, Pr { M = m |C = C n } ≤ Pr { S ε ( Y n , ˆ y n ( m )) = 1 } ≤ 2 − n ( R 1 − 7 ε ) 1 − ε ; hence, Part 3) results. Note that, for l = 1 , 2 , . . . , 2 nR 3 , Pr { L = l |C = C n } = Pr { L = l | M = 0 , C = C n } Pr { M = 0 |C = C n } + Pr { L = l , M > 0 |C = C n } . (38) W e know from the discussion above that Pr { L = l | M = 0 , C = C n } Pr { M = 0 |C = C n } < 2 − nR 3 · 8 ε . Also from Part 3) of the lemma, Pr { L = l , M > 0 |C = C n } = X m ∈ ˜ C n ( l ) Pr { M = m |C = C n } ≤ 2 n ( R 1 − R 3 ) · 2 − n ( R 1 − 7 ε ) 1 − ε = 2 − n ( R 3 − 7 ε ) 1 − ε . Putting these back into (38), we get Pr { L = l |C = C n } < 2 − n ( R 3 − 7 ε )  8 ε · 2 − 7 nε + 1 1 − ε  < 2 − n ( R 3 − 8 ε ) for suf ﬁciently large n . Thus, P art 4) is proved. In the remainder of the paper, we use a ﬁxed code C n identiﬁed by Lemma 4.3. For con venience, we drop the conditioning on C n . D. Secr ecy Analysis First we proceed to bound H ( K ) . Note that H ( K ) = H ( L ) + H ( K | L ) − H ( L | K ) ≥ H ( L ) − H ( L | K ) . (39) Using Part 1) of Lemma 4.3 together with Fano’ s inequality giv es H ( L | K ) ≤ 1 + 8 nεR 3 . Moreover Part 4) of Lemma 4.3 implies that H ( L ) > n ( R 3 − 8 ε ) . Putting these bounds back into (39), we hav e R 3 − (8 R 3 + 8) ε − 1 n < 1 n H ( K ) ≤ R 3 . (40) Next we bound I ( K ; Z n , J ) . Note that I ( K ; Z n , J ) = I ( L ; Z n , J ) + I ( K ; Z n , J | L ) − I ( L ; Z n , J | K ) ≤ I ( L ; Z n , J ) + I ( K ; Z n , J | L ) ≤ I ( L ; Z n , J ) + H ( K | L ) ≤ I ( L ; Z n , J ) + 8 nεR 3 + 1 (41) 29 where the last inequality is obtained from Part 1) of Lemma 4.3 and Fano’ s inequality like before. In addition, it holds that I ( L ; Z n , J ) = H ( L ) − H ( L | Z n , J ) = H ( L ) − H ( L, J | Z n ) + H ( J | Z n ) = H ( L ) + H ( J | Z n ) − H ( L, J, M | Z n ) + H ( M | Z n , L, J ) ≤ H ( L ) + H ( J ) − H ( M | Z n ) − H ( L, J | M , Z n ) + H ( M | Z n , L, J ) ≤ H ( L ) + H ( J ) + I ( M ; Z n ) − H ( M ) + 8 nR 1 ε + 1 , where the second last inequality follows from H ( J | Z n ) ≤ H ( J ) , and the last inequality follo ws from H ( L, J | M , Z n ) = 0 (by deﬁnition of J and L ) and H ( M | Z n , L, J ) ≤ 1 + 8 nR 1 ε (by Fano’ s inequality applied to the ﬁctitious receiver). By construction of the code C n , it holds that H ( L ) ≤ nR 2 and H ( J ) ≤ nR 3 . In addition, Part 3) of Lemma 4.3 implies H ( M ) ≥ n ( R 1 − 8 ε ) . Finally , note that I ( M ; Z n ) ≤ I ( Y n ; Z n ) = nI ( Y ; Z ) by the data-processing inequality applied to the Markov chain ˆ Y n → Y n → Z n and the memoryless property of the channel between Y n and Z n . Combining these observ ations and substituting the values of R 1 , R 2 , and R 3 gi ven by (27) back into (41), we obtain 1 n I ( K ; Z n , J ) ≤ R 2 + R 3 − R 1 + I ( Y ; Z ) + (8 R 1 + 8 R 3 + 8) ε + 2 n ≤ I ( Y ; Z ) − I ( ˆ Y ; Z ) + (8 R 1 + 8 R 3 + 9) ε, when n is sufﬁciently large. W ithout any rate limitation on the public channel, we can choose the transition probability p ( ˆ y | y ) such that I ( Y ; Z ) − I ( ˆ Y ; Z ) ≤ ε ; therefore, 1 n I ( K ; Z n , J ) ≤ (8 R 1 + 8 R 3 + 9) ε. (42) Since ε > 0 can be chosen arbitrarily , Part 1) of Lemma 4.3, (40), and (42), establish the achie v ability of the secret ke y rate I ( Y ; X ) − I ( Y ; Z ) . V . C O N C L U S I O N W e ev aluated the ke y capacity of the fast-fading MIMO wiretap channel. W e found that spatial dimensionality provided by the use of multiple antennas at the source and destination can be employed to combat a channel-gain adv antage of the eav esdropper ov er the destination. In particular if the source has more antennas than the eav esdropper , then the channel gain advantage of the eav esdropper can be completely ov ercome in the sense that the key capacity does not vanish when the ea vesdropper channel 30 gain advantage becomes asymptotically large. This is the most interesting observation of this paper, as no ea vesdropper CSI is needed at the source or destination to achieve the non-vanishing key capacity . A C K N O W L E D G M E N T This work was supported by the National Science Foundation under grant number CNS-0626863 and by the Air Force Ofﬁce of Scientiﬁc Research under grant number F A9550-07-10456. W e would also like to thank Dr . Shlomo Shamai and the anonymous re viewers for their detailed comments and thoughtful suggestions. W e are grateful to the revie wer who pointed out a signiﬁcant o versight in the proof of Theorem 2.1 in the original version of the paper . W e are also indebted to another re viewer who suggested the conca vity argument in the proof of Lemma 3.2, which is much more elegant than our original one. R E F E R E N C E S [1] A. W yner , “The wire-tap channel, ” Bell Syst. T ech. J. , vol. 54, pp. 1355–1387, Oct. 1975. [2] S. Leung-Y an-Cheong and M. Hellman, “The Gaussian wire-tap channel, ” IEEE T rans. Inform. Theory , vol. 24, no. 4, pp. 451–456, Jul 1978. [3] I. Csis ´ zar and J. K orner , “Broadcast channels with conﬁdential messages, ” IEEE T rans. Inform. Theory , v ol. 24, no. 3, pp. 339–348, May 1978. [4] Y . Liang, H. Poor , and S. Shamai, “Secure communication over fading channels, ” IEEE T rans. Inform. Theory , vol. 54, no. 6, pp. 2470–2492, June 2008. [5] P . Gopala, L. Lai, and H. El Gamal, “On the Secrecy Capacity of Fading Channels, ” IEEE T rans. Inform. Theory , vol. 54, no. 10, pp. 4687–4698, October 2008. [6] M. Bloch, J. Barros, M. Rodrigues, and S. McLaughlin, “W ireless information-theoretic security , ” IEEE T rans. Inform. Theory , vol. 54, no. 6, pp. 2515–2534, June 2008. [7] A. Khisti, A. Tchamkerten, and G. W ornell, “Secure broadcasting over fading channels, ” IEEE T rans. Inform. Theory , vol. 54, no. 6, pp. 2453–2469, June 2008. [8] S. Shaﬁee, N. Liu, and S. Ulukus, “T ow ards the secrecy capacity of the Gaussian MIMO wire-tap channel: The 2-2-1 channel, ”, IEEE T rans. Inform. Theory , vol. 55, no. 9, pp. 4033–4039, September 2009. [9] A. Khisti and G. W ornell, “The MIMOME channel, ” Arxiv pr eprint arXiv:0710.1325 , 2007. [10] F . Oggier and B. Hassibi, “The Secrecy Capacity of the MIMO wiretap channel”, in Pr oc. of the 45th Allerton Confer ence on Communication, Contr ol and Computing , September, 2007, pp. 848–855 [11] T . Liu and S. Shamai, “ A note on the secrecy capacity of the multi-antenna wiretap channel, ” IEEE T rans. Inform. Theory , vol. 55, no. 6, pp. 2547–2553, June 2009. [12] R. Bustin, R. Liu, H. V . Poor and S. Shamai, “ An MMSE approach to the secrecy capacity of the MIMO Gaussian wiretap channel”, in Pr oc. of IEEE Int. Symp. Inform. Theory (ISIT 2009) , July 2009, pp. 2602–2606. [13] U. M. Maurer , “Secret key agreement by public discussion from common information, ” IEEE T rans. Inform. Theory , vol. 39, no. 3, pp. 733–742, May 1993. 31 [14] R. Ahlswede and I. Csis ´ zar , “Common randomness in information theory and cryptography . I. Secret sharing, ” IEEE T rans. Inform. Theory , vol. 39, no. 4, pp. 1121–1132, July 1993. [15] I. Csis ´ zar and P . Narayan, “Common randomness and secret key generation with a helper , ” IEEE T rans. Inform. Theory , vol. 46, no. 2, pp. 344–366, Mar 2000. [16] ——, “Secrecy capacities for multiple terminals, ” IEEE T rans. Inform. Theory , vol. 50, no. 12, pp. 3047–3061, Dec. 2004. [17] ——, “Secrecy capacities for multiterminal channel models, ” IEEE T rans. Inform. Theory , vol. 54, no. 6, pp. 2437–2452, June 2008. [18] C. H. Bennett, G. Brassard, C. Crepeau, and U. M. Maurer, “Generalized pri vacy ampliﬁcation, ” IEEE T rans. Inform. Theory , vol. 41, no. 6, pp. 1915–1923, Nov . 1995. [19] U. Maurer and S. W olf, “Secret-key agreement over unauthenticated public channels. I. Deﬁnitions and a completeness result, ” IEEE T rans. Inform. Theory , vol. 49, no. 4, pp. 822–831, Apr . 2003. [20] ——, “Secret-key agreement over unauthenticated public channels. II. The simulatability condition, ” IEEE T rans. Inform. Theory , vol. 49, no. 4, pp. 832–838, Apr . 2003. [21] ——, “Secret-key agreement ov er unauthenticated public channels. III. Pri vacy ampliﬁcation, ” IEEE T rans. Inform. Theory , vol. 49, no. 4, pp. 839–851, Apr . 2003. [22] L. Lai, H. El Gamal, and H. Poor , “The wiretap channel with feedback: Encryption over the channel, ” IEEE T rans. Inform. Theory , vol. 54, no. 11, pp. 5059–5067, November 2008. [23] E. T ekin and A. Y ener, “The General Gaussian Multiple-Access and T wo-W ay W iretap Channels: Achiev able Rates and Cooperativ e Jamming”, IEEE T rans. Inform. Theory , vol. 54, no. 6, pp. 2735–2751, June 2008. [24] ——, “Effects of Cooperation on the Secrecy of Multiple Access Channels with Generalized Feedback, ” in Pr oc. Conf. Inform. Sciences and Systems , Princeton, NJ, Mar . 2008. [25] A. Khisti, S. Digga vi, and G. W ornell, “Secret-key generation with correlated sources and noisy channels, ” in Pr oc. IEEE Int. Symp. Inform. Theory (ISIT 2008) , July 2008, pp. 1005–1009. [26] V . Prabhakaran, K. Eswaran, and K. Ramchandran, “Secrecy via sources and channels — a secret ke y - secret message rate tradeoff region, ” in Proc. IEEE Int. Inform. Theory (ISIT 2008) , July 2008, pp. 1010–1014. [27] T . Cover and J. Thomas, Elements of Information Theory , 2nd ed. New Y ork: Wile y-Interscience, 2006. [28] T . Han, Information—Spectrum methods in information theory . Berlin: Springer-V erlag, 2003. [29] E. T elatar , “Capacity of multi-antenna Gaussian channels, ” Eur opean T ransactions on T elecommunications , vol. 10, no. 6, pp. 585–595, 1999. [30] R. Horn and C. Johnson, Matrix Analysis . Cambridge University Press, 1985. [31] L. L. Scharf, Statistical Signal Pr ocessing: Detection, Estimation, and T ime Series Analysis . New Y ork: Addison-W esley , 1990. [32] S. Boyd and L. V andenberghe, Conve x Optimization . Cambridge University Press, 2004. [33] A. Marshall and I. Olkin, Inequalities: theory of majorization and its applications . Academic Press, 1979. [34] D. Harville, Matrix Algebra fr om a Statistician’s P erspective . Ne w Y ork: Springer-V erlag, 1997. [35] A. Khisti, G. W ornell, A. W iesel, and Y . Eldar , “On the Gaussian MIMO wiretap channel, ” in Pr oc. IEEE Int. Symp. Inform. Theory (ISIT 2007) , 2007, pp. 2471–2475. [36] Y . Oohama, “Gaussian multiterminal source coding, ” IEEE T rans. Inform. Theory , vol. 43, no. 6, pp. 1912–1923, Nov . 1997.

Secret Sharing over Fast-Fading MIMO Wiretap Channels

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment