Gaussian Random Number Generator: Implemented in FPGA for Quantum Key Distribution

O R I G I N A L A R T I C L E J o u r n a l S e c t i o n Gaussian Random Number Gener ator: Implemented in FPGA for Quantum K e y Distribution Y ue Hu 1 | Y an Wu 1 | Yi Chen 1 | Guo Chun W an 1 | Mei Song T ong 1 1 College of Electronics and Information Engineering, T ongji University , Shanghai, 201804, China Correspondence Y ue Hu, College of Electronics and Information Engineering, T ongji University , Shanghai, 201804, China Email: tjhuyue@tongji.edu.cn Present address † College of Electronics and Information Engineering, T ongji University , Shanghai, 201804, China Funding information National Undergraduate Innovation Program 2016, College of Electronics and Information Engineering, T ongji University Quantum Ke y Distribution is the process of using quantum communication to establish a shared k ey between two par- ties. It has been demonstrated the unconditional security and effective communication of quantum communication system can be guaranteed by an ex cellent Gaussian random number generator with high speed and an extended r andom period. In this paper , we propose to construct the Gaussian random number generator by using Field-Programmable Gate Arra y (FPGA) which is able to process large data in high speed. We also compare three algorithms of GRN generation: Bo x- Muller algorithm, polarization decision algorithm, and cen- tral limit algorithm. We demonstr ate that the polarization decision algorithm implemented in FPGA requires less com- puting resources and also produces a high-quality Gaussian random number , through the null hypothesis test. K E Y W O R D S Gaussian Random Numbers, Quantum Key Distribution, Field Programmable Gate Array , Numerical Modeling Abbreviations: FPGA: Field Programmable Gate Array; QKD: Quantum Key Distribution; GRNs: Gaussian Random Number; URNs: Unique Random Numbers; LSFR: Linear F eedback Shift Register; MSR G: Multi-return Shift Register Generator . 1 2 Y . H U E T A L . 1 | I N T R O D U C T I O N As the modern physics suggests, quantum mechanics has a signi ﬁ cant in ﬂ uence in engineering, such as quantum com- puting [13, 10] and quantum tunneling electronic devices[4]. At the same, It also opens a new perspective in offer- ing interesting new protocols in the intersection between computer science [13], information theory[3], and quan- tum cryptographic key (QCK) [14, 6, 8]. The most well known and developed application of quantum cryptography is quantum ke y distribution (QKD )[31, 29], which is the process of using quantum communication to establish a shared ke y between two parties without a third party learning anything about that ke y , even if all communication is being eavesdropped[31]. However , the absolute security of QCK is guaranteed by naturally quantum mechanical properties [39], i.e., Heisenberg’s uncertainty principle[7]. In practical, the unconditional security is achieved by using quantum vacuum ﬂ uctuations [39, 25], the phase noise of lasers[21], and ampli ﬁ ed spontaneous emission[9, 15]. V acuum ﬂ uc- tuations method is based on shot noise measurement[11]. The phase noise method use laser phase noise [16] and the spontaneous emission method also utilizes ﬂ uctuations in ASE noise[37][1]. Recently there is a large number of pro- posals, e xperiments, improvements and ex citing theoretical results in randomness extr action and randomness certi ﬁ - cation for dealing with QCK. Unfortunately , direct ﬂ uctuation measurement is not technologically feasible for optical signals, especially considering the problems from sampling and digitization. However it is possible to obtain a pseudo-random quantum signal by a set of high-quality Gaussian random num- bers [34]. This requires the period of Gaussian random numbers (GRNs) is long enough for guaranteeing the absolute security of QCK. On the other side, the speed of generating is also critical for effective communication between two parties[20]. Hence, an adequate Gaussian random source is economical and vital for continuous-variable quantum cryptograph y communication [36]. Whereas, the study of the Gaussian random numbers generator in continuous- variable QCK system is much complicated. Some ex cellent surveys of the Gaussian random number gener ators (GRNGs) from the algorithmic perspective exist in the published literature of Thomas et al. [33]. Thomas et al. [33] compared their computational requirements and examined the quality . In this work, we choose three conventional algorithms for generating GRN: Box-Muller algorithm[5, 33], polarization decision algorithm[2, 33], and central limit algorithm[18, 33]. The Box-Muller transform is one of the earliest exact transformation methods[33]. It produces a pair of Gaussian random numbers from a couple of uniform numbers. The polar method is an exact method related to the Box-Muller transform and has a closely related two-dimensional graphical interpretation, but uses a different approach to get the 2D Gaussian distribution[2, 33]. The probability density function describing the sum of multiple uniform random num- bers is obtained by convolving the constituent probability density function. Thus, by the central limit theorem, the probability density function of the sum of K uniform random numbers over the range (0, 1) will approximate a Gaus- sian distribution. F urthermore Malik and Hemani [23] provides a potential capsulization of hardware Gaussian r andom number generator architectures. In this work, we analyzed, and compare these three algorithms of generating Gaus- sian Random Number for a continuous-variable quantum cryptography communication system. More importantly , we choose FPGA as hardware to construct GRNG architectures using the Box-Muller algorithm, polarization decision al- gorithm, and central limit algorithm. Generally , CPU executes instructions with extremely high speed. However , for a typical instruction (i.e., instruction for GRN and QCK), it is possible to achieve a higher speed through an appropriately designed architectures[17, 32, 35]. In what follows, in Section 2, we brie ﬂ y describe how to implement Gaussian random number in QCK system by the reverse reconciliation protocol. Then in Section 3, we introduce how to achieve the Box-Muller algorithm , polar- ization decision algorithm, and central limit algorithm. In Section 4, we present the designed architectures using FPGA circuit and the implement of the Box-Muller algorithm, polarization decision algorithm, and central limit algorithm. In Section 5, we describe the hardware structure to achieve our design. In Section 6 we estimate the accuracy and quality Y . H U E T A L . 3 of generated GRNs as well compare these three algorithms. In Section 7, we give the discussion about our design of GRNs generator to be implemented in a continuous-variable QCK system. In Section 8, we give our conclusions. 2 | A P P L I C AT I O N O F G A U S S I A N R A N D O M N U M B E R I N Q U A N T U M K E Y D I S - T R I B U T I O N S Y S T E M In the ﬁ eld of quantum key distribution, randomness is an essential requirement. Even if the communication channel is eavesdropped by others, communication on this channel is still safe with the randomness. Here we introduce a coherent-state QKD protocol, whose security relies on the distribution of a Gaussian key obtained by continuously modulating the phase and amplitude of GRNs[8] Alice ’s (emitter) side, and subsequently detected at Bob’s (receiver) side[12]. The protocol runs as follows[20], Alice prepares displaced coherent states with quadrature components q and p that are realizations of two independent and identically distributed random variables Q and P . The random variables Q and P obey the same zero-centered normal distribution: Q ∼ P ∼ N ( 0 , V ) (1) where V is referred to as variance. The displaced coherent states | α 1 i , . .., | α j i , . .., | α n i are expressed as: | α j i = | q j + i p j i (2) The coherent states obey the usual eigenvalue equation: 1 2 ( ˆ q + i ˆ q ) | α j i = ( q + i q ) | α j i (3) where ˆ q and ˆ q are the quadrature operators, de ﬁ ned in the framework of shot-noise units[22]. After preparation of each coherent state, Alice transmits | α j i to Bob through a Gaussian quantum channel. Bob uses heterodyne detection to measure the eigenvalue of either one or both of the quadrature operators. In the last step, Bob sends the correct data to Alice and then Alice corrects her own message which have the same values as Bob. As what we have illustrated above, to get useful ke y elements, at the ﬁ rst step Gaussian random numbers are needed to modulate the amplitude and phase information. On the other hand, the noise that the Gaussian random source introduces to the transmission system degrades the security . Thus, a high-quality Gaussian random number generator plays a signi ﬁ cant role in the QKD system. Secondly , the high transmission speed is another advantage of quantum communication over a classical communication system. T o improve the transmission speed further , a faster GRNs generator is required. In what follows, we proposed a Gaussian random source with high output speed and low quantizing noise to ef ﬁ - ciently generate a secure sequence of quantum ke y . 3 | A L G O R I T H M F O R G E N E R AT I N G G A U S S I A N R A N D O M N U M B E R 3.1 | Bo x-Muller Algorithm 4 Y . H U E T A L . The Box-Muller transform proposed by Box and Muller [5], is one of the precise algorithms for getting the Gaussian ran- dom number . The Box-Muller Algorithm is based on a property of a two-dimensional Cartesian system, assuming X and Y coordinates are described by two independent and normally distributed random variables (i.e. f X ( x ) = 1 √ 2 π δ e − x 2 2 δ 2 and f Y ( y ) = 1 √ 2 π δ e − y 2 2 δ 2 ). If transform X and Y to the corresponding polar coordinates variables r 2 and θ , the ran- dom variables r 2 and θ are also independent and can be expressed as: f R ( r ) =  2 π 0 r 2 π δ 2 e − r 2 2 δ 2 d θ , 0 6 r 6 ∞ and f Θ ( θ ) =  ∞ 0 r 2 π δ 2 e − r 2 2 δ 2 d r , 0 6 θ 6 2 π , where R obeys the Rayleigh distribution and Θ obeys the uniform distribution, so their joint probability density is f R Θ ( r , θ ) = f R ( r ) × f Θ ( θ ) which is also statistically independent. Then the correspond- ing distribution functions of R and Θ are: F R ( r ) =  r 0 r 0 δ 2 e − r 0 2 2 δ 2 d r 0 F Θ ( θ ) =  θ 0 1 2 π d θ 0 = θ 2 π (4) F ortunately , the distribution functions F R ( r ) F Θ ( θ ) is in closed form. Hence the Gaussian random variables can be generated by the inverse transformation method[33]. Since the F R ( r ) ∈ [ 0 , 1 ] , as well as F Θ ( θ ) , actually the Gaussian random variables X and Y in Cartesian coordinates can be obtained through a transformation of two sets of uniformly and independently distributed random numbers. In practice, the Box-Muller algorithm samples two uniform distribution on the interval (0, 1) and then mapping them to two standards, Gaussian distributed samples with zero expectation and unit variance. The algorithm is imple- mented as follow: 1. Generate a pair of uniformly and independently distributed random numbers between the interval (0,1), denoted as U 1 and U 2 respectively . 2. Mapping the random point to the Cartesian coordinate axis through the transformation: α ( U 1 , U 2 ) =  − 2 l n ( U 1 ) ∗ s i n ( 2 π U 2 ) β ( U 1 , U 2 ) =  − 2 l n ( U 1 ) ∗ c o s ( 2 π U 2 ) (5) where α ( U 1 , U 2 ) and β ( U 1 , U 2 ) are the random numbers following Gaussian distribution respectively . The amplitude of the random numbers α ( U 1 , U 2 ) and β ( U 1 , U 2 ) depends on the uniform random number . Their phases equal to the product of U 1 , U 2 , and the constant 2 π . 3.2 | Polarization Decision Algorithm The polarization decision algorithm method proposed by Bell [2] is also a precise approach to obtain the two-dimensional Gaussian distribution. The polar algorithm is related to the Box-Muller transform but is superior to it. Theoretically , we consider two independent and normally distributed random variables X and Y in Cartesian coor- dinates (i.e. f X ( x ) = 1 √ 2 π δ e − x 2 2 δ 2 and f Y ( y ) = 1 √ 2 π δ e − y 2 2 δ 2 ). Then the probability density function are F =  + ∞ −∞ f X ( x ) d x = Y . H U E T A L . 5  + ∞ −∞ f Y ( y ) d y . Thus the square of F transformed to polar coordinate is: F 2 = 1 2 π δ 2  + ∞ −∞  + ∞ −∞ e − x 2 + y 2 2 δ 2 d x d y = 1 2 π δ 2  2 π 0  + ∞ 0 r e − r 2 2 δ 2 d r d θ (6) Similarly to Box-Muller method, the transformation to polar coordinates makes θ is uniformly distributed from 0 to 2 π . The normalized distribution function of radial distance r is: P ( r < a ) =  a 0 r e − r 2 2 d r (7) The uniform random number U is also used here. Since U is uniformly distributed in the interval (0,1), then the point ( c o s ( 2 π U ) , s i n ( 2 πU )) is uniformly distributed on the unit circumference x 2 + y 2 = 1 . A new point is generated by multiplying that point by radial distance r : ( r c o s ( 2 πU ) , r s i n ( 2 πU )) . Finally , by the inverse transform, one obtains two jointly distributed two variables which are independent standard normal random variables. In practical, the polar method is achieved by the rejection approach. Assuming y = f ( x ) is a function with ﬁ nite integral, C is a set of points ( x , y ) , and Z is a superset of C . Then from set Z , random points ( x , y ) are uniformly selected until point ( x , y ) falls into the range of C . The selected point ( x , y ) is returned as the random number[18]. The set C here is set as a unit cycle: C = x 2 + y 2 < 1 , where − 1 < x < 1 , − 1 < y < 1 . Then the random points ( U 1 , U 2 ) is selected until C = U 2 1 + U 2 2 < 1 . Then a pair of normal random variables is obtained as[33]: α = U 1   − 2 l n ( U 2 1 + U 2 2 ) U 2 1 + U 2 2 β = U 2   − 2 l n ( U 2 1 + U 2 2 ) U 2 1 + U 2 2 (8) 3.3 | Central Limit Algorithm The central limit algorithm is based on the central limit theorem which states that when a suf ﬁ ciently large number of samples drawn from independent random variables (i.e., uniform distributions ), the arithmetic mean of their dis- tributions will have a normal distribution. Thus, the central limit algorithm is an extremely ef ﬁ cient method in GRNs generation, since it simply samples suf ﬁ cient amount of identical and independent uniform distributions. More formally , assume there are n independent and identically distributed uniform numbers U i ∼ U ( 0 , 1 ) . Then we can approximate results of the sum of U i as: S =  n i =1 U i . The cumulative distribution function of S can be approxi- mated as: F S ( s ) = Φ ( s − n µ √ n σ 2 ) (9) where Φ represents the cumulative distribution function of a Gaussian distribution. F or a uniform random variable U i ∼ U ( 0 , 1 ) , the mean µ and variance σ are given by 1 2 and 1 12 respectively . Thus if we choose variable z = s − n 2 1 12 √ n the distribution function of z is Gaussian distribution: f Z ( z ) = n √ 2 π e − x 2 2 (10) 6 Y . H U E T A L . After normalization, a standard Gaussian distribution is obtained. Central limit algorithm can be used to transform uniform random numbers to Gaussian with a very low hard-ware cost. However , the error in tail regions of is inversely proportional to the number of U i ∼ U ( 0 , 1 ) to be added. This makes the GRNs produced by central limit algorithm is not highly accurate in tail region[33]. 4 | H A R D WA R E A R C H I T E C T U R E O F G R N A N D U R N A L G O R I T H M 4.1 | Uniform Random Number Generator F I G U R E 1 General architecture of multi-return shift register generator . c 0 , c 1 . .. c n − 1 , c n are the feedback coef ﬁ cients and a 1 , a 2 , a 3 . .. a n are the output values. Since uniform random numbers (URNs) U ( 0 , 1 ) is essential for all of the three algorithms introduced in Sec.3, an ef ﬁ cient and robust URNs generator is indispensable in the whole system. Here we choose to use Multi-return Shift Register Generator (MSRG) [41]in our design. Multi-return Shift Register Generator is one type of Linear F eedback Shift Register (LFSR)[38] which is one of the most effective and simple ways to get uniform random number . The basic architecture of MSRG is shown in Fig.1. The MSRG is composed of a shift register and a feedback function, which can be represented as a polynomial of variable x referred to as the characteristic polynomial: f ( x ) = c n x n + c n − 1 x n − 1 + · · · + c 1 x + c 0 (11) Where c 1 , c 2 , c 3 . .. c n are the feedback coef ﬁ cients. The feedback coef ﬁ cients are selected by the multiplex er which is controlled by the control signal. When the multiplex er selects the output signal a i directly instead of the ’XOR’ func- tion, the corresponding coef ﬁ cient c i would be regarded as 0 in characteristic polynomial. The input bit is given from a linear function of the initial status and the next state of an MSRG is uniquely determined from the previous one by the feedback network. The initial value of the register is called seed and the sequence produced is completely determined by the initial status[26]. Because the register has a ﬁ nite number of possible statuses, after a period the sequence will be repeated. The period of MSRG with order n is no more than 2 n − 1 . Only if the feedback coef ﬁ cients are properly chosen, the output sequence is the m sequence which has the longest period 2 n − 1 [40]. In this paper , the primitive polynomial we choose is: Y . H U E T A L . 7 f ( x ) = x 32 + x 8 + x 5 + x 2 + 1 (12) After getting m sequence with period of ( 2 32 − 1 ) , the uniform random number is obtained by dividing the m se- quence by ( 2 n − 1 ) using divider . The uniform random number generator which is the essential part of Gaussian random number generator has been obtained in Sec4. Then we show the hardware architecture design for implementing Box- Muller , polarization decision and central limit algorithm. 4.2 | Hardware Architecture Deign for Bo x-Muller Algorithm F I G U R E 2 The integral structural design diagram of Gaussian random generator ,using Box-Muller algorithm. U 1 and U 2 are the uniform random numbers generated by MSRG. α and β are the output Gaussian random numbers. Sing-precision ﬂ oating point numbers are used in the design. Fig.2 shows the component structural design diagram of the Gaussian random generator , using the Box-Muller algorithm. The two uniform random number U 1 and U 2 are generated by MSRG module, which has been introduced in Sec.4. All the random numbers U 1 , U 2 and indispensable constants are converted into single-precision ﬂ oating point numbers. The logarithm of U 1 is achieved by ’L OG’ module and the square root of − 2 l n ( U 1 ) is calculated by module ’SQRT’ (see Fig.2). The trigonometric functions s i n ( 2 π U 2 ) and c o s ( 2 πU 2 ) are achieved by modules ’COS’ and ’SIN’ . Four external multipliers are also used in the design. As a result, two sets of Gaussian random numbers are obtained, i.e., α and β . 4.3 | Hardware Architecture Deign for Polarization Decision Algorithm Fig.3 shows the component structural design diagram of the Gaussian random generator ,using polarization de- cision algorithm. The two uniform random number U 1 and U 2 are generated by MSRG module, which has been intro- duced in Sec.4. All the random numbers U 1 , U 2 and indispensable constants are converted into single-precision ﬂ oating point numbers. The logarithm of U 1 is achieved by ’LOG’ module and the square root of − 2 l n ( U 1 ) is calculated by mod- ule ’SQRT’ (see Fig.3). The division function is achieved by modules ’DIV ’ . Five external multipliers and one adder are 8 Y . H U E T A L . also used in the design. As a result, two sets of Gaussian random numbers are obtained 1 , i.e., α and β . F I G U R E 3 The integral structural design diagram of Gaussian random generator , using polarization decision algorithm. U 1 and U 2 are the uniform random numbers generated by MSRG. α and β are the output Gaussian random numbers. Sing-precision ﬂ oating point numbers are used in the design. 4.4 | Hardware Architecture Deign for Central Limit Algorithm Fig.4 shows the component structural design diagram of the Gaussian random generator ,using central limit algorithm. The uniform random numbers U 1 , U 2 . .. U n are generated by MSRG module, which has been introduced in Sec.4. All the random numbers U 1 , U 2 . .. U n and indispensable constants are converted into single-precision ﬂ oating point numbers. The new variable S =  n i =1 U i is obtained by a adder and the square root is calculated by module ’SQRT’ (see Fig.4). The division function is achieved by modules ’DIV’ . T ow external multipliers and one adder are used in the design. As a result, one set of Gaussian random numbers is obtained, i.e., α . 5 | H A R D WA R E R E A L I Z A T I O N O F A L G O R I T H M F U N C T I O N S 5.1 | Field-Programmable Gate Arra y The Field-Programmable Gate Array (FPGA), which integrates programmable logic blocks, soft-core or hardcore pro- cessors, has become more and more common as a core technology used to build electronic systems. In most FPGAs, logic blocks also include memory elements, which may be simple ﬂ ip- ﬂ ops or more complete blocks of memory . The FPGA con ﬁ guration is generally speci ﬁ ed using a hardware description language, like what we use in this work: V erilog HDL. The main and the most signi ﬁ cant difference between the micro-controller and the FPGA is that FPGA does not 1 The output is always large than zero when U 1 , U 2 ∈ ( 0 , 1 ) . T o get the bilateral Gaussian distribution, two addition sets of uniform random number U 3 , U 4 ∈ ( 0 , 1 ) are indispensable. The k ey point here is to change the sign bit of U 3 , U 4 to be negative after converted into ﬂ oating point number , and then combine the results from U 1 , U 2 , U 3 , U 4 . Y . H U E T A L . 9 F I G U R E 4 The integral structural design diagram of Gaussian random generator , using central limit algorithm. U 1 , U 2 . .. U n are the uniform random numbers generated by MSRG. n is the number of uniform random numbers sets. α is the output Gaussian random numbers. Sing-precision ﬂ oating point numbers are used in the design. have a ﬁ xed hardware structure. On the contrary , FPGA is programmable according to user applications. Howe ver , processors have a ﬁ xed hardware structure, which means that all the transistors memory , peripheral structures, and the connections are constant. Which the processor prede ﬁ ne the operations (addition, multiplication, I/O control, etc.), and then users make the processor sequentially do these operations by using a software. Hardware structure in the FPGA is not ﬁ xed but de ﬁ ned by the user . Although logic cells are ﬁ xed in FPGA, func- tions they perform and the interconnections between them are determined by the user . So operations that FPGA can do are not prede ﬁ ned. Users can have the processes done according to the written HDL code "in parallel" which means simultaneously . The ability of parallel processing is one of the most critical features that separate FPGA from the processor and make it superior in many areas. FPGA is generally more useful for routine control of particular circuits. For e xample, using FPGA for simple func- tions such as check the quantum ke y signals from communication. This process can be quickly done with many conven- tional micro-controllers (PIC series, etc.). However , a solution from FPGA is more reasonable, if users want to achieve a high-ef ﬁ cient communication. Because QCK processing requires processing large data in high speed and make these types of applications are very suitable for FPGA that is capable of parallel processing. Since the user can determine the hardware structure of FPGA, FPGA can be programmed to process more extensive data with few clock cycle. Whereas, it is not possible to achieve this performance by the processor . Because data ﬂ ow is limited by processor bus (16-bit, 32 bit, etc.) and the processing speed. As a result, for applications that require more performance such as intensive data processing FPGA has come to the fore for routine control operations. Nevertheless, micro-controllers can be embedded into the FPGA since they are logic circuits in fact. Thus it possible to de ﬁ ne and use processor and user-speci ﬁ c hardware functions on only one chip by using FPGA. This solution shows the possibility to control the hardware because of its high ﬂ exibility . Users can modify and update whole design (FPGA on the processor and other logic circuits) by only changing the code on FPGA, without any change on circuit board layout. In this way , users can add different functions, 10 Y . H U E T A L . improve performance and make your design resistant to time without having to redesign the cards. According to the Gaussian random number generation algorithm described above, FPGA chip Altera Cyclone IV E EP4CE115F29I8L is chosen to achieve our design. 528 I/O ports, 114,480 logic elements, and 7155 logic arra y blocks are embedded in this chip. It is able to achieve 200 MHz maximum operating frequency . F I G U R E 5 The input and output signals of FPGA ﬂ oating-point IP cores. L OG is AL TFP_LOG IP core. SIN/COS is AL TFP_SINCOS IP core. DIV is AL TFP_DIV IP core. SQRT is AL TFP_SQRT IP core. 5.2 | FPGA Floating-Point IP Cores Intellectual property (IP) cores are standalone modules that can be used in any ﬁ eld programmable gate array and source codes are ported across various FPGA platforms. These are developed using HDL languages like VHDL, V erilog and System V erilog. In this work we use soft IP cores to implement our design. Soft IP cores are completely ﬂ exible and do not depend on vendor technology . Hence, the IPs can be modi ﬁ ed according to users’ typical application and easily integrated with other modules. 5.2.1 | Logarithm IP Core The logarithm calculation is achieved by the AL TFP_L OG IP core which can compute the natural logarithm of single- precision format numbers. Fig5 shows the input and output signals of the AL TFP_LOG IP core. The function of each port is de ﬁ ned as: • clock : Clock input to the IP core; • clk_en : Clock enable. When the clk_en port is asserted high, a natural logarithm operation takes place. Y . H U E T A L . 11 • aclr : Asynchronous clear . When the aclr port is asserted high, the function is asynchronously cleared. • data[ ] : Floating-point input data. • result[ ] : The natural logarithm of the value on input data. • zero : Zero e xception output. This occurs when the actual input value is 1. • nan : NaN ex ception output. This occurs when the input is a negative number or NaN. 5.2.2 | T rigonometric IP Core The trigonometric calculation is achieved by the AL TFP_SINCOS IP core which can perform trigonometric sine and cosine functions single-precision format numbers. Fig5 shows the input and output signals of the AL TFP_SINCOS IP core. The function of each port is de ﬁ ned as: • clock : Clock input to the mega-function.; • clk_en : Clock enable. When the clk_en port is asserted high, sine or cosine operation tak es place. • aclr : Asynchronous clear . When the aclr port is asserted high, the function is asynchronously cleared. • data[ ] : Floating-point input data. • result[ ] : The trigonometric of the data[] input port in ﬂ oating-point format. The widths of the result[] output port and data[] input port are the same. 5.2.3 | Division IP Core The division is achieved by the AL TFP_DIV IP core which performs the ﬂ oating-point division operation. Fig5 shows the input and output signals of the AL TFP_DIV IP core. The function of each port is de ﬁ ned as: • clock : Clock input to the IP core; • clk_en : Clock enable to the ﬂ oating-point divider . This port enables division. • aclr : Asynchronous clear . When the aclr port is asserted high, the function is asynchronously cleared. • dataa[ ] : Numerator data input. • datab[ ] : Denominator data input. • result[ ] : Divider output port. The division result. • ov er ﬂ ow : Over ﬂ ow port for the divider . Asserted when the result of the division ex ceeds or reaches in ﬁ nity . • under ﬂ ow : Under ﬂ ow port for the divider . Asserted when the result of the division is zero even though neither of the inputs to the divider is zero, or when the result is a denormalized number . • zero : Zero port for the divider . Asserted when the value of result[] is zero. • nan : NaN port. Asserted when an invalid division occurs, such as in ﬁ nity dividing in ﬁ nity or zero dividing zero. 5.2.4 | Square Root Calculation IP Core The square root calculation is achieved by the AL TFP_SQRT IP core. This IP core performs a square root calculation based on the input provided. Fig5 shows the input and output signals of the AL TFP_SQRT IP core. The function of each port is de ﬁ ned as: • clock : Clock input to the IP core; 12 Y . H U E T A L . • clk_en : Clock enable that allows square root operations when the port is asserted high. • aclr : Asynchronous clear . When the aclr port is asserted high, the function is asynchronously cleared. • data[ ] : Floating-point input data. • result[ ] : Square root output port for the ﬂ oating-point result. • zero : Zero port. Asserted when the value of the result[] port is 0. • nan : NaN port. Asserted when an invalid square root occurs, such as negative numbers or NaN inputs. • ov er ﬂ ow : Over ﬂ ow port. Asserted when the result of the square root ex ceeds or reaches in ﬁ nity . 6 | T H E R E S U LT O F S I M U L AT I O N A N D T E S T I N G 6.1 | FPGA Resource Usage Summary Algorithm <= 2 input functions 3 input function 4 input function I/O pins Box-Muller 1990 6768 4335 131 Polarization Decision 1579 2859 1469 131 Central Limit 1453 2873 2889 420 T A B L E 1 Logic element usage by number of LUT input and I/O pins. The table shows the LUT resource usage for achieving Box-Muller algorithm, polarization decision algorithm, and central limit algorithm respectively . Based on the design of three algorithms above, we achiev e the function by using FPGA. Here we show the resource utilization of LUT , logic elements, and fan-out analysis for each design. Please note, we choose 12 sets of uniform random number as inputs to the central limit algorithm. T ab.1 presents the logic element usage by the number of LUT input and I/O pins. The design of the Box-Muller algorithm requires more LUT resources than the others, because of its complex mathematical operations. Moreover , Polarization decision algorithm not only uses less LUT resource, in particular , the four input function but also less I/O pins. T ab.2 shows the logic element usage by normal mode and arithmetic mode. The design of the central limit algo- rithm requires fewer arithmetic resources than the others because it does not involves with logarithm and trigono- metric. Moreover , the Box-Muller algorithm still uses much more resources. Algorithm Logic Elements by Normal Mode Logic Elements by Arithmetic Mode Box-Muller 7803 5209 Polarization Decision 3602 2305 Central Limit 6142 1073 T A B L E 2 Logic element usage by mode. The table shows the logic elements resource usage for achieving Box-Muller algorithm, polarization decision algorithm, and central limit algorithm respectively . T ab.3 gives the results of fan-out rest. The maximum fan-out and total fan-out of Box-muller algorithm is much large than the others. Polarization decision and central limit algorithm gives similar results of maximum fan-out and Y . H U E T A L . 13 total fan-out values. However , the central limit algorithm shows a large average fan-out than polarization decision. Algorithm Maximum Fan-Out T otal Fan-Out A ver age Fan-Out Box-Muller 8790 62386 2.81 Polarization Decision 5262 30332 2.63 Central Limit 4534 35411 2.81 T A B L E 3 The table shows the results of fan-out test using Box-Muller algorithm, polarization decision algorithm, and central limit algorithm respectively . 6.2 | Statistical Analysis T o verify our design of the three algorithms, we import the random number generated to MA TLAB for statistical anal- ysis and compare the results with the random number generated by the function ’randn ’ [28] MA TLAB software. The number of sampled random number is 1,000,000 for all cases. Fig.6 shows the histogram of random numbers generated by Box-Muller , polarization decision, and central limit al- gorithm. It is evident that all sets of random numbers follow the Gaussian pro ﬁ le. Box-Muller and polarization decision method are generating two sets of the random number simultaneously , i.e., α set and β set. The quantity of random number generated by the polarization decision algorithm is less than the others since the two uniform random number U 1 and U 2 are rejected if U 2 1 + U 2 2 > 1 . A more robust way to estimate the accuracy of Gaussian random numbers is the null hypothesis, which is used to determine what outcomes of a study would lead to a rejection of the null hypothesis for a pre-speci ﬁ ed level of signi ﬁ cance[27]. We show the results of three null hypothesis in the following, i.e., Chi-Square Goodness of ﬁ t test, Anderson-Darling test, and Kolmogorov-Smirno v test. 6.3 | The Null Hypothesis T est The Chi-squared test is a statistical test, which is a null hypothesis stating that the frequency distribution of speci ﬁ c events observed in a sample is consistent with a particular theoretical distribution. The theoretical distribution here is the chi-squared distribution. This test suitable for unpaired data from large samples [19]. The signi ﬁ cance level is set to be 5% here. The results are presented in T ab.4. The central limit method is rejected by the null hypothesis. It indicates the approximation the Gaussian is particularly poor , especially in the tails[33]. T o improve the accuracy of the central limit method, one has to increase the number of the uniform random number used for approximation(See Sec.4). How- ever , large numbers of uniform random numbers, also, constitutes a computational challenge. Thus, the central limit theorem is not ideally used in contemporary GRNs. The Box-Muller method and polarization method both pass the Chi-squared test. Also, β set of Box-Muller shows a smaller P value than the one generated by MA TLAB. It indicates the quality of GRNs generated by the Box-Muller algorithm in FPGA is higher than MA TLAB software. Furthermore, the Anderson-Darling test and the Kolmogorov-Smirno v test are also applied here for estimation. Anderson-Darling test is a statistical test to con ﬁ rm whether a given sample of data is drawn from a given probability distribution[30], while the Kolmogorov-Smirnov test is a nonparametric test of the equality of continuous probability distributions that can be used to compare a sample with a reference probability distribution[24]. When applied to 14 Y . H U E T A L . (a) Box-Muller: α set (b) Box-Muller: β set (c) Polarization decision: α set (d) Polarization decision: β set (e) Central Limit (f) MATLAB F I G U R E 6 The histogram of Gaussian random number generated by Bo x-Muller , polarization decision, and central limit algorithm using FPGA, as well as the random number generated by MA TLBE software function ’randn’ . Y . H U E T A L . 15 Algorithm Null Hypothesis Calculated Probability (P V alue) T est Statistic Box-Muller Non-rejected α : 0.6636, β :0.1238 α : 4.9290, β :11.3567 Polarization Decision Non-rejected α : 0.7312, β :0.8725 α : 4.4130, β :3.1322 Central Limit Rejected NaN NaN MA TLA TB Non-rejected 0.1938 9.9094 T A B L E 4 The table shows the results of the Chi-Square goodness of ﬁ t test for the Gaussian random number generated by Box-Muller algorithm, polarization decision algorithm, central limit algorithm, and MA TLAB software respectively . Algorithm Null Hypothesis Calculated Probability (P V alue) T est Statistic Box-Muller Non-rejected α : 0.7663, β :0.0704 α : 0.2449, β :0.6922 Polarization Decision Non-rejected α : 0.0862, β :0.9399 α : 0.6569, β :0.1736 Central Limit Rejected NaN NaN MA TLA TB Non-rejected 0.3044 0.4339 T A B L E 5 The table shows the results of the Anderson-Darling test for the Gaussian random number generated by Box-Muller algorithm, polarization decision algorithm, central limit algorithm, and MA TLAB software respectively . Algorithm Null Hypothesis Calculated Probability (P V alue) T est Statistic Box-Muller Non-rejected α : 0.5481, β :0.3308 α : 0.0008, β : 0.0010 Polarization Decision Non-rejected α : 0.1778, β :0.4333 α : 0.0012, β :0.0010 Central Limit Rejected NaN NaN MA TLA TB Rejected NaN NaN T A B L E 6 The table shows the results of the one-sample Kolmogorov-Smirnov test for the Gaussian random number generated by Box-Muller algorithm, polarization decision algorithm, central limit algorithm, and MA TLAB software respectively . 16 Y . H U E T A L . test whether a normal distribution adequately describes a set of data, they are the most powerful statistical tools for detecting most departures from normality . The results are shown in T ab.5 and T ab.6. The central limit method is still rejected by these two test. More im- portant, the GRNs generated by MA TLAB function ’randn ’ does not pass the Kolmogorov-Smirnov test, while the Box- Muller method and polarization method both pass the Kolmogorov-Smirno v test. These results con ﬁ rm that our de- sign for GRNs using Box-Muller and polarization decision algorithm is a robust way to generate high-quality GRNs. 7 | D I S C U S S I O N T wo groups of the Gaussian random number generated through Box-Muller and polarization decision algorithm have been examined to be well-distributed with high quality . T o apply our GRNs design to a QKD system, one addition electrical modulator is required. Through the modulator , the phase and magnitude signal which follow Gaussian pro ﬁ le is modulated. The signal is regarded as pseudo quantum states, which usually generated by expensive optical devices. In a QKD system, high speed and ef ﬁ cient communication between two parties are required. The ability of parallel processing of FPGA shows an advantage in this case. Since the user can determine the hardware structure of FPGA, FPGA can be programmed to process more extensive data with few clock cycle. The high-speed communication is achieved. T o guarantee the security of communication, usually , a truly random signal is required. However , the GRNs generated FPGA can be regarded as true GRNs when its period is long enough. Because of the extensive and ﬂ exible resources of FPGA, the period of GRNs can be extremely long. Of course, the extended period required a better FPGA type which requires more investment. 8 | C O N C L U S I O N Quantum Key Distribution is the process of using quantum communication to establish a shared ke y between two parties. The absolute security and effective communication of quantum communication system can be guaranteed by a good Gaussian random number generator with high speed and a long random period. In this paper , we propose a possible scheme whose results are proved to be satisfactory and all of these works are the foundation of subsequent work. We conclude that: 1. The un ﬁ xed hardware structure of FPGA provides users the parallel processing solution and makes FPGA superior in many areas than the microprocessor . FPGA is an ideal solution for QCK processing which requires processing large data in high speed. 2. Among three conventional GRNs algorithms, the Gaussian random number generated through polarization deci- sion algorithm shows higher quality than others. 3. FPGA ﬂ oating-point IP cores can be easily modi ﬁ ed and integrated with other modules. They are appropriate choices to achieve the complex mathematical operation in hardware. A C K N O W L E D G E M E N T AL acknowledges the support of National Undergraduate Innovation Program 2016. We are grateful to Prof . Guochun W an and Prof . Meisong T ong, in the Department of Electronics and Information Engineering, T ongji University . We are Y . H U E T A L . 17 extremely thankful and indebted to them for sharing e xpertise, and sincere and valuable guidance and encouragement extended to us. R E F E R E N C E S [1] Argyris A, Pikasis E, Deligiannidis S, Syvridis D. Sub- Tb/ s Physical Random Bit Generators Based on Direct Detection of Ampli ﬁ ed Spontaneous Emission Signals. Journal of Lightwave T echnology 2012;30(9):1329–1334. [2] Bell JR. Algorithm 334: Normal Random Deviates. Commun ACM 1968 Jul;11(7):498–. http://doi.acm.org/ 10 . 1145 / 363397 . 363547 . [3] Bennett CH, Shor PW. Quantum information theory . to appear 1998;. [4] Bouchiat V , Vion D, Joyez P , Esteve D, Devoret M. Quantum coherence with a single Cooper pair . Ph ysica Scripta 1998;1998(T76):165. [5] Bo x GEP , Muller ME. A Note on the Generation of Random Normal Deviates. Ann Math Statist 1958 06;29(2):610–611. https://doi.org/ 10 . 1214 /aoms/ 1177706645 . [6] Br aunstein SL, V an Loock P . Quantum information with continuous variables. Reviews of Modern Physics 2005;77(2):513. [7] Busch P , Heinonen T , Lahti P . Heisenberg’s uncertainty principle. Physics Reports 2007;452(6):155–176. [8] Cerf NJ, Levy M, V an Assche G. Quantum distribution of Gaussian keys using squeezed states. Physical Review A 2001;63(5):052311. [9] Chapur an T , T oliver P , Peters N, Jackel J, Goodman M, Runser R, et al. Optical networking for quantum key distribution and quantum communications. New Journal of Physics 2009;11(10):105001. [10] Deutsch D. Quantum theory , the Church–T uring principle and the universal quantum computer . Proc R Soc Lond A 1985;400(1818):97–117. [11] Gabriel C, Wittmann C, Sych D, Dong R, Mauerer W , Andersen UL , et al. A generator for unique quantum random num- bers based on vacuum states. Nature Photonics 2010;4(10):711–715. [12] Grosshans F , Assche GV , Wenger J, Brouri R, Cerf NJ, Grangier P . Quantum ke y distribution using gaussian-modulated coherent states. Nature 2003;421(6920):238. [13] Gruska J. Quantum computing, vol. 2005. McGraw-Hill London; 1999. [14] Herrerocollantes M, Garciaescartin JC. Quantum Random Number Generators. Review of Modern Physics 2017;(1). [15] Hu, Y ., Yuen, K. H., Lazarian, A. 2018. Improving the accuracy of magnetic ﬁ eld tracing by velocity gradients: principal component analysis. Monthly Notices of the Royal Astronomical Society 480, 1333. [16] Jofre M, Curty M, Steinlechner F , Anzolin G, T orres JP , Mitchell MW , et al. T rue random numbers from ampli ﬁ ed quantum vacuum. Optics Express 2011;19(21):20665. [17] K estur S, Davis JD, Williams O. Blas comparison on fpga, cpu and gpu. In: VLSI (ISVLSI), 2010 IEEE computer society annual symposium on IEEE; 2010. p. 288–293. [18] Knuth DE. The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Boston, MA, USA: Addison-Wesle y Longman Publishing Co., Inc.; 1997. [19] Lani D, Chi-Square goodness of ﬁ t test. Retrieved january; 2011. 18 Y . H U E T A L . [20] Laudenbach F , Pacher C, Fung CHF , Poppe A, Peev M, Schrenk B, et al. Continuous- V ariable Quantum Key Distribution with Gaussian Modulation- The Theory of Practical Implementations. arXiv preprint arXiv:170309278 2017;. [21] L o HK, Curty M, T amaki K. Secure quantum k ey distribution. Nature Photonics 2014;8(8):595. [22] Madsen LS, Usenko VC, Lassen M, Filip R, Andersen UL . Continuous variable quantum key distribution with modulated entangled states. Nature communications 2012;3:1083. [23] Malik JS, Hemani A. Gaussian Random Number Generation: A Survey on Hardware Architectures. ACM Comput Surv 2016 Nov;49(3):53:1–53:37. http://doi.acm.org/ 10 . 1145 / 2980052 . [24] Masse y Jr F J . The K olmogorov-Smirnov test for goodness of ﬁ t. Journal of the American statistical Association 1951;46(253):68–78. [25] Milonni PW . The quantum vacuum: an introduction to quantum electrodynamics. Academic press; 2013. [26] Mioc MA. A complete analyze of using Shift Registers in Cryptosystems for Grade 4, 8 and 16 Irreducible Polynomials???. [27] Moore DS, Craig BA, McCabe GP . Introduction to the Practice of Statistics. WH Freeman; 2012. [28] Nadler B. Design ﬂ aws in the implementation of the Ziggurat and Monty Python methods (and some remarks on Matlab randn). arXiv preprint math/0603058 2006;. [29] Scar ani V , Bechmann-Pasquinucci H, Cerf NJ, Dušek M, Lütkenhaus N, Peev M. The security of practical quantum ke y distribution. Reviews of modern physics 2009;81(3):1301. [30] Scholz FW , Stephens MA. K-sample Anderson–Darling tests. Journal of the American Statistical Association 1987;82(399):918–924. [31] Shor PW , Preskill J. Simple proof of security of the BB84 quantum ke y distribution protocol. Physical review letters 2000;85(2):441. [32] Sidhu R, Prasanna VK. Fast regular expression matching using FPGAs. In: Field-Programmable Custom Computing Machines, 2001. FCCM’01. The 9th Annual IEEE Symposium on IEEE; 2001. p. 227–238. [33] Thomas DB, Luk W , Leong PH, Villasenor JD. Gaussian random number generators. A CM Computing Surveys (CSUR) 2007;39(4):11. [34] Thomas DB, Howes L , Luk W . A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In: Acm/ sigda International Symposium on Field Programmable Gate Arra ys; 2009. p. 63–72. [35] Underwood K. FPGAs vs. CPUs: trends in peak ﬂ oating-point performance. In: Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arra ys ACM; 2004. p. 171–180. [36] W ang LZ. The Generation of Gaussian random number with FPGA and its Application on Quantum Crytograph y . In: The Generation of Gaussian random number with FPGA and its Application on Quantum Crytography; 2006. . [37] Williams CRS, Salevan JC, Li X, Roy R, Murphy TE. Fast physical random number generator using ampli ﬁ ed spontaneous emission. Optics Express 2010;18(23):23584. [38] Xiao-Chen GU, Zhang MX. Multi-output Fibonacci T ype LFSR Based Uniform Random Number Generator:Design and Analysis. Computer Engineering and Science 2009;. [39] Z eng G. Quantum private communication. Springer Publishing Company , Incorporated; 2010. [40] Zhang Z, Liu X, Duan X. Algorithm comparison on Gaussian random number generators. Journal of Henan Institute of Science and T echnology 2014;. [41] Zhou DM, Zhang ZB, Liu YH. Study of m Sequence Generator SSRG and MSRG. Journal of Lanzhou Jiaotong University 2010;.

Gaussian Random Number Generator: Implemented in FPGA for Quantum Key Distribution

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment