FPGA Implementation of pipeline Digit-Slicing Multiplier-Less Radix 2 power of 2 DIF SDF Butterfly for Fourier Transform Structure
The need for wireless communication has driven the communication systems to high performance. However, the main bottleneck that affects the communication capability is the Fast Fourier Transform (FFT), which is the core of most modulators. This paper…
Authors: Yazan Samir Algnabi, Rozita Teymourzadeh, Masuri Othman
FPGA Implementation of Pipeline Digit-Slicing Multiplier-Less Radix 2 2 DIF SDF Butterfly for Fast Fourier Transform Structure 1 Yazan Sam ir Algnabi, 1.2 Roz ita Teymourzadeh, 1 Masuri Othm an, 1 Md Shabiul I slam 1 Institute of MicroEng ineering and Nanoelectron ic s IMEN, VLSI Design Dep artment, Universiti Keban gsaan Malaysia, 43 600 Bangi, Selangor, Malaysia 2 Faculty of E ngineering, Architecture and Built Environment, Electrical & Electronic Engineering dep artment, UCSI University, Kuala Lump ur, Malaysia yazansamir@yaho o.com , rozita@ucsi.ed u.my , masuri.o thman@mimos.my , shab iul@ukm.my Ab stract — The need for wireless communication has driv en the communication syste ms to high performance. However, the main bottleneck that affects the c ommunication capability is the Fast Fourier Transfor m (FFT), w hich is the core of most modulators. This paper presents FPGA implementation of pipeline di git- slicing mu ltiplier-less radix 2 2 DIF (Deci mation In Frequency) SDF (single path delay feedback) bu tterfly for FFT str ucture. The approach t aken; in order to re duce co mputa tion complexity in butterfly mu ltiplier, dig it -slicing multiplier-less technique was utilized in the critical path of pipeline Ra dix- 2 2 DIF SDF FF T structure. The proposed design focused on the trade-off between the speed and active silicon area for the c hip i mplementa tion. The m ultiplier input data w as sliced into f our bl ocks each one with four bits to process at the sa me ti me in parallel. T he new architecture was investigated and si mulated with MATLAB software. The Verilog HDL code in Xilinx ISE environment was derived to describe the FF T Butterfly functionality and w as downloaded to Virtex II FPGA board. Consequently, the Virtex - II FG456 Proto board w as used to implement and test t he design on the real hardware. As a result, from the findings, the synthesis report indicates the ma xi mum clock freque ncy of 555 .75 MHz with the t otal equivalent g ate count of 32,146 is a m arked a nd significant improvement over Radix 2 2 DIF SDF FFT butterfly. In comparison with the conve ntional butterfly architecture design w hich can only r un a t a m aximum clock frequency of 200.102 MHz and the conventional multiplier can only run at a maximum clock frequency of 22 1.140 MHz, the proposed system exhibits better results. It c an be concluded that on-chip implementation of pipeline digit-slicing multiplier-less butterfly for FFT struc ture is an enabler in solving proble ms that affect communications capa bility in FFT and possesses huge potentials for future related works and research areas. Key words — Pipelined digit-slicing multiplier-less; Fast Fourier Transform (FFT); Verilog HDL; Xilinx; Radix 22 DIF SDF FFT. I. I NTRODUCTI ON FFT is significant block in s everal digital signal pro cessing (DSP) applications such as biomedical, sonar, co mmunication systems , r adar, and im age processing. It is a successful algorithm to co mpute d iscret e Fourier transfor m (DFT). DFT is the main a nd i mportant proced ure in data anal ysis, syste m design, and i mplementation [1] . Many modules ha ve been designed and implemented in different p latforms in o rder to reduce the complexit y co mputation o f t he FFT alg orithm . These modules focus on the r adix order or twiddle factors to perform a simple and efficient algorithm which incl udes the higher r adix FF T [2], the mixed-radix FFT [3], the prime- factor FFT [4], the recursive FFT [ 5], low-memory reference FFT [6] , Multiplier-less based FFT [7, 8] and Application- Specific Integrated Circuits (ASIC) system [9, 10]. A special class of FFT architecture w hich ca n compute the FFT in a sequential manner is the pipeline FFT . Pip elined architectur es characterized by real-time, non stopping processing and present s maller latency with lo w p ower consumption [11] which makes t hem suitable for most D SP applications. There are two common types of the pip elined architectures; single path ar chitectures and multi path architect ures. Several differen t architect ures have been inv estigated , such as the Radix 2 Multi-path Delay Co mmutator (R2 MDC) [1 2], Radix 2 Single-Pat h Dela y Feedback (R2 SDF) [13], Radix 4 Single- Path Dela y Commutator (R4 SDC) [14] , and Radix -2 2 Sing le- Path Delay Feedback (R2 2 SDF) [15] . T he study made on the listed architectures sho ws that the Delay Feedb ack architecture is m ore efficient than the other delay co mmutator in terms o f memory utilization. Radix-2 2 has si mpler butterfl y as Radix 2 and the same multiplicative co mplexity as Radix 4 algorithm [ 16, 17] . This makes Radix-2 2 single path dela y feedback an attractive architecture for DSP imple mentation. The study of the digit -slicing technique has b een d ealt b y [18 - 20] for the digital filters. Th e d esign and implementation of Digit-slicing FFT has been discussed by [ 21]. T his p aper proposed a si milar idea with the ones put forth by [2 1]; but having a difference by the use of a different algorithm , structure and different plat form, which helps to improve the performance and achiev e higher clock frequenc y. Recently, Field Programmable Gate Array (FP GA) has beco me an applicable o ption to direct ha rdware solution per formance in the real time application. In this p aper, digit-slicing architecture is proposed to design the pipeline digit-slicing multiplier-less Radix 2 2 SDF butterfly. T he FFT butterfly multiplication is the most cr ucial part in causi ng the d elay in the computation of t he F FT. In view of the fact, the t widdle factors in t he FFT pro cessor were kno wn in ad vance hence w e proposed to use the pipeline digit slici ng multiplier -less butterfly to replace the trad itional butterfly in FFT. II . R ADI X 2 2 SDF FFT A LGO RITHM The more efficient architecture in terms of m emory utilization is the delay feedback. radix-4 algorithm based single-path architectures have high er multiplier utilization; howev er, radix-2 algorithm based architectures have sim pler butterflies and control logic. T he rad ix 2 2 FFT algorithm has the same multiplicative complexity as radix 4 but retains the butterfly structure of radix 2 algorithm [15] . That makes the R2 2 SDF FFT algorithm the best choice for the VLSI implem entation. In this algorithm, the first two steps of the decomposition of radix 2 DIT FFT are analysed, and common factor algorithm is used to illustrate. 1 ,......., 1 , 0 , ] [ ] [ 1 0 N k W n x k X N n nk N (1) x[n] and X [k] = Complex numbers kn j2 / N N We = The twiddle factor In the Equation (1 ) the index n and k d ecomposed as: N n n N n N n 3 2 1 4 2 (2) N k k k k 3 2 1 4 2 (3) The total value of n and k is N. when the abo ve substations are applied to Equation (1 ) the DFT definition can be written as: 1 ) 4 / ( 0 1 0 1 0 4 2 4 2 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 4 2 ] 4 2 [ N n n n k k k n n N n N N W n n N n N x k k k X 1 ) 4 / ( 0 4 2 4 1 0 4 3 2 2 / 3 2 1 3 3 2 3 2 2 3 2 1 4 ] 4 2 [ N n k k n n N N n n n N n k N W W n n N B k k k X Where: 2 4 ) 1 ( 4 3 2 3 2 2 / 1 1 N n n N x n n N x B k k N (6) For normal radix 2 DIF FFT algorithms, the expression in the braces is computed first as a first stag in Equation ( 5 ). However, in radix 2 2 FFT algorithm, the main idea is to reconstruct the first stage and the second stage twiddle factor s, as show n in Equation ( 8) as mentioned in [15] . 3 3 2 1 3 2 1 2 3 2 3 2 3 2 1 3 2 4 ) 2 ( ) 2 ( 4 2 4 4 k n N k k n N k k Nn N k Nn N k k n n N N k n n N N W W W W W W (7) 3 3 2 1 3 2 1 2 3 2 3 2 1 3 2 4 ) 2 ( ) 2 ( 4 2 4 4 ) ( k n N k k n N k k n k k n n N N k n n N N W W j W W (8) Observe that the last twiddle factor in Equation (8) can be rewritten as: 3 3 3 3 3 3 3 3 4 / 4 2 4 2 4 k n N k n N j k n N j k n N W e e W (9) By applying Eq uation (8) and (9) in Eq uation (5) and expand the summ ation over n 2 , the result is a DFT definition with four times shorter FFT length. 1 ) 4 / ( 0 ) 2 ( 3 2 1 3 2 1 3 3 3 2 1 3 ] 4 2 [ ] 4 2 [ N n k n N k k n n W W k k k H k k k X (10) Where, 4 3 ) 1 ( 4 ) ( 2 ) 1 ( ) ( ] 4 2 [ 3 1 3 ) 2 ( 3 1 3 3 2 1 2 1 N n x N n x j N n x n x k k k H k k k k (11) Each term in equation ( 10) represents a Radix-2 b utterfly (Butterfly I), while the whole equation represents Radix- 2 butterfly, (B utterfly II) with trivial multiplication by (-j). Equation (10 ) know n as radix 2 2 SDF FFT algorithm. Fig. 1 show s the butterfly signal flow graph for radix 2 2 FFT algorithm. Fig. 2 shows the 1 6 point R2 2 SDF FFT signal flow graph. x[n] x[n+N/4] x[n+N/2] x[n+3N/4] X[n] X[n+N/4] X[n+N/2] X[n+3N/4] 0 N W n N W n N W 2 n N W 3 -j Fig. 1 The but terfly structure for the radix 2 2 DIF FF T x[0] x[1] x[2] x[5] x[3] x[4] x[7] x[6] X[0] X[8] X[4] X[10] X[12] X[2] X[14] X[6] x[8] x[9] x[10] x[13] x[11] x[12] x[15] x[14] X[1] X[9] X[5] X[11] X[13] X[3] X[15] X[7] -j -j -j -j -j -j -j -j W 9 W 6 W 3 W 3 W 2 W 1 W 6 W 4 W 2 Fig. 2 16-P oi nt R2 2 SDF DI F FFT signal flo w graph. III . R ADIX 2 2 SDF B UTTERFLY S TRUCTURE From equation (10 ), each s tage in R2 2 SDF FFT co nsists o f Butterfly I , B utterfly II, Complex multipliers with t widdle factors. Butterfly I ca lculate the input data flow, butterfly I I calculate t he o utput data flow from B utterfly I, than multipl y the twiddle factors with t he output data fro m B utterfly II , to get the res ult of t he current s tage. Fig. 3 shows t he str ucture of 16 point R2 2 SDF FFT. Butterfly I 8 Butterfly II 4 W Twiddle Factor Butterfly I 2 Butterfly II 1 -j -j Fig. 3 16 -Point R2 2 SDF DI F FFT Structure. A. Butterfly I S tructure Fig. 4 sho ws the Butterfl y I structure, the input A r , A i for this butterfly co mes from the p revious component which is the twiddle factor multiplier excep t the first stage it co mes form the FFT input dat a. The output data B r , B i goes to the next stage whic h is nor mally the Butterfl y II . The control si gnal C1 has two optio ns C1=0 to multiplexers direct the input data to the feedb ack regis ters until t hey filled. The other option is C1=1 the multiple xers select the o utput of the adders and subtracters. The process of the Butterfly I is to stor e the anterior half of the N point i nput series in feedback registers, than butterfly calculation when the posterior half data is coming, t he result of the b utterfly is B r , B i , D r , D i . B r , B i fed to the output result of the Butterfly I the other result D r , D i goes to the feedback registers. B. Butterfly II Structure Fig. 5 shows the B utterfly I I structure b. T he input data B r , B i comes f rom the previous component, Butterfly I. The output d ata fro m the B utterfly II are E r , E i , F r and F i . E r , E i fed to the next component, normally t widdle factor multiplier. The F r and F i go to the feedback register s. The multiplication by – j invo lves swapping between real part and i maginary pa rt a nd sign i nversion. T he swapping is handled by the multiple xers Swap-MUX e fficiently and the sign inversion is handled b y switching b etween the ad ding and the subtracting operations by mean of Swap-MUX. The control signal s C1 a nd C2 will be one when there is a need for multiplication b y −j, therefore the real and imaginary data will swap and the adding and subtracting oper ations will s witched. In order to not lose any p recision the divide by 2 is used where the word lengths imply successive growth as t he d ata goes through adder , subtract er and multiplier op erations. Rounding off ha s b een also applied to reduce the scaling errors. 0 1 0 1 0 1 0 1 + - + - Feedback reg. Img Feedback reg. Real C1 B r B i A r A i D r D i 0 1 0 1 0 1 0 1 + - - + + - F e e d b a c k r e g . I m g F e e d b a c k r e g . R e a l C 2 E r E i B r B i F r F i 0 1 0 1 C 1 S w a p M U X Fig. 4 The Butte rfly I Structure . F ig. 5 The Butterfly II Structure. C. Complex Multiplier Normally t he co mplex multiplier can be realized by four real multipliers, o ne adder and one s ubtractor as shown in Fig. 6. This complex multiplier structure occupies large chip area in VLSI implementatio n . (a r +ja i )(b r + jb i )= (a r b r -a i b i ) +j (a i b r +a r b i ) (12) + + + - Real Part Imaginary Part a r a i b i b r a r × b r a i × b i a r × b i a i × b r [a r × b r ]-[a i × b i ] [a r × b i ]+[a i × b r ] Fig. 6 Comple x multiplier with four r eal multiplier structure. This complex multiplier can be realized by only three real multipliers a nd five r eal add er/ subtractor based on equatio n (13); this will save a lot of are a in hard ware implementation as shown in Fig. 7. (a r +ja i )(b r +jb i )= {b r (a r -a i )+a i (b r -b i )}+j{b i (a r +a i )+a i (b r -b i )} ( 13) Real Part Imaginary Part a r a i b i b r a r - a i b r - b i + - + - + + + + + + a r + a i b r [a r - a i ] a i [b r - b i ] b i [a r + a i ] b r [a r - a i ] + a i [b r - b i ] b i [a r + a i ] + a i [b r - b i ] Fig. 7 Comple x multiplier with thre e real multiplier struct ure. IV . FPGA I MPLEMENTATI ON OF P IPELINE D IGI T -S LICING M ULTIPLIER -L ESS R ADIX 2 2 DIF SDF B UTTERFL Y Previous section explain in details the conventional structure of t he R2 2 SDF butterfly, this sectio n discuss how to apply the digit slicing technique for t he R2 2 SDF butterfly co mponent i n order to reduce the co mplexity co mputatio n and enhanced the throughp ut. The digit slicing multiplier less R2 2 SDF butter fly has been used the same co mponent o f t he co nventional structure except the complex multiplier whic h has b een rep laced with the d igit slicing multiplier less. The multiplication functional ity is re garded as the most important op eration for most signal processi ng syste ms, bu t it is a co mplex and e xpensive o peration. Many techniques have been intro duced for reducing the size an d improving the spe ed of multipliers. In thi s paper we pr oposed d igit sl icing multiplier less to improve the speed o f the multiplicatio n. The design of the d igit slicing co mplex multiplier has been made by Matlab to prov e the working of the alg orith m than we improved the design to be the digit slicing multiplier less. The concept behind the d igit slicing architecture is a ny binary number ca n be sliced into a few blocks of shorter binary number s, with each block carr ying a d ifferent weight [22]. In this p aper, the 16 bits fixed- poin t 2’s co mplements arithmetic has been cho sen to represent the inp ut data and the twiddle factor , which are singed numbers with ab solute value less than one. Let us conced er t he absolute value of t he complex multiplier input data (the o utput o f Butterfl y II) is x with length of 16 bits has bee n rep resented in 2’s comple ment as: 1 0 2 B k j j x x (13) To rep resent the sliced data, the funda mental sliced algorithm will be p resented as following: ) 1 ( 1 0 2 2 pb b k k pk X x 1 0 , 2 p j j k j k X X (14 ) Where x is sliced into b blocks and p is bit widths p er block and X k,j are all either ones o r zeros except for X k =b - 1, j =p-1 which is zero or minus o ne. T he digit slicing architecture has been ap plied for the complex multiplier input data (the output of Butterfly II) to slice the data to four groups each carrying four bits as shown in Fig. 8 and Fig. 9 . Figure 8. Digit Sl icing Structure Figure 9. Digit Sl icing for the in put E r The complex multiplier realized by three real multipliers, as mention in p revious sectio n the digit slicing has been appli ed for the real multiplier input data to make the multiplication process par allel with the 16 bits t widdle factor as sho wn in Fig. 10 . Therefore the processing time will be reduced . To understand and prove the digit slicing algorit hm the MATLAB design for the complex multiplier and the d igit slicing m ultiplier has b een made and th e resu lt has been compared as sho wn in Fig. 11 and Fig. 12. Digit Slicing Unit 16 bit To 4 bit Real multiplier Real multiplier Real multiplier Real multiplier 4 bits 4 bits 4 bits 4 bits 16 bit x 4 bits Left-shift by 12 Left-shift by 8 Left-shift by 4 Adder Right-shift by 15 16 bit twiddle factor 16 bit Input data 16 bit output Fig. 10 d igit slicing multipl ier structure . Fig. 11 Design of Complex multiplier in MA TLAB. Fig. 12 Design of d igit slicing Compl ex multiplier in MATL AB. Since t he twiddle factors i n FFT are known in advanced therefore the multiplication po ssibility for the 16 bits twidd le factor and multiply b y 4 bits input d ata will be 1 6 possibilities can be stored in one RAM for each t widdle factor. T his design will im pro ve the d igit slicing m ultiplier to be d igit slicing multiplier less which has been replace d with t he conventional multiplier as sho wn in Fig. 13. Butterfly I 8 Butterfly II 4 Digit Slicing Multiplier less Butterfly I 2 Butterfly II 1 -j -j Fig. 13 16 -point R22 SDF FFT structure with digit sli cing multiplier The d esign of the digit slici ng multiplier less consists of one lookup tab le (ROM) shift and adder to per form the outp ut as sho wn in Fig 14. and Fig. 15. T o generate the lookup ta ble data (the multiplication r esult possibilities), which are 16 different resu lts, a special MATLAB p rogram ha s bee n written b y applying th e d igit-slici ng algorithm for all th e possible numbers for the inp ut data (4 bits) fro m “0000” to “1111” to perform all the possibi lities for the multiplicati on result. T he stor age of all these possibilit ies in one ROM allows the design to perform the multiplication process without any real multiplier. Digit Slicing Unit 16 bit To 4 bit ROM For all the possibilities for the multiply The input 4 bits With W 16 bit 4 bits 4 bits 4 bits 4 bits Left-shift by 12 Left-shift by 8 Left-shift by 4 Adder Right-shift by 15 16 bit output 16 bit Input data Fig. 14 d igit slicing multipl ier less structure Fig. 15 desig n of digit slicing m u ltiplie r l ess in MATL AB The Verilog HD L code in Xilinx ISE e nvironment was derived to describe the P ipeline Dig it-Slicing Multiplier -Less Radix 2 2 DIF SD F Butterfly functionality and was downloaded to Virtex II FPGA b oard. Consequently, the Virtex-II FG45 6 P roto board was used to impleme nt and test the design on the real hard ware. V. RESULT Tw o d ifferent modules were implemented for R2 2 SDF DIF FFT bu tterfly. The f irst module uses the conventional architecture for the butterfly w here the twiddle factors are stored in ROM and called by the butter fly to be multiplied with the inputs b y utilising the dedicated high speed multiplier equipped with the Virte x-II FPGA. The other module uses the p ipelined digit-slici ng multiplier-less architect ure to per form the multiplicatio n with E 16 bits Digit Slicing Unit 16 to 4 b its 4 bits E1 4 bits 4 bits 4 bits The input Data fo r the butterfly 16 bits E2 E3 E0 the twiddle factor. B oth mod ules were built and te sted in MATLAB as i ndicated in pr evious section, t hen co ded in Verilog and synthesized by using the XST -Xilinx Synthe sis Technology tool. T he tar get FPGA was X ilinx Virtex - II XC2V500-6-FG456 FPGA. T he ModelSim simulation result of Pip eline Digit-Slicing Multiplier-Less Radix 2 2 DIF SDF Butterfly is s hown i n Fig. 16 , while the s ynthesis results fo r the two models are presented in Tab le 1, which demonstrates the hard ware specifications for the design. It indicates the maximum clo ck frequency o f 555 .75 MHz for Pipeline Digit - Slicing Mult iplier-Less Radix 2 2 DIF SDF B utterfly as well as the Pipelined Digit-slicing Single M ultiplier-less for the butterfly with a perfor mance of t he maximu m clock frequenc y of 60 9.980 MHz. Meanwhile, Fig. 17 sho ws t he RTL schematic for the P ipeline Digit-Slicing Multiplier- Less Rad ix 2 2 DIF SDF Butterfly. Fig. 16 Model Sim simulatio n result of Pipel ine Digit-Slicing Multi plier-Less Radix 2 2 DIF SDF Butterfly Fig. 17 RTL sch ematic fo r the Pipeline Digit-Slicing Multiplier-L ess Radix 2 2 DIF S DF Butterfly Table 1: Hardw are specifications of the d igit -slicing butterfly Xilinx Virtax-II FPGA XC2v250-6FG456 Total equivalent gate count for design Maximum Frq . MHz Conventional bu tterfly 18.408 200.102 Pipeline Digit-Sl icing Multiplier-Le ss Radix 2 2 DIF S DF Butterfly 32,146 555.75 Conventional 16 bits Multiplier 4.131 220.160 Pipeline Digit-Sl icing Multiplier L ess 16 bits 6.483 609.980 VI . C ON CLUSION This stud y presented of FPGA Impleme ntation o f Pipeline Digit-Slicing M ultiplier-Less Radix 2 2 DIF SDF Butterfly for FFT Structure . The implemen tation ha s b een co ded in Ver ilog hardware d escriptive la nguage and was tested on Xili nx Virtex-I1 XC2V500-6- FG456 prototy ping FPGA board. A maximum clock freque ncy o f 555 .75MHz has been obtained from the synthesis rep ort for the Pipeline Digit -Slicing Multiplier-Less Radi x 2 2 DIF SDF Butterfly which is 2. 8 time faster than the con ventional butterfl y. It can be concluded that FPGA Implementatio n of Pipeline Digit -Slicin g Multiplier - Less Radix 2 2 DIF SDF Butterfly for FFT Structure is an enabler in solving proble ms that a ffect co mmunications capability in FFT and possesses huge p otentials for futu re related works and r esearch areas. R EFERENCES [1] A. V. Oppenheim, R. W. Schafer, and J. R . Buck, Discrete-time signal processing, 2 e d. Upper Saddl e River, N.J.: Pre ntice Hall, 1999. [2] G. D. Bergl and, "A radix-eight f ast-Fourier transf orm subroutine fo r real-v alued series.," IEEE Trans. Audio Electroacoust, vol. 1 7, pp. 138 - 144, 1969. [3] R. C. Single ton, "An algorithm for comp uting the mixed ra dix fast Fourier transform," Audio and Electroacoustics, IEEE Transactions on vol. 17, pp. 93-10 3, 1969. [4] D. P. Kolba and T. W . Parks, "A prime factor FF T algorithm using high - speed convolution," IEEE Trans Acoust . Speech, Signal Process, vol. 25, pp. 281-294, 1977. [5] A. R . Var konyi-Koczy, "A re cursive Fast Fourier Transform algorithm," I EEE Trans. Circuits System, vol. 42, pp. 614-616, 1995. [6] Y. Wang, Y. , Y. J. Tang, J. G. Chung, and S. S. Song, "Novel memory refere nce reduction methods for FFT implementation on DSP processors," I EEE Trans. Signal Proce ss, vol. 55, pp. 2338-2349, 2007. [7] Y. Zhou, J. M. Noras, and S. J. Shephend, "Nove l design of multiplier- less FFT proce ssors," Signal Proc., vol. 87, pp. 140 2-1407, 2007. [8] B. M ahmud and M. Othm an, "F PGA imple mentation of a canonic al signed digit m u ltiplier-l ess based FFT Processor for wireless communication applications," in ICSE2 006 Proc Kuala L umpur, Malaysia, 2006, pp. 641-645. [9] B. M. Baas, "An approach to low-powe r, high-performance fast fourier transform processo r design," in Electrical Engineering vol . Ph.D: Stanford Unive rsity 1999, p. 169. [10] Y. P. Hsu and S. Y. Lin, "Parallel-computing approach for FFT implementatio n on Digital Signal Processo r (DSP)," World Acad. Sci., Eng. Technol ., vol. 42, pp. 587-591, 2 008. [11] T. Sansaloni, A. P´erez-Pascual, V. Torres, and J . Valls, "Effi cient pipeline FFT proce ssors for WLAN MIMO-OFDM systems," Electronics L etters, vol. 41, pp. 1 043 – 1044, 2005. [12] L . R. Rabiner an d B. G old, Theory and application of digital signal processing. Engle wood Cliffs, N.J.: Prentice-Hall, 1975. [13] E. H. W old a nd A. M. De sp ain, "Pipeline and parallel-pipeline FFT processors for VLSI implementation " IEEE Transactions on Computers, vol. 33, pp. 414 – 426, 1984. [14] G. Bi and E. V. Jones, "A pi peline d FFT p rocesso r for word -sequential data," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, pp. 1982 - 1985, 1989 [15] S. He and M. Torkel son, "A new approach to pipeline FFT processor," in Parall el Processing Symposium, Procee dings of IPPS '96, The 10th International Ho nolulu, HI , USA, 1996, pp. 766 - 770 [16] L . Yang, K. Z h ang, H. Liu, J. Huang, and S. Hu ang, " An Efficient Locally Pipelined FFT Processor," IEEE Transactions on Circuits and Syste ms II: Express Briefs, vo l. 53, pp. 585 - 589, 2006. [17] S. He and M. Torkel son, "Designing pipeline FFT processor for OFDM (de)modulation," in Inter national Symposium on Signals, Systems, and Electronics (I SSSE'98), 1998, pp. 257 - 262 [18] M. A. B. Nun and M. E. Woodward, "A modular approach to the hardware implementation of di gital filters " Radio an d Electronic Engineer v ol. 46, pp. 393 - 400 197 6 [19] A. Peled and B . Liu, Digital signal processing : theo ry, design, and implementatio n. New Yo rk: Wiley, 1976. [20] Z. A. M. Sharrif, "Digit slicing architecture for real time di gital filters." vol. Ph.D UK : Loughborough U niversity, 1980. [21] S. A. Samad, A. R agoub, M. Othm an, and Z. A. M. Shariff, "Impleme n tation of a high speed Fast Fourier Transform VLSI chip " Microelectro nics Journal, vol. 2 9, pp. 881-887 1998. [22] Yazan Samir and T. Rozita, "Th e Effect Of The Digit Slicing Architecture On The FFT Bu tterfly ," in 10th International Conference on Infor mation Science, Signal Processing and their Applications (ISSPA 2010) K u ala Lump ur, Malaysia, 20 10, pp. 802-205.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment