# INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 # COMPRESSORS BASED RNS-TO-BINARY CONVERTER FOR MODULI-SET {2<sup>k</sup>-1, 2<sup>k</sup>, 2<sup>k</sup>+1} C. Praise Amulya<sup>1</sup>, K. Neelima<sup>2</sup> <sup>1</sup>PG Student, VLSI, Sree Vidyanikethan Engineering College, Tirupati, Chittoor, A.P, India. **Abstract:** - RNS based signal processing is an efficient way alternate to the conventional methods due to its parallel arithmetic operations and small data size. This paper presents the design of the Compressor based RNS to Binary reverse converter which is the bottleneck in an RNS based signal processing system. Specifically, it describes a pipelined parallel prefix based modular adder which is used to implement the reverse converter for the $\{2^k-1, 2^k, 2^k+1\}$ moduli set based RNS system. Two modulo adders and their performances are also discussed. Keywords: CPA modular adder, Compressor, Ling Adder, Area, Delay, Area-Delay Product. #### 1. Introduction Nowadays many consumer electronics products and devices (e.g. Tablet) use extensive DSP techniques to implement operations such as digital filtering and image processing. So it is important to have efficient signal processing techniques, and Residue Number System (RNS) is one efficient alternative method. The main advantages of RNS lies in its small data size representation, and arithmetic operations can be performed in a parallel manner. For implementing the RNS to binary converter, the moduli set choice is important, as the speed and complexity of the resulting conversion algorithm depend on the chosen moduli set. This paper is analyzed for {2<sup>k</sup>-1, 2<sup>k</sup>, 2<sup>k</sup>+1} moduli set. This special moduli set enables the reverse conversion to be performed with reduced hardware complexity. In this paper, we examine the efficient implementation of the modulo adders and compressor used for this class of moduli set. Section II describes RNS representation and its uses. Section III describes a theorem which is theoretical explanation of converting of RNS to Binary. Section IV represents existing system. Similarly section V describes proposed system where area-delay product will be less than existing system. Section VI describes Implementation of proposed system and section VII represents results. # 2. RNS Representation RNS representation [1] of a number is derived based on a set of relative prime numbers known collectively as a moduli set $\{m_1, m_2, ..., m_n\}$ . For example, using the 3-moduli set $\{7,8,9\}$ , the RNS representation of the decimal number X=169 is $<1,1,7>_{7,8,9}$ which is derived from the modulo operations: $|169|_7 = 1$ ; $|169|_8 = 1$ ; $|169|_9 = 7$ . Similarly, decimal number Y=254 in RNS is equal to $<4,0,1.>_{7,8,9}$ . The operation (169+424) = 593 can be equally performed in RNS as $<1+4,1+0,7+1>_{7,8,9} = <5,1,8>_{7,8,9}$ . Advantages of RNS representation are operations can be done parallel, carry free operations and fault tolerant. <sup>&</sup>lt;sup>2</sup>Assistant Professor, ECE Dept., Sree Vidyanikethan Engineering College, Tirupati, Chittoor, A.P, India. # 3. New Chinese Remainder Theorem (CRT) The statement of the Chinese Remainder Theorem (CRT) [3] is as follows Given a set of pair-wise relatively prime moduli $\{m_1, m_2, ..., m_n\}$ and a residue representation $\{r_1, r_2, ..., r_n\}$ in that system of some number X i.e. $r_i = |X|_{m_i}$ that number and its residues are related by the equation: $$|X|_{M} = \left| \sum_{i=1}^{n} r_{i} |M_{i}^{-1}|_{m_{i}} M_{i} \right|_{M}$$ Where M is the product of the $m_i$ 's, and $M_i = M/m_i$ . If the values involved are constrained so that the final value of X is within the dynamic range, then the modular reduction on the left-hand side can be omitted. Relation between the number $r_i$ and its inverse $r_i^{-1}$ is as follows $$(r_i \times r_i^{-1}) mod \ m_i = 1$$ Then $||M_i^{-1}|_{m_i} M_i|_{m_i} = 1$ Since $all m_i$ ,'s are relatively prime, the inverses exist $$\begin{split} \bar{X}_i &= |M_i^{-1}|_{m_i} M_i \\ X_i &= r_i \bar{X}_i = r_i |M_i^{-1}|_{m_i} M_i \\ X &= \sum_{i=1}^n X_i = \sum_{i=1}^n r_i |M_i^{-1}|_{m_i} M_i \end{split}$$ Example Consider the moduli-set $\{3,4,5\}$ . To find the conventional representation of the residue-set $\{0,2,3\}$ with respect to the given moduli-set using the CRT, we first determine $M_i$ 's: $$M_1 = \frac{M}{m_1} = \frac{3 \times 4 \times 5}{3} = 20$$ $$M_2 = \frac{M}{m_2} = \frac{3 \times 4 \times 5}{4} = 15$$ $$M_3 = \frac{M}{m_3} = \frac{3 \times 4 \times 5}{5} = 12$$ and their inverses $$\begin{split} |M_1 \times M_1^{-1}|_3 &= 1 \\ |20 \times M_1^{-1}|_3 &= 1 \\ M_1^{-1} &= 2 \\ |M_2 \times M_2^{-1}|_4 &= 1 \\ |15 \times M_2^{-1}|_4 &= 1 \\ M_2^{-1} &= 3 \end{split}$$ $$\begin{aligned} |M_3 \times M_3^{-1}|_5 &= 1 \\ |12 \times M_3^{-1}|_5 &= 1 \\ M_5^{-1} &= 3 \end{aligned}$$ **Using Equation** $$X = \left| \sum_{i=1}^{3} r_i |M_i^{-1}|_{m_i} M_i \right|_{60} = \left| 0 \times 20 \times 2 + 2 \times 15 \times 3 + 3 \times 12 \times 3 \right|_{60} = 18$$ We notice from Equation that implementing the CRT requires three main steps: - 1. Obtaining $M_i$ 's and their inverses $M_i^{-1}$ 's. - Multiply-and-Accumulate operations - 3. Modular reduction. ### 4. Existing System Figure 1: Existing System[4] The important stages of the converter consists of Bit organizing module [6], 2k parallel full adders (FAs) that form a 2k-bit Carry-save adder (CSA), one 2k-bit modular adder based on a 2k-bit Carry Propagate adder (CPA) [5], and one module to remove redundant '0's value representation [2] output by the conversion circuit. Compressors [7] are used to minimize delay and area which leads to increase the performance of the overall system. So in the place of full adder 3:2compressor can be used. The main delay of the converter lies in the CPA modular adder, specifically, due to the latency in the generation of the carry-out bit (Cout) within the CPA adder. As shown in figure. 1, the Cout bit is needed to be looped back to the CPA's carry-in (Cin) in order to perform the modular operation. So, it is important to reduce the propagation delay of the carry bit of the modular adder. This paper presents optimization of 2k parallel full adders (FAs) and CPA modular adder to address this issue, with the aim to get better area delay product #### 5. Proposed System The proposed circuit replaces CPA modular adder and Full adders with parallel prefix Ling adder [6] where the carry bit can be pre-calculated using a propagate (pi) bit and a generate (gi) bit, which are in turn generated directly from its input bits (ai and bi) without the need to wait for the completion of the sum (si) and 3:2 compressors [7] generally designed by XOR gates and multiplexers. A pipeline mechanism is added for reducing the delay of the system. Among the parallel prefix adders, the Ling parallel prefix adder is the most efficient for the modulo addition operation. In the Ling adder, its inputs ai and bi bits are the sum bit and the carry bit output by the preceding CSA stage (like the arrangement shown in Figure 1). Figure 2 shows the connections of the Ling adder to perform the modulo addition function required for the converter. Figure 2: Ling's parallel prefix based Modular Adder A 3-2 compressor has three inputs X1, X2, X3 and generates two outputs, the sum and the carry bits. The block diagram of 3-2 compressor is shown in figure 3. Figure 3: (a) A 3-2 Compressor (b) 3-2 Compressor Truth table The internal architecture of 3-2 Compressor using XOR and Multiplexer is shown in figure 4 Figure 4: Internal Architecture of compressor The Internal architectures of 3-2 compressor shown in figure 2(a), it has two XOR gates in the critical path. The sum output is generated by the second XOR and carry output is generated by the multiplexer (MUX). The equations governing the conventional 3-2 compressor outputs are shown below: $$Sum = X1 \oplus X2 \oplus X3$$ $$Carry = (X1 \oplus X2) \cdot X3 + \overline{(X1 \oplus X2)} \cdot X1$$ # 6. Implementation The proposed circuit consists of bit organizing module, 2k parallel 3-2 compressors, ling adder and zero redundancy removal module with pipeline mechanism. This is shown in the figure 5: Figure 5: Proposed system # 7. Results And Discussion The Proposed system is simulated and verified using Verilog HDL in Xlinix ISE 10.1 for the target device xc3s500e-5fg320. The below simulation result shows the output of the system. Figure 6: Simulation Result of Proposed system The following Table represents the comparison of various methods in terms of area, delay and area-delay product. | TABLE- 1: Comparison | of Different Methods of RN | NS to Binary Reverse Converter | |----------------------|----------------------------|--------------------------------| | | | | | Parameters / Methods | Slices | LUT's | IO's | Bonded<br>IOB's | Delay (ns) | Area-Delay<br>Product | | |-----------------------------|--------|-------|------|-----------------|------------|-----------------------|--| | With Full adders (FA) | | | | | | | | | CPA modular adder | 24 | 42 | 24 | 16 | 12.686 | 304.464 | | | Ling Adder with Pipeline | 10 | 17 | 26 | 17 | 8.157 | 85.17 | | | With Compressors | | | | | | | | | CPA modular adder | 24 | 43 | 24 | 16 | 12.686 | 304.464 | | | Ling Adder with<br>Pipeline | 8 | 14 | 26 | 17 | 9.483 | 75.864 | | From the synthesis results, compressors based Ling adder with pipeline has less area-delay product when compared to CPA modular adder. #### 8. Conclusion This paper analyzes the RNS to Binary converter implementation issues for the $\{2^k-1, 2^k, 2^k+1\}$ moduli set, which can be used for RNS based digital signal processing purposes. The CPA modulo adder bottleneck incurred in the architecture of the existing design is identified. Compressors and Ling parallel-prefix based modulo adder with pipeline mechanism are then proposed to overcome the limitations. Simulation result and synthesis results confirm that the proposed system has better area- delay product than existing system. #### REFERENCES - [1] K.M. Karthik and C. H. Vun, "A High Performance Hardware Based RNS-to-Binary Converter" 2014 IEEE International Conference on Consumer Electronics (ICCE). - [2] Z. Wang, G.A. Jullien and W.C. Miller, "An improved Residue to binary converter" *Circuits and Systems I: Fundamental Theory and Applications, IEEE Transactions on*, vol.47, no.9, pp.1437-1440, Sep 2000. - [3] Omar Abdelfattah "Data Conversion in Residue Number System" internal document McGill University Montreal, Canada January 2011. - [4] K. M. Karthik, C.H. Vun, "Efficient Reverse Converters Design for RNS based Digital Signal Processing System", Internal document, SCE,NTU. - [5] G. Dimitrakopoulos, D.G. Nikolos, H.T.Vergos, D. Nikolos, "New Architecture for Modulo 2N-1 Adders". IEEE International conference on components circuits devices and systems, 2005, pp.1 - [6] Andraos and H. Ahmed, "A new efficient Memoryless residue to binary converter", IEEE trans. Circuit Syst., vol. CAS-35, pp. 1157-1158,1988. - [7] Sanjeev Kumar and Manoj Kumar," Low Power High Speed 3-2 Compressor" International journal of Electrical, Electronics and Mechanical controls ISSN (online): 2319-7509 - **C. Praise Amulya**: She is currently Studying M.Tech-VLSI Sree Vidyanikethan Engineering College, Tirupati. Her areas of interest are Digital System Design, VLSI Design. - **K. Neelima**: She is currently working as an Assistant Professor in ECE department of Sree Vidyanikethan Engineering College, Tirupati. She has completed M.Tech in VLSI Design, in Satyabhama University. Her research areas are RFIC Design, Digital Design, and VLSI Signal Processing.