-
Computers & Electrical Engineering. 01/2011; 37:275-284.
-
[show abstract]
[hide abstract]
ABSTRACT: In this work, a fast digital device is defined, which is customized to implement an artificial neuron. Its high computational speed is obtained by mapping data from floating point to integer residue representation, and by computing neuron functions through residue arithmetic operations, with the use of table look-up techniques. Specifically, the logic design of a residue neuron is described and complexity figures of area occupancy and time consumption of the proposed device are derived. The approach was applied to the logic design of a residue neuron with 12 inputs and with a Residue Number System defined in such a way as to attain an accuracy better than or equal to the accuracy of a 20-bit floating point system. The proposed design (NEUROM) exploits the RNS carry independence property to speed up computations, in addition it is very suitable for using look-up tables. The response time of our device is about 8 x T(ACC), where T(ACC) is the ROM access time. With a value of T(ACC) close to the 10 ns allowed by the current ROM technology, the proposed neuron responds within 80 ns, NEUROM is therefore the neuron device proposed in the literature which allows for maximum throughput. Moreover, when a pipeline mode of operation is adopted, the pipeline delay can assume a value as low as about 14 ns. In the case study considered, the total amount of ROM is about 5.55 Mbits. Thus, using current technology, it is possible to integrate several residue neurons into a single VLSI chip, thereby enhancing chip throughput. The paper also discusses how this amount of memory could be reduced, at the expense of the response time.
Neural Networks 04/2005; 18(2):179-89. · 2.18 Impact Factor
-
Comput. J. 01/1998; 41:45-51.
-
IEEE Trans. Computers. 01/1993; 42:962-967.
-
IEEE Trans. Computers. 01/1991; 40:873-878.
-
[show abstract]
[hide abstract]
ABSTRACT: High-computing speed and modularity have made RNS-based arithmetic processors attractive for a long time, especially in signal processing, where additions and multiplications are very frequent. The VLSI technology renewed this interest because RNS-based circuits are becoming more feasible; however, intermodular operations degradate their performance and a great effort results on this topic. In this paper, we deal with the problem of performing the basic operationX(modm), that is the remainder of the integer divisionX/m, for large values of the integerX, following an approximating and correcting approach, which guarantees the correctness of the result.We also define a structure to computeX(modm) by means of few fast VLSI binary multipliers, which is exemplified for 32-bit long numbers, obtaining a total response time lower than 200 nsec. Furthermore, such a structure is evaluated in terms of VLSI complexity and area and time figuresA=[logn,Ön ][\log n,\sqrt n ]
are derived. A simple positional-to-residue converter is finally presented, based on this structure; it improves some complexity results previously obtained by authors.
Journal of VLSI Signal Processing 03/1990; 1(4):257-264. · 0.73 Impact Factor
-
VLSI Signal Processing. 01/1990; 1:257-264.
-
Comput. J. 01/1990; 33:473-475.
-
Computers & Graphics. 01/1986; 10:27-36.
-
Inf. Process. Lett. 01/1984; 18:141-145.
-
[show abstract]
[hide abstract]
ABSTRACT: In many problems, modular exponentiation |xb|m is a basic computation, often responsible for the overall time performance, as in some cryptosystems, since its implementation requires a large number of multiplications.It is known that |xb|m=|x|b|ϕ(m)|m for any x in [1,m−1] if m is prime; in this case the number of multiplications depends on ϕ(m) instead of depending on b. It was also stated that previous relation holds in the case m=pq, with p and q prime; this case occurs in the RSA method.In this paper it is proved that such a relation holds in general for any x in [1,m−1] when m is a product of any number n of distinct primes and that it does not hold in the other cases for the whole range [1,m−1].Moreover, a general method is given to compute |xb|m without any hypothesis on m, for any x in [1,m−1], with a number of modular multiplications not exceeding those required when m is a product of primes.Next, it is shown that representing x in a residue number system (RNS) with proper moduli mi allows to compute |xb|m by n modular exponentiations |xib|mi in parallel and, in turn, to replace b by |b|ϕ(mi) in the worst case, thus executing a very low number of multiplications, namely ⌈log2mi⌉ for each residue digit.A general architecture is also proposed and evaluated, as a possible implementation of the proposed method for the modular exponentiation.
Journal of Systems Architecture.
-
[show abstract]
[hide abstract]
ABSTRACT: The parallelism of computation, that characterizes some operations in residue number systems (RNS), is heavily reduced in operations as division, magnitude and sign detection, since numbers must be converted to the weighted system thus reducing efficiency, in spite of the efforts to speed up the conversion. In this work the problem of detecting the sign of numbers represented in RNS is considered and a procedure is devised, which keeps numbers in residue notation, and requires a redundant modulus mp+1⩾2. A sign detecting circuit is also designed that, merely to speed up the operation, exploits a further redundant modulus mr⩾p in the signed number representation. Circuit response time is evaluated, both from the complexity point of view and in a finite case, where 50 gate delays are estimated for a range [−264,264−1].
Journal of Systems Architecture.