# VLSI DELTA-SIGMA CELLULAR NEURAL NETWORK FOR ANALOG RANDOM VECTOR GENERATION

Gert Cauwenberghs

Department of Electrical and Computer Engineering The Johns Hopkins University, Baltimore, MD 21218-2686 E-mail: gert@bach.ece.jhu.edu

# ABSTRACT

We present a cellular neural network architecture for parallel analog random vector generation, including experimental results from an analog VLSI prototype with 64 channels. Nearest-neighbor coupling between cells produces parallel channels of uniformly distributed random analog values, with statistics that are truly uncorrelated across channels and over time. The cell for each random channel emulates an integrating nonlinearity essentially implementing a delta-sigma modulator, and measures 100  $\mu$ m × 120  $\mu$ m in 2  $\mu$ m CMOS technology. Applications include analog encryption and secure communications, analog built-in self-test, stochastic neural networks, and simulated annealing optimization and learning.

# 1. INTRODUCTION

On-line random analog signal generation is an essential component in many of today's analog VLSI systems for signal or information processing. An on-line supply of random analog vectors comes handy, for instance, to support testing and characterization of the hardware, or as part of the implemented algorithms. Examples of applications include encryption and secure communications [1], analog VLSI built-in self-test [2], and neural computation [3, 4], simulated annealing optimization [5] and stochastic model-free learning [6, 7].

Most commonly used in VLSI are arrays of random binary sources implemented with linear feedback shift registers (LFSR) [8, 9] or cellular automata (CA) [10, 11], which yield compact and scalable parallel VLSI architectures [12]. High-bandwidth, lowpower analog noise generators in VLSI are obtained by means of chaotic oscillators [13, 14], or through recursion of a nonlinear map such as the logistic map or a linear congruential map [15, 16].

In this paper, we consider cellular arrays of cascaded deltasigma modulators for the purpose of random analog vector generation. The particular form of nonlinear coupling between cells not only avoids correlations across cells, but in addition produces a truly random sequence in the sense that the outcome of a cell at a given time is statistically independent of its history. The interactions are nearest-neighbor as in cellular automata, and permit a simple scalable and parallel VLSI architecture. Our motivation to study these structures is inspired by remarkable noise-shaping properties observed in "MASH" cascade structures of delta-sigma modulators [17, 18, 19], as used for stable higher-order oversampled A/D conversion [20, 21]. The following section introduces the basic cellular architecture and its variants, and relates delta-sigma modulation to a congruential linear analog version of additive cellular automata. Section 3 presents a compact analog VLSI implementation, and Section 4 includes experimental results from a 64-channel (and 65-channel) CMOS prototype. Finally Section 5 concludes the results.

# 2. NONLINEAR NOISE-SHAPING AND CELLULAR ARCHITECTURE

The general structure we consider combines additive cellular automata [10] and cellular neural networks [11], together with linear congruential (*i.e.*, modulo residue) maps or, as shown to be equivalent [18], delta-sigma modulation [21]. The interactions between cells are of the form

$$x_i(k+1) = f(\alpha + \beta \sum_{j \in \mathcal{N}(i)} x_j(k)) \tag{1}$$

where  $\mathcal{N}(i)$  defines a neighborhood of cells interacting with cell *i* including itself, and where f(.) defines a nonlinear map. Besides careful choice of the constants  $\alpha$  and  $\beta$ , the form of f(.) is critical to the randomness properties of the sequence  $x_i(k)$ . The map f(.) is defined, as the quantization residue in single-bit delta-sigma modulation:

$$f(x) = x - \text{sign}(x) = x - 1 & \text{if } x > 0; = x + 1 & \text{if } x < 0.$$
(2)

It can be shown [17, 18] that this map is functionally equivalent to a modulo operation, which can be analyzed with the standard rules of residue arithmetic.

#### 2.1. Cascade and Cellular Stuctures

The general template of nearest-neighbor interactions (1) allows to formulate cellular networks of various topologies. The simplest case to be possibly considered is a neighborhood of two cells: one cell and one of its neighbors. With  $\alpha = 0$  and  $\beta = 1$  we obtain

$$x_i(k+1) = f(x_i(k) + x_{i-1}(k)) .$$
(3)

This is functionally equivalent to a "MASH" cascade of first-order, single-bit delta-sigma modulators [22], where the quantization "noise" of the integrator of one stage feeds into the next [20]. Cascaded structures of the MASH type are attractive for stable higher-order oversampled A/D conversion since the modulators do

This work was supported by NSF Career Award MIP-9702346 and ARPA/ONR MURI N00014-95-1-0409. Chip fabrication was provided through MOSIS.



Figure 1: Array of 64 MASH random generating cells. Linear cascaded chain or ring topology implemented on a 2-D grid.

not overload and the "noise" does not appear to correlate with the input at least for constant and sinusoidal inputs [17] and iid random inputs [19].

We consider two special cases of boundary conditions for the cascade structure of N cells  $x_1 \cdots x_N$ : a "chain" topology with a constant input supplied to the first element  $x_1$ , and a "ring" topology with cyclic boundary conditions where the output of the last element  $x_N$  feeds into the input of the first  $x_1$  ( $x_0 \equiv x_N$  in (3)). The ring structure is preferrable because of symmetry which provides more uniform random noise properties across the array, although stability of noise-shaping in the feedback loop is an issue which will be addressed below.

The linear cascaded chain and ring topologies can be implemented in scalable cellular VLSI architectures on either a 1-D and 2-D grid. To realize the chain and ring topologies on a 2-D grid, shown in Figure 1, two sets of linear cascade segments are interleaved in opposing directions, and external connections at the periphery of the array span no more than two adjacent cell spacings on the grid.

Theory on the statistical and dynamical randomness properties of both topologies is presented in [23]. This paper focuses on circuit architectures and VLSI implementation.

#### 3. IMPLEMENTATION

Analog VLSI implementation offers potentially higher integration density and higher energy efficiency than equivalent digital VLSI implementations. Unlike more convential analog designs, where high precision and high noise rejection are primary design constraints, the circuits can be implemented with minimum-size, noisy and imprecise components biased at lower currents.

## 3.1. Architecture

We have adopted an implementation style using low-power, highdensity switched-capacitor circuits, producing a voltage output format,  $V_i(k)$ . In principle, compact alternative realizations using current-mode technology, of the type used in [16], can be derived as well.

The switched-capacitor architecture implementing the MASH cell is shown in Figure 2 (a), and the corresponding signal timing diagram is given in Figure 2 (b). The state  $V_i(k)$ , corresponding to  $x_i(k)$ , is stored across capacitor  $C_2$ . To save power and silicon realestate, the amplifier A serves the dual purpose of accumulator  $\frac{1}{z-1}$  and quantizer f, controlled by the COMP signal. When COMP is



Figure 2: Switched-capacitor MASH modulator cell. (a) Simplified circuit diagram. (b) Timing diagram.

active high, amplifier A compares  $V_i$  with zero, and presents the result (sign of  $V_i$ ) to the accumulator input through the inverting one-bit D/A converter I. When COMP is inactive (low),  $V_i(k)$  is presented to the output OUT. The accumulator functions as a standard switched-capacitor non-inverting integrator [21], where in the sampling phase the capacitor  $C_1$  is precharged to the input, and in the accumulate phase this charge is transferred onto capacitor  $C_2$ . This operation is controlled with signals ACC,  $\phi_1$  and  $\phi_2$ , repeated twice as shown in Figure 2 (b) (Phases *a-b*, and *c-d*). The input to the accumulator is controlled by the SEL signal, which first selects the output from the preceding stage  $V_{i-1}(k)$  presented to IN, and then the output from the comparator. The four-phase operation is summarized as follows:

- *a*: Sample input  $V_{i-1}(k)$  from previous stage;
- b: Accumulate;
- c: Compare with zero; and sample inverted result;
- d: Accumulate; yielding  $V_i(k+1)$ .

Functionally, the first accumulate produces  $x_i(k) + x_{i-1}(k)$ , and the second accumulate subtracts the sign of the first. The net operation thus yields  $f(x_i(k) + x_{i-1}(k))$  as desired.

### 3.2. CMOS Implementation

The transistor-level circuit diagram of the MASH cell is shown in Figure 3. For low-power operation and compatibility with digital interface circuitry, the circuit uses a single supply  $V_{dd}$ , set to 5 V for the experiments. The signal ground level is set to  $V_m = 2$  V, and the signal range is  $\pm 1$  V as determined by the D/A levels,  $V_{ref}^- = 1$  V and  $V_{ref}^+ = 3$  V, symmetric around  $V_m$ . Thus,

$$V_i(k) = V_m + V_{\text{range}} x_i(k) \tag{4}$$

where  $V_{\text{range}} = 1$  V.

The amplifier A is implemented as a single, non-cascoded pseudo-nMOS inverter M1-M2. The relatively low gain of this design is adequate for the purpose of a random generator, where



Figure 3: CMOS switched-capacitor circuit diagram of MASH modulator cell.

linearity and gain errors are less important than power dissipation and size. The virtual ground voltage of the amplifier, used for the precharge in the sampling phase of the accumulator, is obtained from circuit M3-M4, of which the  $V_{cn}$  bias is generated from  $V_{bp}$ . The reason for not precharging directly from the unity-gain connected amplifier is because the accumulator output is needed simultaneously to precharge the next cell, occupying the amplifier. This introduces 1/f noise in the accumulator which otherwise would have been cancelled by a correlated double-sampling technique. Finally, the capacitances  $C_1$  and  $C_2$  are 0.2 pF in 2  $\mu$ m technology, enough to provide adequate matching, and to avoid excessive switch-injection and clock-feedthrough noise contributed by the switches  $\phi_1$  and  $\phi_2$ .

## 4. EXPERIMENTAL RESULTS

Figure 4 shows a micrograph of the Tiny (2.22 mm  $\times$  2.25 mm) double-metal, double-poly 2  $\mu$ m CMOS chip prototyped through MOSIS, which integrates a two-dimensional array of 64 MASH cells configured as shown in Figure 1, plus one extra MASH cell and additional test circuitry. The cell for each random channes measures 100  $\mu$ m  $\times$  120  $\mu$ m in 2 $\mu$ m technology.

All experimental results reported in this paper were obtained from this chip. Experiments were performed on chain and ring topologies with 64 and 65 cells, using the array of  $8 \times 8$  modulators plus the extra modulator.

- *Transfer Characteristics:* The measured transfer characteristic of a single MASH modulator cell, implementing (3), is shown in Figure 5, for a spectrum of input and output voltages in the range of the  $[V_{ref}^-, V_{ref}^+]$  interval. The combined gain errors are in the order of 5%, and their effect on the random statistics is evaluated next.
- *Statistics:* The hypothesis of statistical independence  $p_{i,j}^{k,l}$  across channels and over time was tested experimentally, illustrated in Figure 6 for two concurrent neighboring outputs in the ring topology. The scatter-plot shows  $x_i(k)$  vs.  $x_{i-1}(k)$ , corresponding to the joint probability density which clearly is uniform as expected. For the chain topology, the obtained results were qualitatively similar, except for the first



Figure 4: Micrograph of the 64(+1)-channel VLSI parallel random analog vector generator, including an  $8 \times 8$  array of MASH modulators plus one extra modulator. Dimensions are 2.22 mm × 2.25 mm in 2  $\mu$ m CMOS.



Figure 5: Transfer characteristic of a single MASH modulator cell.

few channels which showed systematic correlations, due to transient effects studied next.

- *Dynamics:* We extensively studied the effect of chain and ring topologies on the transient and steady modulation dynamics. Spectral analysis of experimental data over a 1024-point time window, shown in Figure 7, reveals that effects of limit cycle oscillations or other colored spectral features are limited to no more than the first three stages of the chain, and are absent in the ring.
- System-Level Issues: The operation of the chip has been verified over a range of speeds from 2 Ksamples/s to 50 Ksamples/s per channel. The maximum of 50 Ksamples/s per channel that we obtained is affected by external capacitive loading of the (multiplexed) output which has not been buffered. Measurements of supply currents yield power dissipation levels ranging from 16  $\mu$ W to 245  $\mu$ W per cell, corresponding to 6 nJ of energy dissipated per sample.



Figure 6: Oscilloscope X-Y plot showing outputs from two neighboring channels in the 65-channel ring.



Figure 7: Fourier spectra of cell sequences. (a) First cell in the 65-cell chain, with zero-level  $(V_m)$  input. (b) Second Cell. (c) Fourth Cell. (d) 65-cell ring.

## 5. CONCLUSION

Delta-sigma modulation and residue arithmetic have been combined into a cellular network architecture for parallel analog random vector generation, with statistical properties that are unique to the functional form of interactions across cells and not found in arrays of separate analog random generators: statistical independence over time, and across channels. We have presented cellular architectures, and experimentally verified the results on a 65-channel VLSI prototype.

The cell layout supports the integration of over half a million random generators on a 1 cm<sup>2</sup> die in 0.2  $\mu$ m CMOS technology. Besides the excellent statistical properties, the small size and low energy consumption of the random cell make it particularly well suited for large-scale integrated applications of parallel distributed analog signal and information processing, where an on-line supply of random values is embedded locally with each processing element.

#### 6. REFERENCES

- L.J. Kocarev, K.S. Halle, K. Eckert, U. Parlitz and L.O. Chua, "Experimental Demonstration of Secure Communications via Chaotic Synchronization," *Int. J. Bifurcations and Chaos*, vol. 2, pp 709-713, 1992.
- [2] V.D. Agrawal, "Testing in a Mixed-Signal World," in Proc. 9th IEEE Int. ASIC Conf., IEEE Press, 1996, pp 241-244.
- [3] J. Alspector, B. Gupta, and R.B. Allen, "Performance of a Stochastic Learning Microchip," in *Advances in Neural Information Processing Systems*, San Mateo, CA: Morgan Kaufman, vol. 1, pp 748-760, 1989.
- [4] C.A. Mead and M. Ismail, Eds., Analog VLSI Implementation of Neural Systems, Norwell, MA: Kluwer, 1989.
- [5] B.W. Lee and B.J. Sheu, "Hardware Annealing in Electronic Neural Networks," *IEEE T. Circuits and Systems*, vol. 38 (1), pp 134-137, 1991.
- [6] M. Jabri and B. Flower, "Weight Perturbation: An Optimal Architecture and Learning Technique for Analog VLSI Feedforward and Recurrent Multilayered Networks," *IEEE Transactions on Neural Networks*, vol. 3 (1), pp 154-157, 1992.
- [7] G. Cauwenberghs, "An Analog VLSI Recurrent Neural Network Learning a Continuous-Time Trajectory," *IEEE Transactions on Neu*ral Networks, vol. 7 (2), March 1996.
- [8] R.C. Tausworthe, "Random Numbers Generated by Linear Recurrence Modulo Two," *Math. Computation*, vol. 19, pp 201-209, 1965.
- [9] S.W. Golomb, *Shift Register Sequences*, San Francisco, CA: Holden-Day, 1967.
- [10] S. Wolfram, "Statistical Mechanics of Cellular Automata," *Rev. Modern Phys.*, vol. 55, pp 601-644, 1983.
- [11] P.L. Venetianer, P. Szolgay, K.R. Crounse, T. Roska and L.O. Chua, "Analog Combinatorics and Cellular Automata – Key Algorithms and Layout Design," *Int. J. Circuit Theory and Applications*, vol. 24 (1), pp 145-164, 1996.
- [12] J. Alspector, J.W. Gannett, S. Haber, M.B. Parker and R. Chu, "A VLSI-Efficient Technique for Generating Multiple Uncorrelated Noise Sources and Its Application to Stochastic Neural Networks," *IEEE T. Circ. Syst.*, vol. 38 (1), pp 109-123, 1991.
- [13] A. Rodriguez-Vazquez and M. Delgado-Restituto, "CMOS Design of Chaotic Oscillators Using State Variables – a Monolithic Chua Circuit," *IEEE T. Circuits and Systems II: Analog and Digital Signal Processing*, vol. **40** (10), pp 596-613, 1993.
- [14] L.O. Chua, "Chua's Circuit 10 Years Later," Int. J. Circuit Theory and Applications, 4, pp 279-305, 1994.
- [15] A. Rodriguez-Vazquez, M. Delgado, S. Espejo and J.L. Huertas, "Switched-Capacitor Broad-Band Noise Generator for CMOS VLSI," *Electronics Letters*, vol. 27 (21), pp 1913-1915, 1991.
- [16] M. Delgado-Restituto, F. Medeiro and A. Rodriguez-Vazquez, "Nonlinear Switched-Current CMOS Ic for Random Signal Generation," *Electronics Letters*, vol. 29 (25), pp 2190-2191, 1993.
- [17] , W. Chou, P.W. Wong and R.M. Gray "Multi-Stage Delta-Sigma Modulation," *IEEE T. Info. Theory*, vol. 35 (7), pp 784-796, 1988.
- [18] W. Chou and R.M. Gray, "Modulo Sigma-Delta Modulation," *IEEE T. Commun.*, vol. **40** (8), pp 1388-1395, Aug 1992.
- [19] I. Galton, "Granular Quantization Noise in a Class of Delta-Sigma Modulators," *IEEE T. Info. Theory*, vol. 40 (3), pp 848-859, May 1994.
- [20] T. Hayashi, Y. Inabe, K. Uchimura, T. Kimura, "A Multistage Delta-Sigma Modulator without Double Integration Loop," *ISSCC Tech. Dig. Pap.*, vol. **39**, pp 182-183, 1986.
- [21] J.C. Candy and G.C. Temes, "Oversampled Methods for A/D and D/A Conversion," in *Oversampled Delta-Sigma Data Converters*, IEEE Press, pp 1-29, 1992.
- [22] O.J.A.P. Nys and E. Dijkstra, "On Configurable Oversampled A/D Converters," *IEEE J. Solid-State Circuits*, vol. 28 (7), pp 736-742, July 1993.
- [23] G. Cauwenberghs, "VLSI Cellular Array of Coupled Delta-Sigma Modulators for Random Analog Vector Generation," Proc. 31st Asilomar Conf. on Signals, Systems and Computers, Asilomar CA, Nov. 2-5, 1997.