# **Learning on Silicon: Overview**

## **Gert Cauwenberghs**

Johns Hopkins University gert@jhu.edu

520.776 Learning on Silicon http://bach.ece.jhu.edu/gert/courses/776

G. Cauwenberghs

520.776 Learning on Silicon

## **Learning on Silicon: Overview**

### Adaptive Microsystems

- Mixed-signal parallel VLSI
- Kernel machines

## Learning Architecture

- Adaptation, learning and generalization
- Outer-product incremental learning

## • Technology

- Memory and adaptation
  - Dynamic analog memory
  - Floating gate memory
- Technology directions
  - Silicon on Sapphire
- System Examples

## **Massively Parallel Distributed VLSI Computation**



# *Example: VLSI Analog-to-digital vector quantizer (Cauwenberghs and Pedroni, 1997)*

## Neuromorphic

- distributed representation
- local memory and adaptation
- sensory interface
- physical computation
- internally analog, externally digital
- Scalable

throughput scales linearly with silicon area

## Ultra Low-Power

factor 100 to 10,000 less energy than CPU or DSP

# **Learning on Silicon**



## **Adaptation:**

- necessary for robust performance under variable and unpredictable conditions
- also compensates for imprecisions in the computation
- avoids ad-hoc programming, tuning, and manual parameter adjustment

## Learning:

- generalization of output to previously unknown, although similar, stimuli
- system identification to extract relevant environmental parameters

## **Adaptive Elements**

### **Adaptation:**\*

Autozeroing (high-pass filtering)outputsOffset Correctionoutputse.g. Image Non-Uniformity Correctioninputs, outputsEqualization / Deconvolutioninputs, outputse.g. Source Separation; Adaptive Beamforminginputs, outputs

### Learning:

Unsupervised Learning e.g. Adaptive Resonance; LVQ; Kohonen Supervised Learning e.g. Least Mean Squares; Backprop Reinforcement Learning inputs, outputs

inputs, outputs, targets

reward/punishment

## **Example: Learning Vector Quantization (LVQ)**



G. Cauwenberghs

## **Incremental Outer-Product Learning in Neural Nets**

![](_page_6_Figure_1.jpeg)

![](_page_6_Figure_2.jpeg)

**Multi-Layer Perceptron:** 

**Outer-Product Learning Update:** 

- Hebbian (Hebb, 1949):
- LMS Rule (Widrow-Hoff, 1960):
- Backpropagation (*Werbos, Rumelhart, LeCun*):

 $x_i = f(\sum_j p_{ij} x_j)$ 

 $\Delta p_{ij} = \eta \ x_j \cdot e_i$ 

 $e_i = x_i$  $e_i = f'_i \cdot (x_i^{\text{target}} - x_i)$ 

 $e_i = f'_i \sum p_{ij} e_i$ 

G. Cauwenberghs

520.776 Learning on Silicon

# Technology

## **Incremental Adaptation:**

- Continuous-Time:

$$C \frac{\mathrm{d}}{\mathrm{d}t} V_{\mathrm{stored}} = I_{\mathrm{adapt}}$$

- Discrete-Time:

$$C \Delta V_{\text{stored}} = Q_{\text{adapt}}$$

![](_page_7_Figure_6.jpeg)

## Storage:

- Volatile capacitive storage (incremental refresh)
- Non-volatile storage (floating gate)

## **Precision:**

- Only polarity of the increments is critical (not amplitude).
- Adaptation compensates for inaccuracies in the analog implementation of the system.

### **Floating-Gate Non-Volatile Memory and Adaptation**

Paul Hasler, Chris Diorio, Carver Mead, ...

![](_page_8_Figure_2.jpeg)

#### Hot electron injection

- 'Hot' electrons injected from drain onto floating gate of M1.
- Injection current is proportional to drain current and exponential in floating-gate to drain voltage (~5V).

#### Tunneling

- Electrons tunnel through thin gate oxide from floating gate onto high-voltage (~30V) n-well.
- Tunneling voltage decreases with decreasing gate oxide thickness.

#### Source degeneration

- Short-channel M2 improves stability of closed-loop adaptation (Vd open-circuit).
- M2 is not required if adaptation is regulated (Vd driven).
- Current scaling
  - In subthreshold, Iout is exponential both in the floating gate charge, and in control voltage Vg.

# Dynamic Analog Memory Using Quantization and Refresh

**Autonomous Active Refresh Using A/D/A Quantization:** 

![](_page_9_Figure_2.jpeg)

- Allows for an excursion margin around discrete quantization levels, provided the rate of refresh is sufficiently fast.
- Supports digital format for external access
- Trades analog depth for storage stability

## **Binary Quantization and Partial Incremental Refresh**

### **Problems with Standard Refresh Schemes:**

- Systematic offsets in the A/D/A loop
- Switch charge injection (clock feedthrough) during refresh
- Random errors in the A/D/A quantization

#### **Binary Quantization:**

- Avoids errors due to analog refresh
- Uses a charge pump with precisely controlled *polarity* of increments

#### **Partial Incremental Refresh:**

- Partial increments avoid catastrophic loss of information in the presence of random errors and noise in the quantization
- Robustness to noise and errors increases with smaller increment amplitudes

## **Binary Quantization and Partial Incremental Refresh**

![](_page_11_Figure_1.jpeg)

- Resolution  $\varDelta$
- Increment size  $\delta$

- Worst-case drift rate (|dp/dt|) r

- Period of refresh cycle *T* 

 $rT < \delta << \Lambda$ 

### **Functional Diagram of Partial Incremental Refresh**

![](_page_12_Figure_1.jpeg)

- Similar in function and structure to the technique of delta-sigma modulation
- Supports efficient and robust analog VLSI implementation, using binary controlled charge pump

### **Analog VLSI Implementation Architectures**

![](_page_13_Figure_1.jpeg)

- An increment/decrement device I/D is provided for every memory cell, serving refresh increments locally.
- The binary quantizer Q is more elaborate to implement, and one instance can be time-multiplexed among several memory cells

## **Charge Pump Implementation of the I/D Device**

![](_page_14_Figure_1.jpeg)

### **Binary controlled polarity of increment/decrement**

- INCR/DECR controls polarity of current

#### Accurate amplitude over wide dynamic range of increments

- EN controls duration of current
- $V_{b \text{ INCR}}$  and  $V_{b \text{ DECR}}$  control amplitude of subthreshold current
- No clock feedthrough charge injection (gates at constant potentials)

## **Dynamic Memory and Incremental Adaptation**

![](_page_15_Figure_1.jpeg)

G. Cauwenberghs

520.776 Learning on Silicon

### A/D/A Quantizer for Digital Write and Read Access

![](_page_16_Picture_1.jpeg)

#### Integrated bit-serial (MSB-first) D/A and SA A/D converter:

- Partial Refresh:
- Digital Read Access:

Q(.) from LSB of (n+1)-bit A/D conv. *n*-bit A/D conv. - **Digital Write Access:** *n*-bit D/A ; WR ; Q(.) from COMP

### **Dynamic Analog Memory Retention**

![](_page_17_Figure_1.jpeg)

- 10<sup>9</sup> cycles mean time between failure
- 8 bit effective resolution
- 20 μV increments/decrements
- 200 μm X 32 μm in 2 μm CMOS

## Silicon on Sapphire Peregrine UTSi process

![](_page_18_Figure_1.jpeg)

- Higher integration density
  - Drastically reduced bulk leakage
    - Improved analog memory retention
- Transparent substrate
  - Adaptive optics
    applications

## The Credit Assignment Problem or How to Learn from Delayed Rewards

![](_page_19_Figure_1.jpeg)

External, discontinuous reinforcement signal r(t). Adaptive Critics:

- Heuristic Dynamic Programming (Werbos, 1977)
- Reinforcement Learning (Sutton and Barto, 1983)
- TD( $\lambda$ ) (Sutton, 1988)
- Q-Learning (Watkins, 1989)

## **Reinforcement Learning Classifier for Binary Control**

![](_page_20_Figure_1.jpeg)

#### Adaptive Optical Wavefront Correction with Marc Cohen, Tim Edwards and Mikhail Vorontsov

![](_page_21_Figure_1.jpeg)

G. Cauwenberghs

### **Gradient Flow Source Localization and Separation**

with Milutin Stanacevic and George Zweig

![](_page_22_Figure_2.jpeg)

3mm

# The Kerneltron: Support Vector "Machine" in Silicon

Genov and Cauwenberghs, 2001

![](_page_23_Figure_2.jpeg)