520.490 Analog and Digital VLSI Systems

RATIO SPECTRUM FOR SPEECH FEATURE EXTRACTION

Yunbin Deng and Deneah Sanders

yunbin@jhu.edu ; dds1@jhunix.hcf.jhu.edu

Shantanu Chakrabartty, Graduate Advisor

shantanu@bach.ece.jhu.edu

 

 

Six-fold ratio spectrum processor. Layout of the chip (top view).

 

Objectives

This chip is used to get the feature of speech signal (the ratio spectrum), which may be used for speech recognition. Our final decision for the implementation was to use filters with different cutoff frequencies, and get the power of all these filtered signals and the original unfiltered signal.  Given a ratio, we get the corresponding frequencies.

 

Specifications

 

The ratio spectrum is defined as:

 

 The ratio spectrum is formed by computing the ratio of the power of a low-pass filtered signal to the power of the original unfiltered signal for each filter cutoff frequency, and it has many advantages over the traditional power spectrum as regards efficient implementation. It has been shown that the derivative of the ratio spectrum is exactly the power spectrum when an ideal low-pass filter is used in the construction [1].

 Our approach to compute the ratio spectrum departs from the usual approach that attempts to build an adaptive filter that tracks a target ratio by adapting the cut-off frequency. A faster transient response and intrinsic stability is obtained by using a filterbank with a set of fixed cut-off frequencies, and interpolating the ratio spectrum frequencies from the output powers. The system diagram in the interpolating approach is as follows:

 

 

 

 

 

 

Padframe Pinout

 

The input speech signal is represented as a voltage with amplitude in the range 0.6-1.4 V and frequency content in the range   (100 Hz –8 KHz).

The outputs are six frequencies corresponding to six ratios.

 

1

Input speech singnal

2~17

Vbias1-Vbias 16 (used to tune the cut-off frequency of low pass filter)

18

Vsat (voltage bias for translinear loop, 1V)

19

Vb2 (voltage bias for translinear loop, 3.96V)

20

Vref (reference voltage, power computation, 1V)

21

Vb (for comparator voltage bias, 1V)

22

Vb1(translinear voltage bias, 3V)

23

Vb3(interpolatation voltage bias, 550mV)

24~30

Output frequency channel 1 to 6

 

Results

  1. Low Pass Filter:

The cut-off frequency is tunable from 50Hz to 10kHz

    1. Cadence simulation result:

 

  

 

 

    1. vbias and cut-off frequency table.

 

Vbias

 (mV)

100

140

160

180

190

200

210

220

Cut-off frequency (Hz)

106

213.8

301.7

425

505.5

598.6

708.8

843.8

Vbias

 (mV)

226

250

280

300

320

330

340

350

Cut-off frequency (Hz)

934.2

1402

2337

3280

4599

5448

6437

7846

 

 

 

2.   Absolute Value Simulation. (The bias voltage is 1V)

 

 

 

 

 

  1. Power Simulation (switched capacitor integrator)

 

 

  1. Voltage Divider Simulation (six equally spaced "ratios" of total power input)

 

 

 

 

  1. Power Select Circuit Simulation. (selection of the two channels with power nearest to the ratio power)

Note: the frequency select circuit is almost identical to the power selection circuit.



 

 

  1. Interpolation Circuit Simulation (linear interpolation of that frequency in an interval of two given frequencies that corresponds to a reference power in an interval of two given power values)

 

 

 

  1. Interpolated Ratio Spectrum--- Top Level Simulation (combines the output of power and frequency selection and interpolation circuits to yield the ratio spectrum given a sampled spectrum using a filterbank with fixed cut-off frequencies)

 

 

References

1.An analog front-end speech processor using the ratio spectrum Harris,J.G.; Shao-JenLim Circuits and Systems, 2000. Proceedings. ISCAS 2000 Geneva.

2. A general translinear principle for subthreshold MOS transistors Serrano-Gotarredona, T.; Linares-Barranco, B.; Andreou, A.G. Circuits and Systems I: Fundamental Theory and Applications