Short Course

    Fundamentals of Speech Recognition (Summer 2008)

Lectures:

Lecture 1: Overview of Automatic Speech Recognition: (Lecture1.pdf) -- 6 to a page: (Lecture1_6tp.pdf)

 

Lecture 2: Speech Production in Humans: (Lecture2.pdf) -- 6 to a page: (Lecture2_6tp.pdf)

 

Lecture 3: Speech Perception: (Lecture3.pdf) -- 6 to a page: (Lecture3_6tp.pdf)

 

Lecture 4: Fundamentals of Pattern Recognition: (Lecture4.pdf) -- 6 to a page: (Lecture4_6tp.pdf)

 

Lecture 5: Linear Classifiers, Neural Networks: (Lecture5.pdf) -- 6 to a page: (Lecture5_6tp.pdf)

 

Lecture 6: Time and Frequency Domain Processing Methods: (Lecture6.pdf) -- 6 to a page: (Lecture6_6tp.pdf)

 

Lecture 7: Distortion Measures:  (Lecture7.pdf) -- 6 to a page: (Lecture7_6tp.pdf)

 

Lecture 8: Time Alignment and Normalization: (Lecture8.pdf) -- 6 to a page: (Lecture8_6tp.pdf)

 

Lecture 9: The Hidden Markov Model: (Lecture9.pdf) -- 6 to a page: (Lecture9_6tp.pdf)

 

Lecture 10: Connected Word Models: (Lecture10.pdf) -- 6 to a page: (Lecture10_6tp.pdf)

 

Lecture 11: Large Vocabulary Speech Recognition: (Lecture11.pdf) -- 6 to a page: (Lecture11_6tp.pdf)

 



Speech Files:

 

test_16k.wav: (test_16.wav)

ah.wav: (ah.wav)

should.wav: (should.wav)

s3.wav: (s3.wav)

s5.wav: (s5.wav)

we_were: (we were away a year ago_lrr.wav)

s3_pitch_file: (pp3.mat)

s5_pitch_file: (pp5.mat)

s1.wav: (s1.wav)

s1_pitch_file: (pp1.mat)

s2.wav: (s2.wav)

s2_pitch_file: (pp2.mat)

s4.wav: (s4.wav)

s4_pitch_file: (pp4.mat)

s6.wav: (s6.wav)

s6_pitch_file: (pp6.mat)

 

tidigits files:

1:(1A.waV), (1B.waV); 2:(2A.waV), (2B.waV); 3:(3A.waV), (3B.waV); 4:(4A.waV), (4B.waV); 5:(5A.waV), (5B.waV)

6:(6A.waV), (6B.waV); 7:(7A.waV), (7B.waV); 8:(8A.waV), (8B.waV); 9:(9A.waV), (9B.waV); oh:(OA.waV), (OB.waV)

zero:(ZA.waV), (ZB.waV)

tidigits training set, endpointed: (isolated_digits_ti_train_endpt.zip)

tidigits testing set, endpointed: (isolated_digits_ti_test_endpt.zip)

 

cepstral coefficient files:

1:(cc_tidig_endpt_1.mat), 2:(cc_tidig_endpt_2.mat), 3:(cc_tidig_endpt_3.mat), 4:(cc_tidig_endpt_4.mat)

5:(cc_tidig_endpt_5.mat), 6:(cc_tidig_endpt_6.mat), 7:(cc_tidig_endpt_7.mat), 8:(cc_tidig_endpt_8.mat)

9:(cc_tidig_endpt_9.mat), oh:(cc_tidig_endpt_O.mat), zero:(cc_tidig_endpt_Z.mat)

 

dtw train and test files:

(train.mat), (test.mat)

 

template files for 11 digits (1-9,oh,zero):

1:(template_isodig_1.mat), 2:(template_isodig_2.mat), 3:(template_isodig_3.mat), 4:(template_isodig_4.mat)

5:(template_isodig_5.mat), 6:(template_isodig_6.mat), 7:(template_isodig_7.mat), 8:(template_isodig_8.mat)

9:(template_isodig_9.mat), oh:(template_isodig_10.mat), zero:(template_isodig_11.mat)

 

lrr digit training  and testing files (endpointed files):

    training set: (train.zip);  testing set: (test.zip)

    list of training files: (files_lrrdig_isodig_train_endpt.mat)

    list of testing files: (files_lrrdig_isodig_test_endpt.mat)

 

lrr digit training and testing files (unendpointed files):

    training set:(digits_lrr_train_orig.zip)

    testing set: (digits_lrr_test_orig.zip)

 


Matlab Files:

 

loadwav.m: (m file)

savewav.m: (m file)

loadraw.m: (m file)

saveraw.m: (m file)

grayscale.m: (m file)

fxquant.m: (m file)

pspect.m: (m file)

play_file.m: (m file)

 

HW7 LPC Analysis code: (test_lpc.m)

autocorrelation method code: (autolpc.m)

durbin solution code: (durbin.m)

cholesky solution code: (cholesky_full.m), (cholesky.m)

lattice solution code: (lattice.m)