Fundamentals of Speech Recognition Course (Winter 2005)

Lectures:

Basics:(basic course material.pdf): 6 charts to-a-page(basic course material_6tp.pdf)

Lecture 1:  Introduction/Overview of Automatic Speech Recognition:(Lecture 1.pdf) : 6 charts to-a-page: (Lecture 1_6tp.pdf)

Lecture 2:  Speech Production--acoustic phonetics, articulatory models:(Lecture 2.pdf) : 6-to-a-page:(Lecture 2_6tp.pdf)

Lecture 3:  Speech Perception--ear models, hearing models, perception models:(Lecture 3.pdf) : 6-to-a-page: (Lecture 3_6tp.pdf)

Lecture 4:  Fundamentals of Pattern Recognition:(Lecture 4.pdf): 6-to-a-page:(Lecture 4_6tp.pdf)

Lecture 5:  Linear and Non-Linear Classifiers:(Lecture 5.pdf):6-to-a-page: (Lecture 5_6tp.pdf)

Lecture 6:  Signal Processing Approaches--acoustic-phonetic methods, pattern recognition methods, statistical methods, artificial neural networks:(Lecture 6.pdf): 6-to-a-page: (Lecture 6_6tp.pdf)

Lecture 7:   Signal Processing Methods--temporal, spectral domains: (Lecture 7.pdf) : 6-to-a-page: (Lecture 7_6tp.pdf)

Lecture 8:   Signal Processing Methods--spectral, cepstral, LPC: (Lecture 8.pdf) : 6-to-a-page: (Lecture 8_6tp.pdf)

Lecture 9:   Signal Processing Methods--LPC, Vector Quantization, auditory models: (Lecture 9.pdf) : 6-to-a-page: (Lecture 9_6tp.pdf)

Lecture 10: Pattern Recognition Applied to ASR--speech detection, distortion measures:(Lecture 10.pdf) : 6-to-a-page: (Lecture 10_6tp.pdf)

Lecture 11: Time Alignment and Normalization, Dynamic Time Warping: (Lecture 11.pdf): 6-to-a-page: (Lecture 11_6tp.pdf)

Lectures 12-13: Hidden Markov Model (HMM) Fundamentals: (Lectures 12-13.pdf): 6-to-a-page: (Lectures 12-13_6tp.pdf)

Lectures 14-15: Speech System Design Issues--source coding, template training, discriminative methods:(Lectures 14-15.pdf) : 6-to-a-page:  (Lectures 14-15_6tp.pdf)

Lecture 16: Connected Word Models--dynamic programming, level building, one pass methods: (Lecture 16.pdf): 6-to-a-page: (Lecture 16_6tp.pdf)

Lecture 17: Large Vocabulary Speech recognition--training, basics, language models, perplexity:(Lecture 17.pdf) : 6-to-a-page: (Lecture 17_6tp.pdf)

Lecture 18: Flexible Speech Understanding: (Lecture 18.pdf) : 6-to-a-page: (Lecture 18_6tp.pdf)


Homeworks:

 

Problem Set 1:(PS1s_asr.pdf) ;  PS1 solution:(PS1s_asr_soln.pdf)

Problem Set 2:(PS2s_asr.pdf) ;  PS2 solution:(PS2s_asr_soln.pdf)

Problem Set 3:(PS3s_asr.pdf) ;  PS3 solution:(PS3s_asr_soln.pdf)

Problem Set 4:(PS4_asr.pdf) ;    PS4 solution:(PS4_asr_soln.pdf)

Problem Set 5:(PS5_asr.pdf) ;    PS5 solution:(PS5_asr_soln.pdf)

Problem Set 6:(PS6_asr.pdf);     PS6 solution:(PS6_asr_soln.pdf)

Problem Set 7:(PS7_asr.pdf) ;    PS7 solution:(PS7_asr_soln.pdf)


Speech Files:

 

test_16k.wav: (test_16k.wav)

ah.wav: (ah.wav)

ah_lrr.wav: (ah_lrr.wav)

vowel_ah_100Hz.wav: (vowel_ah_100Hz.wav)

should.wav: (should.wav)

s5.wav: (s5.wav)

we_were: (we were away a year ago_lrr.wav)

s5_pitch_file:(pp5.mat)

isolated digit training files for HW7: (digits_train.zip)

isolated digit testing files for HW7: (digits_test.zip)

isolated digit training files (raw-no endpoints marked): (digits_train_raw.zip)

isolated digit testing files (raw-no endpoints marked): (digits_test_raw.zip)

isolated tidig files-small subset (unendpointed): (tidigits_isolated_unendpointed.ZIP)

digit cepstral coefficients for VQ: (cepstral_coefficients_digits.ZIP)

train and test files for dtw alignment: (train.mat) and (test.mat)

templates for isolated digits: (templates.ZIP)

single speaker digit training files: (files_lrrdig_isodig_train_endpt.mat)

single speaker digit testing files: (files_lrrdig_isodig_test_endpt.mat)

digit templates for HW6: (templates_digits.zip)

isolated TI digits training files, 8 kHz sampled, endpointed: (isolated_digits_ti_train_endpt.zip)

isolated TI digits testing files, 8 kHz sampled, endpointed: (isolated_digits_ti_test_endpt.zip)

isolated TI digits training files, 16 kHz sampled, unendpointed: (isolated_digits_ti_train.zip)

isolated TI digits testing files, 16 kHz sampled, unendpointed: (isolated_digits_ti_test.zip)


Matlab Files:

 

loadwav.m: (loadwav.m)

savewav.m: (savewav.m)

loadraw.m: (loadraw.m)

saveraw.m: (saveraw.m)

grayscale.m: (grayscale.m)

fxquant.m: (fxquant.m)

pspect.m: (pspect.m)

gaussian.m:(gaussian.m)

 


 

Project Suggestions:

 

            General Project Suggestions: (term projects.pdf)

            HMM Model Estimation Project: (HMM Project.pdf)

            HMM Project .mat files: (hmm_observations_ergodic_random.mat), (hmm_observations_ergodic_skewed.mat)

            HMM Project .mat files: (hmm_observations_left-rt_random.mat), (hmm_observations_left-rt_skewed.mat)

            Project Schedule (UCSB-2005):