TWiki> CSTR Web>Listen>ListenSemester2201112 (30 Jun 2012, Main.aghoshal)EditAttach

2011-12Semester2
18.01.12Planning meeting
23.01.12

Discriminative training of long-span LMs (Arnab)

A. Rastrow, M. Dredze, S. Khudanpur, "Efficient Discriminative Training of Long-span Language Models"

Their earlier paper will be a useful background reading:A. Rastrow, et al. "Hill climbing on speech lattices : A new rescoring framework"

30.01.12CSTR's training setup (Mike and Peter)
06.02.12

Reverberant VTS + background (Liang)

Mark Gales: Gales_hscma_2011.pdf

Their ASRU_2011 paper: Wang_RVTS_ASRU11.pdf

13.02.12

Derivative kernels for noise-robust ASR (Liang)

00119.pdf: Ragni_ASRU_2011

and whispered speech recognition (Cheng yu)

20.02.12

Bottleneck features (Steve):

00042.pdf: 'Convolutive Bottleneck Network Features for LVCSR' - Vesely et al, ASRU 2011

00359.pdf: 'Study of Probabilistic and Bottle-Neck Features in Multilingual Environment' - Grezl et al, ASRU 2011

27.02.12

Feature engineering in deep NNs (Pawel)

seide2011_deep.pdf Feature Engineering in Context-Dependent Deep Neural Networks for Conversational Speech Transcription, Seide et. al, ASRU 2011

05.03.12Postponed: iVectors background (TBC)
12.03.12

iVectors background (Arnab)

Front-End Factor Analysis for Speaker Verification, Dehak et al. IEEE. Trans. ASLP 2011.

i-vectors often get used with probabilistic LDA. So, if we have time, we may get to read a background on PLDA as well:

Probabilistic Linear Discriminant Analysis for Inferences About Identity, Prince & Elder, iCCV 2007.

19.03.12ICASSP runthrough (Erich) + iVector-based discriminative adaptation (TBC)
Break
30.04.12
Planning meeting
07.05.12Pronunciation modelling (Arnab)

Subword-based Automatic Lexicon Learning for Speech Recognition, Mertens & Seneff, ASRU 2011.

A chunk-based phonetic score for mobile voice search, Prabhavalkar & Droppo, ICASSP 2012.

Learning non-parametric models of pronunciation, Hutchinson & Droppo, ICASSP 2011. vinyals_2012.pdf

14.05.12

RDLTs (Peter)

RDLTs in multilingual ASR Karafiat et al, ICASSP 2012

Background reading: Zhang et al

21.05.12

Neural networks 1 (Pawel)

vinyals_2012.pdf Vinylas et.al ICASSP 2012

Background on Hessian Free optimisation: opt2011_vinyals.pdf

28.05.12

Neural networks 2 (Steve)

Auto-encoder bottleneck features, Saineth et al, ICASSP 2012

Understanding how DBNs perform acoustic modelling, Mohamed et al, ICASSP 2012

04.06.12No meeting
11.06.12

Noise robustness (Liang)

Feature space VTS: FVTS_ICASSP12.pdf

background on noise adaptive training: Ozlem_ICASSP09_final.pdf

18.06.12

Select diarization papers from ICASSP 2012 (Mark)

SPEECH OVERLAP DETECTION AND ATTRIBUTION USING CONVOLUTIVE NON-NEGATIVE SPARSE CODING (priority)

LOW-LATENCY SPEAKER DIARIZATION BASED ON BAYESIAN INFORMATION CRITERION WITH MULTIPLE PHONEME CLASSES

25.06.12 Question time with Dong Yu
02.07.12
Sparse filtering (Arnab):

Sparse Filtering, Ngiam, Koh, Chen, Bhaskar and Ng, NIPS 2011.

09.07.12
Gillick, Gillick and Wegmann papers (Peter):

Don’t Multiply Lightly: Quantifying Problems with the Acoustic Model Assumptions in Speech Recognition, Gillick, Gillick and Wegmann, ASRU 2011.

Discriminative Training for Speech Recognition is Compensating for Statistical Dependence on the HMM Framework, Gillick, Wegmann & Gillick, ICASSP 2012.

16.07.12
The KL-HMM (TBC)
Topic attachments
I Attachment Action Size Date Who Comment
pdfpdf 00042.pdf manage 205.7 K 15 Feb 2012 - 20:55 SteveRenals 'Convolutive Bottleneck Network Features for LVCSR' - Vesely et al, ASRU 2011
pdfpdf 0004745.pdf manage 78.5 K 30 Jun 2012 - 16:09 Main.aghoshal Gillick, Wegmann & Gillick, "Discriminative Training for Speech Recognition is Compensating for Statistical Dependence on the HMM Framework", ICASSP 2012
pdfpdf 0005032.pdf manage 136.5 K 18 Jan 2012 - 15:59 Main.aghoshal A. Rastrow, et al. "Hill climbing on speech lattices : A new rescoring framework"
pdfpdf 00071.pdf manage 134.8 K 30 Jun 2012 - 15:58 Main.aghoshal Gillick, Gillick and Wegmann, "Donít Multiply Lightly: Quantifying Problems with the Acoustic Model Assumptions in Speech Recognition", ASRU 2011
pdfpdf 00119.pdf manage 221.1 K 09 Feb 2012 - 14:43 Main.llu Ragni_ASRU_2011
pdfpdf 00214.pdf manage 183.0 K 18 Jan 2012 - 15:54 Main.aghoshal A. Rastrow, M. Dredze, S. Khudanpur, "Efficient Discriminative Training of Long-span Language Models"
pdfpdf 00359.pdf manage 222.6 K 15 Feb 2012 - 20:56 SteveRenals 'Study of Probabilistic and Bottle-Neck Features in Multilingual Environment', Grezl et al, ASRU 2011
pdfpdf FVTS_ICASSP12.pdf manage 233.1 K 06 Jun 2012 - 09:02 Main.llu  
pdfpdf Gales_hscma_2011.pdf manage 308.6 K 31 Jan 2012 - 11:20 Main.llu  
pdfpdf Ozlem_ICASSP09_final.pdf manage 161.4 K 06 Jun 2012 - 09:03 Main.llu  
pdfpdf Wang_RVTS_ASRU11.pdf manage 196.2 K 31 Jan 2012 - 11:24 Main.llu  
pdfpdf dehak-aslp11-front_end_fa.pdf manage 1176.5 K 06 Mar 2012 - 14:17 Main.aghoshal Front-End Factor Analysis for Speaker Verification, Dehak, et al.
pdfpdf hutchinson-icassp11-non_param_pron.pdf manage 115.4 K 02 May 2012 - 11:12 Main.aghoshal "Learning non-parametric models of pronunciation," Hutchinson & Droppo, ICASSP 2011.
pdfpdf mertens-asru12-subword_lex_learn.pdf manage 346.1 K 02 May 2012 - 11:27 Main.aghoshal "Subword-based Automatic Lexicon Learning for Speech Recognition," Mertens & Seneff, ASRU 2011.
pdfpdf opt2011_vinyals.pdf manage 82.1 K 15 May 2012 - 14:13 Main.s1136550 Kyrlov Subspace Descent
pdfpdf prabhavalkar-icassp11-chunk_based.pdf manage 117.4 K 02 May 2012 - 11:19 Main.aghoshal "A chunk-based phonetic score for mobile voice search," Prabhavalkar & Droppo, ICASSP 2012
pdfpdf prince-iccv07-plda.pdf manage 526.6 K 06 Mar 2012 - 14:23 Main.aghoshal Probabilistic Linear Discriminant Analysis for Inferences About Identity, Prince & Elder, ICCV 2007
pdfpdf seide2011_deep.pdf manage 197.1 K 21 Feb 2012 - 14:58 Main.s1136550 Feature Engineering in CD- DNNs for CST
pdfpdf vinyals_2012.pdf manage 143.9 K 15 May 2012 - 13:02 Main.s1136550 Revisiting Recurrent Neural Networks for Robust ASR
Topic revision: r21 - 30 Jun 2012 - 20:54:22 - Main.aghoshal
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies