TWiki> CSTR Web>Speak (revision 37)EditAttach

Speak! speech synthesis meeting.

The purpose of these regular informal meetings is to discuss and share progress relating to speech synthesis (audio and visual) research - within CSTR specifically as well as in the field generally. Talks are intended to be short and informal, with an emphasis on discussion, interaction and feedback. Relevant references should be sent round in advance to encourage everyone to contribute.

Everybody with an interest in speech synthesis (audio and visual) research is welcome.

  • Meetings will typically be held in the Instrumented Meeting Room ("IMR" - room 3.07), on Level 3 of the Informatics Forum building (though this may vary on odd occasions.)
  • At present, the standard time for these meetings is Thursdays at 2-3pm.

(NOTE: go to this topic to edit the schedule included below)

Speak! synthesis meetings schedule for 2015/2016

17.09.15 Schedule planning meeting IS15-SpeechSynthesis
24.09.15 no meeting
01.10.15 Felipe - Phase Perception & Source Filter Separation in the Phase Domain
08.10.15 Simon - Fluent Personalized Speech
15.10.15 Gustav - Random Forests for Statistical Speech Synthesis
22.10.15 Srikanth - Thesis proposal / first year review
29.10.15

Sam - Word Embeddings for RNN-based TTS: Wang et al (2015). Optional: Zhu et al (2015) Also optional (a very general introduction, if you find it interesting) Olah's blog post (2014).

05.11.15 Korin - Articulatory-based conversion of foreign accent using DNNs
12.11.15 Zhizheng - context clustering for DNN synthesis http://www.isca-speech.org/archive/interspeech_2015/i15_2212.html
19.11.15 Mirjam - How to compare TTS systems & Objective Intelligibility Assessment of TTS
26.11.15 Christophe - Vowel Enhancement for Esophageal Speech & Individuality-Preserving Spectrum Modification for Articulation Disorders using Phone Selective Synthesis
03.12.15 Srikanth - DNN-based speech synthesis for Indian languages from ASCII text & G2P conversion using LSTM networks
10.12.15 No meeting (Christmas lunch)
14.01.16 New year's planning meeting
21.01.16 Mirjam - Vocoding Challenge evaluation
28.01.16 Gustav - TTS with RVMs (Hong et al., 2015)
04.02.16 No meeting (NST in Sheffield)
11.02.16 Kazuhiro Kobayashi - Statistical singing voice conversion based on Gaussian mixture model
18.02.16 Zhizheng - DBN-based TTS features (Hu and Ling, 2016)
25.02.16 Takenori - Predicting Blizzard Challenge naturalness scores using CNNs
03.03.16 Oliver - Evaluating synthetic speech using EEG - Subjective quality ratings and physiological correlates of synthesized speech (Arndt et al. 2013) & EEG oscillations reflect task effects for the change detection in vocal emotion (Chen et al. 2015)
10.03.16 ICASSP practice session (Rasmus, Tom, Zhizheng, Korin, Gustav)
17.03.16 No meeting (ICASSP)
24.03.16 No meeting (ICASSP)
31.03.16 Post-ICASSP planning meeting
07.04.16 Post-ICASSP planning meeting redux
14.04.16 Joachim - Speech intelligibility: Weighted STOI and twin-HMM in resynthesis for STOI
21.04.16 Gustav - GANs + their application to image generation
28.04.16 Korin - Postfiltering by Ling (using DBNs & Modulation Spectrum)
05.05.16 No meeting - (SSW deadline)
12.05.16 Cassia - High-pitched excitation generation for glottal vocoding
19.05.16 Srikanth - Word embeddings for prosody
26.05.16 Felipe - Modelling unvoiced and voiced waveforms with NNs (suggested: Modelling speech waveforms with NNs)
02.06.16 No meeting (No volunteer)
09.06.16 Rasmus - A Deep Auto-Encoder Based Low-Dimensional Feature Extraction from FFT Spectral Envelopes for Statistical Parametric Speech Synthesis
16.06.16 No meeting (Tom's viva + Google Speech Summit)
23.06.16 No meeting
30.06.16 Christophe - Objective assessment of speech intelligibility for pathological voices Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment & Towards an ASR-free objective analysis of pathological speech (optional: Evaluation and assessment of speech intelligibility on pathologic voices based upon acoustic speaker models)
07.07.16  
14.07.16 No meeting
21.07.16 No meeting
28.07.16 Oliver - Speaker and language factorisation (Fan et al., 2016)
04.08.16 Srikanth - 2nd Year Review Presentation

Details of suggested papers to read:

Acoustic modelling etc:

  • Chunwijitra, Nose & Kobayashi. A speech parameter generation algorithm using local variance for HMM-based speech synthesis. In Proc. Interspeech, 2012

Other:

  • Eyben et al.: Unsupervised Clustering of Emotion and Voice Styles for Expressive TTS. In Proc. ICASSP, 2012

Scratchpad for other suggestions for meeting topics:

* Catherine - Talk on quotation work

* Statistical Text-to-Speech Synthesis with Improved Dynamics. Stas Tiomkin, David Malah; Technion IIT, Israel. Proc Interspeech 2008

* Tomoki Toda's work on voice conversion using less than one sentence of speech

* The Expression and Perception of Emotions: Comparing Assessments of Self versus Others Carlos Busso, Shrikanth S. Narayanan; University of Southern California, USA. In Proc. Interspeech 2008

* Scripted Dialogs versus Improvisation: Lessons Learned About Emotional Elicitation Techniques from the IEMOCAP Database Carlos Busso, Shrikanth S. Narayanan; University of Southern California, USA. In Proc. Interspeech 2008

* ZZT transform (Dutoit's student thesis) - IEEE journal paper (ACTION ON Matthew to find this paper)

-- Main.korin - 12 Sep 2013

Speak! meeting schedules

Topic attachments
I Attachment Action Size Date Who Comment
pdfpdf Brognaux_IS14.pdf manage 266.8 K 25 Sep 2014 - 14:07 Main.mwester pronunciation variation in TTS #2
pdfpdf Kolluru_IS04.pdf manage 270.2 K 25 Sep 2014 - 14:05 Main.mwester pronunciation variation TTS #1
pdfpdf Lecumberri_IS14.pdf manage 434.8 K 25 Sep 2014 - 14:08 Main.mwester pronunciation variation in TTS #3
pdfpdf collobert-2011.pdf manage 726.8 K 11 Jun 2012 - 09:09 Main.s0676515  
Edit | Attach | Print version | History: r39 < r38 < r37 < r36 < r35 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r37 - 17 Sep 2015 - 09:56:55 - Main.cvbotinh
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies