Speak! speech synthesis meeting.
The purpose of these regular informal meetings is to discuss and share progress relating to speech synthesis (audio and visual) research - within CSTR specifically as well as in the field generally. Talks are intended to be short and informal, with an emphasis on discussion, interaction and feedback. Relevant references should be sent round in advance to encourage everyone to contribute.
Everybody with an interest in speech synthesis (audio and visual) research is welcome.
- Meetings will typically be held in the Instrumented Meeting Room ("IMR" - room 3.07), on Level 3 of the Informatics Forum building (though this may vary on odd occasions.)
- At present, the standard time for these meetings is Thursdays at 2-3pm.
(NOTE: go to
this topic to edit the schedule included below)
Speak! synthesis meetings schedule for 2014/2015
25.09.14 |
Schedule planning meeting (bring ideas + be ready to volunteer!) |
02.10.14 |
Mirjam - Pronunciation variation for TTS ( Kolluru et al, Brognaux et al, Lecumberri et al) |
09.10.14 |
Rob/Cassie - Talker variability (Bailly & Martin , Luan et al) |
16.10.14 |
Gustav/Cassia - Postfilter (DNN postfilter , GV ) |
23.10.14 |
Zhizheng - Sequence-based training for DNN (LSTM TTS , Sequence error DNN for VC ) |
30.10.14 |
Evaluation for next Blizzard challenge - child evaluation Example audiobook data - evaluation guidelines ( document ) |
06.11.14 |
Simon - waveform synthesis IS140893.PDF (background: IS080193.PDF) |
13.11.14 |
Shinnosuke Takamichi - Modulation spectrum-based approach to high-quality statistical parametric speech synthesis |
20.11.14 |
Evaluation guidelines |
27.11.14 |
no meeting (Christmas lunch) |
04.12.14 |
Rosie - Spanish evaluation |
11.12.14 |
Gustav - New loss functions and distributions for speech synthesis |
18.12.14 |
no meeting |
08.01.15 |
no meeting |
15.01.15 |
Planning meeting |
22.01.15 |
Ruben |
29.01.15 |
no meeting - NST meeting |
05.02.15 |
Felipe - vocoder journal paper (here) |
12.02.15 |
Cassia - Restoring high frequency components from low-sampling-rate speech (paper here) |
19.02.15 |
Sam - CWT Perceptual Experiments + MOS-MUSHRA Discussion: (notes here) |
26.02.15 |
Mirjam - A trio of random interesting papers from SLT ( Lara Martin et al., Gina-Anne Levow et al., Verena Venek et al.) |
05.03.15 |
Qiong- ICASSP: Vocaine vocoder paper here & Fusion vocoder; Simon - speech pre-enhancement (paper here , samples here) |
12.03.15 |
Tom - Interspeech paper |
19.03.15 |
Rob - Prosody discussion |
26.03.15 |
Rasmus - presentation of work during Google |
02.04.15 |
no meeting - Easter |
09.04.15 |
Gustav - Quality prediction for TTS (1st priority: journal paper, 2nd priority: KLD Interspeech paper) |
16.04.15 |
No meeting |
23.04.15 |
Srikanth presentation |
30.04.15 |
No meeting |
07.05.15 |
Oliver - hybrid synthesis and speech enhancement |
14.05.15 |
Cassia - Modelling the waveform using DNNs (paper here) + Planning - ICASSP papers |
21.05.15 |
A Script for Machine Synthesis |
28.05.15 |
No meeting (NST meeting) |
04.06.15 |
Rob - Prosody papers Icassp |
11.06.15 |
Korin |
18.06.15 |
Simon - A Mouth Full Of Words: Visually Consistent Acoustic Redubbing (paper + demo) - just because it's amusing |
25.06.15 |
No meeting |
02.07.15 |
No meeting (UK speech) |
09.07.15 |
Tom |
16.07.15 |
Zhizheng - the effects of DNN in SPSS (paper) |
23.07.15 |
no meeting |
30.07.15 |
Gustav - Waveform-level probabilistic modelling (Achan et al.) |
06.08.15 |
Summer school summary |
13.08.15 |
no meeting |
20.08.15 |
Qiong 3rd year review (10am) |
27.08.15 |
no meeting |
03.09.15 |
Interspeech practice talks/posters |
- Reccurent latent variable model for sequential data: paper
Details of suggested papers to read:
Acoustic modelling etc:
- Chunwijitra, Nose & Kobayashi. A speech parameter generation algorithm using local variance for HMM-based speech synthesis. In Proc. Interspeech, 2012
Other:
- Eyben et al.: Unsupervised Clustering of Emotion and Voice Styles for Expressive TTS. In Proc. ICASSP, 2012
Scratchpad for other suggestions for meeting topics:
* Catherine - Talk on quotation work
* Statistical Text-to-Speech Synthesis with Improved Dynamics. Stas Tiomkin, David Malah; Technion IIT, Israel. Proc Interspeech 2008
* Tomoki Toda's work on voice conversion using less than one sentence of speech
* The Expression and Perception of Emotions: Comparing Assessments of Self versus Others Carlos Busso, Shrikanth S. Narayanan; University of Southern California, USA. In Proc. Interspeech 2008
* Scripted Dialogs versus Improvisation: Lessons Learned About Emotional Elicitation Techniques from the IEMOCAP Database Carlos Busso, Shrikanth S. Narayanan; University of Southern California, USA. In Proc. Interspeech 2008
* ZZT transform (Dutoit's student thesis) - IEEE journal paper (ACTION ON Matthew to find this paper)
-- Main.korin - 12 Sep 2013
Speak! meeting schedules
Topic revision: r36 - 25 Sep 2014 - 14:08:05 - Main.mwester