Automatic Target Cost and Database Design for Unit-selection Speech Synthesis -- Project Summary

The databases used in speech synthesis cannot contain multiple examples of every unit (e.g. diphone) in every possible context; they always have missing units. We are not yet able to automatically find out which missing units are going to be a problem and which are not. A missing unit is not a problem if a perceptually equivalent unit exists elsewhere in the database, and we know how to select it based on its linguistic features. Once this is possible, the text-selection algorithm, target cost and back-off strategy can exploit this knowledge in a consistent way. Because these components are so intimately related, we will develop a method for jointly selecting the text to be recorded, formulating an optimal target cost for the resulting database, and devising a back-off strategy.

