-- VolkerStrom - 12 Jun 2007

Festival Development

From the proposal:

Festival will be enhanced (by Strom and Clark) to handle half-phones as well as diphone units. We will also be able to perform the "Slot Machine" type synthesis mentioned in the proposal, where certain units are specified as being required or forbidden in certain target positions. Integration with the new accent-independent multi-pronunciation lexicon will also be undertaken.

I am no longer sure whether we should use half-phones within Festival. I discussed with Matthew his approach to creating diphones from half-phones offline. He says, the phone boundaries coming from HTK are not good enough. It requires heuristics for finding good cutting points, which depend on the (half) phone types. In the AT&T synthesis group, Yeon Jun Kim spent almost all his time on improving the phone alignment, not within HTK, but with a postprocessor, the details of which were secret, but I suspect is was a similar set of heuristics which are crucial for the half-phone approach. Matthew's approach furthermore has the advantage that one can throw away parts of a phone transition (which is not possible in the AT&T system). -- VolkerStrom - 13 Jun 2007

Topic revision: r2 - 13 Jun 2007 - 12:23:48 - VolkerStrom
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies