-- Main.matthewa - 18 Aug 2006 Project Home

Evaluation of states v phones v silence

Ergodic model with 45 states seeded with means from kmeans calculated over sppech regarded as speech by SLPA. Parameterised to 13 way MFCC (12 mfcc + 1 energy).

Used 110 nina uttereances (Cereproc nin_x0001_*)

Applied SLPA tools to set.

Cereproc HTK alignment of data.

The results below are for 10ms frames. They are broken down by phone. The first two columns show the percentage of frames catehgorised as silence or speech by SLPA. It does a lot better on initial and final pauses (sil) compared to short pauses (CPRCsp).

The next figure is the number of percentage of frames accounted for by the top 4 most frequent associated states. This is then broken down in order of the four states by their name.

For example sil and CPRCsp are closely associated with s24.

phone SLPA S S LPA #   four state tot state 1 state 2 state 3 state 4
@ 1.00 0.00   0.60 s11:0.33 s20:0.14 s31:0.07 s17:0.06
@@ 1.00 0.00   0.68 s20:0.28 s10:0.24 s11:0.09 s16:0.07
CPRCsp 0.13 0.87   0.84 s24:0.43 s12:0.20 s08:0.16 s04:0.06
a 1.00 0.00   0.83 s16:0.41 s10:0.19 s20:0.16 s34:0.08
aa 0.99 0.01   0.86 s33:0.61 s35:0.16 s38:0.05 s18:0.04
ai 0.99 0.01   0.78 s33:0.28 s10:0.23 s20:0.20 s31:0.07
au 1.00 0.00   0.84 s16:0.40 s10:0.22 s35:0.13 s33:0.10
b 0.76 0.24   0.60 s45:0.27 s43:0.16 s46:0.10 s24:0.08
ch 0.94 0.06   0.83 s07:0.67 s45:0.07 s43:0.05 s46:0.04
d 0.92 0.08   0.69 s46:0.28 s45:0.20 s05:0.12 s11:0.08
dh 0.88 0.12   0.60 s46:0.26 s11:0.18 s45:0.10 s43:0.07
e 1.00 0.00   0.73 s20:0.49 s31:0.09 s42:0.08 s16:0.07
e@ 1.00 0.00   0.88 s20:0.65 s16:0.15 s34:0.04 s10:0.04
ei 1.00 0.00   0.95 s23:0.41 s31:0.41 s20:0.11 s42:0.02
f 1.00 0.00   0.82 s43:0.44 s46:0.23 s34:0.12 s21:0.03
g 0.86 0.14   0.56 s45:0.25 s46:0.15 s14:0.10 s23:0.06
h 0.91 0.09   0.65 s34:0.38 s23:0.14 s07:0.07 s16:0.07
i 0.98 0.02   0.86 s23:0.37 s31:0.24 s11:0.20 s42:0.06
i@ 1.00 0.00   0.87 s31:0.53 s23:0.18 s20:0.12 s11:0.05
ii 1.00 0.00   0.93 s23:0.83 s31:0.05 s46:0.04 s45:0.02
jh 0.95 0.05   0.82 s07:0.56 s46:0.11 s11:0.08 s45:0.08
k 0.85 0.15   0.55 s14:0.18 s45:0.15 s43:0.15 s21:0.08
l 1.00 0.00   0.63 s44:0.25 s18:0.15 s11:0.13 s17:0.09
m 0.98 0.02   0.80 s26:0.56 s11:0.10 s17:0.09 s41:0.05
n 1.00 0.00   0.86 s41:0.53 s26:0.15 s27:0.10 s46:0.07
ng 1.00 0.00   0.93 s26:0.75 s23:0.13 s36:0.03 s46:0.03
o 0.99 0.01   0.75 s33:0.36 s18:0.23 s38:0.09 s44:0.06
oi 1.00 0.00   0.66 s06:0.26 s18:0.19 s17:0.11 s44:0.10
oo 1.00 0.00   0.92 s06:0.56 s44:0.25 s17:0.08 s18:0.04
ou 1.00 0.00   0.78 s20:0.31 s31:0.19 s42:0.15 s23:0.12
p 0.84 0.16   0.60 s43:0.26 s45:0.20 s21:0.08 s04:0.07
r 0.99 0.01   0.69 s17:0.37 s11:0.16 s20:0.13 s10:0.03
s 0.94 0.06   0.90 s37:0.64 s46:0.18 s05:0.04 s21:0.03
sh 1.00 0.00   0.97 s07:0.89 s23:0.04 s11:0.03 s43:0.01
sil 0.01 0.99   0.95 s24:0.86 s12:0.04 s08:0.03 s04:0.02
t 0.87 0.13   0.56 s05:0.23 s46:0.14 s45:0.10 s21:0.09
th 0.98 0.02   0.77 s46:0.31 s43:0.22 s21:0.16 s34:0.08
u 1.00 0.00   0.55 s11:0.17 s44:0.14 s17:0.12 s42:0.12
u@ 1.00 0.00   0.88 s11:0.43 s18:0.22 s17:0.12 s06:0.11
uh 0.99 0.01   0.71 s33:0.32 s10:0.19 s18:0.12 s38:0.08
uu 1.00 0.00   0.89 s23:0.66 s11:0.12 s17:0.07 s46:0.04
v 0.98 0.02   0.74 s46:0.48 s11:0.10 s45:0.08 s43:0.08
w 0.92 0.08   0.57 s17:0.19 s44:0.14 s06:0.13 s18:0.11
y 0.97 0.03   0.93 s23:0.79 s46:0.06 s11:0.04 s07:0.04
z 0.97 0.03   0.89 s46:0.51 s37:0.33 s21:0.03 s15:0.02
zh 1.00 0.00   0.98 s07:0.69 s23:0.15 s11:0.11 s46:0.02


This topic: CSTR > WebHome > EPhones > EphonesR1
Topic revision: r1 - 18 Aug 2006 - 14:55:35 - Main.matthewa
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies