The scripts are in the gmtk_tools CVS tree under gmtk_tools/scripts/triangulateGA_SGE/ You'll need to set an environment variable GMTK_TOOLS to point the root of your local copy of that tree.

An example call:

  -strFile PARAMS/timit_training.with_gender.str
  -timingExportLine "qrsh -b y -cwd "
  -timingScript "$GMTK_TOOLS/bin/gmtkTime -probE -fmt1 htk -nf1 39 -ni1 0 -of1 DATA/training_observations.scp -inputMasterFilePARAMS/nonTrainable.with_gender.master -strFile PARAMS/timit_training.with_gender.str -inputTrainableParameters PARAMS/gender_sensitive_models/model_5.gmp"
  -iswp1 T -nf1 39 -ni1 0 -fmt1 htk -of1 DATA/training_observations.scp
  -inputMasterFile PARAMS/nonTrainable.with_gender.master
  -inputTrainableParameters PARAMS/gender_sensitive_models/model_5.gmp
  -outputDirectory genderTrainingTriangulations/
  -parallelism 40
  -seconds 30
  genderTrainingTriangulations/timit_training.with_gender.str.trifile > LOGS/triangulateGA/genderTrainer.stdout 2> LOGS/triangulateGA/genderTrainer.stderr

The arguments mean...

  • A lot of the options have the same meaning as they do under gmtk.
  • -seconds tells it how many seconds to allow each timing run to last - too short and it won't get through many chunk frames or reach the epilogue - too long and it'll take too long. 30 seconds was a figure that made sense for my structure (in that the speed metric (partitions/sec) levelled out there).
  • -parallelism 40 tells it to create, erm, 20 timing threads when parallelising the benchmarking. Not sure why the script divides by two here...
  • -long sets a bunch of internal parameters indicating how long the run should take (longer means potentially better). The other options are -medium and -short (the default)
  • You can optionally provide a triangulation to start from (this could be the output of a previous run) genderTrainingTriangulations/timit_training.with_gender.str.trifile above
  • Unless you state -useExistingBoundaries it'll start with a boundary search. The boundary search is also distributed using SGE. You'll need to do a boundary search the first time round but after that you can use the -useExistingBoundaries option.
  • The output turns up in directory you ran from, named strfile_name .best.trifile. It is updated with the current best triangulation as the script goes on, so you can use one produced halfway through the run if you're feeling impatient.

-- Main.s0565860 - 19 Jun 2006

Topic revision: r4 - 11 Jul 2006 - 20:00:16 - Main.s0565860
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies