CMU Sphinx Training and Testing. It is designed to train the CMU Sphinx with the different language model obtained by SRILM, however the data are not sufficient. During experiments on Sphinx training procedure, we discovered that the Maximum Mutual Information Estimation (MMIE) training is the only configuration that considers the language model in the training. Otherwise, only decoding (testing) uses the language model.
MMIE training requires at least some amount of training data, such as more than 1. If there isn't enough training data, you might get good performance on the training data but bad performance on the testing data. The following packages are required for training:                        sphinxbase- 0. Sphinx. Train- 0. Inside the (an. 4) directory, you have to set up the first.
However, this setup is only two folders of the above will be presented; others will be created during the training process: On Linuxsphinxtrain - t an. On Windowspython ./sphinxtrain/scripts/sphinxtrain - t an. The setup step copies two files inside an. The most crucial and important part of training is setting the configuration file. This file contains all the parameters that could be changed according to the designed application and the form of the database. Using the configuration file you should do the following: 1.      Define the audio format. Configure the feature parameters.
Vocal tract normalization. Configure path to files. Configure the HMM type and parameters. Veterbi alignment and beam width. Decoding options for testing the model finally.
After putting the required setting in sphinx_train. On Linuxsphinxtrain run. On Windows (if you still inside database folder)python ./sphinxtrain/scripts/sphinxtrain run.
How to use CMU Sphinx 4 for speech to text with english voxforge models. java/edu/cmu/sphinx/demo. project configuration.setAcousticModelPath('file.
This command runs both training and testing “decoding†phases. Testing phase is controlled in the configuration by any variable starts with “DEC†inside the configuration file. The following code shows the decoding phase configuration: $DEC_CFG_VERBOSE = 1;         # Determines how much goes to the screen.# These are filled in at configuration time# Name of the decoding script to use (psdecode. DEC_CFG_SCRIPT = 'psdecode.
Sphinx-3 is the successor to the Sphinx-II speech recognition system from Carnegie Mellon University. file containing the configuration. Sphinx-3, this file.
DEC_CFG_EXPTNAME = "$CFG_EXPTNAME"; $DEC_CFG_JOBNAMEÂ = "$CFG_EXPTNAME"."_job"; # Models to use.$DEC_CFG_MODEL_NAME = "$CFG_EXPTNAME. CFG_DIRLABEL}_${CFG_N_TIED_STATES}"; $DEC_CFG_FEATFILES_DIR = "$CFG_BASE_DIR/feat"; $DEC_CFG_FEATFILE_EXTENSION = '. DEC_CFG_VECTOR_LENGTH = $CFG_VECTOR_LENGTH; $DEC_CFG_AGC = $CFG_AGC; $DEC_CFG_CMN = $CFG_CMN; $DEC_CFG_VARNORM = $CFG_VARNORM; $DEC_CFG_QMGR_DIR = "$CFG_BASE_DIR/qmanager"; $DEC_CFG_LOG_DIR = "$CFG_BASE_DIR/logdir"; $DEC_CFG_MODEL_DIR = "$CFG_MODEL_DIR"; $DEC_CFG_DICTIONARYÂ Â Â Â = "$CFG_BASE_DIR/etc/$CFG_DB_NAME1. DEC_CFG_FILLERDICTÂ Â Â Â = "$CFG_BASE_DIR/etc/$CFG_DB_NAME1. DEC_CFG_LISTOFFILESÂ Â Â = "$CFG_BASE_DIR/etc/${CFG_DB_NAME1}_test.
DEC_CFG_TRANSCRIPTFILE = "$CFG_BASE_DIR/etc/${CFG_DB_NAME1}_test. DEC_CFG_RESULT_DIRÂ Â Â Â = "$CFG_BASE_DIR/result"; # This variables, used by the decoder, have to be user defined, and# may affect the decoder output$DEC_CFG_LANGUAGEMODELÂ = "$CFG_BASE_DIR/etc/${CFG_DB_NAME1}. DMP"; $DEC_CFG_LANGUAGEWEIGHT = "1. DEC_CFG_BEAMWIDTH = "1e- 8. DEC_CFG_WORDBEAM = "1e- 4. DEC_CFG_ALIGN = "builtin"; $DEC_CFG_NPART = 1; Â Â Â Â Â Â Â Â Â Â #Â Define how many pieces to split decode in. Training is done one time only with no language model using three states per HMM.
Then the unigram, bigram and trigram are tested using the following command inside the database directory: On Linuxsphinxtrain - s decode run. On Windows (if you still inside database folder)python ./sphinxtrain/scripts/sphinxtrain - s decode run. The different language models are changed each time using the configuration file sphinx_train. The following lines are changed: $DEC_CFG_RESULT_DIR    = "$CFG_BASE_DIR/new_result"; The above command changes the result folder to be “new_resultâ€Â rather than result. The output of this process is used to make acoustic model for CMU Sphinx.
Sphinx- 4 Configuration Management. Managing the Sphinx Configuration. The Sphinx- 4 configuration manager system has two primary.
Determining which. The Sphinx- 4. system is designed to be extremely flexible. At runtime, just. For example. in Sphinx- 4 the Front. End component provides acoustic.
Robust group's Open Source Tutorial Learning to use the CMU SPHINX. and then copying the configuration. edit the file etc/sphinx_train.cfg in tutorial. CMU Sphinx Open Source Models. Copy it over the noisedict file. Sphinx-3 Specific Notes. you will need to add this argument to the configuration or command line. The easiest way would be to leave the configuration file as is and just replace old data files in WSJ. at edu.cmu.sphinx.decoder.se arch. CMU Sphinx Downloads. Software. CMU Sphinx toolkit has a number of packages for different tasks and applications. It's sometimes confusing what to choose. Managing the Sphinx Configuration The Sphinx-4 configuration manager system has two primary purposes: Determining which components are to be used in the system. Azhar Sabah Abdulaziz. Search. It is designed to train the CMU Sphinx with the different language. the feat.params file and the configuration file sphinx_train. All the source code of the HelloWorld demo is in one short file sphinx4/src/apps/edu/cmu/sphinx/demo. Since the configuration file. Sphinx-4 application.
Typically, Sphinx- 4 is configured with a Front. End that produces. Mel frequency cepstral coefficients (MFCCs), however it.
Sphinx- 4 to use a different Front. End. that, for instance, produces Perceptual Linear Prediction. PLP). The Sphinx- 4 configuration manager is.
Determining the. detailed configuration of each of these components. The. Sphinx- 4 system is like most speech recognition systems in that. For instance, a beam width is sometimes used. A larger value for this beam width can. The Sphinx- 4 configuration manager is used. The Configuration File.
The configuration of a particular Sphinx- 4 system is determined by a configuration file. This. configuration file defines the following: The names and. The connectivity of these components - that is, which. The detailed configuration for each of these.
Let's take a look at a simple configuration file. Sample. Component" type="edu. My. Component"/> < /config>. Some things to note about this configuration file: The format of the file is XMLThis configuration file defines a single component called. Sample. Component. The type of this component is edu. My. Component. which must implement the.
Configurable interface. Defining components. Now lets look. at a somewhat more complex configuration file. Sample. Component" type="edu. My. Component"/> < component name="another. Component" type="edu.
My. Component"/> < component name="a. Different. Component" type="edu. Your. Component"/> < /config>. This configuration file defines three components, two of the.
My. Component , and one. Your. Component. The two components with. The data types My. Component and Your.
Component are, of course. Now lets look at a section from a. Discrete. Cosine. Transform"/>. < component name="batch.
CMN" type="edu. cmu. Batch. CMN"/>. < component name="live.
CMN" type="edu. cmu. Live. CMN"/>. < component name="feature. Extraction" type="edu. Deltas. Feature. Extractor"/>. Here we see some of the components used in the front end of.
Sphinx. 4Defining configuration data. So far. we've shown how to define new components in the configuration. Now lets take a look at how we define the detailed. Data. Source" type="edu.
Concat. File. Data. Source"> < property name="sample. Rate" value="1. 60. File" value="reference.
File" value="/lab/speech/sphinx. Per. Read" value="3.
File" value="tidigits. Random. Silence" value="true"/> < /component> < /config> > In Sphinx- 4, we call the configuration data for a. Here, we are. defining six properties for the. Data. Source component. Properties are simple. The properties that can be defined for a component vary based. The API documentation for a.
For example a description of the properties used. Concat. File. Data. Source page. If a property is omitted from the configuration file, the. Configuration data types. Sphinx- 4 simple. String - a sequence of characters.
Component - the name of a Sphinx- 4 component (more on this. Here are some examples. Twas brillig and slithey toves". Pruner". In addition to these simple property types, there are two. String list - a list of strings.
Component list - a list of components. Lists are defined in a propertylist element.
Each item in a. list is defined with an item element. Here's an. Manager" type="edu. File. Manager"> < propertylist name="file. Names"> < item> file. Property lists of components are defined similarly. Live. Front. End" type="edu.
Front. End"> < propertylist name="pipeline"> < item> concat. Data. Source < /item> < item> speech. Classifier < /item> < item> speech. Marker < /item> < item> non. Speech. Data. Filter < /item> < item> preemphasizer < /item> < item> windower < /item> < item> fft < /item> < item> mel. Filter. Bank < /item> < item> dct < /item> < item> live.
CMN < /item> < item> feature. Extraction < /item> < /propertylist> < /component> Error Checking. When a configuration file.
Some. of the errors that are detected are: Invalid XML - the. XML file. Unknown XML. Missing, extra or Unknown.
XML attributes - an element has been given the wrong. Multiply defined properties. Bad data type for a property - . Multiply defined components - .
Out- of- range- data for a. The Elements. The following table details.
Element. Attributes. Sub- elements. Description< config> none< component> < property> < propertylist> The top level element.
It. has no attributes. It can have any number of the. Defines an instance of a. This element must always have the name and type attributes. None. Used to define a single. This. element must always have the name and value attributes. Used to define a list of.
This element must always have the. I. can have any number of item sub- elements.< item> nonenonecontents of this element. Global Properties. You may have noticed.
These are called global properties. Here's an example. Beam" value="1. 00. Beam" value="1. E- 1. These global variables can then be used in the property. A variable is referenced.
Name}. To reference the. Beam} and ${relative. Beam}Here's an example of using global properties in a config. Rate" value="1. 60. Data. Source" type="edu. Concat. File. Data. Source"> < property name="sample.
Rate" value="${sample. Rate}/> < /component> < component name="microphone" type="edu. Microphone"> < property name="sample.
Rate" value="${sample. Rate}/> < /component> < component name="stream.
Data. Source" type="edu. Stream. Data. Source"> < property name="sample. Rate" value="${sample. Rate}/> < /component> < /config> >. In this example we have three components, all of which need.
Rate. We could explicitly. Rate. properties, but if we decided to change the sample rate at a later. Using a. global property allows us to have a single point where the sample.
To change the sample rate, we only have to change. Global properties are also useful to highlight important tunable. Often times in large configuration files, important. Using global. properties, these important, frequently tuned properties can be. Here's an example from the tidigits. Beam. Width" value="- 1"/> < property name="relative.
Beam. Width" value="1. E- 2. 00"/> < property name="word. Insertion. Probability" value="1.
E- 3. 6"/> < property name="language. Weight" value="8"/> < property name="silence.
Insertion. Probability" value="1"/> < property name="skip" value="0"/> < !- - ******************************************************** - -> < !- - Components - -> < !- - ******************************************************** - -> < component name="batch" type- ".." >..< /component> < !- - more omitted .. Global variables can be substituted for all property value attributes. They can also be used in propertylist item statements.
Front. End pipeline that uses a. CMN"/> < component name="mfc. Front. End" type="edu. Front. End"> < propertylist name="pipeline"> < item> stream. Data. Source< /item> < item> preemphasizer< /item> < item> windower< /item> < item> fft< /item> < item> mel. Filter. Bank< /item> < item> dct< /item> < item> ${cmn}/item> < item> feature.
Extraction< /item> < /propertylist> < /component> < /config> Note that you can not. Thus, this is illegal. CMN"/> < !- - illegal! Cepstral. Mean. Normalizer"> < /component> < /config>. Setting properties from the Java command. Sometimes it is desirable to set component properties. This is often done from an.
This. allows a single configuration file be used to support multiple. The syntax for setting a component property from the. Name[property. Name]=value. For example to set the sample. Rate property for the microphone. Java like this. java - Dmicrophone[sample.
Rate]=4. 41. 00 edu. Live. Mode. Recognizer tidigits. The syntax for global properties is. Property=value. Here's an example of setting multiple properties, some global and. Dmicrophone[sample. Rate]=4. 41. 00. 0 - Dabsolute.
Beam. Width=2. 00. Dword. Insertion. Probability=. 0. 1 \edu. Live. Mode. Recognizer tidigits. Of course, ant. has its own syntax for setting such things. Here's an example of. Live mode TIDIGITS test."> < java classpath="${classes_dir}" classname="${live_main}"< sysproperty key="live[skip]" value="1"/> < sysproperty key="speed.
Tracker[show. Response. Time]" value="true"/> < sysproperty key="frontend" value="mfc. Live. Front. End"/> < arg value="${config}"/> < /java> < /target> Note that currently, it is not possible to set the value of.
Debugging your configuration. Here. are some tips for developing a configuration and getting it to. When a configuration has an error the configuration manager. Property. Exception that details the cause of the.
These exceptions are reported by all of the main. Sphinx- 4 programs and utilities. Use these messages to. There is a special global property called show. Creations. If this. This can sometimes help track down missing or. Dshow. Creations=true edu.
Batch. Mode. Recognizer \tidigits. Creating: batch. Creating: connected. Digits. Recognizer.
Creating: digits. Decoder. Creating: search. Manager. Creating: log. Math. Creating: flat. Linguist. Creating: word.
List. Grammar. Creating: dictionary. Creating: acoustic. Model. Creating: sphinx. Loader. Creating: trivial. Pruner. Creating: threaded.
Scorer. Creating: mfc. Front. End. Creating: stream. Data. Source. Creating: preemphasizer. Creating: windower.
Creating: fft. Creating: mel.