In this paper we tend to gift 2 voice-to-phoneme conversion algorithms that extract voice-tag abstactions for speaker-independent voice-tag applications in embedded platforms, that are very sensitive to memory and CPU consumptions. In the first approach, a voice-to-phoneme conversion in batch mode manages this task by preserving the commonality of input feature vectors of multiple voice-tag example utterances. Given multiple example utterances, a developed feature combination strategy produces an "average" utterance, that is converted to phonetic strings as a voice-tag representation via a speaker-freelance phonetic decoder. In the second approach, a sequential voice-to-phoneme conversion algorithm uncovers the hierarchy of phonetic consensus embedded among multiple phonetic hypotheses generated by a speaker-independent phonetic decoder from multiple example utterances of a voice-tag. The most relevant phonetic hypotheses are then chosen to represent the voice-tag. The voice-tag representations obtained by these 2 voice-to-phoneme conversion algorithms are compared in speech recognition experiments to phonetic transcriptions of voice-tag reference prepared by an skilled phonetician. Both algorithms either perform comparably to or considerably better than the manual transcription approach. We have a tendency to conclude from this that both algorithms are terribly effective for the targeted purposes
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here