Myanmar Continuous Speech to Isolated Word Segmentation

Authors

  • Taryar Myo Tun  University of Computer Studies Mandalay, Myanmar
  • Khin Thida Lynn  

Keywords:

Myanmar language, Energy based VAD, ZCR, Myanmar Tone Length

Abstract

This paper proposes a word segmentation method for Myanmar continuous speech. This system consists of speech processing inclusive of segment boundary detection for isolated words which used zero crossing, duration and energy techniques. Inaccurate segment boundaries are a major cause of errors in automatic speech recognition and a pre-processing stage that segments the speech signal into periods of speech and non-speech is invaluable in improving the recognition accuracy. We propose a combination of three audio features that is energy based voice activity detection, zero crossing rate (ZCR) and duration length for the speech/non-speech detection. Each feature has unique properties to differentiate speech and non-speech segments. We evaluate the results by dividing the speech sample into some segments and used the zero crossing rate, energy based VAD and Myanmar tone length to separate the parts of speech. The algorithm is tested on speech samples that are recorded as sentences of Myanmar speech. The results show that the algorithm managed to segment almost 98.5% of the Myanmar words for all recorded sentences.

References

  1. Bachu R.G., Kopparthi S., Adapa B and Barkana B.D, “Separation of Voiced and Unvoiced using Zero crossing rate and Energy of the Speech Signal”, Electrical Engineering Department, University of Bridgeport
  2. Mohammad Abushariah, , Raja Ainon, Roziati Zainuddin,  Moustafa Elshafei and Othman Khalifa, “Arabic Speaker Independent Continuous Automatic Speech Recognition Based on a Phonetically Rich and Balanced Speech Corpus”, The International Arab Journal of Information Technology, Vol. 9, No. 1, January ,2012
  3. Ei Phyu Phyu Soe, Aye Thida, “Diphone-Concatenation Speech Synthesis for Myanmar Language”, International Journal of Science, Engineering and Technology Research (IJSETR), Volume 2, Issue 5, May 2013
  4. Moe Pwint and Farook Sattar, “Speech/Nonspeech Detection Using Minimal Walsh Basis Functions”, EURASIP Journal on Audio, Speech, and Music Processing, Mark Clements, Volume 2007First Author and Second Author. 2002.
  5. Moe Pwint, Student Member, IEEE and Farook Sattar, Member, IEEE,” A Segmentation method for noisy Speech Using Genetic Algorithm”, School of Electrical and Electronic Engineering, Nanyang Technological University Nanyang Avenue, Singapore 639798, 0-7803-8874-7/05©2005 IEEE
  6. Won-Ho Shin, Byoung-Soo Lee, Yun-Keun Lee and Jong-Seok Lee,” Speech/ Non-speech Classification Using Multiple Features For Robust Endpoint Detection”, LG Corporate Institute of Technology
  7. Runshen Cai,” An Automatic Syllable Segmentation Method for
  8. Mandarin Speech”, Computer Science & Information Engineering College, Tianjin University of Science and Technology, Tianjin, China
  9. F. Pan, N. Ding,” Speech De-noising and Syllable Segmentation Based on Fractal Dimension”, International Conference on Measuring Technology and Mechatronics Automation(ICMTMA2010), pp. 433—43, IEEE Computer Society(2010)
  10. M. A. Ben Messaoud, A. Bouzid, N. Ellouze,” Automatic Segmentation of the Clean Speech Signal”, World Academy of Science, Engineering and Technology, International Journal of Electrical, Computer, Electronics and Communication Engineering Vol:9, No:1, 2015

Downloads

Published

2015-06-25

Issue

Section

Research Articles

How to Cite

[1]
Taryar Myo Tun, Khin Thida Lynn, " Myanmar Continuous Speech to Isolated Word Segmentation , International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 1, Issue 3, pp.178-182, May-June-2015.