Converting System of Phonetics Transcriptions to Myanmar Text Using N-grams Language Models


Kyaw Kyaw Maung
Converting between Phonetics transcriptions and Myanmar text is a process of converting between the sequence of Phonetics transcriptions and Myanmar text. Phonetics transcription is based on the pronunciation of the language and the Myanmar text is based on the written language. One Phonetics alphabet can be represented many possible forms in written language that leads into word sense ambiguity problem. Another problem is that both of the Phonetics transcriptions and Myanmar text have no space to identify the boundary of syllables and words. This problem can be defined as segmentation problem for matching and mapping between Phonetics transcriptions and Myanmar text. To solve the word-sense ambiguity problem, the research developed n-grams language models from correct training data in Myanmar language. By using these trained n-grams language models, the system can be converted from Phonetics to Myanmar text. Instead of computing the probability on the trained n-grams data, the system matched the input data and the trained n-grams model data. The system has built n-grams models where unigram model, bi-grams model, trigrams model, 4-grams models and 5-grams models to train and convert between Phonetics and Myanmar text. To solve the segmentation problem, the system needed to break the input text into individual tokens. In the system, each token may be represented the consonant, or consonant clusters or vowels. To segment the input text Myanmar text or Phonetics transcriptions correctly, the proposed used the Unicode fonts for both Myanmar text and Phonetics transcriptions.

Kyaw Kyaw Maung

N-grams, Unigram, Bi-grams, Trigrams, 4-grams, 5-grams, Phonetics Transcriptions, Myanmar Text

Publication Details

Published in : Volume 1 | Issue 3 | May-June - 2015
Date of Publication Print ISSN Online ISSN
2015-06-25 2395-1990 2394-4099
Page(s) Manuscript Number   Publisher
260-264 IJSRSET151356   Technoscience Academy

Cite This Article

Kyaw Kyaw Maung, "Converting System of Phonetics Transcriptions to Myanmar Text Using N-grams Language Models", International Journal of Scientific Research in Science, Engineering and Technology(IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 1, Issue 3, pp.260-264, May-June-2015.
