search
for
 About Bioline  All Journals  Testimonials  Membership  News


Journal of Applied Sciences and Environmental Management
World Bank assisted National Agricultural Research Project (NARP) - University of Port Harcourt
ISSN: 1119-8362
Vol. 23, No. 5, 2019, pp. 895-901
Bioline Code: ja19135
Full paper language: English
Document type: Research Article
Document available free of charge

Journal of Applied Sciences and Environmental Management, Vol. 23, No. 5, 2019, pp. 895-901

 en Standard Yorùbá Context Dependent Tone Identification Using Multi-Class Support Vector Machine (MSVM)
SOSIMI, AA.; ADEGBOLA, T. & FAKINLEDE, OA.

Abstract

Most state-of-the-art large vocabulary continuous speech recognition systems employ context dependent (CD) phone units, however, the CD phone units are not efficient in capturing long-term spectral dependencies of tone in most tone languages. The Standard Yorùbá (SY) is a language composed of syllable with tones and requires different method for the acoustic modeling. In this paper, a context dependent tone acoustic model was developed. Tone unit is assumed as syllables, amplitude magnified difference function (AMDF) was used to derive the utterance wide F0 contour, followed by automatic syllabification and tri-syllable forced alignment with speech phonetization alignment and syllabification SPPAS tool. For classification of the context dependent (CD) tone, slope and intercept of F0 values were extracted from each segmented unit. Supervised clustering scheme was utilized to partition CD tri-tone based on category and normalized based on some statistics to derive the acoustic feature vectors. Multi-class support vector machine (MSVM) was used for tri-tone training. From the experimental results, it was observed that the word recognition accuracy obtained from the MSVM tri-tone system based on dynamic programming tone embedded features was comparable with phone features. A best parameter tuning was obtained for 10-fold cross validation and overall accuracy was 97.5678%. In term of word error rate (WER), the MSVM CD tri-tone system outperforms the hidden Markov model tri-phone system with WER of 44.47%.

Keywords
Syllabification; Standard Yorùbá; Context Dependent Tone; Tri-tone Recognition

 
© Copyright 2019 - Sosimi et al.

Home Faq Resources Email Bioline
© Bioline International, 1989 - 2024, Site last up-dated on 01-Sep-2022.
Site created and maintained by the Reference Center on Environmental Information, CRIA, Brazil
System hosted by the Google Cloud Platform, GCP, Brazil