XJIPC OpenIR  > 多语种信息技术研究室
Thesis Advisor蒋同海
Degree Grantor中国科学院研究生院
Place of Conferral北京
Degree Discipline计算机应用技术
Keyword语音识别 隐马尔可夫模型 梅尔倒谱系数 高斯混合模型 自适应调整
Other AbstractWith Xinjiang economy fast development, the communication between Xinjiang and inland becomes more and more frequent in a wide range. But for the local minority people, language blocks the communication. To solve the problem, the best method improves spoken Chinese level of the minority teachers and students. To improve pronunciation of the problem, it is the most important aspect in language learning. The speech recognition technique helps the learners have an accurate mandarin pronunciation and calculate the veracity, improving learners’ pronunciation. The tone recognition checks their pronunciation tones to make them know whether their tones are correct or wrong as mandarin tone is the most difficult issue confusing minority students. First in this paper, we elaborate the important ideas and methods of speech recognition from theory. Then we make use of Hidden Markov Model (HMM), Gaussian mixture model and context-dependent three phonemes model to propose frame structure which is based on HTK and build speaker recognition system which is based on 863 speech corpus. And we compare the effect on the system of single phoneme to that of three phonemes, and the result shows a remarkable enhancement of recognition rate. We tests the effect of different Gaussian mixture numbers on the recognition system and finds that recognition rate will be improved if we add Gaussian mixture numbers. But to some degree, the growing speed of recognition rate will get slow. The more Gaussian mixture numbers we added, the longer the time to train the model will be. We finally find a balanced point about time and Gaussian mixture numbers. We build a specific speech corpus is based on spoken Chinese of Uygur minority people. Some data has been labeled. The test result of spoken Chinese of Uygur minority people based on 863 speech corpus speaker recognition system, is not so good. Because of that we introduce a technology of adjustment in our system to make a better performance. We used labeled data from Uygur speech corpus to adjust the speaker recognition system mode which is based on 863 speech corpus. And find that the recognition rate raises significantly.
Document Type学位论文
Recommended Citation
GB/T 7714
李凯. 语音识别在新疆“双语”教学软件中的应用[D]. 北京. 中国科学院研究生院,2009.
Files in This Item:
File Name/Size DocType Version Access License
李凯硕士论文.pdf(1579KB)学位论文 开放获取CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[李凯]'s Articles
Baidu academic
Similar articles in Baidu academic
[李凯]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[李凯]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 李凯硕士论文.pdf
Format: Adobe PDF
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.