XJIPC OpenIR  > 多语种信息技术研究室
Co-occurrence degree based word alignment: A case study on Uyghur-Chinese
Mi, Chenggang1; Yang, Yating1; Zhou, Xi1; Li, Xiao1; Osman, Turghun1
Source PublicationLecture Notes in Computer Science
AbstractMost widely used word alignment models are based on word co-occurrence counts in parallel corpus. However, the data sparseness during training of the word alignment model makes word co-occurrence counts of Uyghur-Chinese parallel corpus cannot indicate associations between source and target words effectively. In this paper, we propose a Uyghur-Chinese word alignment method based on word co-occurrence degree to alleviate the data sparseness problem. Our approach combine the co-occurrence counts and the fuzzy co-occurrence weights as word co-occurrence degree, fuzzy co-occurrence weights can be obtained by searching for fuzzy co-occurrence word pairs and computing differences of length between current Uyghur word and other Uyghur words in fuzzy co-occurrence word pairs. Experiment shows that with the co-occurrence degree based word alignment model, the performance of Uyghur-Chinese word alignment result is outperform the baseline word alignment model, the quality of Uyghur-Chinese machine translation also improved.
KeywordUyghur - Chinese Word Alignment Co - Occurrence Degree Co - Occurrence Count Agglutinative Language Da Ta Sparseness
Indexed ByEI
Document Type期刊论文
Affiliation1.Xinjiang Technical Institute of Physics&Chemistry of Chinese Academy of Sciences Urumqi, Xinjiang, China
2.University of Chinese Academy of Sciences, Beijing, China
Recommended Citation
GB/T 7714
Mi, Chenggang,Yang, Yating,Zhou, Xi,et al. Co-occurrence degree based word alignment: A case study on Uyghur-Chinese[J]. Lecture Notes in Computer Science,2014,8801(1):259-268.
APA Mi, Chenggang,Yang, Yating,Zhou, Xi,Li, Xiao,&Osman, Turghun.(2014).Co-occurrence degree based word alignment: A case study on Uyghur-Chinese.Lecture Notes in Computer Science,8801(1),259-268.
MLA Mi, Chenggang,et al."Co-occurrence degree based word alignment: A case study on Uyghur-Chinese".Lecture Notes in Computer Science 8801.1(2014):259-268.
Files in This Item:
File Name/Size DocType Version Access License
Co-occurrence degree(444KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Mi, Chenggang]'s Articles
[Yang, Yating]'s Articles
[Zhou, Xi]'s Articles
Baidu academic
Similar articles in Baidu academic
[Mi, Chenggang]'s Articles
[Yang, Yating]'s Articles
[Zhou, Xi]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Mi, Chenggang]'s Articles
[Yang, Yating]'s Articles
[Zhou, Xi]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Co-occurrence degree based word alignment A case study on Uyghur-Chinese.pdf
Format: Adobe PDF
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.