XJIPC OpenIR  > 多语种信息技术研究室
Uyghur word segmentation using a combination of rules and statistics
Xue, Huajian; Yang, Yong; Turghun, Osman; Li, Xiao; Zhang, Ronghui
2011
发表期刊Advances in Information Sciences and Service Sciences
ISSN19763700
卷号3期号:11
摘要

Rich morphology of Uyghur produces a large number of words and leads to high out of vocabulary (OOV) rates that can cause many errors in Uyghur natural language processing (NLP). Morphological word segmentation is the very important component to overcome this problem caused by Uyghur morphology. This paper depicts some morphological rules by analyzing the universal structure of Uyghur words and presents a partly supervised word segmentation method. In this method, the suffix corpus was utilized to give all the possible morphological word segmentations, from which the optimal word segmentation is selected by the MAP-based model. In addition, cascaded language model was used to improve the accuracy of word segmentation. The test set composed of 5000 words was collected and segmented by hand. The experiment on this test set was given and experimental results show that the proposed method was more effective.

DOI10.4156/AISS.vol3.issue11.13
收录类别EI
引用统计
文献类型期刊论文
条目标识符http://ir.xjipc.cas.cn/handle/365002/4160
专题多语种信息技术研究室
作者单位Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, China
推荐引用方式
GB/T 7714
Xue, Huajian,Yang, Yong,Turghun, Osman,et al. Uyghur word segmentation using a combination of rules and statistics[J]. Advances in Information Sciences and Service Sciences,2011,3(11).
APA Xue, Huajian,Yang, Yong,Turghun, Osman,Li, Xiao,&Zhang, Ronghui.(2011).Uyghur word segmentation using a combination of rules and statistics.Advances in Information Sciences and Service Sciences,3(11).
MLA Xue, Huajian,et al."Uyghur word segmentation using a combination of rules and statistics".Advances in Information Sciences and Service Sciences 3.11(2011).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Xue, Huajian]的文章
[Yang, Yong]的文章
[Turghun, Osman]的文章
百度学术
百度学术中相似的文章
[Xue, Huajian]的文章
[Yang, Yong]的文章
[Turghun, Osman]的文章
必应学术
必应学术中相似的文章
[Xue, Huajian]的文章
[Yang, Yong]的文章
[Turghun, Osman]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。