Thesis Advisor张岩
Degree Grantor中国科学院大学
Place of Conferral北京
Degree Discipline计算机技术
Keyword斯拉夫哈萨克文 现行哈萨克文 转换规则 转换工具

在当下这个信息时代中,整个社会的各个元素与信息技术的关系日益密切,即便是被人们使用了千百年的语言文字也是如此。哈萨克文作为哈萨克族民众的交际工具和文化载体,也正顺应着信息时代的发展要求不断加快信息化和标准化的步伐。但随着信息科学技术的快速发展和应用的不断深入,以及国内外哈萨克族民众交流的日益密切,伴随哈萨克文“一语多文”的特点,在其信息处理技术和标准又不断面临新的问题需要解决,不同拼写方式的哈萨克文之间的编码字符转换规则和实现方法就是其中一个值得研究的课题。 哈萨克语在其发展的过程中,在世界范围内形成了基于同一语言的两种文字形式的特殊情况,分别是国外的以西里尔字母为基础斯拉夫哈萨克文,以及我国的以阿拉伯文字母为基础的现行哈萨克文。但由于两种文字的书写方式截然不同,在国内的哈萨克民众对外交流往来时,无法认知国外使用的斯拉夫哈萨克文,造成诸多不便,不符合世界范围内的交流与合作的要求。现行哈萨克文和斯拉夫哈萨克文的发音基本相同,而且均是一音一字的文字形式,所以斯拉夫哈萨克文与现行哈萨克文是能够通过规则互相转换的,但至今并没有相关的国家或地方标准对转换规则进行全面而明确的描述,也没有较为完善的编码字符转换工具。 本文提出两种文字编码字符的转换规则,并在转换规则的基础上设计和实现了转换方法。首先,通过对现行哈萨克文与斯拉夫哈萨克文的编码字符进行全面系统地研究和分析,制定完善的编码字符转换规则,将两种哈萨克文字中可能出现的所有编码字符的转换规则作了详细规定,针对每种转换情况,列举出对应的示例,并给出两种编码字符的转换对照表。其次,以现行哈萨克文与斯拉夫哈萨克文编码字符转换规则为理论依据,按照句法词段划分,通过基于词典的转换和基于规则的转换相结合的方式,实现两种哈萨克文编码字符的相互转换。

In this information age, the relationship between the various elements of the whole society and information technology increasingly close, as well as used by people for thousands of years of language. Kazakh language as a communication tool and culture carrier of Kazak people, is also conform to the requirements of the development in the information age continues to accelerate the pace of information technology and standardization. But with the rapid development and application of information science and technology deepening, as well as domestic and foreign Kazak people more closely exchanges closer, accompanied by Kazakh phrase "two meanings of a word" features, its information processing technology and standards continually faced with new problems to resolved. Kazakh language, in the process of its development, has evolved two different writing forms globally- Slavic-based Kazakh and Arabic-based Kazakh. Slavic- based Kazakh is derived from the Slavic letters. But Arabic- based Kazakh is derived from the Arabic letters, which is also called the modern Kazakh. However, due to the two different ways of writing the Kazakh language, when domestic Kazak people communicate with foreigners, they don't understand foreign Slavic Kazakh, caused a lot of inconvenience and do not conform to communicating with each other around the world. The two writing Kazakh have the same pronunciation, and also there is a one-to-one correspondence between sounds and letters in Kazakh, so the conversion between the two writing forms are feasible under certain rules. But until now, there is no clear description of the conversion rules in any national or regional standards and it is not have a perfect character encoding conversion tool. Thus, based on the study of the Slavic Kazakh and Arabic Kazakh, this thesis advanced a conversion rule between character encodings of the two writing forms, and then on the basis of the rules, research and development of the text conversion tool. This thesis presents two character coded character conversion rules, and design and implement the conversion method based on conversion rules. First of all, to conduct a comprehensive research and analysis system by coded character on the Kazakh and Slavic Kazakh , well-developed character encoding conversion rules, the conversion rules may appear in the text of two Kazakh coded character made all the detailed requirements for each conversion, the corresponding examples cited, and gives the control of two character encoding conversion table. Secondly,based on the current Cyrillic Kazakh Kazak and character code conversion rules for the theoretical basis, according to the syntactic word division, through the dictionary conversion based and rule based on the combination of the conversion, the conversion of two kinds of Kazakh coded characters.

Document Type学位论文
Recommended Citation
GB/T 7714
刘金龙. 哈萨克文编码字符转换的研究和实现[D]. 北京. 中国科学院大学,2014.
Files in This Item:
File Name/Size DocType Version Access License
哈萨克文编码字符转换的研究和实现.pdf(1206KB)学位论文 开放获取CC BY-NC-SAView Application Full Text
