中国科学院新疆理化技术研究所机构知识库
Advanced  
XJIPC OpenIR  > 多语种信息技术研究室  > 学位论文
题名: 基于角色的站内搜索引擎的研究和实现
作者: 刘彦峰
答辩日期: 2007-06-13
导师: 李晓
专业: 计算机应用技术
授予单位: 中国科学院研究生院
授予地点: 北京
学位: 硕士
关键词: 全文检索 ; 搜索引擎 ; RBAC ; Lucene
摘要: 随着网络信息资源的急剧增长,人们越来越多地关注如何快速有效地从海量的网络信息中,抽取出潜在的、有价值的信息,以满足自己的需要。全文检索技术是信息处理的领域中的重要技术,它是处理非结构化数据的强大工具,也是搜索引擎的核心技术之一。在全文索引方面,本文引入了一种改进的倒排索引结构,同传统索引结构相比,更便于索引的构建、维护、更新,并根据其特征,设计了优化的查询策略。而且,本文也对访问控制技术尤其是基于角色的访问控制模型进行了研究分析,将全文检索技术和基于角色的访问控制技术结合起来。本文的重点放在了全文检索技术的应用上,对如何利用新技术、改善检索系统的结构、提高检索系统的性能和效率、加快检索速度、不断适应网络信息发展等方面做了重点研究。本文使用了基于Java的全文索引引擎Lucene软件包,详细说明了基于角色的站内信息搜索系统的开发过程和方法。作为一个开源软件,它为我们学习搜索引擎的核心技术提供了绝佳的机会,对其进行剖析研究进行二次开发,是很有实际意义的工作.在应用方面,本文主要工作是站内全文数据库的设计和实现.其检索子系统在文档数据加工、信息抽取及分类等工作的基础上,完成了索引器、检索器的设计等内容,最终实现了全文检索功能。
英文摘要: Along with the rapid growth of information on Web, it becomes more difficult for Web surfers to retrieve useful information to meet their needs among the gigantic amount of Web information. The full text retrieval (FTR) is the primal technology of disposing the information. It is a powerful tool for dealing with nonstructural data, and is one of the key technologies of the search engine. In the filed of full-text index, an improved word-based Chinese inverted index structure is proposed, which has a better performance than traditional approaches, and convenient for constructing maintaining and updating index. And the article also has carried on the research analysis to the access control technology on the role-based access control model, associated with the full text retrieval technology. The paper pays more attention in application of full-text retrieval technologies. How to use the new technique, optimize the structure of retrieval system, improve performance and efficiency, quicken speed and adapt the development of current web is also discussed in this paper. Lucene, a full-text search toolkit, is introduced into the paper, it has powerful performance and its body is cabinet, capable and vigorous. This convenient for it embedded applications. And the paper specify in detail the system development process and methods. As an open source code software, Lucent offer a superexcellent chance to study search engine key technology. It is worth to take a parse research and carry second development to it. In the application aspect, the paper works mostly in the design and implement of the full-text database. Its retrieval subsystem realize constructing indexer and searcher design on the basis of relative work such as document data process ,information extracting and sorting, and ,at last ,it implement the design target.
内容类型: 学位论文
URI标识: http://ir.xjipc.cas.cn/handle/365002/3517
Appears in Collections:多语种信息技术研究室_学位论文

Files in This Item:
File Name/ File Size Content Type Version Access License
刘彦峰硕士论文.pdf(1065KB)学位论文--暂不开放View 联系获取全文

作者单位: 中国科学院新疆理化技术研究所

Recommended Citation:
刘彦峰. 基于角色的站内搜索引擎的研究和实现[D]. 北京. 中国科学院研究生院. 2007.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[刘彦峰]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[刘彦峰]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
文件名: 刘彦峰硕士论文.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Powered by CSpace