XJIPC OpenIR  > 多语种信息技术研究室
基于 ELK 的日志分析与海量数据检索系统的设计与实现
Thesis Advisor马玉鹏
Degree Grantor中国科学院大学
Place of Conferral北京
Degree Discipline计算机应用技术
Keyword日志处理 信息检索 服务器监控 Elk Elasticsearch


Other Abstract
AbstractWith the rapid development of network and information processing technology, mankind has entered the era of big data. Data volume is increasing exponentially, and all trades and professions are facing the pressure of massive data processing. The application log in the supervision system of the gasoline sales information collection and supervision system is still in the stage of manual investigation, the efficiency of the log is low, the centralized processing and analysis is absent.Besides, analysis of large amounts of data in relational databases takes a long time,and the problems of the monitoring of many servers need to be solved urgently.In order to realize the distributed collection and real-time analysis of application log and server metric, and to improve the efficiency of mass data analysis, this paper proposes a distributed real-time data processing and analysis solution based on ELK stack.ELK is the abbreviation for Elasticsearch, Logstash, and Kibana .The log analysis system and mass data retrieval system are designed and implemented in combination with the actual requirements. The specific research contents of this paper are as follows:1. Research on large data processing technology, compare the characteristics and application of the major data processing system, focus on the core technology principle of the distributed search engine. Research and practice the system architecture, technology principle and use method of the newly emerging ELK stack.2. In view of the lack of centralized processing and analysis of the system application log, a distributed log analysis system is designed and implemented, and the distributed collection, parsing, storage and visualization analysis of the application log are completed. It makes up for the shortcomings of traditional log processing methods, such as inefficient log processing, long time consuming, and lack of visual analysis.3.For the lack of effective monitoring of IoT servers, a distributed acquisition and real-time monitoring of the server metrics is realized, which reduces the burden of engineers and operators and provides security for the stable operation of the server.4. Based on distributed search engine Elasticsearch, a mass data retrieval system is designed and implemented, which makes up for the long time consuming and lack of full text search function of relational database in large-scale data retrieval. Because Elasticsearch's built-in analyzer, IK analyzer and Mmseg analyzer can't meet the needs of Chinese address segmentation need, a address segmentation method combining the address factor level and the rule is adopted .At present, log analysis system and massive data retrieval system have been completed and put into use. In terms of log analysis and server monitoring, application log real-time collection and analysis are implemented,which can significantly reduce the burden of engineers and system administrators and guarantee the stable operation of the servers. The massive data retrieval system has realized the high performance search of the data, and has the characteristics of linear expansion, high data processing efficiency, which can effectively improve the efficiency of data analysis
Document Type学位论文
Recommended Citation
GB/T 7714
姚攀. 基于 ELK 的日志分析与海量数据检索系统的设计与实现[D]. 北京. 中国科学院大学,2018.
Files in This Item:
File Name/Size DocType Version Access License
基于ELK的日志分析与海量数据检索系统的(2986KB)学位论文 开放获取CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[姚攀]'s Articles
Baidu academic
Similar articles in Baidu academic
[姚攀]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[姚攀]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 基于ELK的日志分析与海量数据检索系统的设计与实现.pdf
Format: Adobe PDF
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.