Date of Submission
Fall 12-18-2020
Degree Type
Thesis
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
Committee Chair/First Advisor
Dr. Ahyoung Lee
Track
Big Data
Chair
Dr. Ahyoung Lee
Committee Member
Dr. Ahyoung Lee
Committee Member
Dr. Donghyun Kim
Committee Member
Dr. Yan Yuang
Abstract
Nowadays, in almost every computer system, log files are used to keep records of occurring events. Those log files are then used for analyzing and debugging system failures. Due to this important utility, researchers have worked on finding fast and efficient ways to detect anomalies in a computer system by analyzing its log records. Research in log-based anomaly detection can be divided into two main categories: batch log-based anomaly detection and streaming logbased anomaly detection. Batch log-based anomaly detection is computationally heavy and does not allow us to instantaneously detect anomalies. On the other hand, streaming anomaly detection allows for immediate alert. However, current streaming approaches are mainly supervised. In this work, we propose a fully unsupervised framework which can detect anomalies in real time. We test our framework on hdfs log files and successfully detect anomalies with an F- 1 score of 83%.