Date of Submission
Spring 4-28-2020
Degree Type
Thesis
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
Committee Chair/First Advisor
Dr. Dan Chia-Tien Lo
Track
Big Data
Chair
Dr. Coskun Cetinkaya
Committee Member
Dr. Donghyun Kim
Committee Member
Dr. Kai Qian
Abstract
The exponential growth of malware has created a significant threat in our daily lives, which heavily rely on computers running all kinds of software. Malware writers create malicious software by creating new variants, new innovations, new infections and more obfuscated malware by using techniques such as packing and encrypting techniques. Malicious software classification and detection play an important role and a big challenge for cyber security research. Due to the increasing rate of false alarm, the accurate classification and detection of malware is a big necessity issue to be solved. In this research, eight malware family have been classifying according to their family the research provides four feature selection algorithms to select best feature for multiclass classification problem. Comparing. Then find these algorithms top 100 features are selected to performance evaluations. Five machine learning algorithms is compared to find best models. Then frequency distribution of features are find by feature ranking of best model. At last it is said that frequency distribution of every character of API call sequence can be used to classify malware family.