Date of Submission

Spring 4-28-2020

Degree Type


Degree Name

Master of Science in Computer Science (MSCS)


Computer Science

Committee Chair/First Advisor

Dr. Dan Chia-Tien Lo


Big Data


Dr. Coskun Cetinkaya

Committee Member

Dr. Donghyun Kim

Committee Member

Dr. Kai Qian


The exponential growth of malware has created a significant threat in our daily lives, which heavily rely on computers running all kinds of software. Malware writers create malicious software by creating new variants, new innovations, new infections and more obfuscated malware by using techniques such as packing and encrypting techniques. Malicious software classification and detection play an important role and a big challenge for cyber security research. Due to the increasing rate of false alarm, the accurate classification and detection of malware is a big necessity issue to be solved. In this research, eight malware family have been classifying according to their family the research provides four feature selection algorithms to select best feature for multiclass classification problem. Comparing. Then find these algorithms top 100 features are selected to performance evaluations. Five machine learning algorithms is compared to find best models. Then frequency distribution of features are find by feature ranking of best model. At last it is said that frequency distribution of every character of API call sequence can be used to classify malware family.