Disciplines
Digital Communications and Networking | Hardware Systems
Abstract (300 words maximum)
The research project aims to find ways to detect malicious packets inside encrypted network traffic. In addition to this goal maintaining user privacy is a priority. As encryption has become less expensive to implement more and more network traffic is encrypted. Currently, 90% of all network traffic is encrypted, and this trend is expected to increase. The creators of malware areemploying various methods to ensure delivery of their malware, including encryption. One proposed method to combat this suggests implementing machine learning with various algorithms to analyze packet attributes to determine if they contain malware, without actually knowing what's inside the packet. These attributes may include the packet's type, size, sender and receiver addresses, resemblance to other packets, and trusted root certificate. To accomplish this, various algorithms such as XGBoost, SVM, Neural Networks, and RandomForest are employed. In addition analyzing the trusted certificates of the packets has promise. Although analyzing the trusted certificate of the packet has a low success rate of around 70%, it has been observed that employing less successful analysis early in the machine learning process can improve the overall effectiveness. Unfortunately, no one has come up with an acceptable implementation that would be appropriate for real world use. Since even with a very high success rate of 99% when you consider billions of packets are being analyzed is not acceptable. Since even if you have combination of 1% of both false positives and negatives that is still many millions errors. Current methods tend to be around 97% effective. As such, more research is needed to stay ahead of the attackers and defend our Cybersecurity.
Academic department under which the project should be listed
CCSE - Information Technology
Primary Investigator (PI) Name
Liang Zhao
Encrypted Malicious Network Traffic Detection Using Machine Learning
The research project aims to find ways to detect malicious packets inside encrypted network traffic. In addition to this goal maintaining user privacy is a priority. As encryption has become less expensive to implement more and more network traffic is encrypted. Currently, 90% of all network traffic is encrypted, and this trend is expected to increase. The creators of malware areemploying various methods to ensure delivery of their malware, including encryption. One proposed method to combat this suggests implementing machine learning with various algorithms to analyze packet attributes to determine if they contain malware, without actually knowing what's inside the packet. These attributes may include the packet's type, size, sender and receiver addresses, resemblance to other packets, and trusted root certificate. To accomplish this, various algorithms such as XGBoost, SVM, Neural Networks, and RandomForest are employed. In addition analyzing the trusted certificates of the packets has promise. Although analyzing the trusted certificate of the packet has a low success rate of around 70%, it has been observed that employing less successful analysis early in the machine learning process can improve the overall effectiveness. Unfortunately, no one has come up with an acceptable implementation that would be appropriate for real world use. Since even with a very high success rate of 99% when you consider billions of packets are being analyzed is not acceptable. Since even if you have combination of 1% of both false positives and negatives that is still many millions errors. Current methods tend to be around 97% effective. As such, more research is needed to stay ahead of the attackers and defend our Cybersecurity.