Machine Learning Techniques for Malware Detection
DOI:
https://doi.org/10.32628/IJSRSET21858Keywords:
Malware classification, Encrypted Traffic, Feature Selection, Random Forest, Support Vector MachineAbstract
The introduction of Transport Layer Security has been one of the most important contributors to the privacy and security of internet communications during the last decade. Malware authors have followed suit, using TLS to hide potentially dangerous network connections. Because of the growing use of encryption and other evasion measures, traditional content-based network traffic categorization is becoming more challenging. In this paper, we provide a malware classification technique that uses packet information and machine learning algorithms to detect malware. We employ the use of classification algorithms such as support vector machine and random forest. We start by eliminating characteristics that are highly correlated. We utilized the Random Forest method to choose only the 10 best characteristics from all the remaining features after eliminating the unnecessary ones. Following the feature selection phase, we employ several classification algorithms and evaluate their performance. Random forest algorithm performed exceptionally well in our experiments resulting in an accuracy score of over 0.99.
References
- Lucia MJ, Cotton C. Detection of encrypted malicious metwork traffic using machine learning. IEEE Military Communications Conference 2019 (pp. 1-6). IEEE.
- Shen M, Liu Y, Zhu L, Xu K, Du X, N. Guizani N. Optimizing feature selection for efficient encrypted traffic classification: A systematic approach. IEEE Network. 2020; 34(4):20-27.
- Yu T, Zou F, Li L, Yi P. An encrypted malicious traffic detection system based on neural network. International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery 2019 (pp. 62-70). IEEE.
- Thaseen IS, Poorva B, Ushasree PS. Network intrusion detection using machine learning techniques. International Conference on Emerging Trends in Information Technology and Engineering 2020 (pp. 1-7). IEEE.
- Singh AP, Singh M. A comparative review of malware analysis and detection in HTTPs traffic. International Journal of Computing and Digital Systems. 2021; 10(01):111-23.
- Priya A, Nandi S, Goswami RS. An analysis of real-time network traffic for identification of browser and application of user using clustering algorithm. International Conference on Advances in Computing, Communication Control and Networking 2018 (pp. 441-445). IEEE.
- Hou S, Huang X. Use of machine learning in detecting network security of edge computing system. International Conference on Big Data Analytics 2019 (pp. 252-256). IEEE.
- Shen M, Liu Y, Chen S, Zhu L, Zhang Y. Webpage fingerprinting using only packet length information. International Conference on Communications 2019 (pp. 1-6). IEEE.
- Dong Y, Zhao J, Jin J. Novel feature selection and classification of internet video traffic based on a hierarchical scheme. Computer Networks. 2017; 119:102–11.
- Conti M, Mancini LV, Spolaor R, Verde NV. Analyzing Android encrypted network traffic to identify user actions. IEEE Transactions on Information Forensics and Security. 2016; 11(1):114-25.
- Wang T, Cai X, Nithyanand R, Johnson R, Goldberg I. Effective attacks and provable defenses for website fingerprinting. Proceedings of the 23rd USENIX Conference on Security Symposium 2014 (pp. 143–57). ACM.
- Letteri I, Penna G, Vita L, Grifa M. (2020). MTA-KDD'19: A dataset for malware traffic detection. Proceedings of the Fourth Italian Conference on Cyber Security 2020 (pp. 153-65). CEUR-WS.
- S. Feghhi and D. J. Leith, “A Web Traffic Analysis Attack Using Only Timing Information,” IEEE Trans. Info. Forensics and Security, vol. 11, no. 8, 2016, pp. 1747–59.
- L. Xiao et al., “Cloud-Based Malware Detection Game for Mobile Devices with Offloading,” IEEE Trans. Mobile Computing, vol. 16, no. 10, Oct. 2017, pp. 2742–50.
- M. Shen et al., “Secure SVM Training Over Verticall Partitioned Datasets Using Consortium Blockchain for Vehicular Social Networks,” IEEE Trans. Vehic. Tech., 2019, pp. 1–1.
- A. Panchenko et al., “Website Fingerprinting at Internet Scale,” Network and Distributed System Security Symp., 2016, pp. 21–24.
- T. Wang et al., “Effective Attacks and Provable Defenses for Website Fingerprinting,” Usenix Conf. Security Symp., 2014, pp. 143–57.
- V. F. Taylor et al., “Robust Smartphone App Identification Via Encrypted Network Traffic Analysis,” IEEE Trans. Info Forensics and Security, vol. 13, no. 1, 2018, pp. 63–78.
- B. Anderson, S. Paul, and D. McGrew, “Deciphering malware’s use of TLS (without decryption),” Journal of Computer Virology and Hacking Techniques, vol. 14, no. 3, pp. 195–211, Aug. 2018.
- M. Singh, M. Singh, and S. Kaur, “Issues and challenges in DNS based botnet detection: A survey,” Computers & Security, vol. 86, pp. 28–52, Sep. 2019.
- McGaughey, D., Semeniuk, T., Smith, R., & Knight, S. (2018, April). A systematic approach of feature selection for encrypted network traffic classification. In 2018 Annual IEEE International Systems Conference (SysCon) (pp. 1-8). IEEE.
- Chang, Y., Li, W., & Yang, Z. (2017, July). Network intrusion detection based on random forest and support vector machine. In 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) (Vol. 1, pp. 635-638). IEEE.
- Haripriya, L., & Jabbar, M. A. (2018, March). Role of Machine Learning in Intrusion Detection System. In 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 925-929). IEEE.
Downloads
Published
Issue
Section
License
Copyright (c) IJSRSET

This work is licensed under a Creative Commons Attribution 4.0 International License.