Malware classification based on API calls and behaviour analysis

Malware classification based on API calls and behaviour analysis

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Information Security — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This study presents the runtime behaviour-based classification procedure for Windows malware. Runtime behaviours are extracted with a particular focus on the determination of a malicious sequence of application programming interface (API) calls in addition to the file, network and registry activities. Mining and searching n-gram over API call sequences is introduced to discover episodes representing behaviour-based features of a malware. Voting Experts algorithm is used to extract malicious API patterns over API calls. The classification model is built by applying online machine learning algorithms and compared with the baseline classifiers. The model is trained and tested with a fairly large set of 17,400 malware samples belonging to 60 distinct families and 532 benign samples. The malware classification accuracy is reached at 98%.


    1. 1)
      • 1. Sharma, A., Sahay, S.K.: ‘Evolution and detection of polymorphic and metamorphic malwares: a survey’, Int. J. Comput. Appl., 2014, 9, (2), pp. 711.
    2. 2)
      • 2. Symantec.: ‘Internet Security Threat Report2016.
    3. 3)
      • 3. Pektas, A., Eris, M., Acarman, T.: ‘Proposal of n-gram based algorithm for malware classification’. The Fifth Int. Conf. on Emerging Security Information, Systems and Technologies, 2011, pp. 16.
    4. 4)
      • 4. Zhong, Y., Yamaki, H., Takakura, H.: ‘A malware classification method based on similarity of function structure’. IEEE/IPSJ 12th Int. Symp. on Applications and the Internet, 2012, pp. 256261.
    5. 5)
      • 5. Forrest, S., Longstaff, T.A.: ‘A sense of self for Unix processes’. IEEE Symp. on Security and Privacy, 1996, pp. 120128.
    6. 6)
      • 6. Liu, W., Ren, P.: ‘Behavior-based malware analysis and detection’. First Int. Workshop on Complexity and Data Mining (IWCDM), 2011, pp. 3942.
    7. 7)
      • 7. Natani, P., Vidyarthi, D.: ‘Malware detection using API function frequency with ensemble based classifier’, in Thampi, S.M., Atrey, P.K., Fan, CI., Perez, G.M. (Eds): ‘Security in computing and communications’ (Springer, Berlin, Heidelberg, 2013), pp. 378388.
    8. 8)
      • 8. Xu, J.Y., Sung, A., Chavez, P., et al: ‘Polymorphic malicious executable scanner by API sequence analysis’. Fourth Int. Conf. on Hybrid Intelligent Systems, 2004, pp. 378383.
    9. 9)
      • 9. Shen, F., Del Vecchio, J., Mohaisen, A., et al: ‘Android malware detection using complex-flows’. IEEE 37th Int. Conf. on Distributed Computing Systems, Atlanta, GA, 2017, pp. 24302437.
    10. 10)
      • 10. Mariconti, E., Onwuzurike, L., Andriotis, P., et al: ‘Mamadroid: detecting android malware by building markov chains of behavioral models’, CoRR, 2016,
    11. 11)
      • 11. Bai, H., Hu, C., Jing, X., et al: ‘Approach for malware identification using dynamic behaviour and outcome triggering’, IET Inf. Sec., 2014, 8, (2), pp. 140151.
    12. 12)
      • 12. Taeho, K., Zhendong, S.: ‘Behavior-based malware analysis and detection’. IEEE 11th Int. Conf. on Data Mining, 2011, pp. 11341139.
    13. 13)
      • 13. Kim, S., Park, J., Lee, K., et al: ‘A brief survey on rootkit techniques in malicious codes’, J. Internet Services Inf. Sec., 2012, 3, (4), pp. 134147.
    14. 14)
      • 14. Nari, S., Ghorbani, A.A.: ‘Automated malware classification based on network behavior’. Int. Conf. on Computing, Networking and Communications, 2013, pp. 642647.
    15. 15)
      • 15. Hall, M., Frank, E., Holmes, G., et al: ‘The WEKA data mining software: an update’, ACM SIGKDD Explor. Newsl., 2009, 11, (1), pp. 1018.
    16. 16)
      • 16. Mohaisen, A., West, A.G., Mankin, A.: ‘Chatter: classifying malware families using system event ordering’. IEEE Conf. on Communications and Network Security, 2014, pp. 283291.
    17. 17)
      • 17. Rieck, K., Philipp, T., Willems, C., et al: ‘Automatic analysis of malware behavior using machine learning’, J. Comput. Sec., 2011, 19, (4), pp. 639668.
    18. 18)
      • 18. Chandramohan, M., Tan, H.B., Kuan, B., et al: ‘A scalable approach for malware detection through bounded feature space behavior modeling’. IEEE/ACM 28th Int. Conf. on Automated Software Engineering, 2013, pp. 312322.
    19. 19)
      • 19. Ki, Y., Kim, E., Kim, H.K.: ‘A novel approach to detect malware based on API call sequence analysis’, Int. J. Distrib. Sensor Netw., 2015, 19, (4), p. 4.
    20. 20)
      • 20. Hunt, G., Brubacher, D.: ‘DETOURS: binary interception of Win32 functions’. 3rd Usenix Windows NT Symp., 1999, pp. 1414.
    21. 21)
      • 21. Tirli, H., Pektas, A., Falcone, Y., et al: ‘Virmon: a virtualization-based automated, dynamic malware analysis system’. Proc. of Int. Conf. on Information Security and Cryptology, 2013, pp. 5762.
    22. 22)
      • 22. ‘Cuckoo Sandbox’,, accessed 24 July 2017.
    23. 23)
      • 23. Kirillov, I., Beck, D., Chase, P., et al: ‘Malware attribute enumeration and characterization’. Tech. Rep, The MITRE Corporation, 2010.
    24. 24)
      • 24. Caltagirone, S., Pendergast, A., Betz, C.: ‘The diamond model of intrusion analysis’, DTIC Document, 2013.
    25. 25)
      • 25. Cohen, P., Adams, N., Heeringa, B.: ‘Voting experts: an unsupervised algorithm for segmenting sequences’, Intell. Data Anal., 2006, 11, (6), pp. 607625.
    26. 26)
      • 26. ‘Virustotal: An online multiple AV Scan Service’,, accessed 17 July 2017.
    27. 27)
      • 27. Mohaisen, A., Alrawi, O.: ‘Av-meter: an evaluation of antivirus scans and labels’, in Dietrich, S. (Ed): ‘Detection of intrusions and malware, and vulnerability Assessment’ (Springer International Publishing, Cham, 2014), pp. 112131.
    28. 28)
      • 28. Mohaisen, A., Alrawi, O., Matt, L., et al: ‘Towards a methodical evaluation of antivirus scans and labels’, in Kim, Y., Lee, H., Perrig, A. (Eds): ‘Information security applications’ (Springer International Publishing, Cham, 2013), pp. 231241.
    29. 29)
      • 29. Dredze, M., Crammer, K., Pereira, F.: ‘Confidence-weighted linear classification’. Proc. 25th Int. Conf. on Machine learning, 2008, pp. 264271.
    30. 30)
      • 30. Crammer, K., Kulesza, A., Dredze, M.: ‘Adaptive regularization of weight vectors’, in Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I., Culotta, A. (Eds): ‘Advances in neural information processing systems’ (Curran Associates, Inc., 2009), pp. 414422,
    31. 31)
      • 31. Altman, N.S.: ‘An introduction to kernel and nearest-neighbor nonparametric regression’, Am. Stat., 1992, 46, (32), pp. 175185.
    32. 32)
      • 32. Wu, T.F., Lin, C.J., Weng, R.C.: ‘Probability estimates for multi-class classification by pairwise coupling’, J. Mach. Learn. Res., 2004, 5, pp. 9751005.
    33. 33)
      • 33. Breiman, L.: ‘Random forests’, Mach. Learn., 2001, 45, (1), pp. 532.
    34. 34)
      • 34. ‘Virusshare: Malware Sharing Platform’,, accessed 17 July 2017.
    35. 35)
      • 35. Kolosnjaji, B., Eraisha, G., Webster, G., et al: ‘Empowering convolutional networks for malware classification and analysis’. Int. Joint Conf. on Neural Networks, May 2017, pp. 38383845.
    36. 36)
      • 36. ‘Analysis reports of the malware samples used in this study’,, accessed 28 August 2017.

Related content

This is a required field
Please enter a valid email address