Inline high-bandwidth network analysis using a robust stream clustering algorithm

Inline high-bandwidth network analysis using a robust stream clustering algorithm

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Information Security — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

High-bandwidth network analysis is challenging, resource consuming, and inaccurate due to the high volume, velocity, and variety characteristics of the network traffic. The infinite stream of incoming traffic forms a dynamic environment with unexpected changes, which requires analysing approaches to satisfy the high-bandwidth network processing challenges such as incremental learning, inline processing, and outlier handling. This study proposes an inline high-bandwidth network stream clustering algorithm designed to incrementally mine large amounts of continuously transmitting network traffic when some outliers can be dropped before determining the network traffic behaviour. Maintaining extended-meta-events as abstracting data structures over a sliding window, enriches the algorithm to address the high-bandwidth network processing challenges. Evaluating the algorithm indicates its robustness, efficiency, and accuracy in analysing high-bandwidth networks.


    1. 1)
      • 1. Moore, A.W., Papagiannaki, K.: ‘Toward the accurate identification of network applications’, in Dovrolis, C. (Ed.): Passive and Active Network Measurement (PAM 2005), Boston, MA, USA, 2005 (LNCS, 3431), pp. 41-54.
    2. 2)
      • 2. Madhukar, A., Williamson, C.: ‘A longitudinal study of P2P traffic classification’. 14th IEEE Int. Symp. on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2006 (MASCOTS 2006), Washington DC, USA, 2006, pp. 1791881.
    3. 3)
      • 3. Alcock, S., Möller, J.P., Nelson, R.: ‘Sneaking past the firewall: quantifying the unexpected traffic on major TCP and UDP ports’. Proc. 2016 ACM on Internet Measurement Conf., New York, USA, 2016, pp. 231237.
    4. 4)
      • 4. Zhang, J., Chen, X., Xiang, Y., et al: ‘Robust network traffic classification’, IEEE/ACM Tran. Netw. (TON), 2015, 23, pp. 12571270.
    5. 5)
      • 5. Perera, P., Tian, Y.C., Fidge, C., et al: ‘A comparison of supervised machine learning algorithms for classification of communications network traffic’. Int. Conf. on Neural Information Processing, Cham, Switzerland, 2017, pp. 445454.
    6. 6)
      • 6. Manvi, S.S., Shyam, G.K.: ‘Resource management for infrastructure as a service (IAAS) in cloud computing: a survey’, J. Netw. Comput. Appl., 2014, 41, pp. 424440.
    7. 7)
      • 7. Pathan, A.S.K.: ‘The state of the art in intrusion prevention and detection’ (CRC Press, Hoboken, NJ, USA, 2014).
    8. 8)
      • 8. Casas, P., Mazel, J., Owezarski, P.: ‘Knowledge-independent traffic monitoring: unsupervised detection of network attacks’, IEEE Netw., 2012, 26, pp. 1321.
    9. 9)
      • 9. Dainotti, A., Pescape, A., Claffy, K.C.: ‘Issues and future directions in traffic classification’, IEEE Netw., 2012, 26, pp. 3540.
    10. 10)
      • 10. Gharaee, H., Hosseinvand, H.: ‘A new feature selection IDS based on genetic algorithm and SVM’. IEEE 2016 8th Int. Symp. on Telecommunications (IST), Tehran, Iran, 2016, pp. 139144.
    11. 11)
      • 11. Tongaonkar, A., Torres, R., Iliofotou, M., et al: ‘Towards self adaptive network traffic classification’, Comput. Commun., 2015, 56, pp. 3546.
    12. 12)
      • 12. Casas, P., Mazel, J., Owezarski, P.: ‘UNADA: unsupervised network anomaly detection using sub-space outliers ranking’. Int. Conf. on Research in Networking, Valencia, Spain, 2011, pp. 4051.
    13. 13)
      • 13. Jin, Y., Duffield, N., Erman, J., et al: ‘A modular machine learning system for flow-level traffic classification in large networks’, ACM Trans. Knowl. Discov. Data, 2012, 6, pp. 4:14:34.
    14. 14)
      • 14. Noferesti, M., Jalili, R.: ‘HB2DS: a behavior-driven high-bandwidth network mining system’, J. Syst. Softw., 2017, 127, pp. 266277.
    15. 15)
      • 15. Aggarwal, C.C.: ‘Data mining: the textbook’ (Springer, Switzerland, 2015).
    16. 16)
      • 16. Gama, J.: ‘Knowledge discovery from data streams’ (CRC Press, New York, USA, 2010).
    17. 17)
      • 17. Silva, J.A., Faria, E.R., Barros, R.C., et al: ‘Data stream clustering: a survey’, ACM Comput. Surv., 2013, 46, pp. 13:113:31.
    18. 18)
      • 18. Zarrabi-Zadeh, H., Mukhopadhyay, A.: ‘Streaming 1-center with outliers in high dimensions’. Canadian Conf. on Computational Geometry (CCCG), Vancouver, Canada, 2009, pp. 8386.
    19. 19)
      • 19. Ester, M., Kriegel, H.P., Sander, J., et al: ‘A density-based algorithm for discovering clusters in large spatial databases with noise’ (AAAI Press, Portland, OR, USA, 1996), pp. 226231.
    20. 20)
      • 20. Bennett, M.A., Piggott, A.C., Garfield, D.J.M., et al: ‘Real-time network monitoring and security’. US Patent 9,769,276, 2017.
    21. 21)
      • 21. Wang, B., Zhang, J., Zhang, Z., et al: ‘Robust traffic classification with mislabelled training samples’. 2015 IEEE 21st Int. Conf. on Parallel and Distributed Systems (ICPADS), Melbourne, Australia, 2015, pp. 328335.
    22. 22)
      • 22. Lin, R., Li, O., Li, Q., et al: ‘Unknown network protocol classification method based on semi-supervised learning’. 2015 IEEE Int. Conf. on Computer and Communications (ICCC), Chengdu, China, 2015, pp. 300308.
    23. 23)
      • 23. Arora, D., Li, K.F., Loffler, A.: ‘Big data analytics for classification of network enabled devices’. 2016 30th Int. Conf. on Advanced Information Networking and Applications Workshops (WAINA), Crans-Montana, Switzerland, 2016, pp. 708713.
    24. 24)
      • 24. Lu, C.N., Huang, C.Y., Lin, Y.D., et al: ‘High performance traffic classification based on message size sequence and distribution’, J. Netw. Comput. Appl., 2016, 76, pp. 6074.
    25. 25)
      • 25. Gomes, H.M., Barddal, J.P., Enembreck, F., et al: ‘A survey on ensemble learning for data stream classification’, ACM Comput. Surv. (CSUR), 2017, 50, p. 23.
    26. 26)
      • 26. Aggarwal, C.C., Han, J., Wang, J., et al: ‘A framework for clustering evolving data streams’. Proc. 29th Int. Conf. on Very Large Data Bases, VLDB Endowment, Berlin, Germany, 2003, vol. 29, pp. 8192.
    27. 27)
      • 27. Garofalakis, M., Gehrke, J., Rastogi, R.: ‘Data stream management: processing high-speed data streams’ (Springer, Berlin, Germany, 2016).
    28. 28)
      • 28. Amini, A., Ying, W.: ‘Dengris-stream: a density-grid based clustering algorithm for evolving data streams over sliding window’. Proc. Int. Conf. on Data Mining and Computer Engineering, 2012, pp. 206210.
    29. 29)
      • 29. Chen, Y., Tu, L.: ‘Density-based clustering for real-time stream data’. Proc. 13th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, USA, 2007, pp. 133142.
    30. 30)
      • 30. Aggarwal, C.C., Han, J., Wang, J., et al: ‘A framework for projected clustering of high dimensional data streams’. Proc. 30th Int. Conf. on Very Large Data Bases, Toronto, Canada, 2004, vol. 30, pp. 852863.
    31. 31)
      • 31. Ren, J., Ma, R.: ‘Density-based data streams clustering over sliding windows’. 2009 Sixth Int. Conf. on Fuzzy Systems and Knowledge Discovery, Tianjin, China, 2009, pp. 248252.
    32. 32)
      • 32. Hahsler, M., Bolaños, M.: ‘Clustering data streams based on shared density between micro-clusters’, IEEE Trans. Knowl. Data Eng., 2016, 28, pp. 14491461.
    33. 33)
      • 33. Cao, F., Estert, M., Qian, W., et al: ‘Density-based clustering over an evolving data stream with noise’. Proc. 2006 SIAM Int. Conf. on Data Mining, Bethesda, MD, USA, 2006, pp. 328339.
    34. 34)
      • 34. PF_RING: High-speed packet capture, filtering and analysis. URL: Available at, 2018, accessed: 2018-05-11.
    35. 35)
      • 35. Hahsler, M., Bolanos, M., Forrest, J.: ‘Introduction to stream: an extensible framework for data stream clustering research with r’, J. Stat. Softw., 2017, 76, pp. 150.
    36. 36)
      • 36. Bifet, A., Holmes, G., Kirkby, R., et al: ‘MOA: massive online analysis’, J. Mach. Learn. Res., 2010, 11, pp. 16011604.
    37. 37)
      • 37. Deri, L., Martinelli, M., Bujlow, T., et al: ‘nDPI: open-source high-speed deep packet inspection’. 2014 Int. Wireless Communications and Mobile Computing Conf. (IWCMC), Nicosia, Cyprus, 2014, pp. 617622.
    38. 38)
      • 38. Wang, B., Zhang, J., Zhang, Z., et al: ‘Traffic identification in big internet data’, in Shui, Y., Guo, S. (Eds.): ‘Big data concepts, theories, and applications’ (Springer, Cham, Switzerland, 2016), pp. 129156.
    39. 39)
      • 39. Muja, M., Lowe, D.G.: ‘Scalable nearest neighbor algorithms for high dimensional data’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, pp. 22272240.

Related content

This is a required field
Please enter a valid email address