The categorisation of network packets according to multiple parameters such as sender and receiver addresses is called packet classification. Packet classification lies at the core of Software-Defined Networking (SDN)-based network applications. Due to the increasing speed of network traffic, there is an urgent need for packet classification at higher speeds. Although it is possible to accelerate packet classification algorithms through hardware implementation, this solution imposes high costs and offers limited development capacity. On the other hand, current software methods to solve this problem are relatively slow. A practical solution to this problem is to parallelise packet classification using multi-core processors. In this study, the Thread, parallel patterns library (PPL), open multi-processing (OpenMP), and threading building blocks (TBB) libraries are examined and implemented to parallelise three packet classification algorithms, i.e. tuple space search, tuple pruning search, and hierarchical tree. According to the results, the type of algorithm and rulesets may influence the performance of parallelisation libraries. In general, the TBB-based method shows the best performance among parallelisation libraries due to using a theft mechanism and can accelerate the classification process up to 8.3 times on a system with a quad-core processor.

References

1. 1)
  - 7. Zhou, S., Singapura, S.G., Prasanna, V.K.: ‘High-performance packet classification on Gpu’. High Performance Extreme Computing Conf. (HPEC), MA, USA, 2014.
2. 2)
  - 2. Dobrescu, M., Egi, N., Argyraki, K., et al: ‘Routebricks: exploiting parallelism to scale software routers’. Proc. of the ACM SIGOPS 22nd symp. on Operating Systems Principles, MT, USA, 2009.
3. 3)
  - 12. Li, T.-H., Chu, H.-M., Wang, P.-C.: ‘Ip address lookup using Gpu’. IEEE 14th Int. Conf. on High Performance Switching and Routing (HPSR), Taipei, Taiwan, 2013.
4. 4)
  - 30. Zhou, S., Qu, Y.R., Prasanna, V.K.: ‘Multi-core implementation of decomposition-based packet classification algorithms’. Int. Conf. on Parallel Computing Technologies, St. Petersburg, Russia, 2013.
5. 5)
  - 9. Zhao, Y., Chen, L., Xie, G., et al: ‘Gpu implementation of a cellular genetic algorithm for scheduling dependent tasks of physical system simulation programs’, J. Comb. Optim., 2018, 35, (1), pp. 293–317.
6. 6)
  - 11. Maghazeh, A., Bordoloi, U.D., Dastgeer, U., et al: ‘Latency-aware packet processing on Cpu-Gpu heterogeneous systems’. 2017 54th ACM/EDAC/IEEE Design Automation Conf. (DAC), TX, USA, 2017.
7. 7)
  - 6. Paulino, H., Marques, E.: ‘Heterogeneous programming with single operation multiple data’, J. Comput. Syst. Sci., 2015, 81, (1), pp. 16–37.
8. 8)
  - 10. Nottingham, A., Irwin, B.: ‘Gpu packet classification using opencl: a consideration of viable classification methods’. Proc. of the 2009 Annual Research Conf. of the South African Institute of Computer Scientists and Information Technologists, Vanderbijlpark, South Africa, 2009.
9. 9)
  - 32. Taylor, D.E., Turner, J.S.: ‘Classbench: a packet classification benchmark’, IEEE/ACM Trans. Netw. (ton), 2007, 15, (3), pp. 499–511.
10. 10)
  - 14. Deng, Y., Jiao, X., Mu, S., et al: ‘Npgpu: network processing on graphics processing units’, in ‘Theoretical and mathematical foundations of computer science’ (Springer, Germany, 2011), pp. 313–321.
11. 11)
  - 1. Mythrei, S., Dharmaraj, R.: ‘Packet classification based on standard access control list’, Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET), 2014, 3, (1), pp. 172–175.
12. 12)
  - 3. Pao, D., Zhou, P., Liu, B., et al: ‘Enhanced prefix inclusion coding filter-encoding algorithm for packet classification with ternary content addressable memory’, IET Comput. Digit. Tech., 2007, 1, (5), pp. 572–580.
13. 13)
  - 29. Pong, F., Tzeng, N.-F.: ‘Harp: rapid packet classification via hashing round-down prefixes’, IEEE Trans. Parallel Distrib. Syst., 2011, 22, (7), pp. 1105–1119.
14. 14)
  - 17. Srinivasan, V., Suri, S., Varghese, G.: ‘Packet classification using tuple space search’. ACM SIGCOMM Computer Communication Review, NY, USA, 1999.
15. 15)
  - 28. Zhang, Y., Wang, H., Zhang, Y., et al: ‘An improved ant system algorithm based on PPL’. 2010 2nd Int. Conf. on Information Engineering and Computer Science, Hangzhou, China, 2010.
16. 16)
  - 26. Refsnes, P.R.: ‘Comparison of Openmp and threading building blocks for expressing parallelism on shared-memory systems’. The University of Bergen, 2011.
17. 17)
  - 16. Abbasi, M., Rafiee, M.: ‘A calibrated asymptotic framework for analyzing packet classification algorithms on Gpus’, J. Supercomput., 2019, 75, (10), pp. 6574–6611.
18. 18)
  - 4. Taylor, D.E.: ‘Survey and taxonomy of packet classification techniques’, ACM Comput. Surveys (CSUR), 2005, 37, (3), pp. 238–275.
19. 19)
  - 22. Available at http://msdn.microsoft.com/, Date Accessed 2019.
20. 20)
  - 24. Iordan, A.C., Jahre, M., Natvig, L.: ‘Tuning the victim selection policy of intel Tbb’, J. Syst. Archit., 2015, 61, (10), pp. 584–591.
21. 21)
  - 18. Wang, P.-C., Chan, C.-T., Tseng, W.-C., et al: ‘A fast packet classification by using enhanced tuple pruning’, in ‘Protocols for high speed networks’ (Springer, Germany, 2002), pp. 180–191.
22. 22)
  - 33. Azarkhish, E., Loi, I., Benini, L.: ‘A case for three-dimensional stacking of tightly coupled data memories over multi-core clusters using low-latency interconnects’, IET Comput. Digit. Tech., 2013, 7, (5), pp. 191–199.
23. 23)
  - 15. Beyeler, M., Oros, N., Dutt, N., et al: ‘A Gpu-accelerated cortical neural network model for visually guided robot navigation’, Neural Netw., 2015, 22, pp. 75–87.
24. 24)
  - 21. Stroustrup, B.: ‘A tour of C + +’ (Addison-Wesley Professional, England, 2013).
25. 25)
  - 23. Wolf, F., Psaroudakis, I., May, N., et al: ‘Extending database task schedulers for multi-threaded application code’. Proc. of the 27th Int. Conf. on Scientific and Statistical Database Management, CA, USA, 2015.
26. 26)
  - 20. Batty, M., Memarian, K., Owens, S., et al: ‘Clarifying and compiling C/C + + concurrency: from C + + 11 to power’, ACM SIGPLAN Notices, 2012, 47, (1), pp. 509–520.
27. 27)
  - 31. Qu, Y.R., Zhang, H.H., Zhou, S., et al: ‘Optimizing many-field packet classification on Fpga, multi-core general purpose processor, and Gpu’. 2015 ACM/IEEE Symp. on Architectures for Networking and Communications Systems (ANCS), Oakland, CA, USA, 2015.
28. 28)
  - 27. Salehian, S., Liu, J., Yan, Y.: ‘Comparison of threading programming models’. 2017 IEEE Int. Parallel and Distributed Processing Symp. Workshops (IPDPSW), FL, USA, 2017.
29. 29)
  - 13. Kang, K., Deng, Y.S.: ‘Scalable packet classification via Gpu metaprogramming’. Design, Automation & Test in Europe Conf. & Exhibition (DATE), Grenoble, France, 2011.
30. 30)
  - 5. Machado, R.S., Almeida, R.B., Jardim, A.D., et al: ‘Comparing performance of C compilers optimizations on different multicore architectures’. 2017 Int. Symp. on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), Campinas, Brazil, 2017.
31. 31)
  - 8. Zheng, J., Zhang, D., Li, Y., et al: ‘Accelerate packet classification using Gpu: a case study on hicuts’, in ‘Computer science and its applications’ (Springer, Germany, 2015), pp. 231–238.
32. 32)
  - 25. Kim, C.G., Kim, J.G., Lee, D.H.: ‘Optimizing image processing on multi-core Cpus with intel parallel programming technologies’, Multimedia Tools Appl., 2014, 68, (2), pp. 237–251.
33. 33)
  - 19. Lim, H., Lee, S., Swartzlander, JrE.E.: ‘A new hierarchical packet classification algorithm’, Comput. Netw., 2012, 56, (13), pp. 3010–3022.

Efficient parallelisation of the packet classification algorithms on multi-core central processing units using multi-threading application program interfaces

References

Related content