© The Institution of Engineering and Technology
Machine learning (ML) techniques have been widely applied in recent traffic classification. However, the flow-level statistics are prone to improve the accuracies for some applications; however, to reduce the accuracies for others. To address the problem, the authors propose a cascaded traffic classifier that is composed of both several binary sub-classifiers and a multiclass sub-classifier. The authors first present theorems that show how to make an optimal cascade of sub-classifiers, and then design a cascaded classification algorithm for improving the accuracy of flow-level traffic classification. In addition, to improve the classification speed, the authors propose a parallel scheme for the cascaded classifier. The authors evaluate their approaches on the traces captured from entirely different networks. Compared with the previous multiclass traffic classifiers built in one-time training process, the cascaded classifier is superior in terms of the overall accuracy and the accuracy for each application.
References
-
-
1)
-
5. Erman, J., Arlitt, M., Mahanti, A.: ‘Traffic classification using clustering algorithms’. Proc. of the 2006 SIGCOMM Workshop on Mining Network Data (MineNet 2006), New York, 2006, pp. 281–286.
-
2)
-
17. He, H., Garcia, E.A.: ‘Learning from imbalanced data’, IEEE Trans. Knowl. Data Eng., 2009, 21, pp. 1263–1284.
-
3)
-
20. Muniyandi, A.P., Rajeswari, R., Rajaram, R.: ‘Network anomaly detection by cascading k-Means clustering and C4. 5 decision tree algorithm’, Proc. Eng., 2012, 30, pp. 174–182.
-
4)
-
1. Sen, S., Spatscheck, O., Wang, D.M.: ‘Accurate, scalable in-network identification of P2P traffic using application signatures’. Proc. of the 13th Int'l Conf. on World Wide Web (WWW 2004), New York, 2004, pp. 512–521.
-
5)
-
6)
-
9. Li, W., Canini, M., Moore, A.W.: ‘Efficient application identification and the temporal and spatial stability of classification schema’, Comput. Netw., 2009, 53, pp. 790–809.
-
7)
-
15. Este, A., Gringoli, F., Salgarelli, L.: ‘On the stability of the information carried by traffic flow features at the packet level’, ACM SIGCOMM Computer Communication Review, 2009, 39, pp. 13–18.
-
8)
-
16. Bernaille, L., Teixeira, R., Salamatian, K.: ‘Early application identification’. CONEXT’ 06, Portugal, 2006.
-
9)
-
6. Soysal, M., Schmidt, E.G.: ‘Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison’, Perform. Eval., 2010, 67, pp. 451–467.
-
10)
-
8. Shubhangi, M.G., Sarika, M.R., Shital, M.G.: ‘Implementation of network traffic classification by using MLS’, Int. Journal of Scientific and Research Publications, 2013, 3, pp. 1–3.
-
11)
-
27. Gringoli, F., Salgarelli, L., Dusi, M., et al: ‘GT: picking up the truth from the ground for internet traffic’, ACM SIGCOMM Computer Communication Review, 2009, 39, (5), pp. 13–18.
-
12)
-
13. Palmieri, F., Fiore, U., Castiglione, A., et al: ‘On the detection of card-sharing traffic through wavelet analysis and support vector machines’, Appl. Soft Comput., 2013, 13, pp. 615–627.
-
13)
-
23. Vapnik, V.N., Vapnik, V.: ‘Statistical learning theory’ (Willey-Interscience, 1998).
-
14)
-
10. Zhang, H., Lu, G., Qassrawi, M.T., et al: ‘Feature selection for optimizing traffic classification’, Comput. Commun., 2012, 35, pp. 1457–1471.
-
15)
-
19. Foremski, P., Callegari, C., Pagano, M.: ‘Waterfall: rapid identification of IP flows using cascade classification’. Int. Conf. on Computer Networks, 2014, pp. 14–23.
-
16)
-
7. Chawla, N.V., Bowyer, K.W., Hall, L.O., et al: ‘SMOTE: synthetic minority over_sampling technique’, J. Artif. Intell. Res., 2002, 16, pp. 321–357.
-
17)
-
24. Cormen, T.H., Leiserson, C.E., Rivest, R.L., et al: ‘Introduction to algorithms’ (The MIT press, 2008).
-
18)
-
12. Palmieri, F., Fiore, U.: ‘A nonlinear, recurrence-based approach to traffic classification’, Comput. Netw., 2009, 53, pp. 761–773.
-
19)
-
20)
-
21)
-
14. Grimaudo, L., Mellia, M., Baralis, E.: ‘Hierarchical learning for fine grained internet traffic classification’. Proc. Of the 8th Int. Conf. on Wireless Communications and Mobile Computing, Limassol, 2012.
-
22)
-
4. Bernaille, L., Teixeira, R., Akodkenou, I., et al: ‘Traffic classification on the fly’, ACM SIGCOMM Computer Communication Review, 2006, 36, pp. 23–26.
-
23)
-
21. Singh, K., Guntuku, S.C., Thakur, A., et al: ‘Big data analytics framework for peer-to-peer botnet detection using random forests’, Inf. Sci., 2014, 278, pp. 488–497.
-
24)
-
2. Bleul, H., Rathgeb, E.P., Zilling, S.: ‘Advanced P2P multiprotocol traffic analysis based on application level signature detection’. Proc. of the Telecommunications Network Strategy and Planning, New Delhi, 2006, pp. 1–6.
-
25)
-
26)
-
26. Batista, G., Prati, R., Monard, M.: ‘A study of the behavior of several methods for balancing machine learning training data’, SIGKDD Explor, 2004, 6, (1), pp. 20–29.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-com.2017.0091
Related content
content/journals/10.1049/iet-com.2017.0091
pub_keyword,iet_inspecKeyword,pub_concept
6
6