Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Balanced feature selection method for Internet traffic classification

Balanced feature selection method for Internet traffic classification

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Networks — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

In Internet traffic classification, the class imbalance problem is mainly addressed by adjusting the class distribution. In the meanwhile, feature selection is also a key factor evoking this problem. Therefore a new filter feature selection method called balanced feature selection (BFS) is proposed. Every feature is measured both locally and globally and then an optimal feature subset is selected by our search model. A certainty coefficient is presented to measure the correlation between a feature and a certain class locally. The symmetric uncertainty is utilised to measure a feature and all classes globally. Through experiments on two real traffic traces using three classification algorithms, BFS is compared with five existing feature selection methods. Results show that it outperforms others by more than 15.29% g-mean improvement. Classification results are averaged over all datasets and classifiers here, 59.54% g-mean, 86.35% Mauc and 91.42% overall accuracy are achieved, respectively, when it is used.

References

    1. 1)
      • P. Bermolen , M. Mellia , M. Meo , D. Rossi , S. Valenti . Abacus: Accurate behavioral classification of P2P-TV traffic. Comput. Netw. , 2011 , 1394 - 1411
    2. 2)
      • Zhong, W.C., Raahemi, B., Liu, J.: `Learning on class imbalanced data to classify peer-to-peer applications in IP traffic using resampling techniques', Proc. Int. Conf. on Neural Networks, 2009, p. 3548–3554.
    3. 3)
      • Zuev, D., Moore, A.W.: `Traffic classification using a statistical approach', Proc. Int. Conf. on Passive and Active Measurement Workshop (PAM), 2005, p. 321–324.
    4. 4)
      • Khoshgoftaar, T.M., Gao, K., Van Hulse, J.: `A novel feature selection technique for highly imbalanced data', Proc. IEEE IRI, 2010, p. 80–85.
    5. 5)
      • K. Xu , Z.L. Zhang , S. Bhattacharyya . Internet traffic behavior profiling for network security monitoring. IEEE/ACM Trans. Netw. , 6 , 1241 - 1252
    6. 6)
      • Y. Jin , N. Duffield , J. Erman , P. Haffner , S. Sen , Z.L. Zhang . A modular machine learning system for flow-level traffic classification in large networks. ACM Trans. Knowl. Discov. Data , 1 , 1 - 34
    7. 7)
      • Alejo, R., Sotoca, J.M., Casañ, G.A.: `An empirical study for the multi-class imbalance problem with neural networks', Proc. Int. Conf. Iberoamerican Congress on Pattern Recognition, 2008, p. 479–486.
    8. 8)
      • J.S. Lei . Feature selection for text classification on skewed data sets. J. Comput. Inf. Syst. , 1 , 147 - 153
    9. 9)
      • J.R. Quinlan . (1992) C4.5: Programs for machine learning.
    10. 10)
      • Yu, L., Liu, H.: `Feature selection for high-dimensional data: a fast correlation-based filter solution', Proc. Int. Conf. on Machine Learning (ICML), 2003, p. 856–863.
    11. 11)
      • M. Dash , H. Liu . Consistency-based search in feature selection. Artif. Intell. , 155 - 176
    12. 12)
      • J. Yu , H. Lee , Y. Im , M.-S. Kim , D. Park . Real-time classification of Internet application traffic using a hierarchical multi-class SVM. KSII Trans. Internet Inf. Syst. , 5 , 859 - 876
    13. 13)
      • Hall, M.A.: `Correlation-based feature selection for machine learning', 1998, PhD, Waikato University, Department of Computer Science, p. 51–74.
    14. 14)
      • GTVS tool, http://www.cl.cam.ac.uk/research/srg/netos/brasil/gtvs/index.html.
    15. 15)
      • Moore, A.W., Zuev, D., Crogan, M.: `Discriminators for use in flow-based classification', Technical report RR-05–13,, 2005, p. 1–16.
    16. 16)
      • D.J. Hand , R.J. Till . A simple generalization of the area under the ROC curve for multiple class classification problems. Mach. Learn. , 2 , 171 - 186
    17. 17)
      • En-Najjary, T., Urvoy-Keller, G., Pietrzyk, M., Costeux, J.-L.: `Application-based feature selection for Internet traffic classification', Proc. Int. Conf. on Teletraffic Congress (ITC), 2010, p. 1–8.
    18. 18)
      • Liu, H., Setiono, R.: `Chi2: Feature selection and discretization of numeric attributes', Proc. Int. Conf. on Tools with Artificial Intelligence, 1995, p. 388–391.
    19. 19)
      • J. Hurley , E. Garcia-Palacios , S. Sezer . Classifying network protocols: a “two-way” flow approach. IET Commun. , 1 , 79 - 89
    20. 20)
      • Wang, W., Gombault, S., Guyet, T.: `Towards fast detecting intrusions: using key attributes of network traffic', Proc. Int. Conf. on Internet Monitoring and Protection, 2008, p. 86–91.
    21. 21)
      • I.H. Witten , E. Frank . (2005) Data mining: practical machine learning tools and techniques.
    22. 22)
      • R.X. Yuan , Z. Li , X.H. Guan , L. Xu . An SVM-based machine learning method for accurate Internet traffic classification. Informat. Syst. Front. , 2 , 149 - 156
    23. 23)
      • NetMate, 2007, http://sourceforge.net/projects/netmate-meter/.
    24. 24)
      • Moore, A.W., Zuev, D.: `Internet traffic classification using Bayesian analysis techniques', Proc. Int. Conf. on ACM SIGMETRICS, 2005, p. 50–60.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-net.2011.0049
Loading

Related content

content/journals/10.1049/iet-net.2011.0049
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address