High-accuracy document classification with a new algorithm

T. Temel

High-accuracy document classification with a new algorithm

View Fulltext

Author(s): T. Temel¹
- Affiliations: 1: Department of Mechatronics Engineering, Faculty of Natural Sciences , Architecture and Engineering, Bursa Technical University , 16320 Bursa , Turkey
Source: Volume 54, Issue 17, 23 August 2018, p. 1028 – 1030
DOI: 10.1049/el.2018.0790 , Print ISSN 0013-5194, Online ISSN 1350-911X

Received 06/03/2018, Published 20/07/2018

A new algorithm based on learning vector quantisation classifier is presented based on a modified proximity-measure, which enforces a predetermined correct classification level in training while using sliding-mode approach for stable variation in weight updates towards convergence. The proposed algorithm and some well-known counterparts are implemented by using Python libraries and compared in a task of text classification for document categorisation. Results reveal that the new classifier is a successful contender to those algorithms in terms of testing and training performances.

References

1. 1)
  - 3. Li, C.H., Park, S.C.: ‘An efficient document classification model using an improved back propagation neural network and singular value decomposition’, Expert Syst. Appl., 2009, 36, pp. 3208–3215 (doi: 10.1016/j.eswa.2008.01.014).
2. 2)
  - 9. Pacella, M., Grieco, A., Blaco, M.: ‘On the use of self-organizing map for text clustering in engineering change process analysis: a case study’, Comput. Intel. Neurosci., 2016, ID 5139574, pp. 1–11.
3. 3)
  - 13. Temel, T.: ‘System and circuit design for biologically-inspired intelligent learning’ (IGI Global, PA, USA, 2010).
4. 4)
  - 15. Hammer, B., Hoffmann Schleif, D.F.-M., Zhu, X.: ‘Learning vector quantization for (dis-)similarities’, Neurocomputing, 2014, 131, pp. 43–51 (doi: 10.1016/j.neucom.2013.05.054).
5. 5)
  - 4. Gkanogiannis, A., Kalamboukis, T.: ‘A perceptron-like linear supervised algorithm for text classification’, in Cao, L., et al (Ed) ‘Advanced data mining and applications’ (Springer, Berlin, 6440, 2010), pp. 86–97.
6. 6)
  - 14. Umer, M.F., Khiyal, M.S.H.: ‘Classification of textual documents using learning vector quantization’, Inf. Technol. J., 2007, 6, pp. 154–159 (doi: 10.3923/itj.2007.154.159).
7. 7)
  - 11. Nova, D., Estévez, P.: ‘A review of learning vector quantization classifiers’, Neural Comput. Appl., 2013, 25, (3–4), pp. 511–524 (doi: 10.1007/s00521-013-1535-3).
8. 8)
  - 1. Joachims, T.: ‘A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization’. Proc. Int. Conf. Machine Learning, San Francisco, CA, USA, July 1997, pp. 143–151.
9. 9)
  - 16. Kaden, M., Lange, M., Nebel, D., et al: ‘Aspects in classification learning - review of recent, developments in learning vector quantization’, Found. Comput. Decision Sci., 2014, 39, (1), pp. 79–105 (doi: 10.2478/fcds-2014-0006).
10. 10)
  - 18. Temel, T., Ashrafiuon, H.: ‘Sliding-mode control approach for faster tracking’, Electron. Lett., 2012, 48, (15), pp. 916–917 (doi: 10.1049/el.2012.1576).
11. 11)
  - 10. Miao, D., Duan, Q., Zhang, H., et al: ‘Rough set based hybrid algorithm for text classification’, Expert Syst. Appl., 2009, 36, (5), pp. 9168–9174 (doi: 10.1016/j.eswa.2008.12.026).
12. 12)
  - 17. Temel, T.: ‘A new classification algorithm: optimally generalized learning vector quantization (OGLVQ)’, Neural Netw. World, 2017, 27, (6), pp. 569–576 (doi: 10.14311/NNW.2017.27.031).
13. 13)
  - 5. Isa, D., Lee, L.H., Kallimani, V.P., et al: ‘Text documents preprocessing with the Bahes formula for classification using the support vector machine’, Trans. Knowl. Data Eng., 2008, 20, pp. 1264–1272 (doi: 10.1109/TKDE.2008.76).
14. 14)
  - 8. Bang, S.L., Yang, J.D., Yang, H.J.: ‘Hierarchical document categorization with k-NN and concept-based thesauri’. Inf. Process. Manag., 2006, 42, (2), pp. 387–406 (doi: 10.1016/j.ipm.2005.04.003).
15. 15)
  - 49. Janardhanan, S., Bandyopadhyay, B.: ‘On discretization of continuous-time terminal sliding mode’, IEEE Trans. Autom. Control, 2006, 51, (9), pp. 1532–1536 (doi: 10.1109/TAC.2006.880805).
16. 16)
  - 6. Khan, A., Baharudin, B., Lee, L.H., et al: ‘A review of machine learning algorithms for text-documents classification’, J. Adv. Inf. Technol., 2010, 1, (1), pp. 4–20.
17. 17)
  - 2. Frank, E., Bouckaert, R.: ‘Naive Bayes for text classification with unbalanced classes’. Proc. European Conf. Principles and Practice of Knowledge Discovery in Databases, Berlin, Germany, September 2006, vol. 4213, pp. 503–510.
18. 18)
  - 12. Temel, T., Karlik, B.: ‘An improved odor recognition system using learning vector quantization with a new discriminant analysis’, Neural Netw. World, 2007, 17, (4), pp. 287–294.
19. 19)
  - 7. Li, Y.H., Jain, A.K.: ‘Classification of text documents’, Comput. J., 1998, 8, pp. 537–546 (doi: 10.1093/comjnl/41.8.537).
20. 20)
  - 20. Temel, T., Ashrafiuon, H.: ‘Sliding-mode speed controller for tracking of underactuated surface vessels with extended Kalman filter’, Electron. Lett., 2015, 51, (6), pp. 467–469 (doi: 10.1049/el.2014.4516).

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

High-accuracy document classification with a new algorithm

References

Related content