access icon free Hadoop framework for efficient sentiment classification using trees

Due to the increase in the speed of generation of data, the authors are forced to handle a massive volume of data with the help of conventional machine learning algorithms. Big data is an enormous volume of data which is beyond the capacity of the traditional database software tool to collect, store, manage, and process within a stipulated time limit. Sentiment analysis is analysing the data by classifying the text on the basis of strength and polarity of opinion (positive/negative) words that define the text. While handling big data, Hadoop provides a platform for users to develop their own sentiment analysis with the help of a lexicon dictionary or available application programming interface (API) or external programs. The aim of classifying data is to analyse extensive data and develop an appropriate description or model for every organised class with the feature present in the data. In this work, the feature extraction based on term frequency-inverse document frequency is utilised and the Hadoop framework in attaining a useful classification with the help of random forest techniques.

Inspec keywords: learning (artificial intelligence); tree data structures; data analysis; pattern classification; sentiment analysis; data handling; feature extraction; Big Data

Other keywords: Hadoop framework; opinion words; massive volume; machine learning algorithms; stipulated time limit; efficient sentiment classification; sentiment analysis; big data; traditional database software tool; extensive data analysis

Subjects: Other topics in statistics; Knowledge engineering techniques; Natural language interfaces; File organisation; Document processing and analysis techniques

References

    1. 1)
      • 11. Ingle, A., Kante, A., Samak, S., et al: ‘Sentiment analysis of twitter data using hadoop’, Int. J. Eng. Res. General Sci., 2015, 3, (6), pp. 144147.
    2. 2)
      • 10. Rodrigues, A.P., Chiplunkar, N.N., Rao, A.: ‘Sentiment analysis of social Media data using hadoop framework: A survey’, Int. J. Comput. Appl., 2016, 151, (6), pp. 710.
    3. 3)
      • 7. Suthaharan, S.: ‘Big data classification: problems and challenges in network intrusion prediction with machine learning’, Department of Computer Science, University of North Carolina at Greensboro, Greensboro, NC 27402, USA, 2012.
    4. 4)
      • 21. Sahitya, A.B., Vijayalakshmi, D.M.M.: ‘Feature extraction from big data’.
    5. 5)
      • 5. Sharma, D.: ‘Study of sentiment analysis using hadoop’, in Aggarwal, V., Bhatnagar, V., Mishra, D. (Eds.): Big Data Analytics, Advances in Intelligent Systems and Computing, vol 654, 2018, pp. 363376, https://doi.org/10.1007/978-981-10-6620-7_35.
    6. 6)
      • 20. Asghar, M.Z., Khan, A., Ahmad, S., et al: ‘A review of feature extraction in sentiment analysis’, J. Basic Appl. Sci. Res., 2014, 4, (3), pp. 181186.
    7. 7)
      • 14. Huang, B.F., Boutros, P.C.: ‘The parameter sensitivity of random forests’, BMC bioinformatics, 2016, 17, (1), p. 331.
    8. 8)
      • 19. Patil, V.S, Soni, P.D.: ‘Hadoop skeleton & fault tolerance in hadoop clusters’, IJAIEM, 2013, 2, (2), pp. 247250.
    9. 9)
      • 15. Han, J., Liu, Y., Sun, X.: ‘A scalable random forest algorithm based on mapreduce’. 2013 4th IEEE Int. Conf. on Software Engineering and Service Science (ICSESS), Beijing, China, May 2013, pp. 849852.
    10. 10)
      • 17. Mathivanan, V., Al-Alawi, A.S.R., Al-Sulti, S.S.: ‘An selection system for automotive sentiment classification In hadoop using KNN classifier’, Eur. J. Business Manage. Res., 2019, 4, (6), pp. 26.
    11. 11)
      • 23. Breiman, L.: ‘Random forests’, Mach. Learn., 2001, 45, (1), pp. 532.
    12. 12)
      • 22. Criminisi, A., Shotton, J., Konukoglu, E.: ‘Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning’, Found Trends® Comput. Graph Vis., 2011, 7, pp. 81227.
    13. 13)
      • 8. Liu, Y.: ‘Random forest algorithm in big data environment’, Comput. Model. New Technol., 2014, 18, (12A), pp. 147151.
    14. 14)
      • 16. Temesgen, M.M., Lemma, D.T.: ‘A scalable text classification using naive Bayes with hadoop framework’. Int. Conf. on Information and Communication Technology for Development for Africa, Cham, May 2019, pp. 291300.
    15. 15)
      • 1. Kurian, D.D.M.K., Vishnupriya, S., Ramesh, R., et al: ‘Big data sentiment analysis using hadoop’, Int. J. Innovative Res. Sci. Technol., 2015, 1, (11), pp. 9296.
    16. 16)
      • 24. Lulli, A., Oneto, L., Anguita, D.: ‘Mining big data with random forests’, Cogn. Comput., 2019, 11, (2), pp. 294316.
    17. 17)
      • 18. Alam, A., Ahmed, J.: ‘Hadoop architecture and its issues’. 2014 Int. Conf. on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA., March 2014, Vol. 2, pp. 288291.
    18. 18)
      • 13. Matwankar, S.H., Shinde, S.H.: ‘Sentiment analysis for big data using data mining algorithms’, Int. J. Eng. Res. Technol. (IJERT), 2015, 4, (9), pp. 962965.
    19. 19)
      • 6. Ogutu, R.V.A., Rimiru, R., Otieno, C.: ‘Target sentiment analysis model with naïve Bayes and support vector machine for product review classification’, Int. J. Comput. Sci. Inf. Secur. (IJCSIS), 2019, 17, (7), pp. 117.
    20. 20)
      • 4. Felciah, M.L.P., Anbuselvi, R.: ‘A study on sentiment analysis of social media reviews’. 2015 Int. Conf. on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore, India, March 2015, pp. 13.
    21. 21)
      • 12. Ayma, V.A., Ferreira, R.S., Happ, P., et al: ‘Classification algorithms for big data analysis, a map reduce approach’, Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., 2015, 40, (3), p. 17.
    22. 22)
      • 3. Del Río, S., López, V., Benítez, J. M., et al: ‘On the use of MapReduce for imbalanced big data using random forest’, Inf. Sci., 2014, 285, pp. 112137.
    23. 23)
      • 9. Kalambate, A.R., Mane, M.R., Rane, Z., et al: ‘Use of hadoop framework for web based sentiment analysis’, Int. J. for Sci. Res. Dev., 2015, 3, (8), ISSN (online): 2321–0613, pp. 855857.
    24. 24)
      • 2. Wang, Y., Rao, Y., Wu, L.: ‘A review of sentiment semantic analysis technology and progress’. In 2017 13th Int. Conf. on Computational Intelligence and Security (CIS), Hong Kong, China, December 2017, pp. 452455.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-net.2019.0208
Loading

Related content

content/journals/10.1049/iet-net.2019.0208
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading