Your browser does not support JavaScript!

access icon free Retrieving and mining professional experience of software practice from grey literature: an exploratory review

Retrieving and mining practitioners’ self-reports of their professional experience of software practice could provide valuable evidence for research. The authors are, however, unaware of any existing reviews of research conducted in this area. The authors reviewed and classified previous research, and identified insights into the challenges research confronts when retrieving and mining practitioners’ self-reports of their experience of software practice. They conducted an exploratory review to identify and classify 42 studies. They analysed a selection of those studies for insights on challenges to mining professional experience. They identified only one directly relevant study. Even then this study concerns the software professional's emotional experiences rather than the professional's reporting of behaviour and events occurring during software practice. They discussed the challenges concerning: the prevalence of professional experience; definitions, models and theories; the sparseness of data; units of discourse analysis; annotator agreement; evaluation of the performance of algorithms; and the lack of replications. No directly relevant prior research appears to have been conducted in this area. They discussed the value of reporting negative results in secondary studies. There are a range of research opportunities but also considerable challenges. They formulated a set of guiding questions for further research in this area.


    1. 1)
    2. 2)
      • 5. Rainer, A., Williams, A.: ‘Using blog-like documents to investigate software practice: benefits, challenges, and research directions’, J. Softw., Evol. Process, 2019, 31, (11), p. e2197.
    3. 3)
      • 67. Jijkoun, V., de Rijke, M., Weerkamp, W., et al: ‘Mining user experiences from online forums: an exploration’. Proc. of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, Los Angeles, California, USA, 2010, pp. 1718.
    4. 4)
      • 15. Cabrio, E., Villata, S.: ‘Five years of argument mining: a data-driven analysis’. Int. Joint Conf. on Artificial Intelligence, Stockholm, 2018, pp. 54275433.
    5. 5)
      • 55. Abe, S., Inui, K., Hara, K., et al: ‘Mining personal experiences and opinions from web documents’, Web Intell. Agent Syst., Int. J., 2011, 9, (2), pp. 109121.
    6. 6)
      • 76. Gordon, A., Swanson, R.: ‘Identifying personal stories in millions of weblog entries’. Third Int. Conf. on Weblogs and Social Media, Data Challenge Workshop, San Jose, CA, 2009, vol. 46, pp. 1623.
    7. 7)
      • 28. Ceran, B., Karad, R., Corman, S., et al: ‘A hybrid model and memory based story classifier’. Proc. of the 3rd Workshop on Computational Models of Narrative, Istanbul, Turkey, 2012, pp. 5862.
    8. 8)
      • 71. Achour, C.B.: ‘Guiding scenario authoring’, Inf. Modell. Knowl. Bases X, 1999, 51, p. 152.
    9. 9)
      • 35. de AR Gonçalves, J.C., Santoro, F.M., Baião, F.A.: ‘Collaborative narratives for business rule elicitation’. 2011 IEEE Int. Conf. on Systems, Man, and Cybernetics (SMC), Anchorage, AK, USA, 2011, pp. 19261931.
    10. 10)
      • 47. Bonchi, F., Bordino, I., Gullo, F., et al: ‘Identifying buzzing stories via anomalous temporal subgraph discovery’. 2016 IEEE/WIC/ACM Int. Conf. on Web Intelligence (WI), Omaha, NE, USA, 2016, pp. 161168.
    11. 11)
    12. 12)
      • 68. Vails-Vargas, J., Zhu, J., Ontanon, S.: ‘Towards automatically extracting story graphs from natural language stories’. vol. WS-17-01 - WS-17-15. cited By 0, San Francisco, USA, 2017, pp. 10061013. Available at
    13. 13)
      • 10. Rainer, A.: ‘Anaya's journey: a vision for a future software academy’, IEEE Softw., 2020, 37, (2), pp. 9596.
    14. 14)
      • 66. Garcıa-Barriocanal, E., Sicilia, M.-A., Korfiatis, N.: ‘Exploring hotel service quality experience indicators in user-generated content: a case using tripadvisor data’. Proc. of the 5th Mediterranean Conf. on Information Systems (MCIS 2010), Tel Aviv, Israel, 2010.
    15. 15)
      • 75. Labov, W., Waletzky, J.: ‘Narrative analysis. Essays on the verbal and visual arts’, in Helm, J., MacNeish, J.H. (Eds.): ‘Essays on the verbal and visual arts’ (University of Washington Press, Seattle, 1967), pp. 1244.
    16. 16)
      • 82. Parnin, C., Treude, C.: ‘Measuring API documentation on the web’. Proc. of the 2nd Int. Workshop on Web 2.0 for Software Engineering, Waikiki, Honolulu HI USA, 2011, pp. 2530.
    17. 17)
      • 73. Hemmatian, F., Sohrabi, M.K.: ‘A survey on classification techniques for opinion mining and sentiment analysis’, Artif. Intell. Rev., 2019, 52, pp. 14951545.
    18. 18)
      • 2. Devanbu, P., Zimmermann, T., Bird, C.: ‘Belief & evidence in empirical software engineering’. Proc. of the 38th Int. Conf. on Software Engineering, Austin, TX, USA, 2016, pp. 108119.
    19. 19)
    20. 20)
      • 64. Hattori, Y., Nadamoto, A.: ‘Extracting tip information from social media’. Proc. of the 14th Int. Conf. on Information Integration and Web-based Applications & Services, Bali, Indonesia, 2012, pp. 205212.
    21. 21)
      • 16. Lippi, M., Torroni, P.: ‘Argumentation mining: state of the art and emerging trends’, ACM Trans. Internet Technol. (TOIT), 2016, 16, (2), p. 10.
    22. 22)
      • 57. Jiang, K., Gupta, R., Gupta, M., et al: ‘Identifying personal health experience tweets with deep neural networks’. 39th IEEE Annual Int. Conf. of Engineering in Medicine and Biology Society (EMBC), Jeju Island, Korea, 2017, pp. 11741177.
    23. 23)
    24. 24)
    25. 25)
      • 81. Garousi, V., Felderer, M., Mäntylä, M.V.: ‘Guidelines for including grey literature and conducting multivocal literature reviews in software engineering’, Inf. Softw. Technol., 2018, 106, pp. 101121.
    26. 26)
      • 12. Garousi, V., Felderer, M., Mäntylä, M.V.: ‘The need for multivocal literature reviews in software engineering: complementing systematic literature reviews with grey literature’. Proc. of the 20th Int. Conf. on Evaluation and Assessment in Software Engineering, Limerick, Ireland, 2016, p. 26.
    27. 27)
      • 53. Kuzey, E., Vreeken, J., Weikum, G.: ‘A fresh look on knowledge bases: distilling named events from news’. Proc. of the 23rd ACM Int. Conf. on Information and Knowledge Management, Shanghai, China, 2014, pp. 16891698.
    28. 28)
      • 17. Peldszus, A., Stede, M.: ‘From argument diagrams to argumentation mining in texts: a survey’, Int. J. Cognitive Informat. Natural Intell. (IJCINI), 2013, 7, (1), pp. 131.
    29. 29)
      • 60. Masterov, D.V., Mayer, U.F., Tadelis, S.: ‘Canary in the e-commerce coal mine: detecting and predicting poor experiences using buyer-to-seller messages’. Proc. of the Sixteenth ACM Conf. on Economics and Computation, Portland, Oregon, USA, 2015, pp. 8193.
    30. 30)
      • 18. Mochales, R., Moens, M.-F.: ‘Argumentation mining’, Artif. Intell. Law, 2011, 19, (1), pp. 122.
    31. 31)
      • 49. Park, K.C., Jeong, Y., Myaeng, S.H.: ‘Detecting experiences from weblogs’. Proc. of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 2010, pp. 14641472.
    32. 32)
      • 4. Dybå, T., Kitchenham, B.A., Jørgensen, M.: ‘Evidence-based software engineering for practitioners’, IEEE Softw., 2005, 22, (1), pp. 5865.
    33. 33)
    34. 34)
      • 59. Krawczyk, M., Xiang, Z.: ‘Perceptual mapping of hotel brands using online reviews: a text analytics approach’, Inf. Technol. Tourism, 2016, 16, (1), pp. 2343.
    35. 35)
      • 41. Srijith, P.K., Hepple, M., Bontcheva, K., et al: ‘Sub-story detection in twitter with hierarchical dirichlet processes’, Inf. Process. Manage., 2017, 53, (4), pp. 9891003.
    36. 36)
      • 51. Mazoyer, B., Cagé, J., Hudelot, C., et al: ‘Real-time collection of reliable and representative tweets datasets related to news events’. BroDyn 2018: 1st Workshop on Analysis of Broad Dynamic Topics over Social Media, Grenoble, France, 2018.
    37. 37)
      • 43. Petrović, S., Osborne, M., Lavrenko, V.: ‘Streaming first story detection with application to Twitter’. Human language technologies: The 2010 Annual Conf. of the North American Chapter of the Association for Computational Linguistics, Los Angeles, California, 2010, pp. 181189.
    38. 38)
      • 77. Rahimtoroghi, E., Swanson, R., Walker, M.A., et al: ‘Evaluation, orientation, and action in interactive storytelling’, Proc. Intell. Narrative Technol., 2013, 6, pp. 5156.
    39. 39)
      • 7. Garousi, V., Mäntylä, M.V.: ‘When and what to automate in software testing? A multi-vocal literature review’, Inf. Softw. Technol, 2016, 76, pp. 92117.
    40. 40)
      • 9. Spolsky, J.: ‘Language wars’. Available at (accessed: 18.03.2020).
    41. 41)
      • 33. Qamra, A., Tseng, B., Chang, E.Y.: ‘Mining blog stories using community-based and temporal clustering’. Proc. of the 15th ACM Int. Conf. on Information and Knowledge Management, Arlington, Virginia, USA, 2006, pp. 5867.
    42. 42)
      • 78. Ott, M., Choi, Y., Cardie, C., et al: ‘Finding deceptive opinion spam by any stretch of the imagination’. Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, Oregon, USA, 2011, pp. 309319.
    43. 43)
      • 65. Min, H.-J., Park, J.C.: ‘Identifying helpful reviews based on customer's mentions about experiences’, Expert Syst. Appl., 2012, 39, (15), pp. 1183011838.
    44. 44)
      • 74. Chen, X., Wang, S., Tang, Y., et al: ‘A bibliometric analysis of event detection in social media’. Online Information Review, 2019.
    45. 45)
      • 80. Domingos, P.: ‘A few useful things to know about machine learning’, Commun. ACM, 2012, 55, (10), pp. 7887.
    46. 46)
      • 13. Garousi, V., Felderer, M., Mäntylä, M.V.: ‘Guidelines for including the grey literature and conducting multivocal literature reviews in software engineering’. arXiv preprint arXiv:1707.02553, 2017.
    47. 47)
      • 52. Nanni, F., Ponzetto, S.P., Dietz, L.: ‘Building entity-centric event collections’. Proc. of the 17th ACM/IEEE Joint Conf. on Digital Libraries, Toronto, ON, Canada, 2017, pp. 199208.
    48. 48)
      • 37. Swanson, R., Rahimtoroghi, E., Corcoran, T., et al: ‘Identifying narrative clause types in personal stories’. Proc. of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Philadelphia, PA, U.S.A., 2014, pp. 171180.
    49. 49)
      • 30. Inui, K., Abe, S., Hara, K., et al: ‘Experience mining: building a largescale database of personal experiences and opinions from web documents’. Proc. of the 2008 IEEE/WIC/ACM Int. Conf. on Web Intelligence and Intelligent Agent Technology-Volume 01, Sydney, NSW, Australia, 2008, pp. 314321.
    50. 50)
      • 42. Petrović, S., Osborne, M., Lavrenko, V.: ‘Using paraphrases for improving first story detection in news and Twitter’. Proc. of the 2012 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. NAACL HLT ‘12, Montréal, Canada, 2012, pp. 338346. ISBN: 978-1-937284-20-6. Available at
    51. 51)
      • 36. de AR Gonçalves, J.C., Santoro, F.M., Baião, F.A.: ‘A case study on designing business processes based on collaborative and mining approaches’. 2010 14th Int. Conf. on Computer Supported Cooperative Work in Design (CSCWD), Shanghai, China, 2010, pp. 611616.
    52. 52)
      • 19. Lippi, M., Torroni, P.: ‘MARGOT: a web server for argumentation mining’, Expert Syst. Appl., 2016, 65, pp. 292303.
    53. 53)
      • 61. Liu, Y., Chen, Y., Tang, J., et al: ‘Context-aware experience extraction from online health forums’. 2015 Int. Conf. on Healthcare Informatics (ICHI), Dallas, TX, USA, 2015, pp. 4247.
    54. 54)
      • 31. Brants, T., Chen, F., Farahat, A.: ‘A system for new event detection’. Proc. of the 26th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Toronto, Canada, 2003, pp. 330337.
    55. 55)
      • 34. Fontão, A., Ekwoge, O.M., Santos, R., et al: ‘Facing up the primary emotions in Mobile software ecosystems from developer experience’. Proc. of the 2nd Workshop on Social, Human, and Economic Aspects of Software, Salvador, Brazil, 2017, pp. 511.
    56. 56)
      • 44. Yu, J., Xie, L., Xiao, X., et al: ‘Learning distributed sentence representations for story segmentation’, Signal Process., 2018, 142, pp. 403411.
    57. 57)
    58. 58)
      • 56. Calix, R.A., Gupta, R., Gupta, M., et al: ‘Deep gramulator: improving precision in the classification of personal health-experience tweets with deep learning’. 2017 IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 2017, pp. 11541159.
    59. 59)
      • 58. Hassan, E., Buscaldi, D., Gangemi, A.: ‘Event-based recognition of lived experiences in user reviews’. European Knowledge Acquisition Workshop, Bologna, Italy, 2016, pp. 320336.
    60. 60)
      • 24. Marshall, C., Brereton, P., Kitchenham, B.: ‘Tools to support systematic reviews in software engineering: a feature analysis’. Proc. of the 18th Int. Conf. on Evaluation and Assessment in Software Engineering, London, England, 2014, p. 13.
    61. 61)
      • 1. Rainer, A., Hall, T., Baddoo, N.: ‘Persuading developers to ‘buy into’ software process improvement: a local opinion and empirical evidence’. Proc. of the 2003 Int. Symp. on Empirical Software Engineering (ISESE 2003), Rome, Italy, 2003, pp. 326335.
    62. 62)
    63. 63)
      • 6. Soldani, J., Tamburri, D.A., Van Den Heuvel, W.-J.: ‘The pains and gains of microservices: a systematic grey literature review’, J. Syst. Softw., 2018, 146, pp. 215232.
    64. 64)
      • 79. Burton, K., Java, A., Soboroff, I., et al: ‘The ICWSM 2009 Spinn3r dataset’. Third Annual Conf. on Weblogs and Social Media (ICWSM 2009), San Jose, USA, 2009.
    65. 65)
      • 45. Lee, K.-S., Kageura, K.: ‘Korean–Japanese story link detection based on distributional and contrastive properties of event terms’, Inf. Process. Manage., 2006, 42, (2), pp. 538550.
    66. 66)
      • 14. Adams, J., Hillier-Brown, F.C., Moore, H.J., et al: ‘Searching and synthesising ‘grey literature'and ‘grey information'in public health: critical reflections on three case studies’, Syst. Rev., 2016, 5, (1), p. 164.
    67. 67)
      • 22. Twining, W.: ‘Rethinking evidence: exploratory essays’ (Northwestern University Press, Evanston, IL, 1994).
    68. 68)
      • 20. Rainer, A., Williams, A.: ‘Heuristics for improving the rigour and relevance of grey literature searches for software engineering research’, Inf. Softw. Technol., 2019, 106, pp. 231233.
    69. 69)
      • 21. Bex, F.J.: ‘Values as the point of a story’. From knowledge representation to argumentation in AI, law and policy making. A Festschrift in honour of Trevor Bench-Capon, 2013, pp. 6378.
    70. 70)
      • 39. Gruenheid, A., Kossmann, D., Rekatsinas, T., et al: ‘Storypivot: comparing and contrasting story evolution’. Proc. of the 2015 ACM SIGMOD Int. Conf. on Management of Data, Melbourne, Victoria, Australia, 2015, pp. 14151420.
    71. 71)
      • 63. Wilson, M.L., Ali, S., Valstar, M.F.: ‘Finding information about mental health in microblogging platforms: a case study of depression’. Proc. of the 5th Information Interaction in Context Symp., Regensburg, Germany, 2014, pp. 817.
    72. 72)
      • 27. Singh, P., Singh, K.: ‘Exploring automatic search in digital libraries: a caution guide for systematic reviewers’. Proc. of the 21st Int. Conf. on Evaluation and Assessment in Software Engineering, Karlskrona, Sweden, 2017, pp. 236241.
    73. 73)
    74. 74)
      • 23. Marshall, C., Brereton, P.: ‘Tools to support systematic literature reviews in software engineering: a mapping study’. 2013 ACM/IEEE Int. Symp. on Empirical Software Engineering and Measurement, Baltimore, MD, USA, 2013, pp. 296299.
    75. 75)
      • 8. Rainer, A.: ‘Using argumentation theory to analyse software practitioners’ defeasible evidence, inference and belief’, Inf. Softw. Technol., 2017, 87, pp. 6280.
    76. 76)
      • 50. Kurashima, T., Fujimura, K., Okuda, H.: ‘Discovering association rules on experiences from large-scale blog entries’. European Conf. on Information Retrieval, Toulouse, France, 2009, pp. 546553.
    77. 77)
      • 54. Khrouf, H., Milicic, V., Troncy, R.: ‘Mining events connections on the social web: real-time instance matching and data analysis in EventMedia’, Web Semantics: Sci. Services and Agents on the World Wide Web, 2014, 24, pp. 310.
    78. 78)
      • 11. Choi, K.: ‘Software engineering blogs’. Available at (accessed: 04.08.2018).
    79. 79)
      • 3. Storey, M.-A., Singer, L., Cleary, B., et al: ‘The (r) evolution of social media in software engineering’. Proc. of the Future of Software Engineering, Hyderabad, India, 2014, pp. 100116.
    80. 80)
      • 29. Kurashima, T., Tezuka, T., Tanaka, K.: ‘Mining and visualizing local experiences from blog entries’. DEXA, Kraków, Poland, 2006, pp. 213222.
    81. 81)
      • 32. Lee, K.-S., Kageura, K.: ‘Multilingual story link detection based on event term weighting on times and multilingual spaces’. Int. Conf. on Asian Digital Libraries, Shanghai, China, 2004, pp. 398407.
    82. 82)
      • 72. Vendler, Z.: ‘Linguistics in philosophy. Ithaca’, 1967.

Related content

This is a required field
Please enter a valid email address