Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Software crowdsourcing task pricing based on topic model analysis

In software crowdsourcing, task prize is a primary incentive for engaging crowd developers. One of the main challenges in crowdsourcing task pricing is to determine appropriate prizes in order to attract qualified workers. Few studies proposed methods to address this challenge. However, they are either too theoretical or too restricted to be applied for early crowdsourcing planning. In this study, we propose a novel approach, i.e., PTMA, to support early task pricing in software crowdsourcing from textual task requirements. PTMA consists of three phases, namely data pre-processing, topic extraction, and topic-based task pricing analysis, integrating 6 machine learning algorithms and 3 analogy-based models for topic-based pricing analysis. PTMA is evaluated using data from 2016 software crowdsourcing tasks extracted from TopCoder, the largest software crowdsourcing platform. The results show that: 1) textual requirement information can aid early task pricing in software crowdsourcing; 2) the best predictor in PTMA, based on logistic regression, achieves an accuracy of 88.3% in Pred (30); and 3) PTMA outperforms the existing baseline models by 9% in Pred (30). PTMA greatly simplifies the pricing process by only leveraging textual task description as inputs, and can achieve better prediction accuracy in making task pricing decisions.

References

    1. 1)
      • 18. Thomas, S.W., Adams, B., Hassan, A.E., et al: ‘Validating the use of topic models for software evolution’. 2010 10th IEEE Working Conf. on Source Code Analysis and Manipulation, Timisoara, Romania, 2010, pp. 5564.
    2. 2)
      • 4. Wang, J., Ipeirotis, P.G., Provost, F.: ‘Quality-based pricing for crowdsourced workers’. June 2013, NYU Working Paper No. 2451/31833, Available at SSRN: https://ssrn.com/abstract=2283000.
    3. 3)
      • 6. Mao, K, Yang, Y, Li, M, et al: ‘Pricing crowdsourcing-based software development tasks’. Proc. of the 2013 Int. Conf. on Software Engineering, San Francisco, CA, USA, 2013, pp. 12051208.
    4. 4)
      • 26. Qian, G., Sural, S., Gu, Y., et al: ‘Similarity between Euclidean and cosine angle distance for nearest neighbor queries’. ACM Symp. on Applied Computing, 2004, pp. 12321237.
    5. 5)
      • 28. Shepperd, M., MacDonell, S.: ‘Evaluating prediction systems in software project estimation’, Inf. Softw. Technol., 2012, 54, (8), pp. 820827.
    6. 6)
      • 20. Available at http://www.nltk.org/.
    7. 7)
      • 9. Shepperd, M., Schofield, C., Kitchenham, B.: ‘Effort estimation using analogy’. Proc. of the 18th int. Conf. on Software Engineering, 1996, pp. 170178.
    8. 8)
      • 22. Available at https://radimrehurek.com/gensim/, 2018.
    9. 9)
      • 1. Howe, J.: ‘The rise of crowdsourcing’, Wired Mag., 2006, 14, (6), pp. 14.
    10. 10)
      • 3. Beecham, S., Baddoo, N., Hall, T., et al: ‘Motivation in software engineering: a systematic literature review’, Inf. Softw. Technol., 2008, 50, (9–10), pp. 860878.
    11. 11)
      • 23. Griffiths, T.L., Steyvers, M.: ‘Finding scientific topics’, Proc. Natl. Acad. Sci., 2004, 101, (suppl 1), pp. 52285235.
    12. 12)
      • 8. Jorgensen, M., Shepperd, M.: ‘A systematic review of software development cost estimation studies’, IEEE Trans. Softw. Eng., 2007, 33, (1), pp. 3353.
    13. 13)
      • 24. Landauer, T.K., McNamara, D.S., Dennis, S., et al: ‘Handbook of latent semantic analysis’ (Erlbaum, USA, 2007).
    14. 14)
      • 16. Lukins, S.K., Kraft, N.A., Etzkorn, L.H.: ‘Bug localization using latent Dirichlet allocation’, Inf. Softw. Technol., 2010, 52, (9), pp. 972990.
    15. 15)
      • 19. Park, J., Lee, M.W., Kim, J., et al: ‘Costriage: a cost-aware triage algorithm for bug reporting systems’. Twenty-Fifth AAAI Conf. on Artificial Intelligence, San Francisco, CA, USA, 2011.
    16. 16)
      • 15. Zou, X., Wu, Y., Liu, Z.: ‘Require- documents and provide-documents matching algorithm based on topic model’. Int. Conf. on Audio, Hong Kong, People's Republic of China, 2017.
    17. 17)
      • 10. Rao, K.S.: ‘Efficient software cost estimation using machine learning techniques’. Acharya Nagarjuna University, India, 2014.
    18. 18)
      • 29. Idri, A., Hassani, A., Abran, A.: ‘RBFN networks-based models for estimating software development effort: a cross-validation study’. Computational Intelligence, 2015 IEEE Symp., Cape Town, South Africa, 2016, pp. 976983.
    19. 19)
      • 17. Hindle, A., Godfrey, M.W., Holt, R.C.: ‘What‘s hot and what‘s not: windowed developer topic analysis’. 2009 IEEE Int. Conf. on Software Maintenance, Edmonton, Canada, 2009, pp. 339348.
    20. 20)
      • 27. Available at http://scikit-learn.github.io/stable.
    21. 21)
      • 7. Alelyani, T, Mao, K, Yang, Y.: ‘Context-centric pricing: early pricing models for software crowdsourcing tasks’. Proc. of the 13th Int. Conf. on Predictive Models and Data Analytics in Software Engineering, Turhan Sudan, 2017, pp. 6372.
    22. 22)
      • 31. Brown, P.F., Pietra, V.J.D., Mercer, R.L., et al: ‘An estimate of an upper bound for the entropy of English’, Comput. Linguist., 1992, 18, (1), pp. 3140.
    23. 23)
      • 21. Available at http://snowball.tartarus.org/.
    24. 24)
      • 13. Hu, Z., Zhang, J.: ‘Optimal posted-price mechanism in microtask crowdsourcing’. Proc. Twenty-Sixth Int. Joint Conf. Artificial Intelligence (IJCAI), Melbourne, Australia, 2017, pp. 228234.
    25. 25)
      • 12. Nassif, A.B., Azzeh, M., Capretz, L.F., et al: ‘Neural network models for software development effort estimation: a comparative study’, Neural Comput. Appl., 2016, 27, (8), pp. 23692381.
    26. 26)
      • 2. Mao, K., Capra, L., Harman, M., et al: ‘A survey of the use of crowdsourcing in software engineering’, J. Syst. Softw., 2017, 126, pp. 5784.
    27. 27)
      • 30. Kohavi, R.: ‘A study of cross-validation and bootstrap for accuracy estimation and model selection’, Proc. 14th Int. Joint Conf. Artificial Intelligence (IJCAI), 1995, vol. 14, pp. 11371145.
    28. 28)
      • 11. Heiat, A.: ‘Comparison of artificial neural network and regression models for estimating software development effort’, Inf. Softw. Technol., 2002, 44, (15), pp. 911922.
    29. 29)
      • 14. Singer, Y, Mittal, M.: ‘Pricing mechanisms for crowdsourcing markets’. Proc. of the 22nd int. Conf. on World Wide Web, Rio de Janeiro Brazil, 2013, pp. 11571166.
    30. 30)
      • 5. Difallah, D.E., Catasta, M., Demartini, G., et al: ‘Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement’. Second AAAI Conf. on Human Computation and Crowdsourcing, 2014.
    31. 31)
      • 25. Craw, S.: ‘Manhattan distance’ (Springer, US, 2011).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-sen.2019.0168
Loading

Related content

content/journals/10.1049/iet-sen.2019.0168
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address