© The Institution of Engineering and Technology
Social media data now enriches and supplements information flow in various sectors of society. The question addressed here is whether social media can act as a credible information source of sufficient quality to meet the needs of transport planners, operators, policy makers and the travelling public. A typology of primary transport data needs, current and new data sources is initially established, following which this study focuses on social media textual data in particular. Three sub-questions are investigated: the potential to use social media data alongside existing transport data, the technical challenges in extracting transport-relevant information from social media and the wider barriers to the uptake of this data. Following an overview of the text mining process to extract relevant information from the corpus, a review of the challenges this approach holds for the transport sector is given. These include ontologies, sentiment analysis, location names and measuring accuracy. Finally, institutional issues in the greater use of social media are highlighted, concluding that social media information has not yet been fully explored. The contribution of this study is in scoping the technical challenges in mining social media data within the transport context, laying the foundation for further research in this field.
References
-
-
1)
-
27. Nocera, S., Tonin, S.: ‘A joint probability density function for reducing the uncertainty of marginal social cost of carbon evaluation in transport planning’. Advances in Intelligent Systems and Computing, 2013, .
-
2)
-
62. Khanwalkar, S., Seldin, M., Srivastava, A., Kumar, A., Colbath, S.: ‘Content-based geo-location detection for placing tweets pertaining to trending news on map’. Fourth Int. Workshop on Mining Ubiquitous and Social Environments (MUSE), Prague, Czech Republic, September 2013.
-
3)
-
8. Corley, C., Cook, D., Mikler, A., Singh, K.: ‘Text and structural data mining of influenza mentions in web and social media’, Int. J. Environ. Res. Public Health, 2010, 7, (2), pp. 596–615 (doi: 10.3390/ijerph7020596).
-
4)
-
65. Tapscott, D., Williams, A.D., Herman, D.: ‘Government 2.0: transforming government and governance for the twenty-first century’, , January 2008. .
-
5)
-
46. Pak, A., Paroubek, P.: ‘Twitter as a corpus for sentiment analysis and opinion mining’, Computer, 2010, 10, pp. 1320–1326.
-
6)
-
66. Cheng, Z., Caverlee, J., Lee, K.: ‘You are where you tweet: a content-based approach to geo-locating Twitter users’. Proc. of CIKM'10 Proc. of the 19th ACM Int. Conf. on Information and Knowledge Management, New York, 2010, pp. 759–768.
-
7)
-
20. Grant-Muller, S.M., Usher, M.: ‘Intelligent transport systems: the propensity for environmental and economic benefits’. Technological Forecasting and Social Change, 2013, .
-
8)
-
64. Paradesi, S.: ‘Geotagging tweets using their content’. Proc. of the 24th Int. Florida Artificial Intelligence Research Society Conf., 2011, pp. 335–356.
-
9)
-
10)
-
26. Nocera, S., Maino, F., Cavallaro, F.: ‘A heuristic method for evaluating CO2 efficiency in transport planning’, Eur. Transp. Res. Rev., 2012, 4, pp. 91–106 (doi: 10.1007/s12544-012-0073-x).
-
11)
-
11. Schweitzer, L.: ‘How are we doing? Opinion mining customer sentiment in US transit agencies and airlines via twitter’. Presented at the 91th Annual Meeting of the Transportation Research Board, Washington, DC, 2012.
-
12)
-
55. Chung, J., Mustafaraj, E.: ‘Can collective sentiment expressed on twitter predict political elections?’. Proc. of the 25th AAAI Conf. on Artificial Intelligence, San Francisco, CA, USA, 2011, pp. 1770–1771.
-
13)
-
37. Cho, S., Kang, J.Y., Yasar, A., et al ‘An activity-based carpooling microsimulation using ontology’, Procedia Comput. Sci.,201319 pp. 48–55 (doi: 10.1016/j.procs.2013.06.012).
-
14)
-
76. Moss, M.L., Kaufman, S.: ‘How Social Media Moves in New York – Final report’. .
-
15)
-
44. Grosenick, S.: ‘Real-Time Traffic Prediction Improvement through Semantic Mining of Social Networks’. Thesis (Master's), University of Washington, 2012. .
-
16)
-
51. Davidov, D., Sur, O., Rappoport, A.: ‘Semi-supervised recognition of sarcastic sentences in Twitter and Amazon’. Proc. of the Fourteenth Conf. on Computational Natural Language Learning, Uppsala, Sweden, 2010, pp. 107–116.
-
17)
-
38. Niaraki, A.S., Kim, K.: ‘Ontology based personalized route planning system using a multi-criteria decision making approach’, Expert Syst. Appl., 2009, 36, pp. 2250–2259 (doi: 10.1016/j.eswa.2007.12.053).
-
18)
-
9. Mai, E., Hranac, R.: ‘Twitter interactions as a data source for transportation incidents’. TRB 92nd Annual Meeting Compendium of Papers, 2013.
-
19)
-
15. Barron, E., Peck, S., Venner, M., Malley, W.G.: ‘Suggested Practices Guidance Resource’, , September 2013.
-
20)
-
10. Pender, B., Currie, G., Delbosc, A., Shiwakoti, N.: ‘Social media use in unplanned passenger rail disruptions – an international study’. TRB 93rd Annual Meeting, 2014.
-
21)
-
5. Koppel, M., Shtrimberg, I.: ‘Good news or bad news? Let the market decide’. AAAI Spring Symp. on Exploring Attitude and Affect in Text: Theories and Applications, 2004.
-
22)
-
33. Nugroho, A.S., Endarnoto, S.K., Pradipta, S., Purnama, J.: ‘Traffic condition information extraction amp; visualization from social media twitter for android mobile application’. Proc. of the Int. Conf. on Electrical Engineering and Informatics (ICEEI), 2011.
-
23)
-
23. Nocera, S., Cavallaro, F.: ‘Policy effectiveness for containing CO2 emissions in transportation’, Procedia – Soc. Behav. Sci., 2011, 20, pp. 703–713 (doi: 10.1016/j.sbspro.2011.08.078).
-
24)
-
29. Schulz, A., Ristoski, P., Paulheim, H.: ‘I see a car crash: real-time detection of small scale incidents in microblogs’. ‘The semantic web: ESWC 2013 satellite events’, Berlin Heidelberg, New York, 2013 (, 7955), pp. 22–33.
-
25)
-
31. Ritter, A., Clark, S., Mausam, , Etzioni, O.: ‘Named entity recognition in tweets: an experimental study’. Proc. of the Conf. on Empirical Methods in Natural Language Processing (EMNLP), 2011.
-
26)
-
58. Kaplan, A.M., Haenlein, M.: ‘Users of the world, unite! The challenges and opportunities of social media’, Bus. Horiz., 2010, 53, (1), pp. 59–68 (doi: 10.1016/j.bushor.2009.09.003).
-
27)
-
40. Li, L., Wu, W., Liu, N.: ‘Ontology model for situation awareness of city tunnel traffic’. Proc of the Second Int. Symp. on Computer, Communication, Control and Automation (ISCCCA-13), Atlantis Press, Paris, France, 2013, pp. 601–603.
-
28)
-
18. Nocera, S.: ‘The key role of quality assessment in public transport policy’, Traffic Eng. Control, 2011, 52, (9), pp. 394–398.
-
29)
-
4. Antweiler, W., Frank, M.Z.: ‘Is all that talk just noise? The information content of internet stock message boards’, J. Finance, 2004, 59, (3), pp. 1259–1294 (doi: 10.1111/j.1540-6261.2004.00662.x).
-
30)
-
21. Caceres, N., Romero, L.M., Benitez, F.G., del Castillo, J.M.: ‘Traffic flow estimation models using cellular phone data’, IEEE Trans. Intell. Transp. Syst., 2012, 13, (3), pp. 1430–1441 (doi: 10.1109/TITS.2012.2189006).
-
31)
-
67. Bry, F., Lorenz, B., Ohlbach, H.J., Rosner, M.: ‘A geospatial world model for the semantic web’. Principles and Practice of Semantic Web Reasoning, Berlin, Heidelberg2005 (, 3703), pp. 145–159.
-
32)
-
33)
-
49. Sood, S., Owsley, S., Hammond, K., Birnbaum, L.: ‘Reasoning through search: a novel approach to sentiment classification’, .
-
34)
-
6. O'Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: ‘From tweets to polls: linking text sentiment to public opinion time series’. Proc. of the Fourth Int. AAAI Conf. on Weblogs and Social Media (ICWSM), Washington, DC, 2010, pp. 122–129.
-
35)
-
19. Innovateuk.org.: ‘Common Highways Agency Rijkswaterstaat Model (CHARM)’, 2013. .
-
36)
-
22. Libardo, A., Nocera, S.: ‘Transportation elasticity for the analysis of Italian transportation demand on a regional scale’, Traffic Eng. Control, 2008, 49, (5), pp. 187–192.
-
37)
-
45. Yang, W.D., Wang, T.: ‘The fusion model of intelligent transportation systems based on the urban traffic ontology’, Phys. Procedia, 2012, 25, pp. 917–923 (doi: 10.1016/j.phpro.2012.03.178).
-
38)
-
25. Nocera, S., Cavallaro, F.: ‘A methodological framework for the economic evaluation of CO2 emissions from transport’, J. Adv. Transp., 2014, 45, pp. 138–164 (doi: 10.1002/atr.1249).
-
39)
-
42. Wang, J., Ding, Z., Jiang, C.: ‘An ontology-based public transport query system’. Proc. of the First Int. Conf. on Semantics and Grid’, SKG, 2005.
-
40)
-
2. Pang, B., Lee, L.: ‘Opinion mining and sentiment analysis’, Found. Trends Inf. Retr., 2008, 2, (1–2), pp. 1–135 (doi: 10.1561/1500000011).
-
41)
-
24. Nocera, S., Cavallaro, F.: ‘Economical evaluation of future carbon impacts on the Italian highways’, Procedia – Soc. Behav. Sci., 2012, 54, pp. 1360–1369 (doi: 10.1016/j.sbspro.2012.09.850).
-
42)
-
13. Carrasco, J.A., Hogan, B., Wellman, B., Miller, E.J.: ‘Collecting social network data to study social activity-travel behavior: an egocentric approach’, Environ. Plan. B: Plan. Des., 2008, 35, (6), pp. 961–980 (doi: 10.1068/b3317t).
-
43)
-
12. Collins, C., Hasan, S., Ukkusuri, S.V.: ‘A novel transit rider satisfaction metric: rider sentiments measured from online social media data’, J. Public Transp., 2013, 16, (2), pp. 21–45.
-
44)
-
79. Gal-Tzur, A., Grant-Muller, S.M., Kuflik, T., Minkov, E., Nocera, S., Shoor, I.: ‘The potential of social media in delivering transport policy goals’, Transp. Policy, 2014, 32, pp. 115–123 (doi: 10.1016/j.tranpol.2014.01.007).
-
45)
-
7. Grishman, R., Huttunen, S., Yangarber, R.: ‘Information extraction for enhanced access to disease outbreak reports’, J. Biomed. Inform., 2002, 35, (4), pp. 236–246 (doi: 10.1016/S1532-0464(03)00013-3).
-
46)
-
32. Oppenheim, N.: ‘Urban travel demand modeling: from individual choices to general equilibrium’ (John Wiley and Sons, Inc., New York, 1995).
-
47)
-
69. Zimmer, C.G.: ‘Social Media Use in Local Public Agencies: A Study of California's Cities’, .
-
48)
-
35. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: ‘Freebase: a collaboratively created graph database for structuring human knowledge’. Proc. of the ACM SIGMOD Int. Conf. on Management of Data, Vancouver, BC, Canada, 2008, pp. 1247–1250, .
-
49)
-
50. Wiegand, M., Balahur, A., Roth, B., Klakow, D., Montoyo, A.: ‘A survey on the role of negation in sentiment analysis’. Proc. of the Workshop on Negation and Speculation in Natural Language Processing (NeSp-NLP '10), Association for Computational Linguistics, Stroudsburg, PA, USA, 2010, pp. 60–68.
-
50)
-
1. Kushal, D., Lawrence, S., Pennock, D.M.: ‘Mining the peanut gallery: opinion extraction and semantic classification of product reviews’. Proc. of the 12th Int. Conf. on World Wide Web, 2003, pp. 519–528.
-
51)
-
60. Priedhorsky, R., Culotta, A., Del Valle, S.Y.: ‘Inferring the origin locations of tweets with quantitative confidence’. Proc. of the 17th ACM Conf. on Computer Supportive Cooperative Work and Social Computing (CSCW), Baltimore, MD, 15–19 February 2014.
-
52)
-
43. Houda, M., Khemaja, M., Oliveira, K., Abed, M.: ‘A public transportation ontology to support user travel planning’. Proc. of the Fourth Int. Conf. on Research Challenges in Information Science (RCIS), Nice, France, 2010, pp. 127–136.
-
53)
-
28. Aggarwal, C.C., Zhai, C.-X.: ‘Mining text data’ (Springer, 2012).
-
54)
-
54. Bollen, J., Mao, H., Zeng, X.J.: ‘Twitter mood predicts the stock market’, J. Comput. Sci., 2011, 2, pp. 1–8 (doi: 10.1016/j.jocs.2010.12.007).
-
55)
-
36. Madkour, M., Maach, A.: ‘Ontology-based context modeling for vehicle-aware services’, J. Theor. Appl. Inf. Technol., 2011, 34, (2), pp. 158–166.
-
56)
-
57)
-
58)
-
61. Eisenstein, J., O'Connor, B., Smith, N.A., Xing, E.P.: ‘A latent variable model for geographic lexical variation’. Proc. of Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, 2010, pp. 1277–1287.
-
59)
-
30. Li, C., Weng, J., He, Q., et al: ‘TwiNER: named entity recognition in targeted twitter stream’. Proc. of the Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2012.
-
60)
-
63. Amitay, E., Har'El, N., Sivan, R., Soffer, A.: ‘Web-a-where: ‘Geotagging web content’. SIGIR'04 Proc. of the 27th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2004, pp. 273–280.
-
61)
-
75. Minnesota Department of Transportation, Office of Policy Analysis: ‘Use of Social Media by Minnesota Cities and Counties’, Transportation Research Synthesis, November 2011. .
-
62)
-
70. Cotey, A.: ‘Social media: transit agencies connect with riders in new ways’, , January 2011. .
-
63)
-
52. Gal-Tzur, A., Grant-Muller, S.M., Minkov, E., Nocera, S.: ‘The impact of social media usage on transport policy: issues, challenges and recommendations’, Procedia – Soc. Behav. Sci., 2014, 111, pp. 937–946 (doi: 10.1016/j.sbspro.2014.01.128).
-
64)
-
48. Wilson, T., Wiebe, J., Hoffmann, P.: ‘Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis’, Comput. Linguist., 2009, 35, (3), pp. 399–433 (doi: 10.1162/coli.08-012-R1-06-90).
-
65)
-
34. Kaur, A., Gupta, V.: ‘A survey on sentiment analysis and opinion mining techniques’, J. Emerging Technol. Web Intell., 2013, 5, (4)pp. 367–371.
-
66)
-
59. Davis, C.A.Jr, Pappa, G.L., de Oliveira, D.R.R., de L Arcanjo, F.: ‘Inferring the location of twitter messages based on user relationships’, Trans. GIS, 2011, 15, (6), pp. 735–751 (doi: 10.1111/j.1467-9671.2011.01297.x).
-
67)
-
56. Bie, J., Bijlsma, M., Broll, G., et al: ‘Move better with tripzoom’, Int. J. Adv. Life Sci., 2012, 4, pp. 125–135.
-
68)
-
47. Musakwa, W.: ‘The use of social media in public transit systems: the case of the Gautrain, Gauteng province, South Africa: analysis and lessons learnt’. Proc. REAL CORP 2014 Tagungsband, Vienna, Austria, 21–23 May 2014. .
-
69)
-
71. Barron, E., Peck, S., Venner, M., Malley, W.G.: ‘Potential Use of Social Media in the NEPA Process’, .
-
70)
-
68. Gao, L., Zhang, Z., Wu, H.: ‘Analyzing the use of Facebook page among state DOTs’. TRB 92nd Annual Meeting Compendium of Papers, 2013.
-
71)
-
17. Nocera, S.: ‘An operational approach for quality evaluation in public transport services’, Ing. Ferrov., 2010, 65, (4), pp. 363–383.
-
72)
-
53. Bollen, J., Pepe, A., Mao, H.: ‘Modeling public mood and emotion: twitter sentiment and socio-economic phenomena’. Proc. of the Fifth Int. AAAI Conf. on Weblogs and Social Media (ICWSM), Barcelona, Spain, 17–21 July 2011, pp. 450–453.
-
73)
-
39. Trappey, C., Wu, H.Y., Liu, K.L.: ‘Knowledge discovery of customer satisfaction and dissatisfaction using ontology-based text analysis of critical incident dialogues’. Proc. of the 2012 IEEE 16th Int. Conf. on Computer Supported Cooperative Work in Design, Wuhan, 2012, pp. 470–475.
-
74)
-
77. Shepherd, P.A.: ‘The Transportation World Should Embrace Social Media... Carefully’, Eno Center of Transportation. .
-
75)
-
57. Leetaru, K., Wang, S., Cao, G., Padmanabhan, A., Shook, E.: ‘Mapping the global twitter heartbeat: the geography of twitter’, First Monday, 2013, 18, (5) (doi: 10.5210/fm.v18i5.4366).
-
76)
-
3. Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: ‘Predicting elections with twitter: what 140 characters reveal about political sentiment’. Proc. of the Fourth Int. AAAI Conf. on Weblogs and Social Media, 2010.
-
77)
-
14. Efthymiou, D., Antoniou, C.: ‘Use of social media for transport data collection, Procedia’, Soc. Behav. Sci., 2012, 48, pp. 775–785, .
-
78)
-
41. Becker, M., Smith, S.F.: ‘An ontology for multi-modal transportation planning and scheduling’, , Robotics Institute, Carnegie Mellon University, 1997.
-
79)
-
16. Manning, C., Raghavan, P., Schtze, H.: ‘Introduction to information retrieval’ (Cambridge University Press, NY, USA, 2008).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-its.2013.0214
Related content
content/journals/10.1049/iet-its.2013.0214
pub_keyword,iet_inspecKeyword,pub_concept
6
6