© The Institution of Engineering and Technology
Extracting a statistically significant result from video of natural phenomenon can be difficult for two reasons: (i) there can be considerable natural variation in the observed behaviour and (ii) computer vision algorithms applied to natural phenomena may not perform correctly on a significant number of samples. This study presents one approach to clean a large noisy visual tracking dataset to allow extracting statistically sound results from the image data. In particular, analyses of 3.6 million underwater trajectories of a fish with the water temperature at the time of acquisition are presented. Although there are many false detections and incorrect trajectory assignments, by a combination of data binning and robust estimation methods, reliable evidence for an increase in fish speed as water temperature increases are demonstrated. Then, a method for data cleaning which removes outliers arising from false detections and incorrect trajectory assignments using a deep learning-based clustering algorithm is proposed. The corresponding results show a rise in fish speed as temperature goes up. Several statistical tests applied to both cleaned and not-cleaned data confirm that both results are statistically significant and show an increasing trend. However, the latter approach also generates a cleaner dataset suitable for other analysis.
References
-
-
1)
-
38. Kollios, G., Gunopulos, D., Koudas, N., et al: ‘Efficient biased sampling for approximate clustering and outlier detection in large data sets’, IEEE Trans. Knowl. Data Eng., 2003, 15, (5), pp. 1170–1187.
-
2)
-
40. Breunig, M.M., Kriegel, H.-P., Ng, R.T., et al: ‘Lof: identifying density-based local outliers’. Proc. ACM SIGMOID Int. Conf. Management of Data, 2000, pp. 93–104.
-
3)
-
18. Kaisler, S., Armour, F., Espinosa, J.A., et al: ‘Big data: issues and challenges moving forward’. Proc. IEEE Hawaii Int. Conf. System Sciences, 2012, pp. 995–1004.
-
4)
-
24. Xiao, J.: ‘A 2D + 3D rich data approach to scene understanding’. , Massachusetts Institute of Technology, 2013.
-
5)
-
30. Tang, N.: ‘Big data cleaning’, web technologies and applications, lecture notes in computer science (Springer International Publishing, 2014), 8709, pp. 13–24.
-
6)
-
43. Boom, B.J., He, J., Palazzo, S., et al: ‘Research tool for the analysis of underwater camera surveillance footage’, Ecol. Inform., 2013, 23, pp. 83–97.
-
7)
-
15. Hellerstein, J.M.: , 2008, pp. 1–42.
-
8)
-
45. Naftel, A., Khalid, S.: ‘Classifying spatiotemporal object trajectories using unsupervised learning in the coefficient feature space’, Multimedia Syst., 2006, 12, pp. 227–238.
-
9)
-
27. Huang, P., Boom, B., Fisher, R.: ‘Underwater live fish recognition using a balance-guaranteed optimized tree’. Proc. Asian Conf. Computer Vision, 2012, pp. 422–433.
-
10)
-
39. Knorr, E.M., Ng, R.T.: ‘Finding intensional knowledge of distance-based outliers’. Proc. Int. Conf. Very Large Data Bases, 1999, pp. 211–222.
-
11)
-
4. Zhou, J., Bai, X., Caelli, T. (Eds.): ‘Computer vision and pattern recognition in environmental informatics’ (IGI-Global, 2015).
-
12)
-
34. Boom, B.J., Beauxis-Aussalet, E., Hardman, L., et al: ‘Uncertainty-aware estimation of population abundance using machine learning’ (Multimedia Systems, 2015), p. 1–13.
-
13)
-
13. Kumar, S., Singh, S.K.: ‘Visual animal biometrics: survey’, IET Biomet., 2017, 6, (3), pp. 139–156.
-
14)
-
20. Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., et al: ‘Deep learning applications and challenges in big data analytics’, J. Big Data, 2015, 2, (1), pp. 1–21.
-
15)
-
25. Kumar, P.: ‘High performance object detection on big video data using GPUs’. Proc. Int. Conf. Multimedia Big Data, 2015, pp. 383–388.
-
16)
-
17)
-
6. Pugh, M.: ‘Removing false detections from a large fish image dataset’. , School of Informatics, University of Edinburgh, 2015.
-
18)
-
49. Sillito, R.R., Fisher, R.B.: ‘Semi-supervised learning for anomalous trajectory detection’. Proc. British Machine Vision Conf., 2008, pp. 227–238.
-
19)
-
35. Krishnan, S., Wang, J., Wu, E., et al: ‘ActiveClean: interactive data cleaning for statistical modeling proceeding’. VLDB Endowment, 2016, pp. 1–12.
-
20)
-
32. Fan, W., Geerts, F., Neven, F.: ‘Making queries tractable on big data with preprocessing: through the eyes of complexity theory’, Proc. VLDB Endowment, 2013, 6, (9), pp. 685–696.
-
21)
-
28. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’. Proc. Neural Information and Processing Systems, 2012, , pp. 1106–1114.
-
22)
-
16. Chen, M., Mao, S., Liu, Y.: ‘Big data: a survey’, Mobile Netw. Appl., 2014, 19, (2), pp. 171–209.
-
23)
-
5. Fisher, R.B., Chen-Burger, Y.-H., Giordano, D., et al (Eds.): ‘Fish4Knowledge: collecting and analyzing massive coral reef fish video data’ (Springer, 2016).
-
24)
-
31. Fan, W., Geerts, F., Cao, Y., et al: ‘Querying big data by accessing small data’. Proc. Association for Computing Machinery Symp. Principles of Database Systems, 2015, pp. 173–184.
-
25)
-
50. Li, C., Han, Z., Ye, Q., et al: ‘Abnormal behavior detection via sparse reconstruction analysis of trajectory’. Proc. Int. Conf. Image and Graphics, 2011, pp. 807–810.
-
26)
-
10. Palazzo, S., Murabito, F.: ‘Fish species identification in real-Life underwater images’. Proc. 3rd ACM Int. Workshop on Multimedia Analysis for Ecological Data, 2014.
-
27)
-
12. Ohayona, S., Avni, O., Taylor, A.L., et al: ‘Automated multi-day tracking of marked mice for the analysis of social behaviour’, J. Neurosci. Methods, 2013, 219, pp. 10–19.
-
28)
-
26. Spampinato, C., Palazzo, S., Giordano, D., et al: ‘Covariance-based fish tracking in real-life underwater environment’. Proc. Int. Conf. Computer Vision Theory and Applications, 2012, pp. 409–414.
-
29)
-
9. Johansen, J.L., Messmer, V., Coker, D.J., et al: ‘Increasing ocean temperatures reduce activity patterns of a large commercially important coral reef fish’, Glob. Change Biol., 2014, 20, pp. 1067–1074.
-
30)
-
42. Kaur, P., Kaur, K.: ‘A review on outlier detection for data cleaning in data mining’, Int. J. Innov. Res. Comput. Commun. Eng., 2016, 4, (7), pp. 14373–14376.
-
31)
-
7. Beyan, C., Boom, B.J., Liefhebber, J.M.P., et al: ‘Natural swimming speed of Dascyllus reticulatus increases with water temperature’, ICES Mar. Sci., 2015, 72, (8), pp. 2506–2511.
-
32)
-
29. Shang, L., Yang, L., Wang, F., et al: ‘Real-time large scale near-duplicate web video retrieval’. Proc. ACM Int. Conf. Multimedia, 2010, pp. 531–540.
-
33)
-
23. Alexander, J.: ‘Scene understanding for real time processing of queries over big data streaming video’. PhD thesis, , The University of Central Florida, 2013.
-
34)
-
35)
-
17. Manyika, J., Chui, M., Brown, B., et al: ‘Big data: the next frontier for innovation, competition, and productivity’ (McKinsey Global Institute, 2011).
-
36)
-
37)
-
52. Bengio, Y., Courville, A., Vincent, P.: ‘Representation learning: a review and new perspectives’, IEEE Trans. Pattern Anal. Machine Intell., 2013, 35, pp. 1798–1828.
-
38)
-
44. Morris, B.T., Trivedi, M.M.: ‘A survey of vision-based trajectory learning and analysis for surveillance’, IEEE Trans. Circuits Syst. Video Technol., 2008, 18, (8), pp. 1114–1127.
-
39)
-
22. Kavasidis, I., Palazzo, S., Salvo, R., et al: ‘An innovative web-based collaborative platform for video annotation’, Multimedia Tools Appl., 2013, 7, (2), pp. 1–20.
-
40)
-
21. Huang, T.: ‘Surveillance video: the biggest big data. Computing now’, IEEE Comput. Soc., 2014, 7, (2).
-
41)
-
51. Chen, G.: ‘Deep learning with nonparametric clustering’ (arXiv preprint arXiv:1501.03084, 2015), pp. 1–14.
-
42)
-
53. Ranzato, M., Hinton, G.E.: ‘Modeling pixel means and covariance using factorized third-order Boltzmann machines’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010, pp. 2551–2558.
-
43)
-
8. Johansen, J.L., Jones, G.P.: ‘Increasing ocean temperature reduces the metabolic performance and swimming ability of coral reef damselfishes’, Glob. Change Biol., 2011, 17, pp. 2971–2979.
-
44)
-
47. Bashir, F., Wu, Q., Khokhar, A., et al: ‘HMM-based motion recognition system using segmented PCA’. Proc. IEEE Int. Conf. Image Processing, 2005, pp. 2286–2289.
-
45)
-
14. Spampinato, C., Palazzo, S.: ‘Hidden Markov models for detecting anomalous fish trajectories in underwater footage’. Proc. Int. Workshop on Machine Learning for Signal Processing, 2012.
-
46)
-
11. Stern, U., He, R., Yang, C.-H.: ‘Analyzing animal behavior via classifying each video frame using convolutional neural networks’, Sci. Rep., 2015, 5, (14351), pp. 1–13.
-
47)
-
54. Katsageorgiou, V.M., Huang, H., Ferretti, V., et al: ‘Unsupervised mouse behavior analysis: a data-driven study of mice interactions’. Proc. Int. Conf. Pattern Recognition, 2016.
-
48)
-
48. Porikli, F.: ‘Learning object trajectory patterns by spectral clustering’. Proc. IEEE Conf. Multimedia Expo, 2004, pp. 1171–1174.
-
49)
-
41. Loureiro, A., Torgo, L., Soares, C.: ‘Outlier detection using clustering methods: a data cleaning application’. Proc. KDNet Symp. Knowledge-Based Systems for the Public Sector, 2004.
-
50)
-
36. Krishnan, S., Franklin, J.M., Goldberg, K., et al: ‘ActiveClean: an interactive data cleaning framework for modern machine learning’, Proc. SIGMOD, 2016, pp. 2117–2120.
-
51)
-
46. Sillito, R.R., Fisher, R.B.: ‘Parametric trajectory representations for behaviour classification’. Proc. British Machine Vision Conf., 2009, pp. 1–11.
-
52)
-
37. Beyan, C., Fisher, R.B.: ‘Detection of abnormal fish trajectories using a clustering based hierarchical classifier’. Proc. British Machine Vision Conf., 2013, pp. 1–11.
-
53)
-
33. Fan, W., Geerts, F., Libkin, L.: ‘On scale independence for querying Big data’. Proc. Association of Computing Machinery Symp. Principles of Database Systems, 2014, , (9), pp. 51–62.
-
54)
-
19. Gani, A., Siddiqa, A., Shamshirband, S., et al: ‘A survey on indexing techniques for big data: taxonomy and performance evaluation’, Knowl. Inf. Syst., 2016, 46, pp. 241–284.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2016.0462
Related content
content/journals/10.1049/iet-cvi.2016.0462
pub_keyword,iet_inspecKeyword,pub_concept
6
6