This is an open access article published by the IET under the Creative Commons Attribution-NonCommercial-NoDerivs License (http://creativecommons.org/licenses/by-nc-nd/3.0/)
This Letter discusses the benefits of introducing Machine Learning techniques in multi-view streaming applications. Widespread use of machine learning techniques has contributed to significant gains in numerous scientific and industry fields. Nonetheless, these have not yet been specifically applied to adaptive interactive multimedia streaming systems where, typically, the encoding bit rate is adapted based on resources availability, targeting the efficient use of network resources whilst offering the best possible user quality of experience (QoE). Intrinsic user data could be coupled with such existing quality adaptation mechanisms to derive better results, driven also by the preferences of the user. Head-tracking data, captured from camera feeds available at the user side, is an example of such data to which Recurrent Attention Models could be applied to accurately predict the focus of attention of users within videos frames. Information obtained from such models could be used to assist a preemptive buffering approach of specific viewing angles, contributing to the joint goal of maximising QoE. Based on these assumptions, a research line is presented, focusing on obtaining better QoE in an already existing multi-view streaming system
References
-
-
1)
-
3. Alexe, B., Heess, N., Teh, Y., et al: ‘Searching for objects driven by context’. Proc. of the Int. Conf. on Neural Information Processing Systems, Lake Tahoe, NV, USA, December 2012, pp. 890–898.
-
2)
-
3)
-
2. Mnih, V., Heess, N., Graves, A., et al: ‘Recurrent models of visual attention’. Proc. of the Conf. on Neural Information Processing Systems, Montreal, QC, Canada, December 2014, pp. 2204–2212.
-
4)
-
6. Alexe, B., Deselaers, T., Ferrari, V.: ‘What is an object?’. Proc. of the Int. Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, USA, June 2010, pp. 73–80.
-
5)
-
6)
-
5. Sande, K., Uijlings, J., Gevers, T., et al: ‘Segmentation as selective search for object recognition’. Proc. of the Int. Conf. on Computer Vision, Barcelona, Spain, November 2011, pp. 1879–1886.
-
7)
-
4. Butko, N., Movellan, J.: ‘Optimal scanning for faster object detection’. Proc. of the Int. Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, June 2009, pp. 2751–2758.
-
8)
-
9)
-
1. Denil, M., Bazzani, L., Larochelle, H., et al: ‘Learning where to attend with deep architectures for image tracking’, Neural Comput., 2012, 24, (8), pp. 2151–2184 (doi: 10.1162/NECO_a_00312).
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2019.0713
Related content
content/journals/10.1049/el.2019.0713
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Correspondence
This article has following corresponding article(s):
interview