access icon openaccess Predictive multi-view content buffering applied to interactive streaming system

This Letter discusses the benefits of introducing Machine Learning techniques in multi-view streaming applications. Widespread use of machine learning techniques has contributed to significant gains in numerous scientific and industry fields. Nonetheless, these have not yet been specifically applied to adaptive interactive multimedia streaming systems where, typically, the encoding bit rate is adapted based on resources availability, targeting the efficient use of network resources whilst offering the best possible user quality of experience (QoE). Intrinsic user data could be coupled with such existing quality adaptation mechanisms to derive better results, driven also by the preferences of the user. Head-tracking data, captured from camera feeds available at the user side, is an example of such data to which Recurrent Attention Models could be applied to accurately predict the focus of attention of users within videos frames. Information obtained from such models could be used to assist a preemptive buffering approach of specific viewing angles, contributing to the joint goal of maximising QoE. Based on these assumptions, a research line is presented, focusing on obtaining better QoE in an already existing multi-view streaming system

Inspec keywords: video coding; quality of experience; video streaming; telecommunication network reliability; telecommunication computing; learning (artificial intelligence)

Other keywords: machine learning techniques; multiview streaming applications; interactive multimedia; preemptive buffering approach; interactive streaming system; QoE; recurrent attention models; quality adaptation mechanisms; numerous scientific industry fields; predictive multiview content; head-tracking data; encoding bit rate; quality of experience

Subjects: Multimedia communications; Image and video coding; Communications computing; Video signal processing; Knowledge engineering techniques; Reliability

References

    1. 1)
      • 3. Alexe, B., Heess, N., Teh, Y., et al: ‘Searching for objects driven by context’. Proc. of the Int. Conf. on Neural Information Processing Systems, Lake Tahoe, NV, USA, December 2012, pp. 890898.
    2. 2)
      • 8. Akamai: ‘The State of the Internet Report?’, 2017, Internet: https://www.akamai.com, 31 May 2017 [14 December 2018].
    3. 3)
      • 2. Mnih, V., Heess, N., Graves, A., et al: ‘Recurrent models of visual attention’. Proc. of the Conf. on Neural Information Processing Systems, Montreal, QC, Canada, December 2014, pp. 22042212.
    4. 4)
      • 6. Alexe, B., Deselaers, T., Ferrari, V.: ‘What is an object?’. Proc. of the Int. Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, USA, June 2010, pp. 7380.
    5. 5)
      • 7. ‘Information technology – Dynamic adaptive streaming over HTTP (DASH) – Part 1: Media presentation description and segment formats’, 2014, ISO/IEC 23009-1:2014.
    6. 6)
      • 5. Sande, K., Uijlings, J., Gevers, T., et al: ‘Segmentation as selective search for object recognition’. Proc. of the Int. Conf. on Computer Vision, Barcelona, Spain, November 2011, pp. 18791886.
    7. 7)
      • 4. Butko, N., Movellan, J.: ‘Optimal scanning for faster object detection’. Proc. of the Int. Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, June 2009, pp. 27512758.
    8. 8)
      • 9. Costa, J.: ‘Adaptaçao Automática de Vistas em Aplicaçoes 3D’. M.S. Thesis, Faculty of Engineering, University of Porto, Porto, 2016.
    9. 9)
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2019.0713
Loading

Related content

content/journals/10.1049/el.2019.0713
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
Correspondence
This article has following corresponding article(s):
interview