Beam hopping (BH) is the key technology to improve the system throughput and decrease the transmission delay in multibeam satellite systems. The objective of this study is to find a policy to maximise the expected long-term resource utilisation. The BH illumination plan (BHIP) optimisation problem aimed at minimising the transmission delay is formulated and modelled as a partially observable Markov decision process. To tackle the issue of unknown dynamics and prohibitive computation, an artificial intelligence method named deep reinforcement learning (DRL) is first proposed to solve the BHIP problem in multibeam satellite systems. The proposed DRL-BHIP algorithm considers a series of realistic conditions, including the traffic demands in spatial distribution and temporal variation, ModCod constraints, antenna radiation pattern and inter-beam interference. The state reformulation concept is adopted to characterise the traffic spatial and temporal features. Simulation results show that the proposed DRL-BHIP algorithm can decrease the transmission delay and improve the system throughput compared with existing algorithms.

References

1. 1)
  - 28. ‘digital video broadcasting (DVB); Second generation framing structure, channel coding and modulation systems for Broadcasting, Interactive services, News gathering and other broadband satellite applications’, 2005.
2. 2)
  - 13. Anzalchi, J., Couchman, A., Gabellini, P., et al: ‘Beam hopping in multi-beam broadband satellite systems: system simulation and performance comparison with non-hopped systems’. The 2010 5th Advanced Satellite Multimedia Systems Conf. and the 11th Signal Processing for Space Communications Workshop, Cagliari, Italy, September 2010, pp. 248–255.
3. 3)
  - 4. Choi, J.P., Chan, V.W.: ‘Optimum power and beam allocation based on traffic demands and channel conditions over satellite downlinks’, IEEE Trans. Wirel. Commun., 2005, 4, (6), pp. 2983–2993.
4. 4)
  - 19. Shi, S., Li, G., Li, Z., et al: ‘Joint power and bandwidth allocation for beam-hopping user downlinks in smart gateway multibeam satellite systems’, Int. J. Distrib. Sensor Netw., 2017, 13, (5), doi: 1550147717709461.
5. 5)
  - 9. Lei, J., Vazquez-Castro, M.A.: ‘Joint power and carrier allocation for the multibeam satellite downlink with individual SINR constraints’. The 2010 IEEE Int. Conf. on Communications, Cape Town, South Africa, May 2010.
6. 6)
  - 18. Han, H., Zheng, X., Huang, Q., et al: ‘QoS-equilibrium slot allocation for beam hopping in broadband satellite communication systems’, Wirel. Netw., 2015, 21, (8), pp. 2617–2630.
7. 7)
  - 8. Chiti, F., Fantacci, R., Tarchi, D., et al: ‘QoS provisioning in GEO satellite with onboard processing using predictor algorithms’, IEEE Wirel. Commun., 2005, 12, (5), pp. 21–27.
8. 8)
  - 23. Silver, D., Schrittwieser, J., Simonyan, K., et al: ‘Mastering the game of Go without human knowledge’, Nature, 2017, 550, (7676), p. 354.
9. 9)
  - 20. Ginesi, A., Re, E., Arapoglou, P.D.: ‘Joint beam hopping and precoding in HTS systems’. Int. Conf. on Wireless and Satellite Systems, OXFORD, GREAT BRITAIN, U.K, 2017, pp. 43–51.
10. 10)
  - 2. Kaneko, K., Nishiyama, H., Kato, N., et al: ‘An evaluation of flexible frequency utilisation in high throughput satellite communication systems with digital channelizer’. The 2017 IEEE Int. Conf. on Communications (ICC), Paris, France, May 2017.
11. 11)
  - 24. Li, L., Lv, Y., Wang, F.Y.: ‘Traffic signal timing via deep reinforcement learning’, IEEE/CAA J. Autom. Sin., 2016, 3, (3), pp. 247–254.
12. 12)
  - 3. Fenech, H., Tomatis, A., Amos, S., et al: ‘Eutelsat HTS systems’, Int. J. Satell. Commun. Netw., 2016, 34, (4), pp. 503–521.
13. 13)
  - 1. Cocco, G., De Cola, T., Angelone, M., et al: ‘Radio resource management optimisation of flexible satellite payloads for DVB-S2 systems’, IEEE Trans. Broadcast., 2018, 64, (2), pp. 266–280.
14. 14)
  - 11. Bai, L., Zhu, L., Choi, J., et al: ‘Cooperative transmission over Rician fading channels for geostationary orbiting satellite collocation system’, IET Commun., 2017, 11, (4), pp. 538–547.
15. 15)
  - 17. Liu, H., Yang, Z., Cao, Z.: ‘Max-min rate control on traffic in broadband multibeam satellite communications systems’, IEEE Commun. Lett., 2013, 17, (7), pp. 1396–1399.
16. 16)
  - 26. Liu, S., Hu, X., Wang, W.: ‘Deep reinforcement learning based dynamic channel allocation algorithm in multibeam satellite systems’, IEEE Access, 2018, 6, pp. 15733–15742.
17. 17)
  - 12. Angeletti, P., Fernandez Prim, D., Rinaldo, R.: ‘Beam hopping in multi-beam broadband satellite systems: system performance and payload architecture analysis’. The 24th AIAA Int. Communications Satellite Systems Conf., San Diego, June 2006.
18. 18)
  - 27. Hu, X., Liu, S., Chen, R., et al: ‘A deep reinforcement learning based framework for dynamic resource allocation in multibeam satellite systems’, IEEE Commun. Lett., 2018, 22, (8), pp. 1612–1615.
19. 19)
  - 16. Alegre, R., Alagha, N., Vázquez, M.A.: ‘Offered capacity optimization mechanisms for multi-beam satellite systems’. The 2012 IEEE Int. Conf. on Communications (ICC), Ottawa, ON, Canada, June 2012, pp. 3180–3184.
20. 20)
  - 21. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Playing Atari with deep reinforcement learning’, arXiv preprint arXiv:1312.5602, 2013.
21. 21)
  - 10. Du, J., Jiang, C., Wang, J., et al: ‘Resource allocation in space multiaccess systems’, IEEE Trans. Aerosp. Electron. Syst., 2017, 53, (2), pp. 598–618.
22. 22)
  - 25. Mao, H., Alizadeh, M., Menache, I., et al: ‘Resource management with deep reinforcement learning’. The Proc. 15th ACM Workshop on Hot Topics in Networks, Trento, Italy, 2016, pp. 50–56.
23. 23)
  - 5. Choi, J.P., Chan, V.W.: ‘Resource management for advanced transmission antenna satellites’, IEEE Trans. Wirel. Commun., 2009, 8, (3), pp. 1308–1321.
24. 24)
  - 30. Abadi, M., Barham, P., Chen, J., et al: ‘Tensorflow: a system for large-scale machine learning’, In Symp. Opearting Systems Design and Implementation, 2016, 16, pp. 265–283.
25. 25)
  - 7. Durand, F.R., Abrão, T.: ‘Power allocation in multibeam satellites based on particle swarm optimization’, AEU-Int. J. Electron. Commun., 2017, 78, pp. 124–133.
26. 26)
  - 15. Alegre, R., Alagha, N., Vázquez-Castro, M.A.: ‘Heuristic algorithms for flexible resource allocation in beam hopping multi-beam satellite systems’. The 29th AIAA Int. Communications Satellite Systems Conf. (ICSSC 2011), Nara, Japan, July 2011, p. 8001.
27. 27)
  - 6. Aravanis, A.I., Bhavani Shankar, M.R., Arapoglou, P.D., et al: ‘Power allocation in multibeam satellite systems: A two-stage multi-objective optimization’, IEEE Trans. Wirel. Commun., 2015, 14, (6), pp. 3171–3182.
28. 28)
  - 22. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Human-level control through deep reinforcement learning’, Nature, 2015, 518, (7540), p. 529.
29. 29)
  - 14. Evans, B., Thompson, P.: ‘Key issues and technologies for a Terabit/s satellite’. The 28th AIAA Int. Communications Satellite Systems Conf. (ICSSC 2010), Anaheim, California, USA, June 2010, p. 8713.
30. 30)
  - 31. ITU-R: ‘Recommendation ITU-R S.672-4: Satellite antenna radiation pattern for use as a design objective in the fixed-satellite service employing geostationary satellites’, 1997.
31. 31)
  - 29. Sutton, R.S., Barto, A.G.: ‘Reinforcement learning: an introduction’ (MIT Press, Cambridge, Massachusetts, USA, 1988).

Deep reinforcement learning-based beam Hopping algorithm in multibeam satellite systems

References

Related content