access icon free Co-scheduling tasks on multi-core heterogeneous systems: An energy-aware perspective

Single-ISA heterogeneous multi-core processors trade-off power with performance; however, threads that co-run on shared resources suffer from resource contention, which induces performance degradation and energy inefficiency. The authors introduce a novel approach to optimise the co-scheduling of multi-threaded applications on heterogeneous processors. The approach is based on the concept of stakes function, which represents the trade-off between isolation and sharing of resources. The authors also develop a co-scheduling algorithm that use stakes functions to optimise resource usage while mitigating resource contention, thus improving performance and energy efficiency. They validated the approach using applications from the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suite, obtaining up to 12.88% performance speed-up, 13.65% energy speed-up and 28.29% energy delay speed-up with respect to the standard Linux heterogeneous multi-processing scheduler.

Inspec keywords: power aware computing; instruction sets; Linux; multi-threading; resource allocation; multiprocessing systems; processor scheduling; performance evaluation

Other keywords: PARSEC benchmark suite; energy inefficiency; energy-aware perspective; multithreaded applications; coscheduling tasks; Linux heterogeneous multiprocessing scheduler; single instruction set architecture multicore heterogeneous systems; resource sharing; performance speed-up; performance degradation; stakes function; resource contention mitigation; resource usage optimisation; energy delay speed-up; energy efficiency improvement; resource isolation; performance improvement

Subjects: Performance evaluation and testing; Operating systems; Multiprocessing systems; Parallel software

References

    1. 1)
      • 2. Rodrigues, R., Koren, I., Kundu, S.: ‘Performance and power benefits of sharing execution units between a high performance core and a low power core’. IEEE 27th Int. Conf. on VLSI Design and 13th Int. Conf. on Embedded Systems, 2014, 2014, pp. 204209.
    2. 2)
    3. 3)
      • 28. Menage, P.: ‘Adding genering process containers to the linux kernel’. Proc. of the Linux Symp., 2007, pp. 4557.
    4. 4)
      • 20. Becchi, M., Crowley, P.: ‘Dynamic thread assignment on heterogeneous multiprocessor architectures’. Proc. of the 3rd Conf. on Computing Frontiers, ACM, 2006, pp. 2940.
    5. 5)
      • 16. Lam, J.W., Tan, I., Ong, B.L., et al: ‘Effective operating system scheduling domain hierarchy for core-cache awareness’. TENCON – 2009 IEEE Region 10 Conf., pp. 17.
    6. 6)
      • 1. Rodrigues, R., Annamalai, A., Koren, I., et al: ‘Improving performance per watt of asymmetric multi-core processors via online program phase classification and adaptive core morphing’, ACM Trans. Des. Autom. Electron. Syst., 2013, 18, (1), p. 5.
    7. 7)
      • 19. Mars, J., Tang, L., Hundt, R., et al: ‘Bubble-up: increasing utilization in modern warehouse scale computers via sensible co-locations’. Proc. of the 44th Annual IEEE/ACM Int. Symp. on Microarchitecture, ACM, 2011, pp. 248259.
    8. 8)
      • 17. Wang, Y., Cui, Y., Tao, P., et al: ‘Reducing shared cache contention by scheduling order adjustment on commodity multi-cores’. IEEE Int. Symp.on Parallel and Distributed Processing Workshops and PhD Forum, 2011, pp. 984992.
    9. 9)
      • 23. Chen, Q., Guo, M.: ‘Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures’, ACM Trans. Architect. Code Optim., 2014, 11, (1), p. 8.
    10. 10)
      • 14. Wang, Y., Cui, Y., Tao, P., et al: ‘Reducing shared cache contention by scheduling order adjustment on commodity multi-cores’. IEEE IPDPS Workshops, 2011, pp. 984992.
    11. 11)
      • 3. Greenhalgh, P.: ‘Big.little processing with arm cortex-a15 & cortex-a7’, ARM White Paper, 2011, pp. 18.
    12. 12)
    13. 13)
      • 4. Boyd-Wickizer, S., Clements, A.T., Mao, Y., et al: ‘An analysis of linux scalability to many cores’, in OSDI, 2010, 10, (13), pp. 8693.
    14. 14)
      • 7. Nishtala, R., Mossé, D., Petrucci, V.: ‘Energy-aware thread co-location in heterogeneous multicore processors’. IEEE Proc. of the Int. Conf. on Embedded Software, 2013, pp. 19.
    15. 15)
      • 21. Lugini, L., Petrucci, V., Mosse, D.: ‘Online thread assignment for heterogeneous multicore systems’. IEEE 41st Int. Conf. on Parallel Processing Workshops, 2012, pp. 538544.
    16. 16)
      • 9. Pusukuri, K.K., Gupta, R., Bhuyan, L.N.: ‘Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems’. Proc. of the 23rd Int. Conf. on Parallel Architectures and Compilation, ACM, 2014, pp. 289300.
    17. 17)
      • 5. Pusukuri, K.K., Gupta, R., Bhuyan, L.N.: ‘Thread reinforcer: dynamically determining number of threads via OS level monitoring’. IEEE Int. Symp. on Workload Characterization, 2011, pp. 116125.
    18. 18)
      • 22. Craeynest, K.V., Akram, S., Heirman, W., et al: ‘Fairness-aware scheduling on single-ISA heterogeneous multi-cores’. IEEE 22nd Int. Conf. on Parallel Architectures and Compilation Techniques, 2013, pp. 177187.
    19. 19)
      • 15. Kim, S.g., Eom, H., Yeom, H.Y.: ‘Virtual machine scheduling for multicores considering effects of shared on-chip last level cache interference’. IEEE Proc. of the 2012 Int. Green Computing Conf. (IGCC), Washington, DC, USA, pp. 16.
    20. 20)
    21. 21)
      • 24. Fedorova, A., Seltzer, M., Smith, M.D.: ‘Improving performance isolation on chip multiprocessors via an operating system scheduler’. IEEE Proc. of the 16th Int. Conf. on Parallel Architecture and Compilation Techniques, 2007, pp. 2538.
    22. 22)
      • 18. Mars, J., Vachharajani, N., Hundt, R., et al: ‘Contention aware execution: online contention detection and response’. Proc. of the 8th Annual IEEE/ACM Int. Symp. on Code Generation and Optimization, ACM, 2010, pp. 257265.
    23. 23)
      • 13. Feliu, J., Sahuquillo, J., Petit, S., et al: ‘L1-bandwidth aware thread allocation in multicore SMT processors’. Proc. of the 22nd Int. Conf. on Parallel Architectures and Compilation Techniques, 2013, pp. 123132.
    24. 24)
      • 12. Merkel, A., Stoess, J., Bellosa, F.: ‘Resource-conscious scheduling for energy efficiency on multicore processors’. Proc. of the 5th European Conf. on Computer Systems, ACM, 2010, pp. 153166.
    25. 25)
      • 6. Wang, W., Dey, T., Mars, J., et al: ‘Performance analysis of thread mappings with a holistic view of the hardware resources’. IEEE Int. Symp. on Performance Analysis of Systems and Software, 2012, pp. 156167.
    26. 26)
      • 27. Bienia, C.: ‘Benchmarking modern multiprocessors’. PhD thesis, Princeton University, 2011.
    27. 27)
      • 26. Zhu, H., He, L.: ‘A graph based approach for co-scheduling jobs on multi-core computers’. Imperial College Computing Student Workshop, 2013, vol. 35, pp. 144151.
    28. 28)
      • 10. Jiang, Y., Shen, X., Chen, J., et al: ‘Analysis and approximation of optimal co-scheduling on chip multiprocessors’. Proc. of the 17th Int. Conf. Parallel Architectures and Compilation Techniques, ACM, 2008, pp. 220229.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2015.0053
Loading

Related content

content/journals/10.1049/iet-cdt.2015.0053
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading