Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon openaccess Hardware-assisted instruction profiling and latency detection

Debugging and profiling tools can alter the execution flow or timing, can induce heisenbugs and are thus marginally useful for debugging time critical systems. Software tracing, however advanced it may be, depends on consuming precious computing resources. In this study, the authors analyse state-of-the-art hardware-tracing support, as provided in modern Intel processors and propose a new technique which uses the processor hardware for tracing without any code instrumentation or tracepoints. They demonstrate the utility of their approach with contributions in three areas - syscall latency profiling, instruction profiling and software-tracer impact detection. They present improvements in performance and the granularity of data gathered with hardware-assisted approach, as compared with traditional software only tracing and profiling. The performance impact on the target system – measured as time overhead – is on average 2–3%, with the worst case being 22%. They also define a way to measure and quantify the time resolution provided by hardware tracers for trace events, and observe the effect of fine-tuning hardware tracing for optimum utilisation. As compared with other in-kernel tracers, they observed that hardware-based tracing has a much reduced overhead, while achieving greater precision. Moreover, the other tracing techniques are ineffective in certain tracing scenarios.

References

    1. 1)
      • 18. Vaswani, K., Thazhuthaveetil, M.J., Srikant, Y.N.: ‘A programmable hardware path profiler’. Proc. Int. Symp. on Code Generation and Optimization, CGO ‘05, Washington, DC, USA, 2005, pp. 217228, doi: 10.1109/CGO.2005.3.
    2. 2)
      • 32. https://www.github.com/andikleen/simple-pt, accessed March 2016.
    3. 3)
    4. 4)
      • 19. Nowak, A., Yasin, A., Mendelson, A., et al: ‘Establishing a base of trust with performance counters for enterprise workloads’. Proc. 2015 USENIX Annual Technical Conf. (USENIX ATC 15), Santa Clara, CA, July 2015, pp. 541548.
    5. 5)
      • 11. Livshits, B.: ‘Improving software security with precise static and runtime analysis’. PhD thesis, Stanford University, Stanford, CA, USA, 2006.
    6. 6)
      • 3. IEEE Nexus 5001. Available at http://www.nexus5001.org/, accessed March 2016.
    7. 7)
      • 36. Adamoli, A., Hauswirth, M.: ‘Trevis: a context tree visualization & analysis framework and its use for classifying performance failure reports’. Proc. Fifth Int. Symp. on Software Visualization, SOFTVIS ‘10, Salt Lake City, UT, USA, 2010, pp. 7382, doi: 10.1145/1879211.1879224.
    8. 8)
    9. 9)
      • 20. Bitzes, G., Nowak, A.: ‘The overhead of profiling using PMU hardware counters’. Technical Report, CERN, openlab, 2014.
    10. 10)
      • 38. http://www.openblas.net, accessed March 2016.
    11. 11)
    12. 12)
    13. 13)
      • 27. Vasudevan, A., Qu, N., Perrig, A.: ‘XTRec: secure real-time execution trace recording on commodity platforms’. 2011 44th Hawaii Int. Conf. on Proc. System Sciences (HICSS), 2011, pp. 110, doi: 10.1109/HICSS.2011.500.
    14. 14)
      • 31. Kleen, A.: ‘Adding processor trace support to linux’. Available at http://www.lwn.net/Articles/648154/, accessed March 2016.
    15. 15)
    16. 16)
      • 24. Weidendorfer, J.: ‘Sequential performance analysis with callgrind and kcachegrind’, in Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (Eds.)Tools for high performance computing’ (Springer, Berlin Heidelberg, 2008, 1st edn.), pp. 93113.
    17. 17)
    18. 18)
      • 2. Ball, T., Burckhardt, S., Halleux, J., et al: ‘Deconstructing concurrency heisenbugs’. 31st Int. Conf. on Proc. Software Engineering – Companion Volume, 2009. ICSE-Companion 2009, 2009, pp. 403404, doi: 10.1109/ICSE-COMPANION.2009.5071033.
    19. 19)
    20. 20)
      • 29. Soffa, M.L., Walcott, K.R., Mars, J.: ‘Exploiting hardware advances for software testing and debugging (NIER track)’. Proc. 33rd Int. Conf. on Software Engineering, ICSE ‘11, Honolulu, USA, 2011, pp. 888891, doi: 10.1145/1985793.1985935.
    21. 21)
      • 35. Moret, P., Binder, W., Villazón, A., et al: ‘Visualizing and exploring profiles with calling context ring charts’, Softw. Pract. Exper., 2010, 40, (9), pp. 825847, doi: 10.1002/spe.v40:9.
    22. 22)
      • 40. http://lists.lttng.org/pipermail/lttng-dev/2015-October/025151.html, accessed March 2016.
    23. 23)
      • 7. http://www.ghs.com/products/timemachine.html, accessed March 2016.
    24. 24)
    25. 25)
      • 9. Boogerd, C., Moonen, L.: ‘On the use of data flow analysis in static profiling’. 2008 Eighth IEEE Int. Working Conf. on Proc. Source Code Analysis and Manipulation, 2008, pp. 7988, doi: 10.1109/SCAM.2008.18.
    26. 26)
      • 30. Intel: ‘Intel processor trace’ (Intel Press, 2015, 1st edn.), pp. 35783644, accessed March 2016.
    27. 27)
      • 39. https://www.kernel.org/doc/Documentation/trace/ftrace.txt, accessed March 2016.
    28. 28)
    29. 29)
      • 37. http://www.julialang.org/benchmarks/, accessed March 2016.
    30. 30)
      • 6. http://www.ds.arm.com/ds-5/, accessed March 2016.
    31. 31)
    32. 32)
      • 16. Dean, J., Hicks, J.E., Waldspurger, C.A., et al: ‘ProfileMe: hardware support for instruction-level profiling on out-of-order processors’. Thirtieth Annual IEEE/ACM Int. Symp. on Proc. Microarchitecture, 1997. Proc., 1997, pp. 292302, doi: 10.1109/MICRO.1997.645821.
    33. 33)
      • 5. http://www.ghs.com/products/supertraceprobe.html, accessed March 2016.
    34. 34)
      • 33. https://www.github.com/01org/processor-trace, accessed March 2016, 2015.
    35. 35)
    36. 36)
    37. 37)
      • 34. Baugh, L., Zilles, C.: ‘An analysis of I/O and syscalls in critical sections and their implications for transactional memory’. IEEE Int. Symp. on Proc. Performance Analysis of Systems and Software, 2008. ISPASS 2008, 2008, pp. 5462, doi: 10.1109/ISPASS.2008.4510738.
    38. 38)
      • 28. Pedersen, C., Acampora, J.: ‘Intel code execution trace resources’, Intel Technol. J., 2012, 16, (1), pp. 130136.
    39. 39)
      • 4. http://www.ds.arm.com/ds-5/debug/dstream/, accessed March 2016.
    40. 40)
http://iet.metastore.ingenta.com/content/journals/10.1049/joe.2016.0127
Loading

Related content

content/journals/10.1049/joe.2016.0127
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address