access icon free Optimising loops in dynamic dataflow

Dynamic dataflow allows simultaneous execution of instructions in different iterations of a loop, boosting parallelism exploitation. In this model, operands are tagged with their associated instance number, which is incremented as they go through the loop. Instruction execution is triggered when all input operands with the same tag become available. However, this traditional tagging mechanism often requires the generation of several control instructions to manipulate tags and guarantee the correct match. To address this problem, this work presents three dataflow loop optimisation techniques. The stack-tagged dataflow is a tagging mechanism that uses stacks of tags to reduce control overheads in dataflow. On the other hand, as nested loops may increase the overhead of stack-tag comparison, tag resetting can be used to set the tag to zero whenever it is safe, allowing a one-level reduction at the stack depth. Finally, loop skipping allows to further avoid stack comparison overhead in loops, when the number of iterations can be determined by the compiler. Experimental results show the overhead, drawbacks and benefits for the three optimisations presented. Moreover, the results suggested that a hybrid compiling approach can be used to get the best performance of each technique.

Inspec keywords: parallel programming; data flow graphs; data handling

Other keywords: stack tagged dataflow; dataflow overhead reduction; nested loops; tagging mechanism; dataflow loop optimisation; dynamic dataflow

Subjects: Combinatorial mathematics; Parallel programming; Data handling techniques

References

    1. 1)
    2. 2)
    3. 3)
      • 10. Alves, T.A.O.: ‘Dataflow execution for reliability and performance on current hardware’. PhD thesis, COPPE – UFRJ, 2014.
    4. 4)
    5. 5)
      • 13. Marzulo, L.A.J.: ‘Exploring multithreaded execution with dataflow oriented programming’. PhD thesis, COPPE – UFRJ, 2011.
    6. 6)
    7. 7)
      • 3. Swanson, S., Michelson, K., Schwerin, A., et al: ‘WaveScalar’. 36th Annual IEEE/ACM Int. Symp. on Microarchitecture, 2003, MICRO-36, 2003, pp. 291302.
    8. 8)
      • 6. ‘Tbb flowgraph’. Available at http://www.threadingbuildingblocks.org/docs/help/reference/flow_graph.htm, accessed 8 August 2014.
    9. 9)
      • 9. Santiago, L., Marzulo, L., Goldstein, B., et al: ‘Stack-tagged dataflow’. Int. Symp. on Computer Architecture and High Performance Computing Workshop (SBAC-PADW), 2014, 2014, pp. 7883.
    10. 10)
    11. 11)
      • 12. Alves, T.A.O., Kundu, S., Marzulo, L.A.J., et al: ‘Online error detection recovery for dataflow execution’. IEEE IOLTS, 2014.
    12. 12)
      • 11. Marzulo, L.A.J., Alves, T.A., Franca, F.M.G., et al: ‘TALM: a hybrid execution model with distributed speculation support’. Int. Symp. on Computer Architecture and High Performance Computing Workshops, 2010, pp. 3136.
    13. 13)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cds.2015.0148
Loading

Related content

content/journals/10.1049/iet-cds.2015.0148
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading