Work Package 1

From Superoptimization
Jump to: navigation, search

Work package summary

Start date: 2 June 2014
End date: 18 July 2014

Description

  1. Identify code that can be superoptimized (for speed, size or energy) focussing on finding awkward constructs.
  2. Use rapid prototyping to evaluate key algorithms. Consider novel constructive techniques as well as exploring existing approaches.

Deliverables

Analysis of what code can be superoptimized. Prototype code that can be used to define the requirements of a future product. Example sequences identified and superoptimized, using multiple techniques.

Dependencies

None.

Status

Complete.

Approach

Identify code that can be superoptimized (for speed, size or energy) focussing on finding awkward constructs.
Use rapid prototyping to evaluate key algorithms. Consider novel constructive techniques as well as exploring existing approaches.

Notes

Difficult constructs

Memory
Possibly not too difficult if only considering a single basic block
Memory liveness needs to be taken into account (To discuss next week)
Skip instructions
Possibly difficult to encode. Consider as seperate basic blocks for now?
Flags
Naively modelling flags results in a large number of constraints
However, the constraints should fall naturally out of the bitvector arithmetic in most cases.
Increase in constraint-generator complexity for reduction in final constraint complexity (which should reduce superoptimization time)
Few instructions that use the flags (except branches), so maybe low effectiveness in superoptimization


How can we handle larger basic blocks?

Hopefully be able to generate instruction sequences up to length 10 (maybe 15).
The set of instructions used for superoptimization should be restricted (conservatively).
E.g. if the sequence does not load or store memory, then perhaps we do not need the memory instructions
Some basic blocks are larger, however we may be able to split the basic block
Consider the dataflow DAG, cyclomatic complexity. Can the DAG be partitioned (possibly make a cut which cuts the shortest amount of edges – where, how, how many?)
Potential for superoptimization often inhibited by load/stores dispersed through the code.
The load/store are a single chain which cannot be reordered/eliminated, since the volatility of the address is not known (see Illustration 1).
Code to be superoptimized could chosen by selecting a subgraph with few connections to the load/store chain (highlighted area on the right).
The complexity of the DAG and size of the basic blocks are higher than expected. The number of instructions per basic block could greatly slow down superoptimization, although partitioning the DAG into smaller areas may help.
The edges represent the live registers/flags. The search can be restricted based on liveness information

Deliverables

The above notes, and notes on the Meetings page are the output of WP1, to be formulated into the project report (WP4)