Abstract:
Despite advancements in microprocessor and Instruction Set Architecture design such as dual core processors, CELL technology and hyper-threading, the use of Single Instruction Multiple Data (SIMD) instructions is still a recognized method of obtaining speedup in multimedia and Digital Signal Processing applications.
The application of SIMO instructions within multimedia systems is typically static in nature, is non-configurable and is unable to adapt to dramatic changes to system specifications such as a change in data path width or ISA.
Suhsequently, scientitic code that is vectorized for one specific instruction set will not run optimally on competing processors with contlicting capabilities and subtleties in corresponding instruction sets. This problem is further exacerbated by the fact that most vector code is hand-written using combinations of inline assembly code and primitive functions that map to combinations of low level instructions when compiled.
We present a scheme of dynamic compilation that can automatically detect so called "hotspots" (i.e., regions of nontrivial computation) that are parallelizable in nature and vectorize them using SIMO instructions using a target-independent sequence of code generation steps.