|
Our compiler, referred to as a 'single source' compiler,
is a parallelizing, simdizing compiler, which generates from a single
C or Fortran input source file, multiple binaries targetting the PPE and
SPE processing elements. Our goal in this compiler is to generate highly
optimized code for the multiple levels of parallelism while providing
an abstraction of the underlying architectural intricacies, thus allowing
the user to develop applications for a parallel architecture with a single
shared memory image.
At the core of our compilation strategy is our technique
for abstracting the small local memories of the SPE.Each SPE has a 256k
local memory which is used for both data and instructions. An SPE can
directly access only its local store, requiring a DMA transfer whenever
it reads or writes locations in the shared system memory. This imposes
significant burden on the programmer, especially for large programs accessing
significant amounts of data. Our compiler-controlled software-cache,
memory hierarchy optimizations and code partitioning techniques
assume all data resides in shared system memory, and enables automatic
transfer of code and data while preserving coherence across all the local
SPE memories and system memory. This infrastructure provides the underpinning
for enabling parallelism across the Cell processing elements. Our current
compiler enables this via OpenMP Pragmas, but our techniques will
easily support the existing auto-parallelization techniques in the compiler
framework. Other parallelization paradigms such as UPC could be developed
on this framework.

|