< ROSE Compiler Framework
Where is the tool
Source file
Binary, not built or installed by default . You have to build it:
- cd rose_buildtree/tutorial
- make loopProcessor
Documentation
See more at
- Chapter 38 of http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
Command line options
..[buildtree/tutorial]./loopProcessor --help loopProcessor <options> <program name> -gobj: generate object file -orig: copy non-modified statements from original file # split loop #---------------------------------- -splitloop: applying loop splitting to remove conditionals inside loops -annot <filename> -pre: apply partial redundancy elimination -fd: apply finite differencing to array index expressions # Debugging options #---------------------------------- -debugloop: print debugging information for loop transformations; -debugdep: print debugging information for dependence analysis; -tmloop: print timing information for loop transformations; # Use special function to denote array access (the special function can be replaced # with macros after transformation). This option is for circumventing complex # subscript expressions for linearized multi-dimensional arrays. -arracc <funcname>: use function <funcname> to denote multi-dimensional array access; opt <level=0>: the level of loop optimizations to apply; by default, only the outermost level is optimized; # unroll loop: #---------------------------------- -unroll [-locond] [-nvar] [poet] <-unrollsize> : unrolling innermost loops at <unrollsize> # break up statements in loops #---------------------------------- -bs <stmtsize> : break up statements in loops at <stmtsize> -bk_poet <blocksize> : parameterize the blocking transformation -par_poet <blocksize> : paralleization transformation using POET # loop blocking #---------------------------------- -bk1 <blocksize> :block outer loops -bk2 <blocksize> :block inner loops -bk3 <blocksize> :block all loops # copy array #---------------------------------- -cp <copydim> :copy array regions with dimensions <= <copydim> -cp_poet<copydim> :parameterize array copy array regions; to be applied together with blocking. # loop interchange #---------------------------------- -ic1 :loop interchange for more reuses // *** # loop fission #---------------------------------- -fs0 : maximum distribution at all loops -fs01 : maximum distribution at inner-most loops # loop fusing #---------------------------------- -fs1 :single-level loop fusion for more reuses -fs2 :multi-level loop fusion for more reuses # Max number of nodes to split for transitive dependence analysis (to limit the overhead of transitive dep. analysis) -ta <int> :split limit for transitive dep. analysis # set cache line size in evaluating spatial locality (affect decisions in applying loop optimizations) -clsize <int> :set cache line size # set maximum distance of reuse that can exploit cache (used to evaluate temporal locality of loops) -reuse_dist <int> :set reuse distance -dt :perform dynamic tuning
Example use
Loop fusion
// -----------test loop fusion input.c ---------------
#define N 1024
void foo(double a[N], double b[N], double c[N])
{
int i,j;
for (i = 0; i < N; i++)
a[i - 1] = b[i];
for (j = 0; j < N; j++)
c[j] = a[j];
}
// command line
[..buildtree/tutorial]./loopProcessor -fs2 input.c
//------------------------ output---------------
// test loop fusion
#define N 1024
void foo(double a[1024],double b[1024],double c[1024])
{
int i;
int j;
for (i = 0; i <= 1024; i += 1) {
if (i <= 1023) {
a[i - 1] = b[i];
}
else {
}
if (i >= 1) {
c[-1 + i] = a[-1 + i];
}
else {
}
}
}
This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.