top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Does gcc hoist memory-reference based loop invariant?

0 votes
428 views

For the below simple case, I think poly1[i] should be hoisted to outermost loop to avoid loading from innermost loop at each iteration.
But for arm-none-eabi target like cortex-m4, gcc fails to do so. I this a normal case or a missing optimization? Please advise.

void
PolyMul (float *poly1, unsigned int n1,
 float *poly2, unsigned int n2,
 float *polymul, unsigned int *nmul)
{
 unsigned int i, j;

 for (i = 0; i for (j = 0; j polymul[i+j] += poly1[i] * poly2[j];
}
posted Jun 13, 2013 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

1 Answer

0 votes

Unless you use 'restrict' qualifier to the pointer type, I don't think compiler can do such optimization due to aliasing issue.

answer Jun 13, 2013 by anonymous
Also I don't think -fstrict-aliasing could help for your case. Becase poly1, poly2, and polymul are all 'float *' pointer type.
Similar Questions
+2 votes

I want to use some loop optimizations. I have build GCC-4.8.1 using gmp 5.1.3, mpfr 3.1.2, mpc 1.0.1, cloog 0.18.0 and isl 0.11.1. I tried optimization flags -floop-interchange and -floop-block. I don't find any changes in the optimized code.

Are graphite loop transforms implemented in GCC-4.8.1. Should I configure and build GCC in a different way.

0 votes

I've noticed the difference in gcc and llvm behaviour with the following code:

$ cat test.c

int main()
{
 for(int i = 0;; ({break;}))
 printf("Hello, worldn");
}

$ clang test.c -pedantic &; ({break;}))
1 warning generated.
Hello, world

$ gcc test.c -std=gnu11 -pedantic &; ({break;}))
test.c:5:21: warning: ISO C forbids braced-groups within expressions [-Wpedantic]
 for(int i = 0;; ({break;}))

So, llvm thinks that this is GNU extension (seems like it really is), but compiles it, and gcc does not, even if the standard is specified as gnu11. Is it a bug?

+2 votes

I'm working on a program where a few switch statements end up being hotspots. Some manual experimentation shows that rearranging switch cases so that more commonly executed ones appear together helps, as does checking for the most common case separately. The downside is that it makes the code uglier. Can GCC perform these kinds of optimizations as part of PGO, and how good is it at it?

...