1. cpu optimization:
I tried different optimization flags for the arm cpu and would advise you use "CFLAGS_USER= -march=armv5te -mcpu=arm946e-s" as it reduces the autoexec.bin size by -12% while the "CFLAGS_USER= -march=armv5te -mtune=arm946e-s" only cuts 6%, both result in no performance changes in the zebra benchmark.
2. gcc optimization:
On my 60d, a -O2 version is 20% faster in zebras than -Os. However, -O3 crashes the camera, so I cannot say how much faster that would be. The problems have to be one or some of the added flags over -O2, i.e. "-finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone" ... with try & error it would be possible to find the culprit and exclude it like for example "-O3 -fnoinline-functions" if anyone has time to spare.
3. gcc version:
The Linaro gcc 4.7 shows no fps improvement over vanilla fsf 4.6. That isn't really a surprise, many Linaro optimizations are for the latest arm cores, and it's only the zebra test that was tested. Still, as I wrote before Ubuntu & Linaro have switched to Linaro and it certainly doesn't hurt. I'll run this compile on my camera and see if any regressions occur, but since few things changed from gcc 4.6->4.7 I don't expect any.