From another thread:
Building -march=AMDFAM10 is also totally out of the question. For a start that's a 64-bit architecture and you be using the x86-64 build instead. Also this is about the x86 build. But most importantly, building -march=AMDFAM10 will exclude support for any AMD processors pre-Phenom and all Intel processors. That simply isn't going to happen.
As for -mtune=AMDFAM10, this is redundant in your CFLAGS. -mtune is for tuning for an architecture different from -march whilst not excluding a lower architecture. For example if I was to build -march=pentium3 -mtune=pentium4 it would run on a Pentium 3 but would run more efficiently on a Pentium 4. Making -mtune the same as -march does nothing. -SSE2 is also redundant in your example as -march=AMDFAM10 includes support for SSE2.
There is of course nothing stopping people recompiling themselves as you have done to tune for there own systems but the release binaries need to be kept universal. I'm just suggesting tightening that up a bit.
For future reference you may want to check out the Gentoo safe CFLAGS page:
http://en.gentoo-wiki.com/wiki/Safe_Cflags
onemoar Wrote:march=pentium 3 ... seriously do you have any fking Idea how slow that isI think you've missed the point here. It is currently built for i686 with MMX. I am suggesting upping this to Pentium 3. As for speed comparisons, I have done this before and it will gain 1-2%. I would be happy to dig out the benchmarks I did with Nexuiz if you have any benchmarks to the contrary.
onemoar Wrote:xonotic should be using SSE2 at the very leastYou're forgetting a few things here. This is for the x86 build, NOT the x86-64. Anyone with a 64-bit processor should be running the x86-64 build anyway so we should forget about optimising to infinity for them. The x86 build needs to target those with Athlon XP and pre-EMT64 Pentium 4 systems. As the Athlon XP does not support SSE2, doing as you suggest would wipe out support for Athlon XP. SSE remains the highest common denomenator for x86 systems, hence -march=pentium3 being the most sensible choice. Adding any -mtune=pentium4 or -mtune=athlonxp would detriment one or other so should not be done.
onemoar Wrote:as I have seen 20% performance gains simply by adding -O2 -march=AMDFAM10 -mtune=AMDFAM10 -SSE2Firstly a 20% gain is the sort of thing you might get by switching -O2 on (they're all fairly similar, I have benchamrks) instead of no O flag at all. Unless you are comparing with some very broken build, I would not expect 20% performance improvement but of course if you have benchmarks, please provide them.
Building -march=AMDFAM10 is also totally out of the question. For a start that's a 64-bit architecture and you be using the x86-64 build instead. Also this is about the x86 build. But most importantly, building -march=AMDFAM10 will exclude support for any AMD processors pre-Phenom and all Intel processors. That simply isn't going to happen.
As for -mtune=AMDFAM10, this is redundant in your CFLAGS. -mtune is for tuning for an architecture different from -march whilst not excluding a lower architecture. For example if I was to build -march=pentium3 -mtune=pentium4 it would run on a Pentium 3 but would run more efficiently on a Pentium 4. Making -mtune the same as -march does nothing. -SSE2 is also redundant in your example as -march=AMDFAM10 includes support for SSE2.
onemoar Wrote:you are suggesting a 30% loss in performance to save haft a MBNo, I'm suggesting a 1-2% performance gain coupled with a 1-2% reduction in binary size.
onemoar Wrote:bandwidth inst that expensive in fact I would like to see xonotic more heavy optimist-ed for recent hardware more use of SSE4 sets and more cpu dependent opflags its 2012Your understanding here of the effect of different CFLAGS is not entirely correct. Typically when you up the -march flag you actually make the binary SMALLER. An analogy would be rewriting a book with a larger vocabulary allowing you to be more concise. If you are adding in many extra CFLAGS and getting anecdotal performance improvements coupled with massive bloat in the binary then you are doing things wrong.
onemoar Wrote:Pentium 3's and x86 are pretty much deadLook, we have the x86-64 build. This is for 64-bit processors and anyone using a modern system should be using this. We also have the x86 build. While this is still maintained it needs to be targetted at those who need it most: Athlon XP and Pentium 4 users. -march=pentium3 is the highest common denominator for these 2 processors and no across the board improvement will be had with any further instruction sets.
There is of course nothing stopping people recompiling themselves as you have done to tune for there own systems but the release binaries need to be kept universal. I'm just suggesting tightening that up a bit.
For future reference you may want to check out the Gentoo safe CFLAGS page:
http://en.gentoo-wiki.com/wiki/Safe_Cflags
I'm at least a reasonably tolerable person to be around - Narcopic

