Originally posted by John Gleason
View Post
Announcement
Collapse
No announcement yet.
#Optimize vs #Speed vs #Neither
Collapse
X
-
Hmmmm.... "More Taste?" or "Less Filling?"
=====================================================
One machine can do the work of fifty ordinary men.
No machine can do the work of one extraordinary man.
Elbert Hubbard (1859-1915)
=====================================================Last edited by Gösta H. Lovgren-2; 31 Mar 2009, 07:31 PM.
Leave a comment:
-
Bern, I just checked using PBwin 9.01 and it defaults to #OPTIMIZE SPEED, i.e. there are !nop's inserted throughout when no #OPTIMIZE is specified. Specifying #..SIZE results in no inserted !nop's. :hmmm:
Leave a comment:
-
FYI - I emailed PB support about the discrepency between the Gazette and Help/Docs. They said the Help/Docs were incorrect. PB uses #OPTIMIZE SIZE by default.
Leave a comment:
-
Paul
> Try varying the number of NOPs from 1 to 16 and see if there is any variation.
I am only getting slight variations in the fourth significant figure - so negligible in practical terms.
Interestingly, the code is nearly four times slower if I use
Code:Local hProcess As Long hProcess = GetCurrentProcess() SetProcessAffinityMask( hProcess, 1) ' ie CPU 0
Last edited by David Roberts; 31 Mar 2009, 09:20 AM.
Leave a comment:
-
David,
Try varying the number of NOPs from 1 to 16 and see if there is any variation.
Some architectures are less sensitive to alignment, if yours is one then just optimize for size.
Athlons, especially when writing ASM, can vary in perfomance by over 20% just by adjusting the code alignment.
Paul.
Leave a comment:
-
Paul
I ran your code above quite a few times and then
> '!nop 'Uncommenting this NOP slows the loop by 5%
and found that on my machine there wasn't a blind bit of difference.
I had noticed previously when running code by yourself and John Gleason where nops were inserted deliberately that there was no difference in speed when I altered the number of insertions.
I have an Intel E6700 which I got shortly after its introduction. Perhaps it and the architecture which has followed since simply isn't sensitive to alignment.
Leave a comment:
-
David,
As for ALIGN see the fifth post here.
http://www.powerbasic.com/support/pb...ighlight=align
Paul.
Leave a comment:
-
There are certain alignments of instructions which cause a degradation in performance.
This happens worst when certain instruction straddle cache line boundaries.
The effect is more noticible in repeated, tight loops.
The recommended way to reduce these effects is to align branch targets with the start of the cache line which reduces the liklihood of an instruction's performance being degraded in this way. This is done by padding the code with NOPs which slightly increases the code size.
The effect of OPTIMIZE SPEED is not to make all the code faster, since a lot of code isn't in tight loops or isn't misaligned to start with, but to reduce the chances of a misalignment slowing your code down unexpectedly.
I can demonstrate it but the exact effects vary from CPU to CPU so the following code might not perform the same on your system. However, it is good practice on all CPUs.
The following code allows the alignment of the loop to be adjusted by inserting NOPs.
As posted, it runs in 4.09 clks per loop on my AthlonXP CPU.
Uncommenting the NOP slows this to 4.32 clks/loop, 5.6% slower.
Paul
Code:#OPTIMIZE SIZE 'stop the compiler inserting NOPs and messing up my own alignment FUNCTION PBMAIN () AS LONG #ALIGN 16 'set a fixed alignment which we'll adjust by insterting NOPS !nop !nop !nop !nop !nop !nop !nop '!nop 'Uncommenting this NOP slows the loop by 5% sum##=0 cnt&=1000000000 TIX a&& FOR r& = 1 TO cnt& sum## += r& NEXT TIX END a&& PRINT "CPU cycles per loop="; a&&/cnt& PRINT sum## WAITKEY$ END FUNCTION
Leave a comment:
-
From the Docs: If not used, the default is to choose faster code speed.
From the Gazette: The alternative is to specify #OPTIMIZE SIZE (the default) ...
I played with #OPTIMIZE when PB9 first came out and I could not tell what the default was either.
As for ALIGN see the fifth post here.
Leave a comment:
-
time elapsed plummeted from 1.612 secs to 1.595 secs and Size ballooned from 654 to 741 bytes
Leave a comment:
-
#Optimize vs #Speed vs #Neither
Just read the latest newsletter regarding #Optimize vs #Speed. I took a 400k program I am currently working on and decided to try them out (having used neither previously). There was virtually no size difference/change between using either or none at all.
No complaints, just curiosity.
Here's a little sample code:
'Code:'PBWIN 9.01 - WinApi 05/2008 - XP Pro SP3 #Compile Exe #Dim All '>>>>>>> Virtually no difference in using the #Directives or not '>>>>>>> Rem/Unrem to run test '$Title = "No Directive" '#Optimize SPEED '$Title = "Speed Directive" #Optimize Size $Title = "Size Directive" #Include "WIN32API.INC" #Include "COMDLG32.INC" #Include "InitCtrl.inc" Sub Testing_directives Local ctr As Long Local l As String For ctr = 1 To 100000 l$ = String$(ctr, "k") Next ctr End Sub Function PBMain ErrClear Local tmr As Double tmr = Timer Call Testing_directives ?Using$("Done in #.### secs", Timer - tmr),,$Title End Function '
Speed = 1.612 secs for each or none (identical for each condition)
Size = 655 bytes for no Directive
Size = 654 bytes when using either Directive (1 byte smaller)
===========================
He of whom many are afraid
ought to fear many.
Sir Francis Bacon
===========================
Later: just for fun I put an #ALIGN 16 in front of the For Ctr loop and the time elapsed plummeted from 1.612 secs to 1.595 secs and Size ballooned from 654 to 741 bytes (for nothing and for Speed). Note #Optimize Size speed stayed at 1.612 though size didn't change. {grin}
Gee, so many options. What's a guy to do? {sigh}
=====================================
"I choose a block of marble
and chop off whatever I don't need."
Francois-Auguste Rodin (1840-1917),
when asked how he managed
to make his remarkable statues
=====================================Tags: None
Leave a comment: