Announcement

Collapse
No announcement yet.

SWAP slower than manual

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SWAP slower than manual

    I've been using the SWAP statement in a very tight loop with many itterations and I noticed that when i commented out the SWAP statement and placed in regular code to do it, there was a ~4.5 second performance gain. I thought SWAP would be faster since there were less assignments involved. Perhaps this is a candidate for improvement in the next version of the compiler?

    -Mike

  • #2
    Care to post some sample code?

    --Dave


    ------------------
    PowerBASIC Support
    mailto:[email protected][email protected]</A>
    Home of the BASIC Gurus
    www.basicguru.com

    Comment


    • #3
      The following isnt my working code, but its setup in a similar fashion. My program is dealing with files of fairly large sizes. The 5000000 itteration of the outer loop represents 5,000,000 byte sized file.

      If i register the x1 and x2 variables instead of the counter, performance is improved for both versions of the swap, but the manual version still beats the SWAP statement version.

      Code:
      #COMPILE EXE
      #REGISTER NONE
      #DIM ALL
      
      FUNCTION PBMAIN()
      
         DIM x1 AS DWORD
         DIM x2 AS DWORD
         DIM temp AS DWORD
         Register i AS LONG
         Register j AS LONG
         DIM lngTime AS LONG
      
         
      
         lngTime = TIMER
      
          FOR i = 0 TO 5000000
              FOR j = 0 TO 15
                  SWAP x1,x2
                  'temp = x1
                  'x1 = x2
                  'x2 = temp
              NEXT j
              SWAP x1,x2
              'temp = x1
              'x1 = x2
              'x2 = temp
          NEXT i
         
         MSGBOX "Total: " & STR$(TIMER-lngTime) & " secs."
      
      END FUNCTION

      Got the code /code tags in this time, and the second commented out set of code for the outer swap. (not really necessary to see the performance difference, but this way more accurately reflects my working code)



      [This message has been edited by Mike Joseph (edited April 01, 2000).]

      Comment


      • #4
        Hello,

        Using just LONGs the ratio between using your own temp LONG and using the SWAP function was 20 to 1 with your own temp LONG method being the faster of the two. How ever, if you are using large UDTs then the ratio is much smaller (72 bytes gave a 2 to 1 ratio).

        I would suggest that the help file be updated to state that the SWAP function is not faster than using your own temp variable, it is just easier.

        Thanks,
        Colin Schmidt

        ------------------
        Colin Schmidt & James Duffy, Praxis Enterprises, Canada


        [This message has been edited by Colin Schmidt (edited April 02, 2000).]

        Comment


        • #5
          Oops, by the time I came up with a reply there where two more posts!

          Here is the code I used:

          The first message box displays “6.76 - .38” and the second displays “3.74 – 2.19”

          Code:
          #COMPILE EXE
          
          TYPE myBigType
              a AS LONG
              b AS LONG
              c AS LONG
              d AS DOUBLE
              e AS DOUBLE
              f AS DOUBLE
          END TYPE
          TYPE myBiggerType
              a AS myBigType
              b AS myBigType
              c AS myBigType
          END TYPE
          
          FUNCTION PBMAIN
              LOCAL a AS LONG
              LOCAL b AS LONG
              LOCAL s AS LONG
              
              LOCAL x AS myBiggerType
              LOCAL y AS myBiggerType
              LOCAL z AS myBiggerType
              
              REGISTER c1 AS LONG, c2 AS LONG
              LOCAL t1 AS DOUBLE, t2 AS DOUBLE
              
              t1 = TIMER
              FOR c1 = 1 TO 20000000
                  SWAP a, b
              NEXT c1
              t1 = TIMER - t1
          
              t2 = TIMER
              FOR c1 = 1 TO 20000000
                  s = a
                  a = b
                  b = s
              NEXT c1
              t2 = TIMER - t2
              
              MSGBOX "Swaping Longs:> " + STR$(ROUND(t1, 3)) + " - " + STR$(ROUND(t2, 3))
              
              t1 = TIMER
              FOR c1 = 1 TO 400000
                  SWAP x, y
              NEXT c1
              t1 = TIMER - t1
          
              t2 = TIMER
              FOR c1 = 1 TO 400000
                  z = x
                  x = y
                  y = z
              NEXT c1
              t2 = TIMER - t2
              
              MSGBOX "Swaping UDTs:> " + STR$(ROUND(t1, 3)) + " - " + STR$(ROUND(t2, 3))
              
          END FUNCTION
          Colin Schmidt

          ------------------
          Colin Schmidt & James Duffy, Praxis Enterprises, Canada

          Comment


          • #6
            Swap is faster for UDT's, slower for LONGs.
            Swap copies the UDT memory around directly. Manual UDT copies are with SafeArrayCopy. As for LONGs, the reason is because it uses !XCHG when its faster just to !MOV LONGs around, oddly enough.
            Code:
            #COMPILE EXE
            #REGISTER NONE
            #DIM ALL
            %NORMAL_PRIORITY_CLASS                       = &H20
            %IDLE_PRIORITY_CLASS                         = &H40
            %HIGH_PRIORITY_CLASS                         = &H80
            %REALTIME_PRIORITY_CLASS                     = &H100
            DECLARE FUNCTION GetCurrentProcess LIB "KERNEL32.DLL" ALIAS "GetCurrentProcess" () AS LONG
            DECLARE FUNCTION SetPriorityClass LIB "KERNEL32.DLL" ALIAS "SetPriorityClass" (BYVAL hProcess AS LONG, BYVAL dwPriorityClass AS LONG) AS LONG
            DECLARE FUNCTION QueryPerformanceCounter LIB "KERNEL32.DLL" ALIAS "QueryPerformanceCounter" (lpPerformanceCount AS QUAD) AS LONG
            DECLARE FUNCTION QueryPerformanceFrequency LIB "KERNEL32.DLL" ALIAS "QueryPerformanceFrequency" (lpFrequency AS QUAD) AS LONG
            DECLARE SUB DebugBreak LIB "KERNEL32.DLL" ALIAS "DebugBreak" ()
            TYPE TestType
              A AS LONG
              B AS ASCIIZ * 60
              C AS LONG
            END TYPE
            
            %Itterations = 5000000 ' 5 mil
            FUNCTION PBMAIN()
                LOCAL TestVariable1 AS TestType
                LOCAL TestVariable2 AS TestType
                LOCAL lngTemp AS TestType
            
                LOCAL A AS QUAD, B AS QUAD, OverHead AS QUAD, Freq AS QUAD
                LOCAL ResultsCode AS EXT, ResultsSwap AS EXT
                LOCAL strText AS STRING
             
                REGISTER I AS LONG
            
                SetPriorityClass GetCurrentProcess, %REALTIME_PRIORITY_CLASS
                   
                QueryPerformanceFrequency Freq
            
                QueryPerformanceCounter A
                QueryPerformanceCounter B
                OverHead = B - A
            
                TestVariable1.A = 16
                TestVariable2.C = 32
            
                QueryPerformanceCounter A
                FOR I = 1 TO %Itterations
                  lngTemp = TestVariable1
                  TestVariable1 = TestVariable2
                  TestVariable2 = lngTemp
                NEXT
                QueryPerformanceCounter B
                ResultsCode = (B - A - OverHead) / Freq
            
                QueryPerformanceCounter A
                FOR I = 1 TO %Itterations
                  SWAP TestVariable1, TestVariable2
                NEXT
                QueryPerformanceCounter B
                ResultsSwap = (B - A - OverHead) / Freq
            
                SetPriorityClass GetCurrentProcess, %NORMAL_PRIORITY_CLASS
                
                strText = "CODE: " & FORMAT$(%Itterations,"##,###,###") & " in " & FORMAT$(ResultsCode,"###0.####") & " seconds." & CHR$(13,10)
                strText = strText & "SWAP: " & FORMAT$(%Itterations,"##,###,###") & " in " & FORMAT$(ResultsSwap,"###0.####") & " seconds." & CHR$(13,10)
                MSGBOX strText
            END FUNCTION


            [This message has been edited by Enoch S Ceshkovsky (edited April 02, 2000).]

            Comment


            • #7
              Enoch, I ran your code and it produced similar results to that of my own less precise code posted above. Both UDTs and LONGs where slower when using SWAP, although the ratio was again much closer when using UDTs.

              With the UDT, the code took 26 seconds, and the SWAP took 30.
              With LONGs the code took .1 second and the SWAP took 1.7

              I'm running Win98SE on an AMD K6-2 350. Would the processor and/or OS make a difference here?

              Colin Schmidt


              ------------------
              Colin Schmidt & James Duffy, Praxis Enterprises, Canada

              Comment


              • #8
                My Computer: P3-500, 256MB, Win2k Server.
                Code: 12.8813, Swap: 10.1714
                Swap should be faster because of the way it copies. Code copy has the SafeArray overhead. I hope someone can comment on why a UDT is treated as a SafeArray.


                ------------------

                Comment


                • #9
                  Enoch --
                  Make following changes in your code and ,pls, report, results
                  Type TestType
                  A As Long
                  B As String * 60
                  C As Long
                  d As String * 1160
                  End Type
                  %Itterations = 1000000 ' 1 mil (enough)

                  ------------------

                  Comment


                  • #10
                    Here's the results after Semen's change:
                    Code: 11.3797
                    Swap: 35.8503


                    ------------------

                    Comment


                    • #11
                      I played a little.
                      On my PC (PIII-550,Win2000), if length of UDT (not depends of variables's type) is >= 100 bytes "code" won, and as longer UDT than code is much faster.
                      To say true, it's not a big surprise to me, because - like I imagine - SWAP should execute the same as code PLUS to receive and release a buffer.
                      Surprise that if UDT < 100 bytes, SWAP works faster.

                      ------------------

                      Comment


                      • #12
                        My work computer is a P2-266(nt diag says ~267), 256MB, NT 4.0 Server.
                        Move Code:
                        Code:
                            QueryPerformanceCounter A
                            FOR I = 1 TO %Itterations
                              MoveMemory VARPTR(lngTemp), VARPTR(TestVariable1), lngSize
                              MoveMemory VARPTR(TestVariable1), VARPTR(TestVariable2), lngSize
                              MoveMemory VARPTR(TestVariable2), VARPTR(lngTemp), lngSize
                            NEXT
                            QueryPerformanceCounter B
                            ResultsMove = (B - A - OverHead) / Freq
                        Using semen's large UDT:
                        Code: 21.0439
                        Swap: 66.0067
                        Move: 07.5481

                        It has the potential to be a bit faster if lngSize was a constant. %TestUDTSize


                        [This message has been edited by Enoch S Ceshkovsky (edited April 03, 2000).]

                        Comment

                        Working...
                        X