Announcement

Collapse
No announcement yet.

ASM Memory Copy..

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    ASM Memory Copy..

    I am not farmiliar with assembly but could someone who knows ASM write
    me a simple memory copy routine in memory.

    I want to pass information to this ASM function and have it copy data
    from one location to the next.

    Sorta Like..

    Sub CopyMem (SourceLocation as DWORD, DestinationLocation as DWORD, NumBytes as DWORD)
    End Sub

    I need this routine to be as fast as possible and it seems ASM is the
    only way to solve my problem..

    Any help would be appreciated..


    ------------------
    Explorations v3.0 RPG Development System
    http://www.explore-rpg.com
    Explorations v9.10 RPG Development System
    http://www.explore-rpg.com

    #2
    Use the CopyMemory-API. It is as fast as you can get it...

    Regards
    Peter

    ------------------
    [email protected]
    www.dreammodel.dk

    Comment


      #3
      I liked a way "for dummies", which works with the same speed.

      Code:
      sub CpyMem(Destination As Dword, Source As Dword, Length As Long)     
         dim arrDest(Length-1) as byte at Destination    
         dim arrSour(Length-1) as byte at Source
         mat arrDest() = arrSour()
      end sub
      I don't name author, because it looks, that he refuses from own idea

      Added later.
      I compared three methods. On my PC (Win2000) Steve's sub works with the same speed as API.
      MAT works good with big chunks only (>= 10-20 K).

      Code:
         #Compile Exe
         #Dim All
         #Register None
         
         %L = 10000
         %k = 100000
      
         Declare Function MoveMemory Lib "KERNEL32.DLL" Alias "RtlMoveMemory" (ByVal lpDest As Long, ByVal lpSource As Long, ByVal cbMove As Long) As Long
      
      
         Function MemCopyD(ByVal Source As Long, _
                                 ByVal Dest As Long, _
                                 ByVal ln As Long) As Long
        'Big mover, [ movsd ] BURP !
        'Written by Steve Hutchesson < [email protected] >
        '~~~~~~~~~~~~~~~~~~~~~~~~~~~
        Local lnth As Long, _
              divd As Long, _
              rmdr As Long
      
        ! cmp ln, 4           ; if under 4 bytes long
        ! jl tail             ; jump to label tail
        ! mov eax, ln         ; copy length into eax
        ! push eax            ; place a copy of eax on the stack
        ! shr eax, 2          ; integer divide eax by 4
        ! shl eax, 2          ; multiply eax by 4 to get dividend
        ! mov divd, eax       ; copy it into variable
        ! mov ecx, divd       ; copy variable into ecx
        ! pop eax             ; retrieve length in eax off the stack
        ! sub eax, ecx        ; subtract dividend from length to get remainder
        ! mov rmdr, eax       ; copy remainder into variable
        ! cld                 ; copy bytes forward
        ! mov ecx, ln         ; put byte count in ecx
        ! shr ecx, 2          ; divide by 4 for DWORD data size
        ! mov esi, Source     ; copy source pointer into source index
        ! mov edi, Dest       ; copy dest pointer into destination index
        ! repnz movsd         ; repeat while not zero, move string DWORD
        ! mov ecx, rmdr       ; put remainder in ecx
        ! jmp over
      tail:
        ! mov ecx, ln         ; set counter if less than 4 bytes in length
        ! mov esi, Source     ; copy source pointer into source index
        ! mov edi, Dest       ; copy dest pointer into destination index
      over:
        ! repnz movsb         ; copy remaining BYTES from source to dest
        ! sub ln, ecx         ; calculate return value ( little use )
      
        Function = ln         ' return bytes copied
      End Function
      
         Sub CpyMem(ByVal Destination As Dword, ByVal Source As Dword, ByVal Length As Long)
            ReDim bSource(0 : Length - 1) As Byte At Source
            ReDim bDestination(0 : Length - 1) As Byte At Destination
            Mat bDestination = bSource
      
         End Sub
      
         Function PbMain
      
            Dim s As Asciiz * %L
            Dim d As Asciiz * %L
            Dim t1 As Single, t2 As Single, i As Long
      
            s = "Test for CpyMem"
      
      
            t1 = Timer
            For i = 1 To %k
               MoveMemory VarPtr(d), VarPtr(s), %L
            Next
            t2 = Timer
            MsgBox Format$(t2 - t1, "#.### sec"),, "MoveMemory"
      
            t1 = Timer
            For i = 1 To %k
               CpyMem VarPtr(d), VarPtr(s), %L
            Next
            t2 = Timer
            MsgBox Format$(t2 - t1, "#.### sec"),, "MAT"
      
            t1 = Timer
            For i = 1 To %k
               MemCopyD VarPtr(d), VarPtr(s), %L
            Next
            t2 = Timer
            MsgBox Format$(t2 - t1, "#.### sec"),, "Steve"
            
            'If d <> s Then MsgBox "Oh" Else MsgBox "Ok"
      
         End Function

      ------------------
      E-MAIL: [email protected]

      [This message has been edited by Semen Matusovski (edited March 24, 2001).]

      Comment


        #4
        Steve Hutchesson once posted this to the Windows Forum. Very fast.
        All credits to Steve..
        Code:
        DECLARE FUNCTION MemCopyD(BYVAL Source AS LONG, _
                                 BYVAL Dest AS LONG, _
                                 BYVAL ln AS LONG) AS LONG
         
        FUNCTION MemCopyD(BYVAL Source AS LONG, _
                                   BYVAL Dest AS LONG, _
                                   BYVAL ln AS LONG) AS LONG
          'Big mover, [ movsd ] BURP !
          'Written by Steve Hutchesson < [email protected] >
          '~~~~~~~~~~~~~~~~~~~~~~~~~~~
          LOCAL lnth AS LONG, _
                divd AS LONG, _
                rmdr AS LONG
         
          ! cmp ln, 4           ; if under 4 bytes long
          ! jl tail             ; jump to label tail
          ! mov eax, ln         ; copy length into eax
          ! push eax            ; place a copy of eax on the stack
          ! shr eax, 2          ; integer divide eax by 4
          ! shl eax, 2          ; multiply eax by 4 to get dividend
          ! mov divd, eax       ; copy it into variable
          ! mov ecx, divd       ; copy variable into ecx
          ! pop eax             ; retrieve length in eax off the stack
          ! sub eax, ecx        ; subtract dividend from length to get remainder
          ! mov rmdr, eax       ; copy remainder into variable
          ! cld                 ; copy bytes forward
          ! mov ecx, ln         ; put byte count in ecx
          ! shr ecx, 2          ; divide by 4 for DWORD data size
          ! mov esi, Source     ; copy source pointer into source index
          ! mov edi, Dest       ; copy dest pointer into destination index
          ! repnz movsd         ; repeat while not zero, move string DWORD
          ! mov ecx, rmdr       ; put remainder in ecx
          ! jmp over
        tail:
          ! mov ecx, ln         ; set counter if less than 4 bytes in length
          ! mov esi, Source     ; copy source pointer into source index
          ! mov edi, Dest       ; copy dest pointer into destination index
        over:
          ! repnz movsb         ; copy remaining BYTES from source to dest
          ! sub ln, ecx         ; calculate return value ( little use )
         
          FUNCTION = ln         ' return bytes copied
        END FUNCTION

        ------------------

        Comment


          #5
          Try this one, for all the variations I have tried to get a faster
          algo, this one still beats the rest. I have tried 6 register versions,
          8 MMX register versions and this one still outclocks them. The speed
          limit on memory copy is apparently imposed by the actual speed of
          memory but the REP MOVSD pair is very well optimised and it has
          slightly less overhead.

          Regards,

          [email protected]

          Code:
          ' ###########################################################################
          
          FUNCTION MemCopyD(ByVal src as LONG, _
                            ByVal dst as LONG, _
                            ByVal ln as LONG) as LONG
          
              #REGISTER NONE
          
                ! cld
          
                ! mov esi, src
                ! mov edi, dst
                ! mov ecx, ln
          
                ! shr ecx, 2
                ! rep movsd
          
                ! mov ecx, ln
                ! and ecx, 3
                ! rep movsb
          
              FUNCTION = 0
          
          END FUNCTION
          
          ' ###########################################################################
          ------------------
          hutch at movsd dot com
          The MASM Forum - SLL Modules and PB Libraries

          http://www.masm32.com/board/index.php?board=69.0

          Comment


            #6
            Semen,

            What is mat? I found m(Move memory) documented in Win32.hlp.

            m SourceAddr Length DestAddress

            ------------------

            Comment


              #7
              Charles --
              MAT is PB statement (operations with arrays). So, look PB.HLP.

              ------------------
              E-MAIL: [email protected]

              Comment


                #8
                Thank you, Semen...

                And to think that I spent so much of my working career doing matrix algebra
                with Fortran, I shouldn't have forgotten the PB mat commands. Thanks again
                for reminding me.

                ------------------

                Comment


                  #9
                  Fast, easy memory copy:
                  Code:
                  LET Y = X

                  MCM
                  Michael Mattias
                  Tal Systems (retired)
                  Port Washington WI USA
                  [email protected]
                  http://www.talsystems.com

                  Comment


                    #10
                    Gee..

                    Thanks guys.. that was more than I expected.. I'll see if I can make
                    use of these routines..



                    ------------------
                    Explorations v3.0 RPG Development System
                    http://www.explore-rpg.com
                    Explorations v9.10 RPG Development System
                    http://www.explore-rpg.com

                    Comment


                      #11
                      Steve,

                      there is one possible way to copy even faster that that, but I'm not sure if it'll work.

                      Maybe you can write down the code for this because I have no experience in programming the floating point processor:

                      It should be possible to copy 8 or 10 bytes at a time, copying FPU registers around. This might be faster, maybe you can adapt the movsd loop to test this?

                      I've made a movsd loop myself but made sure my buffers are at least 4 bytes too large, I just increment the loop counter and copy between 1 and 4 bytes too many, forget about movsb. Should be faster on small copies.


                      Peter.


                      ------------------
                      [email protected]



                      [This message has been edited by Peter Manders (edited March 26, 2001).]
                      [email protected]

                      Comment


                        #12
                        Peter,

                        Somewhere I have seen code for doing memory copy using floating point
                        instructions but they use the same registers as MMX so it does not seem
                        to be any advantage.

                        I can't lay my hands on the test piece at the moment but it is an unrolled
                        loop of this type,

                        Code:
                            ! mov esi, src
                            ! mov edi, dst
                        
                          mmSt:
                            ! movq mm(0), [esi]
                            ! movq mm(1), [esi + 8]
                            ! movq mm(2), [esi + 16]
                            ! movq mm(3), [esi + 24]
                            ! movq mm(4), [esi + 32]
                            ! movq mm(5), [esi + 40]
                            ! movq mm(6), [esi + 48]
                            ! movq mm(7), [esi + 56]
                        
                            ! movq [edi],      mm(0)
                            ! movq [edi + 8],  mm(1)
                            ! movq [edi + 16], mm(2)
                            ! movq [edi + 24], mm(3)
                            ! movq [edi + 32], mm(4)
                            ! movq [edi + 40], mm(5)
                            ! movq [edi + 48], mm(6)
                            ! movq [edi + 56], mm(7)
                        This code below was my last attempt at improving on REP MOVSD, it is still
                        about 3 - 4 % slower than REP MOVSD and I tried this is after instruction
                        re-ordering to maximise its loop speed and it has no pairing problems and no
                        stalls.

                        From all of the technical data I have seen and from my own testing, REP MOVSD
                        is well optimised in the PII - PIII processor range but I have also run into
                        the technical data that the speed of the physical memory is the limiting factor
                        in memory copy and my testing appears to bear this out, all of the algorithms
                        I have tested come within about 5% of each other, even though the MMX version
                        should be a lot faster.

                        Regards,

                        [email protected]

                        Code:
                          ; #########################################################################
                          
                          srCopy proc src :DWORD, dst :DWORD, ln :DWORD
                          
                              LOCAL cntr :DWORD
                          
                              push ebx
                              push esi
                              push edi
                          
                              mov esi, src
                              mov edi, dst
                          
                              cmp ln, 16
                              jb ShortLoop
                          
                              mov eax, ln
                              shr eax, 4
                              mov cntr, eax
                          
                            @@:
                              mov eax, [esi]
                              mov [edi], eax
                              mov ebx, [esi+4]
                              mov [edi+4], ebx
                              mov ecx, [esi+8]
                              mov [edi+8], ecx
                              mov edx, [esi+12]
                              mov [edi+12], edx
                              add esi, 16
                              add edi, 16
                              dec cntr
                              jnz @B
                          
                              and ln, 15
                          
                            ShortLoop:
                              mov al, [esi]
                              inc esi
                              mov [edi], al
                              inc edi
                              dec ln
                              jns ShortLoop
                          
                              pop edi
                              pop esi
                              pop ebx
                          
                              ret
                          
                          srCopy endp
                          
                          ; #########################################################################
                        hmmmm, smileys



                        [This message has been edited by Steve Hutchesson (edited March 26, 2001).]
                        hutch at movsd dot com
                        The MASM Forum - SLL Modules and PB Libraries

                        http://www.masm32.com/board/index.php?board=69.0

                        Comment

                        Working...
                        X
                        😀
                        🥰
                        🤢
                        😎
                        😡
                        👍
                        👎