Announcement

Collapse
No announcement yet.

Memory Matching

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Memory Matching

    I have a need to compare blocks of memory as quickly as possible to determine if they match. I found some code that emulates memcmp in the forum but 1) I don't really care about the values of the two memory locations and 2) I don't need to continue the comparison if a mismatched byte is detected. So, I put together an ASM function to do this. It's not really that much faster than the memcmp function when the data matches and can be much faster if it doesn't. But, it was a chance to play around with ASM. Perhaps there is a better way to do this - a learning opportunity for me.

    '
    Code:
    #COMPILE EXE
    #DIM ALL
    #UNIQUE VAR ON
    #OPTION VERSION5
    #TOOLS OFF
    
    
    FUNCTION zMemMatch ( _
        BYVAL pData1 AS LONG, _   'IN:  Ptr to first data to compare
        BYVAL pData2 AS LONG, _   'IN:  Ptr to second data to compare
        BYVAL dSize AS LONG _     'IN:  Number of bytes to compare
        ) AS LONG                 'RTN: TRUE if all bytes match or FALSE if not.
    #REGISTER NONE
    
    '-- Move the memory pointers to registers.
    !mov ebx, pData1
    !mov edi, pData2
    
    '-- Calculate sizes. After division, EAX will equal the number of 4-byte chunks to compare and
    '   EDX will equal the number of bytes to compare after comparing 4-byte chunks.
    !mov edx, 0           ' Clear return dividend register
    !mov eax, dSize       ' Store the dividend and return the number of 4-bye chuncks
    !mov ecx, 4           ' Store the divisor (size of 32-bit value)
    !div ecx              ' Do the division (eax/ecx). ECX is now free for loop counting.
    
    '-- Compare the 4-byte chunks first.
    !mov ecx, eax         ' Set loop counter for 4-byte chunks
    Loop_4:               ' Top of loop
    !mov esi, [ebx]       ' Move 32-bit values
    !mov esp, [edi]
    !cmp esi, esp         ' Compare them
    !jne not_equal        ' Exit if chunks don't match
    !add ebx, 4           ' Increment pointers for next loop or following byte compare
    !add edi, 4
    !loop Loop_4          ' Loop if counter is not zero
    
    '-- Process any remaining bytes here. If there are none, then exit.
    !cmp edx, 0
    !jz finished          ' Exit if no remaining bytes to check
    !mov ecx, edx         ' Reset ecx loop counter with remaining byte count
    Loop_1:               ' Top of loop
    !mov al, [ebx]        ' Look at low byte values
    !mov dl, [edi]
    !cmp al, dl           ' And compare them
    !jne not_equal        ' Not equal so exit
    !add ebx, 1           ' Increment pointers for next loop
    !add edi, 1
    !loop Loop_1          ' Loop until counter ecx reaches zero
    '!jmp Finished         ' Jump
    
    '-- Go here if all bytes match. Set RETURN to TRUE (-1).
    Finished:
    !mov function, -1
    
    '-- Jump here if all bytes do NOT match. Return FALSE (0).
    Not_equal:
    
    END FUNCTION  'zMemMatch
    
    
    
    FUNCTION PBMAIN () AS LONG
    
    LOCAL s1, s2 AS STRING * 32
    LOCAL p1, p2 AS LONG
    LOCAL cnt, ret AS LONG
    REGISTER n AS LONG
    
    
    s1 = "*123*123*123*121"
    s2 = "v*123*123*123*123"
    cnt = 200000000
    
    p1 = VARPTR(s1)
    p2 = VARPTR(s2)
    
    PRINT "Press key to start compare loop:" WAITKEY$
    FOR n = 1 TO cnt
      ret = zMemMatch(p1, p2, 32)
    '  if s1 = s2 then ret = -1
    NEXT n
    PRINT "ret =" ret
    
    Exitmain:
      WAITKEY$
    
    END FUNCTION
    '

  • #2
    Jerry,

    This may be useful to you.
    Code:
    ' «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    
    FUNCTION szCmp(ByVal str1 as DWORD,ByVal str2 as DWORD) as DWORD
    
        #REGISTER NONE
    
      ' --------------------------------------
      ' scan zero terminated string for match
      ' --------------------------------------
        ! mov ebx, str1
        ! mov edi, str2
        ! xor esi, esi
      cmst:
        ! mov al, [ebx+esi]
        ! cmp al, [edi+esi]
        ! jne no_match
        ! add esi, 1
        ! test al, al         ' check for terminator
        ! jne cmst
    
        ! lea eax, [ebx+esi-1]
        ! sub eax, str1       ' return length on match
        ! jmp cmpout
    
      no_match:
        ! xor eax, eax        ' return zero on no match
    
      cmpout:
        ! mov FUNCTION, eax
    
    END FUNCTION
    
    ' «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    hutch at movsd dot com
    The MASM Forum

    www.masm32.com

    Comment


    • #3
      Surprised you didn't just XOR the source chunk with the target chunk (32 or 64 bits) and query (jump out of loop) on ZF.

      More than one way to skin a cat I guess.
      Michael Mattias
      Tal Systems (retired)
      Port Washington WI USA
      [email protected]
      http://www.talsystems.com

      Comment


      • #4
        You could make it a bit faster by removing the stack frame and using eax rather than al in a couple of places but its a case of what is the point. String compares are usually done on strings well under 1k and you would have problems timing it, the duration is so low.
        Code:
            Replace
            ! mov al, [ebx+esi]
            with
            ! movzx eax, BYTE PTR[ebx+esi]
            ----
            ! test al, al   ; replace this
            ! test eax, eax ; with this
        And to add to the fun, results will vary depending on CPU choice.

        Its usually the case that only when you get an algorithm up near its fastest, that instruction choice has the main effect, slower code full of stalls and not necessarily the most efficient where changing instructions have little if any effect. As you have usually said, pick the right algorithm first before you try to optimise it for speed.
        hutch at movsd dot com
        The MASM Forum

        www.masm32.com

        Comment

        Working...
        X