Announcement

Collapse
No announcement yet.

Comb sort in ASM

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Harry Akers
    replied
    Comb sort in ASM (new code)

    '''
    #CODE

    TYPE my_type
    my_sortkey AS STRING * 80 '' Key record
    my_stuff AS STRING * 80 '' More data
    more_stuff AS STRING * 240 '' Still more data
    END TYPE

    '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''

    FUNCTION PBMAIN() AS LONG
    '''
    DIM this_one (0 TO 10000) AS GLOBAL my_type
    DIM stp (0 TO 10000) AS GLOBAL my_type POINTER
    DIM k AS INTEGER

    ''' Below: variables used in comb-sort algorithm
    DIM _
    chgd AS GLOBAL LONG, _ ' Set to 0 (nothing changed
    ' on this iteration), or to 1
    ktr AS GLOBAL LONG, _ ' Number of instances,
    ' or items to sort
    ktra AS GLOBAL LONG, _ ' Working value of ktr,
    ' gradually diminished
    ng AS GLOBAL LONG ' A measure of the separation
    ' between two items to be
    ' compared, initially ktr \ 2,
    ' eventually 1

    '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''

    ktr = 0

    ' ' ' Read several thousand instances of this_one,
    ' ' ' incrementing ktr to become number of instances.
    ' ' ' Each instance consists of a my_sortkey and two
    ' ' ' other strings, read in from files and/or created
    ' ' ' by appropriate editing.
    ' ' '
    ' ' ' This process, and the data structure that is populated,
    ' ' ' are specific to this particular example.
    ' ' ' In this example, my_sortkey is derived from the data
    ' ' ' as read, and is eventually discarded. However, this
    ' ' ' is simply due to the requirements of this example.
    ' ' '
    ' ' ' What is important here is that one of the items in
    ' ' ' the data structure is the sort key, and the rest of
    ' ' ' the items tag along with it when pointers to it are
    ' ' ' exchanged.
    ' ' '
    ' ' ' No data are placed in instance (0), which is a dummy.
    ' ' ' Instance (ktr) is the last one populated.
    ' ' '

    '' Below: Make POINTERs for instances 0 to 10000,
    '' even if ktr < 10000 . . .
    FOR k = 0 To 10000
    stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
    NEXT k

    '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''

    ktra = ktr '' working value of ktr
    ng = ktr \ 2 '' half of ktr (integer division)

    ' ' ' Below, the CALL passes stp(0), but this will quickly
    ' ' ' be incremented, so stp(1) will be the first instance
    ' ' ' to be considered . . .

    ' ' ' ASM routine to do sorting on my_sortkey . . .
    CALL do_sort (stp(0))

    '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''

    ''' After ASM routine . . .
    '''
    ''' Output all results to a properly-formatted file.
    ''' Output is in a loop on k from 1 to ktr, and is
    ''' essentially as follows:
    '''
    ''' @stp (k).my_stuff
    ''' followed by
    ''' @stp (k).more_stuff
    ''' discarding my_sortkey
    '''
    ''' . . . with appropriate headers, blank-padding,
    ''' blank lines, formatting, and paging.
    '''
    ''' The sort key does not have to be discarded, but
    ''' this particular example no longer needs it.
    '''


    '''''''''''''''''''''''''''''''''''''''''''''''''''''''


    ''' SUB do_sort follows. It is a comb-sort.
    '''

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''
    '''''''''''''''''''''''''''''''''''''''''''''''''''''''

    SUB do_sort (BYREF StrucPtr AS DWORD)

    '' Power Basic pushes EBX, ESI, EDI.

    ASM CLD

    Very_top: ' Scan from the top

    ASM MOV chgd, 0 ; Set or reset "changed" flag . . .
    ' Destination must be LONG for a direct MOV.

    ASM MOV ECX, ktra ; ECX will eventually become counter for
    ' comb sort's outer loop.
    ASM CMP ECX, 1 ; Test for final exit.
    ASM JE Labexit

    ASM SUB ECX, ng ; ECX = ktra - ng = counter for outer loop

    '' Below, find disp = 4 * ng = displacement in bytes between
    '' the stp() pointers of two strings to be compared . . .

    ASM LEA EAX, [EAX * 4] ; Fast way of placing EAX * 4 into EAX.
    ' This approach works only for certain
    ' kinds of operands.
    ASM MOV disp, EAX ; ng * 4 = displacement in bytes between
    ' the stp( ) pointers of two strings to
    ' be compared


    ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will start
    ' with stp(1). EDX is dedicated to
    ' always contain current pointer stp( ).


    ''' Scan will go from the top of the array to a bottom
    ''' which gradually moves up as "heavier" items sink to
    ''' the bottom and "lighter" ones rise to the top. When
    ''' ktra, the working value of the bottom of the set,
    ''' has been decremented all the way to 1 (top of the
    ''' set), sorting is finished. It is also finished if
    ''' an iteration fails to exchange a pair of items,
    ''' provided that at least one bubble-sorting iteration
    ''' (an iteration with ng = 1) has been made.
    '''
    ''' "Very_top" starts a new iteration from the top to the
    ''' current bottom.
    '''
    ''' "Intermediate" starts a comparison between two
    ''' my_sortkey strings and a possible exchange of their
    ''' stp ( ) pointers.


    Intermediate:

    ASM PUSH ECX ; Save outer loop's counter.
    ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)

    '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''

    ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
    ''' farther down in the this_one( ) array, at least
    ''' according to the stp ( ) pointers. Note: The instances
    ''' of this_one are never moved. Only their stp( )
    ''' pointers are moved around in their own array.

    ASM MOV ECX, 0 ; ECX becomes the counter to compare 80
    ' bytes (length of any my_sortkey).

    ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.
    ASM MOV EAX, disp
    ASM ADD EAX, EDX
    ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.



    '''' The following compares 4 bytes at a time, and is
    '''' faster than the more intuitive REPE CMPSB . . .

    ASM PUSH EAX ; Needed later

    NextCmp:
    '' Below, capture a 4-byte sequence (sort key)
    '' into EAX, and another one into EBX . . .
    ASM MOV EAX, [ESI + ECX] ; ECX initially 0, increases by 4
    ' for each 4-byte sequence compared.
    ASM MOV EBX, [EDI + ECX]

    '' Below, use BSWAP because characters must be
    '' compared left-to-right in a 4-byte sequence,
    '' so reverse order to get correct comparison . . .
    ASM BSWAP EAX
    ASM BSWAP EBX
    '' Compare the two 4-byte sequences . . .
    ASM CMP EAX, EBX ; Comparison will set flags.
    ASM JNE SHORT ExitCmp ; Quit this loop on inequality between
    ; sort keys, else . . .
    ASM ADD ECX, 4 ; Update ECX.
    ASM CMP ECX, 80 ; Compare with length of sort keys.
    ASM JB SHORT NextCmp ; Back for new comparison if not yet 80.
    ExitCap:

    ASM POP EAX ; still needed below

    '''''' End of key comparison '''''''


    '' There were two possible CMP results. If CMP EAX, EBX
    '' said "inequality," it also told whether EAX was "above"
    '' or "below" EBX (for example, "h" is higher-valued, or
    '' above, "g").
    ''
    '' If CMP EAX, EBX said "equality," and then if CMP ECX, 80
    '' said "equality" and caused exit from the testing loop, it
    '' indicated that EAX = EBX, meaning that a pair of duplicate
    '' sort keys existed.
    ''
    '' Next statement is very critical, as it controls whether
    '' pointers will be exchanged and this iteration will be
    '' flagged with chgd as having had such an exchange.
    '' To jump around the step that exchanges pointers, it is a
    '' mistake to use JB (jump if below).
    ''
    '' JBE is required, because the jump around the pointer
    '' exchange should also occur if the two sort keys were equal.
    ''
    '' With JB instead of JBE, a single pair of duplicate sort
    '' keys can cause an increase in sorting time by orders of
    '' magnitude, because this pair requires repeated exchanging
    '' of pointers and flagging of the exchange, hence many
    '' repeated iterations.
    ''


    ASM JBE Past_exchange ; Jump to avoid exchanging pointers.


    '' Exchanging pointers in the stp ( ) array, to change order . . .
    ASM MOV [EAX], ESI
    ASM MOV [EDX], EDI

    ASM MOV chgd, 1 ; Set "changed" flag.

    ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    Past_exchange:

    '' Restore outer loop's counter . . .
    ASM POP ECX

    '' Decrement and check counter, for loop-back to
    '' Intermediate . . .
    ASM DEC ECX
    ASM JNZ Intermediate ; to compare another pair of instances
    ' on current iteration

    '' Current iteration has ended, so narrow the range ng
    '' (unless ng = 1 already) . . .
    ASM MOV EAX, ng
    ASM CMP EAX, 1

    ASM JE SHORT Keep_ng ; If ng was 1, keep it, else (below)
    ' scale it down.

    '' Scaling down ng by 3/4 . . .
    '' Scale factor is not an exact science, and an assunmption
    '' must be made. Literature suggests division by 1.3, with
    '' integer rounding, so this is close . . .

    ASM LEA EAX, [EAX + EAX * 2] ; Triple ng in EAX.
    ASM SAR EAX, 2 ; Divide by 4. Result can go to 0.

    ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
    ASM JNE SHORT Save_ng ; If not 0, save new ng, else . . .

    ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
    ' below), but only if it was = 0
    ' above. With ng = 1, sort will
    ' henceforth be a bubble-sort.

    Save_ng:
    ASM MOV ng, EAX ; Save ng, from EAX,
    ASM JMP Very_top ; To start new iteration

    Keep_ng:
    ASM CMP chgd, 0 ; Check the "changed" flag.
    ' Note: It can't get here to check chgd
    ' unless this iteration was a bubble-sort
    ' (ng = 1). The chgd flag must never be
    ' checked unless at least one bubble-sorting
    ' itgeration has been made.
    ASM JZ Labexit ; To Labexit if no exchange on this iteration,
    ; else . . .

    ASM DEC ktra ; Decrement ktra for next iteration.
    ASM JMP Very_top ; To start new iteration

    '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''

    Labexit:

    '' Nothing to POP

    END SUB '' do_sort

    Leave a comment:


  • Paul Dixon
    replied
    Harry,
    Code:
    It works only if you are multiplying by powers of 2.
    It works with a very limited range of operands.
    The address can be formed from an immediate operand plus a register plus a register *1 or *2 or *4 or *8. The second register doesn't have to be different to the first.

    So you could use:
    Code:
    !lea eax,[ecx + edx*8 + 1234]
    to put 1234 + ecx + edx*8 into the eax register.

    or
    Code:
    !lea eax,[eax + eax*4]
    to multiply the value in EAX by 5 very quickly.

    The PB version now compares very favorably with that other compiler
    Since the PB version has turned into an ASM version it should beat any compiler. If it doesn't then it needs more work!

    Paul.

    Leave a comment:


  • Harry Akers
    replied
    Comb sort in ASM

    Thanks. I think I understand how you are using LEA in that context. It works only if you are multiplying by powers of 2.
    My app can be streamlined a little more. With your method of using CMPSD instead of CMPSB, ESI and EDI are not changed during the comparison, so they don't have to be restored after the comparison (no need for ! MOV ESI, [EDX] and ! MOV EDI, [EAX] before exchanging pointers), and the elimination of these also makes a visible improvement in the timing.
    Also, an early ! MOV ECX, EAX can be eliminated by placing ktra in ECX to begin with and working with it there.

    Thanks for all your help. The PB version now compares very favorably with that other compiler.

    Leave a comment:


  • Paul Dixon
    replied
    Harry,
    LEA is Load Effective Address, it loads into the register the address of the operand that follows rather than the content pointed to by that address.
    If you used MOV eax,[ebx*4] then eax would get the content of the memory at address ebx*4
    If you use LEA eax,[ebx*4] then eax gets the address instead of the content, in this case ebx*4 is loaded into eax.

    It's a way to multiply ebx by 4 without the use of a slower MUL instruction.

    Paul.

    Leave a comment:


  • Harry Akers
    replied
    Comb sort in ASM

    Thanks, John Gleason for all the help. Your method of comparing 4 bytes at a time cuts the sorting time by more than half, down to about 0.04 second in my test case (from 0.09 to 0.10). When I said the 4-byte comparison doubled the time and gave totally wrong results, I had simply cut ECX to 20 and substituted REPE CMPSD for REPE CMPSB, with no concern for the order of the bytes within each 4-byte sequence.

    I still don't quite understand what is happening when you say
    ! MOV EBX, ng
    ! LEA EAX, [EBX * 4]
    ! MOV disp, EAX

    I see that EBX * 4 gives the desired value for disp, but it appears that you are loading EAX with the address of a numeric quantity, then moving that address to disp.

    Leave a comment:


  • John Gleason
    replied
    Harry, this now compares 4 bytes at a time AND always maintains correct sort order, but make sure your sortkey length MOD 4 is 0. This code is usually faster than my post #6 above, which does a byte by byte comparison. Sorting the the data we generate below, both are faster than PB sort by quite a margin.
    Code:
       #DIM ALL
       #REGISTER NONE
       ' Comb sort in ASM
       ''' A forum member suggested that I should post this sample
       ''' code here. It is a comb sort, generally faster than a
       ''' bubble sort.
       %ARRSIZE = 5000
    
       '''
       TYPE my_type
       my_sortkey AS STRING * 80 '' Key record
       my_stuff AS STRING * 80 '' More data
       more_stuff AS STRING * 240 '' Still more data
       END TYPE
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       FUNCTION PBMAIN() AS LONG
       '''
       DIM this_one (0 TO %ARRSIZE) AS GLOBAL my_type
       DIM stp (0 TO %ARRSIZE) AS GLOBAL my_type POINTER
       LOCAL k AS LONG, outPtr AS STRING PTR * 80, outStr AS STRING, t AS QUAD
    '   RANDOMIZE
       '’’ Below: variables used in comb-sort algorithm
       DIM   chgd AS GLOBAL LONG, _ ' Set to 0 (nothing changed on this iteration), or to 1
       ktr AS GLOBAL LONG, _ ' Number of instances, or items to sort
       ktra AS GLOBAL LONG, _ ' Working value of ktr gradually diminished
       ng AS GLOBAL LONG ' A measure of the separation
       ' between two items to be
       ' compared, initially ktr \ 2,
       ' eventually 1
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       ktr = 0
    
       FOR k = 1 TO %ARRSIZE          'fill our records with data
          this_one(k).my_sortkey = CHR$(RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122)) & STRING$(72, RND(33, 122))   'printable common chrs
    '      this_one(k).my_sortkey = STRING$(10, RND(33, 122)) & STRING$(70, RND(33, 122))   'printable common chrs
          this_one(k).my_stuff = STRING$(40, RND(33, 122)) & STRING$(40, RND(33, 122))   'printable common chrs
          this_one(k).more_stuff = STRING$(120, RND(33, 122)) & STRING$(120, RND(33, 122))   'printable common chrs
       NEXT
       ktr = %ARRSIZE
    
       ' ' ' Read several thousand instances of this_one,
       ' ' ' letting ktr = number of instances. Each instance
       ' ' ' consists of a my_sortkey and two other strings,
       '‘ ‘ ‘ all read in from files, with appropriate editing.
       ' ' ' The reading / editing step is not a bottleneck, and
       ' ' ' is fairly fast.
       ' ' ' No data are placed in instance (0), which is a dummy,
       ' ' ' so instance (ktr) is the last one populated.
    
       '' Below: Make POINTERs for instances 0 to %ARRSIZE,
       '' even if ktr < %ARRSIZE . . .
       FOR k = 0 TO %ARRSIZE
       stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
       NEXT k
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''
    
       ktra = ktr '' working value of ktr
       ng = ktr \ 2 '' half of ktr (integer division)
    
       ' ' ' Below, the CALL passes stp(0), but this will quickly
       ' ' ' be incremented, so stp(1) will be the first instance
       ' ' ' to be considered . . .
    
       ' ' ' ASM routine to do sorting on my_sortkey . . .
    '     Do sort now
       TIX t
        CALL do_sort (stp(0))
    
        'try it in PB too
    '  ARRAY SORT this_one()
       TIX END t
       'for PB sort, use this outStr
    '   FOR k = 1 TO 30
    '      outStr = outStr & this_one(k).my_sortkey & $CRLF
    '   NEXT
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''
       FOR k = 1 TO 30
          outPtr = stp(k)
          outStr = outStr & @outPtr & $CRLF
    '      outStr = outStr & this_one(k).my_sortkey & $CRLF
       NEXT
    
       ? FORMAT$(t, "0,") & " tix" & $CRLF & outStr
    
       ''' Output all results to a properly-formatted file.
       ''' Output is in a loop on k from 1 to ktr, and is
       '[' essentially as follows:
       '''
       ''' @stp (k).my_stuff
       ''' followed by
       ''' @stp (k).more_stuff
       ''' discarding my_sortkey
       '''
       ''' . . . with appropriate headers, blank-padding,
       ''' blank lines, formatting, and paging. It is a
       ''' relatively fast step, not a bottleneck.
    
       END FUNCTION
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''
    
    
       ''' SUB do_sort follows. It is a comb-sort, and works
       ''' perfectly, but is still the slowest step in this
       ''' application.
       '''
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
    
       SUB do_sort (BYREF StrucPtr AS DWORD)
    
       '' Power Basic pushes EBX, ESI, EDI.
       LOCAL disp AS LONG
       ASM CLD
    
       Very_top: ' Scan from the top
    
       ASM MOV chgd, 0 ; Set or reset "changed" flag . . .
    
    '   ASM LEA EBX, ktra ; Pointer to ktra
    '   ASM MOV EAX, [EBX] ; EAX = ktra
       !mov eax, ktra
       ASM CMP EAX, 1 ; Testing for exit, if ktra has been decremented to 1
       ASM JE Labexit
    
       !mov ebx, ng       'ng in ebx
    '   ASM LEA EBX, ng ; Pointer to ng
       !sub eax, ebx      ; Subtract ng from ktra.
    '   ASM SUB EAX, [EBX] ; Subtract ng from ktra.
       ASM MOV ECX, EAX ; This ECX is based on (ktra - ng).
       ' It determines how far down in the
       ' table the testing goes on the current
       ' iteration.
    
    '   !mov eax, ebx      ; EAX = ng
       !lea eax, [ebx*4]
    '   ASM MOV EAX, [EBX] ; EAX = ng
    '   ASM SAL EAX, 2 ; Quadrupled
    '   ASM LEA EBX, disp
    '   ASM MOV [EBX], EAX ; disp (displacement) in bytes =
       !mov disp, eax
       ' 4 * ng between stp( ) pointers
       ' of the two strings to be compared
    
       ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will
       ' start with stp(1). Below this
       ' point, EDX will always = stp ( ).
    
    
       ''' Scan will go from the top of the array to a bottom
       ''' which gradually moves up as "heavier" items sink to
       ''' the bottom and "lighter" ones rise to the top. When
       ''' ktra, the working value of the bottom of the set,
       ''' has been decremented all the way to 1 (top of the
       ''' set), sorting is finished. It is also finished if
       ''' an iteration fails to exchange a pair of items,
       ''' provided that at least one bubble-sorting iteration
       ''' has been made.
       '''
       ''' "Very_top" starts a new iteration from the top to the
       ''' current bottom.
       '''
       ''' "Intermediate" starts a comparison between two
       ''' my_sortkey strings and a possible exchange of
       ''' this_one pointers.
    
    
       Intermediate:
    
       ASM PUSH ECX ; Save master counter.
       ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
       ''' farther down in the this_one( ) array, at least
       ''' according to the stp ( ) pointers. [The instances of
       ''' this_one are never moved. Only their stp( )
       ''' pointers are moved.]
    
       ASM MOV ECX,  0 ; ECX becomes the counter to compare to 80. Change to 80 if using REPE CMPSB below
       ' bytes (length of any my_sortkey).
    
       ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.
    
       !mov eax, disp      ; disp in EAX
    '   ASM LEA EBX, disp
    '   ASM MOV EAX, [EBX] ; disp in EAX
        ASM ADD EAX, EDX ; EAX = EDX + disp, for my_sortkey #2.
        ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.
        
    '--------------------------comparison part of code to replace REPE CMPSB-----------------------------------
    'this code handles 4 bytes at a time
    !push eax                          'because we need value later
      nextCmp:                         'compare strings with our own code--approaching 3x faster than REPE CMPSB
    '   !movzx ebx, byte ptr[edi+ecx]
    '   !cmp byte ptr[esi+ecx], bl
       !mov ebx, [edi+ecx]             'swap bytes in my_sortkey to make big endian
       !mov eax, [esi+ecx]
       !bswap ebx                      'make big endian so string sort is correct
       !bswap eax                      '         "           "            "
       !cmp eax, ebx
       !jne short exitCmp
       !add ecx, 4                     ;4 here because comparing 4 bytes at a time
       !cmp ecx, 80
       !jb short nextCmp
      exitCmp:
    !pop eax
    '----------------------end comparison part of code to replace REPE CMPSB-----------------------------------
     
    'ASM REPE CMPSB ; Compare strings. ESI and EDI change
       ' in this process. This is "repeat
       ' while equal," so it exits at first
       ' mismatch.
    
       ASM JB Past_exchange ; "Below" = smaller ASCII value, so
       ' jump to avoid changing order,
       ' else . . .
    
       '' Exchanging pointers to change order . . .
       ASM MOV ESI, [EDX] ; First, restore original ESI and EDI.
       ASM MOV EDI, [EAX]
       ASM MOV [EDX], EDI ; Then, exchange their locations in the
       ' stp( ) array . . .
       ASM MOV [EAX], ESI
    
       ASM MOV chgd, 1 ; Set "changed" flag.
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       Past_exchange:
    
       ''' Master counter (the one based on ktra) . . .
       ASM POP ECX
    
       '' Decrement and check counter, for loop-back to
       '' Intermediate . . .
       ASM DEC ECX
       ASM JNZ Intermediate ; to compare another pair of instances
       ' on current iteration
    
       '' Current iteration has ended, so narrow the range ng
       '‘’ (unless ng = 1) . . .
       !mov eax, ng
    '  ASM LEA EBX, ng
    '  ASM MOV EAX, [EBX] ; EAX = [EBX] = ng
       ASM CMP EAX, 1 ; To test for ng = 1
    
       ASM JE SHORT Keep_ng ; If ng was = 1, keep it, else (below)
       ' scale it down . .
    
       '' Scaling down ng by 3/4. Scale factor is not an exact
       '' science, and an assunmption must be made. Literature
       '' suggests division by 1.3, with integer rounding, so
       '' this is close . . .
    
       ASM SAL EAX, 1 ; Double it.
       !add eax, ng
    '   ASM ADD EAX, [EBX] ; Triple it.
       ASM SAR EAX, 2 ; Integer division by 4,
       ' can go down to 0.
    
       ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
       ASM JNE SHORT Save_ng
    
       ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
       ' below), but only if it was = 0
       ' above. With ng = 1, sort will
       ' become a bubble-sort.
       Save_ng:
       !mov ng, eax
    '   ASM MOV [EBX], EAX ; Save ng, which may or may not = 1.
       ' [EBX] still = ng here, but not after
       ' this point.
       ASM JMP Very_top ; To start new iteration
    
       Keep_ng:
       ASM LEA EBX, chgd ; Check "changed" flag . . .
       ' Note: No ehecking of chgd unless this
       ' iteration was a bubble-sort, using
       ' ng = 1
    
       ASM MOV EAX, [EBX]
       ASM SUB EAX, 0 ; To compare chgd with 0
    
       ASM JZ Labexit ; To Labexit if no change on this
       ' iteration, else . . .
    
       ASM LEA EBX, ktra
       ASM MOV EAX, [EBX]
       ASM DEC EAX ; Decrement ktra (move bottom upward).
       ASM MOV [EBX], EAX ; Store new ktra.
    
       ASM JMP Very_top ; To start new iteration
    
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''
    
       Labexit:
    
       '' Nothing to POP
    
       END SUB '' do_sort

    Leave a comment:


  • Harry Akers
    replied
    Comb sort in ASM

    To Paul Dixon: Thanks for the tips about things like !mov eax, disp. I had tried this in a few places, but it wouldn't compile. The reason was that I had declared disp, etc., as INTEGER instead of as LONG.

    To John Gleason: This is a puzzle! I see your change from CMPSB to CMPSD. However, when I tried this, it doubled the time, even when I cut ECX from 80 back to 20. And the results were wrong -- badly out of order.

    Leave a comment:


  • John Gleason
    replied
    Here it is with several optimizations, but by far the main one is replacing REPE CMPSB with our own compare loop, which more than doubled the speed. It could be improved to nearly 3x by comparing 4 bytes at a time.

    Code:
       #DIM ALL
       #REGISTER NONE
       ' Comb sort in ASM
       ''' A forum member suggested that I should post this sample
       ''' code here. It is a comb sort, generally faster than a
       ''' bubble sort.
    
    
       '''
       TYPE my_type
       my_sortkey AS STRING * 80 '' Key record
       my_stuff AS STRING * 80 '' More data
       more_stuff AS STRING * 240 '' Still more data
       END TYPE
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       FUNCTION PBMAIN() AS LONG
       '''
       DIM this_one (0 TO 5000) AS GLOBAL my_type
       DIM stp (0 TO 5000) AS GLOBAL my_type POINTER
       LOCAL k AS LONG, outPtr AS STRING PTR * 80, outStr AS STRING, t AS QUAD
    '   RANDOMIZE
       '’’ Below: variables used in comb-sort algorithm
       DIM   chgd AS GLOBAL LONG, _ ' Set to 0 (nothing changed on this iteration), or to 1
       ktr AS GLOBAL LONG, _ ' Number of instances, or items to sort
       ktra AS GLOBAL LONG, _ ' Working value of ktr gradually diminished
       ng AS GLOBAL LONG ' A measure of the separation
       ' between two items to be
       ' compared, initially ktr \ 2,
       ' eventually 1
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       ktr = 0
    
       FOR k = 1 TO 5000          'fill our records with data
          this_one(k).my_sortkey = CHR$(RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122),RND(33, 122)) & STRING$(72, RND(33, 122))   'printable common chrs
          this_one(k).my_stuff = STRING$(40, RND(33, 122)) & STRING$(40, RND(33, 122))   'printable common chrs
          this_one(k).more_stuff = STRING$(120, RND(33, 122)) & STRING$(120, RND(33, 122))   'printable common chrs
       NEXT
       ktr = 5000
    
       ' ' ' Read several thousand instances of this_one,
       ' ' ' letting ktr = number of instances. Each instance
       ' ' ' consists of a my_sortkey and two other strings,
       '‘ ‘ ‘ all read in from files, with appropriate editing.
       ' ' ' The reading / editing step is not a bottleneck, and
       ' ' ' is fairly fast.
       ' ' ' No data are placed in instance (0), which is a dummy,
       ' ' ' so instance (ktr) is the last one populated.
    
       '' Below: Make POINTERs for instances 0 to 5000,
       '' even if ktr < 5000 . . .
       FOR k = 0 TO 5000
       stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
       NEXT k
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''
    
       ktra = ktr '' working value of ktr
       ng = ktr \ 2 '' half of ktr (integer division)
    
       ' ' ' Below, the CALL passes stp(0), but this will quickly
       ' ' ' be incremented, so stp(1) will be the first instance
       ' ' ' to be considered . . .
    
       ' ' ' ASM routine to do sorting on my_sortkey . . .
    '     Do sort now
       TIX t
        CALL do_sort (stp(0))
    
        'try it in PB too
    '  ARRAY SORT this_one()
       TIX END t
       'for PB sort, use this outStr
    '   FOR k = 1 TO 30
    '      outStr = outStr & this_one(k).my_sortkey & $CRLF
    '   NEXT
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''
       FOR k = 1 TO 30
          outPtr = stp(k)
          outStr = outStr & @outPtr & $CRLF
       NEXT
    
       ? FORMAT$(t, "0,") & " tix" & $CRLF & outStr
    
       ''' Output all results to a properly-formatted file.
       ''' Output is in a loop on k from 1 to ktr, and is
       '[' essentially as follows:
       '''
       ''' @stp (k).my_stuff
       ''' followed by
       ''' @stp (k).more_stuff
       ''' discarding my_sortkey
       '''
       ''' . . . with appropriate headers, blank-padding,
       ''' blank lines, formatting, and paging. It is a
       ''' relatively fast step, not a bottleneck.
    
       END FUNCTION
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''
    
    
       ''' SUB do_sort follows. It is a comb-sort, and works
       ''' perfectly, but is still the slowest step in this
       ''' application.
       '''
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
    
       SUB do_sort (BYREF StrucPtr AS DWORD)
    
       '' Power Basic pushes EBX, ESI, EDI.
       LOCAL disp AS LONG
       ASM CLD
    
       Very_top: ' Scan from the top
    
       ASM MOV chgd, 0 ; Set or reset "changed" flag . . .
    
    '   ASM LEA EBX, ktra ; Pointer to ktra
    '   ASM MOV EAX, [EBX] ; EAX = ktra
       !mov eax, ktra
       ASM CMP EAX, 1 ; Testing for exit, if ktra has been decremented to 1
       ASM JE Labexit
    
       !mov ebx, ng       'ng in ebx
    '   ASM LEA EBX, ng ; Pointer to ng
       !sub eax, ebx      ; Subtract ng from ktra.
    '   ASM SUB EAX, [EBX] ; Subtract ng from ktra.
       ASM MOV ECX, EAX ; This ECX is based on (ktra - ng).
       ' It determines how far down in the
       ' table the testing goes on the current
       ' iteration.
    
    '   !mov eax, ebx      ; EAX = ng
       !lea eax, [ebx*4]
    '   ASM MOV EAX, [EBX] ; EAX = ng
    '   ASM SAL EAX, 2 ; Quadrupled
    '   ASM LEA EBX, disp
    '   ASM MOV [EBX], EAX ; disp (displacement) in bytes =
       !mov disp, eax
       ' 4 * ng between stp( ) pointers
       ' of the two strings to be compared
    
       ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will
       ' start with stp(1). Below this
       ' point, EDX will always = stp ( ).
    
    
       ''' Scan will go from the top of the array to a bottom
       ''' which gradually moves up as "heavier" items sink to
       ''' the bottom and "lighter" ones rise to the top. When
       ''' ktra, the working value of the bottom of the set,
       ''' has been decremented all the way to 1 (top of the
       ''' set), sorting is finished. It is also finished if
       ''' an iteration fails to exchange a pair of items,
       ''' provided that at least one bubble-sorting iteration
       ''' has been made.
       '''
       ''' "Very_top" starts a new iteration from the top to the
       ''' current bottom.
       '''
       ''' "Intermediate" starts a comparison between two
       ''' my_sortkey strings and a possible exchange of
       ''' this_one pointers.
    
    
       Intermediate:
    
       ASM PUSH ECX ; Save master counter.
       ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
       ''' farther down in the this_one( ) array, at least
       ''' according to the stp ( ) pointers. [The instances of
       ''' this_one are never moved. Only their stp( )
       ''' pointers are moved.]
    
       ASM MOV ECX,  0 ; ECX becomes the counter to compare to 80. Change to 80 if using REPE CMPSB below
       ' bytes (length of any my_sortkey).
    
       ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.
    
       !mov eax, disp      ; disp in EAX
    '   ASM LEA EBX, disp
    '   ASM MOV EAX, [EBX] ; disp in EAX
        ASM ADD EAX, EDX ; EAX = EDX + disp, for my_sortkey #2.
        ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.
    
      nextCmp:                         'compare strings with our own code--over 2x faster than REPE CMPSB
       !movzx ebx, byte ptr[edi+ecx]
       !cmp byte ptr[esi+ecx], bl
    '  !mov ebx, [edi+ecx]             'if you customize my_sortkey, you can compare 4 bytes at a time for about 3x faster than REPE CMPSB
    '  !cmp [esi+ecx], ebx
       !jne short exitCmp
       !add ecx, 1                     ;1 here would change to 4 if comparing 4 bytes at a time
       !cmp ecx, 80
       !jb short nextCmp
      exitCmp:
    
    ' ASM REPE CMPSB ; Compare strings. ESI and EDI change
       ' in this process. This is "repeat
       ' while equal," so it exits at first
       ' mismatch.
    
       ASM JB Past_exchange ; "Below" = smaller ASCII value, so
       ' jump to avoid changing order,
       ' else . . .
    
       '' Exchanging pointers to change order . . .
       ASM MOV ESI, [EDX] ; First, restore original ESI and EDI.
       ASM MOV EDI, [EAX]
       ASM MOV [EDX], EDI ; Then, exchange their locations in the
       ' stp( ) array . . .
       ASM MOV [EAX], ESI
    
       ASM MOV chgd, 1 ; Set "changed" flag.
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       Past_exchange:
    
       ''' Master counter (the one based on ktra) . . .
       ASM POP ECX
    
       '' Decrement and check counter, for loop-back to
       '' Intermediate . . .
       ASM DEC ECX
       ASM JNZ Intermediate ; to compare another pair of instances
       ' on current iteration
    
       '' Current iteration has ended, so narrow the range ng
       '‘’ (unless ng = 1) . . .
       !mov eax, ng
    '  ASM LEA EBX, ng
    '  ASM MOV EAX, [EBX] ; EAX = [EBX] = ng
       ASM CMP EAX, 1 ; To test for ng = 1
    
       ASM JE SHORT Keep_ng ; If ng was = 1, keep it, else (below)
       ' scale it down . .
    
       '' Scaling down ng by 3/4. Scale factor is not an exact
       '' science, and an assunmption must be made. Literature
       '' suggests division by 1.3, with integer rounding, so
       '' this is close . . .
    
       ASM SAL EAX, 1 ; Double it.
       !add eax, ng
    '   ASM ADD EAX, [EBX] ; Triple it.
       ASM SAR EAX, 2 ; Integer division by 4,
       ' can go down to 0.
    
       ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
       ASM JNE SHORT Save_ng
    
       ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
       ' below), but only if it was = 0
       ' above. With ng = 1, sort will
       ' become a bubble-sort.
       Save_ng:
       !mov ng, eax
    '   ASM MOV [EBX], EAX ; Save ng, which may or may not = 1.
       ' [EBX] still = ng here, but not after
       ' this point.
       ASM JMP Very_top ; To start new iteration
    
       Keep_ng:
       ASM LEA EBX, chgd ; Check "changed" flag . . .
       ' Note: No ehecking of chgd unless this
       ' iteration was a bubble-sort, using
       ' ng = 1
    
       ASM MOV EAX, [EBX]
       ASM SUB EAX, 0 ; To compare chgd with 0
    
       ASM JZ Labexit ; To Labexit if no change on this
       ' iteration, else . . .
    
       ASM LEA EBX, ktra
       ASM MOV EAX, [EBX]
       ASM DEC EAX ; Decrement ktra (move bottom upward).
       ASM MOV [EBX], EAX ; Store new ktra.
    
       ASM JMP Very_top ; To start new iteration
    
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''
    
       Labexit:
    
       '' Nothing to POP
    
       END SUB '' do_sort

    Leave a comment:


  • John Gleason
    replied
    Definitely note all the changes Paul mentioned above, and for my sortkeys below in particular, the one change from byte to dword made a huge--over 150x--difference in speed. That's an extreme case, but does demo what a small optimization can do.
    Code:
    ASM REPE CMPSD ; Compare strings. ESI and EDI change
    Code:
       #DIM ALL
       #REGISTER NONE
       ' Comb sort in ASM
       ''' A forum member suggested that I should post this sample
       ''' code here. It is a comb sort, generally faster than a
       ''' bubble sort.
    
    
       '''
       TYPE my_type
       my_sortkey AS STRING * 80 '' Key record
       my_stuff AS STRING * 80 '' More data
       more_stuff AS STRING * 240 '' Still more data
       END TYPE
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       FUNCTION PBMAIN() AS LONG
       '''
       DIM this_one (0 TO 5000) AS GLOBAL my_type
       DIM stp (0 TO 5000) AS GLOBAL my_type POINTER
       LOCAL k AS LONG, outPtr AS STRING PTR * 80, outStr AS STRING, t AS QUAD
       RANDOMIZE
       '’’ Below: variables used in comb-sort algorithm
       DIM   chgd AS GLOBAL INTEGER, _ ' Set to 0 (nothing changed on this iteration), or to 1
       ktr AS GLOBAL INTEGER, _ ' Number of instances, or items to sort
       ktra AS GLOBAL INTEGER, _ ' Working value of ktr gradually diminished
       ng AS GLOBAL INTEGER ' A measure of the separation
       ' between two items to be
       ' compared, initially ktr \ 2,
       ' eventually 1
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       ktr = 0
       
       FOR k = 1 TO 5000          'fill our records with data
          this_one(k).my_sortkey = STRING$(40, RND(33, 122)) & STRING$(40, RND(33, 122))   'printable common chrs
          this_one(k).my_stuff = STRING$(40, RND(33, 122)) & STRING$(40, RND(33, 122))   'printable common chrs
          this_one(k).more_stuff = STRING$(120, RND(33, 122)) & STRING$(120, RND(33, 122))   'printable common chrs
       NEXT
       ktr = 5000
    
       ' ' ' Read several thousand instances of this_one,
       ' ' ' letting ktr = number of instances. Each instance
       ' ' ' consists of a my_sortkey and two other strings,
       '‘ ‘ ‘ all read in from files, with appropriate editing.
       ' ' ' The reading / editing step is not a bottleneck, and
       ' ' ' is fairly fast.
       ' ' ' No data are placed in instance (0), which is a dummy,
       ' ' ' so instance (ktr) is the last one populated.
    
       '' Below: Make POINTERs for instances 0 to 5000,
       '' even if ktr < 5000 . . .
       FOR k = 0 TO 5000
       stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
       NEXT k
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''
    
       ktra = ktr '' working value of ktr
       ng = ktr \ 2 '' half of ktr (integer division)
    
       ' ' ' Below, the CALL passes stp(0), but this will quickly
       ' ' ' be incremented, so stp(1) will be the first instance
       ' ' ' to be considered . . .
    
       ' ' ' ASM routine to do sorting on my_sortkey . . .
    '     Do sort now
       TIX t
        CALL do_sort (stp(0))
        
        'try it in PB too
    '  array sort this_one()
       TIX END t
       'for PB sort, use this outStr
    '   FOR k = 1 TO 30
    '      outStr = outStr & this_one(k).my_sortkey & $CRLF
    '   NEXT
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''
       FOR k = 1 TO 30
          outPtr = stp(k)
          outStr = outStr & @outPtr & $CRLF
       NEXT
       
       ? FORMAT$(t, "0,") & " tix" & $CRLF & outStr
          
       ''' Output all results to a properly-formatted file.
       ''' Output is in a loop on k from 1 to ktr, and is
       '[' essentially as follows:
       '''
       ''' @stp (k).my_stuff
       ''' followed by
       ''' @stp (k).more_stuff
       ''' discarding my_sortkey
       '''
       ''' . . . with appropriate headers, blank-padding,
       ''' blank lines, formatting, and paging. It is a
       ''' relatively fast step, not a bottleneck.
    
       END FUNCTION
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''
    
    
       ''' SUB do_sort follows. It is a comb-sort, and works
       ''' perfectly, but is still the slowest step in this
       ''' application.
       '''
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
    
       SUB do_sort (BYREF StrucPtr AS DWORD)
    
       '' Power Basic pushes EBX, ESI, EDI.
       LOCAL disp AS LONG
       ASM CLD
    
       Very_top: ' Scan from the top
    
       ASM MOV chgd, 0 ; Set or reset "changed" flag . . .
    
       ASM LEA EBX, ktra ; Pointer to ktra
       ASM MOV EAX, [EBX] ; EAX = ktra
       ASM CMP EAX, 1 ; Testing for exit, if ktra has been decremented to 1
       ASM JE Labexit
    
       ASM LEA EBX, ng ; Pointer to ng
       ASM SUB EAX, [EBX] ; Subtract ng from ktra.
       ASM MOV ECX, EAX ; This ECX is based on (ktra - ng).
       ' It determines how far down in the
       ' table the testing goes on the current
       ' iteration.
    
       ASM MOV EAX, [EBX] ; EAX = ng
       ASM SAL EAX, 2 ; Quadrupled
       ASM LEA EBX, disp
       ASM MOV [EBX], EAX ; disp (displacement) in bytes =
       ' 4 * ng between stp( ) pointers
       ' of the two strings to be compared
    
       ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will
       ' start with stp(1). Below this
       ' point, EDX will always = stp ( ).
    
    
       ''' Scan will go from the top of the array to a bottom
       ''' which gradually moves up as "heavier" items sink to
       ''' the bottom and "lighter" ones rise to the top. When
       ''' ktra, the working value of the bottom of the set,
       ''' has been decremented all the way to 1 (top of the
       ''' set), sorting is finished. It is also finished if
       ''' an iteration fails to exchange a pair of items,
       ''' provided that at least one bubble-sorting iteration
       ''' has been made.
       '''
       ''' "Very_top" starts a new iteration from the top to the
       ''' current bottom.
       '''
       ''' "Intermediate" starts a comparison between two
       ''' my_sortkey strings and a possible exchange of
       ''' this_one pointers.
    
    
       Intermediate:
    
       ASM PUSH ECX ; Save master counter.
       ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
       ''' farther down in the this_one( ) array, at least
       ''' according to the stp ( ) pointers. [The instances of
       ''' this_one are never moved. Only their stp( )
       ''' pointers are moved.]
    
       ASM MOV ECX, 80 ; ECX becomes the counter to compare 80
       ' bytes (length of any my_sortkey).
    
       ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.
    
       ASM LEA EBX, disp
       ASM MOV EAX, [EBX] ; disp in EAX
       ASM ADD EAX, EDX ; EAX = EDX + disp, for my_sortkey #2.
       ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.
    
       ASM REPE CMPSD ; Compare strings. ESI and EDI change
       ' in this process. This is "repeat
       ' while equal," so it exits at first
       ' mismatch.
    
       ASM JB Past_exchange ; "Below" = smaller ASCII value, so
       ' jump to avoid changing order,
       ' else . . .
    
       '' Exchanging pointers to change order . . .
       ASM MOV ESI, [EDX] ; First, restore original ESI and EDI.
       ASM MOV EDI, [EAX]
       ASM MOV [EDX], EDI ; Then, exchange their locations in the
       ' stp( ) array . . .
       ASM MOV [EAX], ESI
    
       ASM MOV chgd, 1 ; Set "changed" flag.
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       Past_exchange:
    
       ''' Master counter (the one based on ktra) . . .
       ASM POP ECX
    
       '' Decrement and check counter, for loop-back to
       '' Intermediate . . .
       ASM DEC ECX
       ASM JNZ Intermediate ; to compare another pair of instances
       ' on current iteration
    
       '' Current iteration has ended, so narrow the range ng
       '‘’ (unless ng = 1) . . .
       ASM LEA EBX, ng
       ASM MOV EAX, [EBX] ; EAX = [EBX] = ng
       ASM CMP EAX, 1 ; To test for ng = 1
    
       ASM JE SHORT Keep_ng ; If ng was = 1, keep it, else (below)
       ' scale it down . .
    
       '' Scaling down ng by 3/4. Scale factor is not an exact
       '' science, and an assunmption must be made. Literature
       '' suggests division by 1.3, with integer rounding, so
       '' this is close . . .
    
       ASM SAL EAX, 1 ; Double it.
       ASM ADD EAX, [EBX] ; Triple it.
       ASM SAR EAX, 2 ; Integer division by 4,
       ' can go down to 0.
    
       ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
       ASM JNE SHORT Save_ng
    
       ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
       ' below), but only if it was = 0
       ' above. With ng = 1, sort will
       ' become a bubble-sort.
       Save_ng:
       ASM MOV [EBX], EAX ; Save ng, which may or may not = 1.
       ' [EBX] still = ng here, but not after
       ' this point.
       ASM JMP Very_top ; To start new iteration
    
       Keep_ng:
       ASM LEA EBX, chgd ; Check "changed" flag . . .
       ' Note: No ehecking of chgd unless this
       ' iteration was a bubble-sort, using
       ' ng = 1
    
       ASM MOV EAX, [EBX]
       ASM SUB EAX, 0 ; To compare chgd with 0
    
       ASM JZ Labexit ; To Labexit if no change on this
       ' iteration, else . . .
    
       ASM LEA EBX, ktra
       ASM MOV EAX, [EBX]
       ASM DEC EAX ; Decrement ktra (move bottom upward).
       ASM MOV [EBX], EAX ; Store new ktra.
    
       ASM JMP Very_top ; To start new iteration
    
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''
    
       Labexit:
    
       '' Nothing to POP
    
       END SUB '' do_sort

    Leave a comment:


  • Harry Akers
    replied
    I'll try your suggestions.

    Leave a comment:


  • Paul Dixon
    replied
    Harry,
    you aren't using the available opcodes efectively in your program, in particular the addressing modes available to the CPU.
    Throughout your code you do things like this:
    Code:
    ASM LEA EBX, ktra ; Pointer to ktra
    ASM MOV EAX, [EBX] ; EAX = ktra
    When the same can be acheived in a single instruction like this:
    Code:
    !mov eax,ktra   'move the value in ktra into eax register
    There is no need to load the address into a register first, just access the variable directly and its address is encoded into the instruction.



    You also have this:
    Code:
    ASM LEA EBX, disp
    ASM MOV EAX, [EBX] ; disp in EAX
    ASM ADD EAX, EDX   ; EAX = EDX + disp, for my_sortkey #2.
    ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.
    Which could be written
    Code:
    !mov eax,disp
    !mov edi,[eax+edx]


    Code:
    Keep_ng:
    ASM LEA EBX, chgd ; Check "changed" flag . . .
    ' Note: No ehecking of chgd unless this
    ' iteration was a bubble-sort, using
    ' ng = 1
    
    ASM MOV EAX, [EBX]
    ASM SUB EAX, 0 ; To compare chgd with 0
    
    ASM JZ Labexit ; To Labexit if no change on this
    Could be written:
    Code:
    Keep_ng:
    !cmp chgd,0    'compare chgd to zero
    !je Labexit    'jump if equal to Labexit


    and this:
    Code:
    ASM LEA EBX, ktra
    ASM MOV EAX, [EBX]
    ASM DEC EAX ; Decrement ktra (move bottom upward).
    ASM MOV [EBX], EAX ; Store new ktra.
    Could be:
    Code:
    !dec ktra    'decrement ktra
    I'm sure there are more.

    Also,
    Code:
    REPE CMPSB
    is easy to program but is usually slower than coding the search explicitly using a number of simpler opcodes and, since you're scanning for =, you could scan 4 bytes at a time instead of 1 and it would be a lot quicker.


    Paul.

    Leave a comment:


  • John Gleason
    replied
    Harry, I got it to compile (PB9) but I haven't debugged it yet. It has a page fault somewhere at this time.

    Note: You can solve those formatting problems using [CODE ][/CODE ] tags (without the spaces) around your code.
    Code:
       #DIM ALL
       #REGISTER NONE
       ' Comb sort in ASM
       ''' A forum member suggested that I should post this sample
       ''' code here. It is a comb sort, generally faster than a
       ''' bubble sort.
    
    
       '''
       TYPE my_type
       my_sortkey AS STRING * 80 '' Key record
       my_stuff AS STRING * 80 '' More data
       more_stuff AS STRING * 240 '' Still more data
       END TYPE
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       FUNCTION PBMAIN() AS LONG
       '''
       DIM this_one (0 TO 5000) AS GLOBAL my_type
       DIM stp (0 TO 5000) AS GLOBAL my_type POINTER
       LOCAL k AS LONG
    
       '’’ Below: variables used in comb-sort algorithm
       DIM   chgd AS GLOBAL INTEGER, _ ' Set to 0 (nothing changed on this iteration), or to 1
       ktr AS GLOBAL INTEGER, _ ' Number of instances, or items to sort
       ktra AS GLOBAL INTEGER, _ ' Working value of ktr gradually diminished
       ng AS GLOBAL INTEGER ' A measure of the separation
       ' between two items to be
       ' compared, initially ktr \ 2,
       ' eventually 1
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''
    
       ktr = 0
    
       ' ' ' Read several thousand instances of this_one,
       ' ' ' letting ktr = number of instances. Each instance
       ' ' ' consists of a my_sortkey and two other strings,
       '‘ ‘ ‘ all read in from files, with appropriate editing.
       ' ' ' The reading / editing step is not a bottleneck, and
       ' ' ' is fairly fast.
       ' ' ' No data are placed in instance (0), which is a dummy,
       ' ' ' so instance (ktr) is the last one populated.
    
       '' Below: Make POINTERs for instances 0 to 5000,
       '' even if ktr < 5000 . . .
       FOR k = 0 TO 5000
       stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
       NEXT k
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''
    
       ktra = ktr '' working value of ktr
       ng = ktr \ 2 '' half of ktr (integer division)
    
       ' ' ' Below, the CALL passes stp(0), but this will quickly
       ' ' ' be incremented, so stp(1) will be the first instance
       ' ' ' to be considered . . .
    
       ' ' ' ASM routine to do sorting on my_sortkey . . .
       CALL do_sort (stp(0))
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''
    
       ''' Output all results to a properly-formatted file.
       ''' Output is in a loop on k from 1 to ktr, and is
       '[' essentially as follows:
       '''
       ''' @stp (k).my_stuff
       ''' followed by
       ''' @stp (k).more_stuff
       ''' discarding my_sortkey
       '''
       ''' . . . with appropriate headers, blank-padding,
       ''' blank lines, formatting, and paging. It is a
       ''' relatively fast step, not a bottleneck.
    
       END FUNCTION
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''
    
    
       ''' SUB do_sort follows. It is a comb-sort, and works
       ''' perfectly, but is still the slowest step in this
       ''' application.
       '''
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
       '''''''''''''''''''''''''''''''''''''''''''''''''' ''''
    
       SUB do_sort (BYREF StrucPtr AS DWORD)
    
       '' Power Basic pushes EBX, ESI, EDI.
       LOCAL disp AS LONG
       ASM CLD
    
       Very_top: ' Scan from the top
    
       ASM MOV chgd, 0 ; Set or reset "changed" flag . . .
    
       ASM LEA EBX, ktra ; Pointer to ktra
       ASM MOV EAX, [EBX] ; EAX = ktra
       ASM CMP EAX, 1 ; Testing for exit, if ktra has been
       ' decremented to 1
       ASM JE Labexit
    
       ASM LEA EBX, ng ; Pointer to ng
       ASM SUB EAX, [EBX] ; Subtract ng from ktra.
       ASM MOV ECX, EAX ; This ECX is based on (ktra - ng).
       ' It determines how far down in the
       ' table the testing goes on the current
       ' iteration.
    
       ASM MOV EAX, [EBX] ; EAX = ng
       ASM SAL EAX, 2 ; Quadrupled
       ASM LEA EBX, disp
       ASM MOV [EBX], EAX ; disp (displacement) in bytes =
       ' 4 * ng between stp( ) pointers
       ' of the two strings to be compared
    
       ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will
       ' start with stp(1). Below this
       ' point, EDX will always = stp ( ).
    
    
       ''' Scan will go from the top of the array to a bottom
       ''' which gradually moves up as "heavier" items sink to
       ''' the bottom and "lighter" ones rise to the top. When
       ''' ktra, the working value of the bottom of the set,
       ''' has been decremented all the way to 1 (top of the
       ''' set), sorting is finished. It is also finished if
       ''' an iteration fails to exchange a pair of items,
       ''' provided that at least one bubble-sorting iteration
       ''' has been made.
       '''
       ''' "Very_top" starts a new iteration from the top to the
       ''' current bottom.
       '''
       ''' "Intermediate" starts a comparison between two
       ''' my_sortkey strings and a possible exchange of
       ''' this_one pointers.
    
    
       Intermediate:
    
       ASM PUSH ECX ; Save master counter.
       ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
       ''' farther down in the this_one( ) array, at least
       ''' according to the stp ( ) pointers. [The instances of
       ''' this_one are never moved. Only their stp( )
       ''' pointers are moved.]
    
       ASM MOV ECX, 80 ; ECX becomes the counter to compare 80
       ' bytes (length of any my_sortkey).
    
       ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.
    
       ASM LEA EBX, disp
       ASM MOV EAX, [EBX] ; disp in EAX
       ASM ADD EAX, EDX ; EAX = EDX + disp, for my_sortkey #2.
       ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.
    
       ASM REPE CMPSB ; Compare strings. ESI and EDI change
       ' in this process. This is "repeat
       ' while equal," so it exits at first
       ' mismatch.
    
       ASM JB Past_exchange ; "Below" = smaller ASCII value, so
       ' jump to avoid changing order,
       ' else . . .
    
       '' Exchanging pointers to change order . . .
       ASM MOV ESI, [EDX] ; First, restore original ESI and EDI.
       ASM MOV EDI, [EAX]
       ASM MOV [EDX], EDI ; Then, exchange their locations in the
       ' stp( ) array . . .
       ASM MOV [EAX], ESI
    
       ASM MOV chgd, 1 ; Set "changed" flag.
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''
    
       Past_exchange:
    
       ''' Master counter (the one based on ktra) . . .
       ASM POP ECX
    
       '' Decrement and check counter, for loop-back to
       '' Intermediate . . .
       ASM DEC ECX
       ASM JNZ Intermediate ; to compare another pair of instances
       ' on current iteration
    
       '' Current iteration has ended, so narrow the range ng
       '‘’ (unless ng = 1) . . .
       ASM LEA EBX, ng
       ASM MOV EAX, [EBX] ; EAX = [EBX] = ng
       ASM CMP EAX, 1 ; To test for ng = 1
    
       ASM JE SHORT Keep_ng ; If ng was = 1, keep it, else (below)
       ' scale it down . .
    
       '' Scaling down ng by 3/4. Scale factor is not an exact
       '' science, and an assunmption must be made. Literature
       '' suggests division by 1.3, with integer rounding, so
       '' this is close . . .
    
       ASM SAL EAX, 1 ; Double it.
       ASM ADD EAX, [EBX] ; Triple it.
       ASM SAR EAX, 2 ; Integer division by 4,
       ' can go down to 0.
    
       ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
       ASM JNE SHORT Save_ng
    
       ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
       ' below), but only if it was = 0
       ' above. With ng = 1, sort will
       ' become a bubble-sort.
       Save_ng:
       ASM MOV [EBX], EAX ; Save ng, which may or may not = 1.
       ' [EBX] still = ng here, but not after
       ' this point.
       ASM JMP Very_top ; To start new iteration
    
       Keep_ng:
       ASM LEA EBX, chgd ; Check "changed" flag . . .
       ' Note: No ehecking of chgd unless this
       ' iteration was a bubble-sort, using
       ' ng = 1
    
       ASM MOV EAX, [EBX]
       ASM SUB EAX, 0 ; To compare chgd with 0
    
       ASM JZ Labexit ; To Labexit if no change on this
       ' iteration, else . . .
    
       ASM LEA EBX, ktra
       ASM MOV EAX, [EBX]
       ASM DEC EAX ; Decrement ktra (move bottom upward).
       ASM MOV [EBX], EAX ; Store new ktra.
    
       ASM JMP Very_top ; To start new iteration
    
    
       '''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''
    
       Labexit:
    
       '' Nothing to POP
    
       END SUB '' do_sort

    Leave a comment:


  • Harry Akers
    started a topic Comb sort in ASM

    Comb sort in ASM

    ''' A forum member suggested that I should post this sample
    ''' code here. It is a comb sort, generally faster than a
    ''' bubble sort.


    '''
    TYPE my_type
    my_sortkey AS STRING * 80 '' Key record
    my_stuff AS STRING * 80 '' More data
    more_stuff AS STRING * 240 '' Still more data
    END TYPE

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    FUNCTION PBMAIN() AS LONG
    '''
    DIM this_one (0 TO 5000) AS GLOBAL my_type
    DIM stp (0 TO 5000) AS GLOBAL my_type POINTER

    ‘’’ Below: variables used in comb-sort algorithm
    DIM
    chgd AS GLOBAL INTEGER, _ ' Set to 0 (nothing changed
    ' on this iteration), or to 1
    ktr AS GLOBAL INTEGER, _ ' Number of instances,
    ' or items to sort
    ktra AS GLOBAL INTEGER, _ ' Working value of ktr,
    ' gradually diminished
    ng AS GLOBAL INTEGER ' A measure of the separation
    ' between two items to be
    ' compared, initially ktr \ 2,
    ' eventually 1

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    ktr = 0

    ' ' ' Read several thousand instances of this_one,
    ' ' ' letting ktr = number of instances. Each instance
    ' ' ' consists of a my_sortkey and two other strings,
    ‘ ‘ ‘ all read in from files, with appropriate editing.
    ' ' ' The reading / editing step is not a bottleneck, and
    ' ' ' is fairly fast.
    ' ' ' No data are placed in instance (0), which is a dummy,
    ' ' ' so instance (ktr) is the last one populated.

    '' Below: Make POINTERs for instances 0 to 5000,
    '' even if ktr < 5000 . . .
    FOR k = 0 To 5000
    stp(k) = VARPTR(this_one(k)) '' Computing POINTERs
    NEXT k

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    ktra = ktr '' working value of ktr
    ng = ktr \ 2 '' half of ktr (integer division)

    ' ' ' Below, the CALL passes stp(0), but this will quickly
    ' ' ' be incremented, so stp(1) will be the first instance
    ' ' ' to be considered . . .

    ' ' ' ASM routine to do sorting on my_sortkey . . .
    CALL do_sort (stp(0))

    ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    ''' Output all results to a properly-formatted file.
    ''' Output is in a loop on k from 1 to ktr, and is
    '[' essentially as follows:
    '''
    ''' @stp (k).my_stuff
    ''' followed by
    ''' @stp (k).more_stuff
    ''' discarding my_sortkey
    '''
    ''' . . . with appropriate headers, blank-padding,
    ''' blank lines, formatting, and paging. It is a
    ''' relatively fast step, not a bottleneck.


    '''''''''''''''''''''''''''''''''''''''''''''''''''''''


    ''' SUB do_sort follows. It is a comb-sort, and works
    ''' perfectly, but is still the slowest step in this
    ''' application.
    '''

    ''''''''''''''''''''''''''''''''''''''''''''''''''''''
    ''''''''''''''''''''''''''''''''''''''''''''''''''''''
    ''''''''''''''''''''''''''''''''''''''''''''''''''''''

    SUB do_sort (BYREF StrucPtr AS DWORD)

    '' Power Basic pushes EBX, ESI, EDI.

    ASM CLD

    Very_top: ' Scan from the top

    ASM MOV chgd, 0 ; Set or reset "changed" flag . . .

    ASM LEA EBX, ktra ; Pointer to ktra
    ASM MOV EAX, [EBX] ; EAX = ktra
    ASM CMP EAX, 1 ; Testing for exit, if ktra has been
    ' decremented to 1
    ASM JE Labexit

    ASM LEA EBX, ng ; Pointer to ng
    ASM SUB EAX, [EBX] ; Subtract ng from ktra.
    ASM MOV ECX, EAX ; This ECX is based on (ktra - ng).
    ' It determines how far down in the
    ' table the testing goes on the current
    ' iteration.

    ASM MOV EAX, [EBX] ; EAX = ng
    ASM SAL EAX, 2 ; Quadrupled
    ASM LEA EBX, disp
    ASM MOV [EBX], EAX ; disp (displacement) in bytes =
    ' 4 * ng between stp( ) pointers
    ' of the two strings to be compared

    ASM MOV EDX, StrucPtr ; Pointer to stp(0). Scan will
    ' start with stp(1). Below this
    ' point, EDX will always = stp ( ).


    ''' Scan will go from the top of the array to a bottom
    ''' which gradually moves up as "heavier" items sink to
    ''' the bottom and "lighter" ones rise to the top. When
    ''' ktra, the working value of the bottom of the set,
    ''' has been decremented all the way to 1 (top of the
    ''' set), sorting is finished. It is also finished if
    ''' an iteration fails to exchange a pair of items,
    ''' provided that at least one bubble-sorting iteration
    ''' has been made.
    '''
    ''' "Very_top" starts a new iteration from the top to the
    ''' current bottom.
    '''
    ''' "Intermediate" starts a comparison between two
    ''' my_sortkey strings and a possible exchange of
    ''' this_one pointers.


    Intermediate:

    ASM PUSH ECX ; Save master counter.
    ASM ADD EDX, 4 ; Pointer to next stp ( ), initially stp(1)

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    ''' Compare my_sortkey #1 with my_sortkey #2, where #2 is
    ''' farther down in the this_one( ) array, at least
    ''' according to the stp ( ) pointers. [The instances of
    ''' this_one are never moved. Only their stp( )
    ''' pointers are moved.]

    ASM MOV ECX, 80 ; ECX becomes the counter to compare 80
    ' bytes (length of any my_sortkey).

    ASM MOV ESI, [EDX] ; ESI points to my_sortkey #1.

    ASM LEA EBX, disp
    ASM MOV EAX, [EBX] ; disp in EAX
    ASM ADD EAX, EDX ; EAX = EDX + disp, for my_sortkey #2.
    ASM MOV EDI, [EAX] ; EDI points to my_sortkey #2.

    ASM REPE CMPSB ; Compare strings. ESI and EDI change
    ' in this process. This is "repeat
    ' while equal," so it exits at first
    ' mismatch.

    ASM JB Past_exchange ; "Below" = smaller ASCII value, so
    ' jump to avoid changing order,
    ' else . . .

    '' Exchanging pointers to change order . . .
    ASM MOV ESI, [EDX] ; First, restore original ESI and EDI.
    ASM MOV EDI, [EAX]
    ASM MOV [EDX], EDI ; Then, exchange their locations in the
    ' stp( ) array . . .
    ASM MOV [EAX], ESI

    ASM MOV chgd, 1 ; Set "changed" flag.

    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    Past_exchange:

    ''' Master counter (the one based on ktra) . . .
    ASM POP ECX

    '' Decrement and check counter, for loop-back to
    '' Intermediate . . .
    ASM DEC ECX
    ASM JNZ Intermediate ; to compare another pair of instances
    ' on current iteration

    '' Current iteration has ended, so narrow the range ng
    ‘’ (unless ng = 1) . . .
    ASM LEA EBX, ng
    ASM MOV EAX, [EBX] ; EAX = [EBX] = ng
    ASM CMP EAX, 1 ; To test for ng = 1

    ASM JE SHORT Keep_ng ; If ng was = 1, keep it, else (below)
    ' scale it down . .

    '' Scaling down ng by 3/4. Scale factor is not an exact
    '' science, and an assunmption must be made. Literature
    '' suggests division by 1.3, with integer rounding, so
    '' this is close . . .

    ASM SAL EAX, 1 ; Double it.
    ASM ADD EAX, [EBX] ; Triple it.
    ASM SAR EAX, 2 ; Integer division by 4,
    ' can go down to 0.

    ASM CMP EAX, 0 ; Testing for EAX = new ng = 0
    ASM JNE SHORT Save_ng

    ASM MOV EAX, 1 ; Force EAX = 1 (to be stored as ng,
    ' below), but only if it was = 0
    ' above. With ng = 1, sort will
    ' become a bubble-sort.
    Save_ng:
    ASM MOV [EBX], EAX ; Save ng, which may or may not = 1.
    ' [EBX] still = ng here, but not after
    ' this point.
    ASM JMP Very_top ; To start new iteration

    Keep_ng:
    ASM LEA EBX, chgd ; Check "changed" flag . . .
    ' Note: No ehecking of chgd unless this
    ' iteration was a bubble-sort, using
    ' ng = 1

    ASM MOV EAX, [EBX]
    ASM SUB EAX, 0 ; To compare chgd with 0

    ASM JZ Labexit ; To Labexit if no change on this
    ' iteration, else . . .

    ASM LEA EBX, ktra
    ASM MOV EAX, [EBX]
    ASM DEC EAX ; Decrement ktra (move bottom upward).
    ASM MOV [EBX], EAX ; Store new ktra.

    ASM JMP Very_top ; To start new iteration


    '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

    Labexit:

    '' Nothing to POP

    END SUB '' do_sort
    Last edited by Harry Akers; 23 Jul 2009, 06:10 PM. Reason: It gave me smiley faces where I wanted something else Also, problems with and spacing.
Working...
X