Announcement

Collapse
No announcement yet.

Inline assembly problem

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inline assembly problem

    This is probably the most peculiar problem I have seen when it comes to programming under PB. The reason is that my inline assembly code doesn't work - but it does. Contradiction? I think so, but none the less, it is the accurate but confusing description :hmmm:.

    The current situation is that my [assembly]code works sometimes, but not every time. The behaviour differ quite a lot - when it fails. Some of it crashes to the extent that even Dr. Watson cannot do its job properly. There seems to be a system to it, since for the same setup/set of parametres, the response is repeatable, although I am unable to predict it. I cannot find any corresponding problems with my pure BASIC code. Not to wonder - it's bad code, you would think. I tend to agree, but I am unable to isolate the problem.

    The thing is, my code work 100% correct when I debug it! When run from inside the PB debugger, all is well - using the very same data set that fails when run outside of the debugger!

    I first detected this when reading a forum discussion the other day, and failed to make a local comparison using my own code. I left that for a while, until I stumbled over a variation of the problem. After a follow up on that, I detected that the problem is present for more (all?) of my assembly functions. Since then I have fiddled around with this and tried all the possibilities I can think of with no success. I have tested it under XP SP3 and W2K SP4. With and without firewalls and anti-everything SW. My usual settings in such SW is at "paranoia level" which have previously hampered PC usage. But I think that possibility has been eliminated this time.
    This is NOT related to a particular PB compiler, as several versions reveal the same problem. The behaviour is consistent over PBCC versions 4.00, 4.01, 4.03, 4.04, 5.0 and 5.01, and PBWin version 8.04 and 9.01. Old and fresh installations (the 4.01 through 4.04 was by upgrade).

    Normally, I would have poured debugging-code to log status all over the place in an attempt to isolate the problem, but I do not (yet) have any code to do that reliably from within pure assembly code.

    ------

    In order to avoid the "code not shown" comment , here is compilable code to demonstrate my problem:

    The code shown consistently crash on me when run from outside of the debugger, but it works fine when run from inside the debugger. (The CC-compilers seems to struggle the most, Dr. Watson is having difficulties handling it, but the WIN-compilers simply return with a "sorry, need to close"-message). The code should work with compiler versions older than the ones I have used for testing without modification of any sort.

    I would appreciate it if anyone could spare the time to confirm that it does or does not work with their compiler / computer.

    Is this a situation known to anyone?
    Is this a violation of "owned" code or data space rules?
    What am I missing here?
    My gut feeling tells me that there is some basic stuff (no pun intended), but I clearly fail to see it.

    ------

    Code description:

    This is an assembly piece that will search a string for a substring, primarily being case insensitive (but it's your choice), without using temporary storage. As such it is very well suited for looong string searches. At the time of writing, it was faster than the then current PB versions' INSTR, but I have not timed it against the PB5/9 INSTR.
    I have previously published this code in the source forum (http://www.powerbasic.com/support/pb...ad.php?t=25128), so it may already be known to some of you. The link doubly demonstrate my point, as the samples published works both inside and outside of the debugger (As you can see, Colin Schmidt complained about the code crashing on his computer, but I didn't hear anything more after I released version 2).

    The code supplied below has a modification as compared to the code already published: To make the code even more solid, I have put a (to my knowledge, unnecessary) PUSHAD at the start of the assembly code and a corresponding POPAD at the exit point. No change in behaviour is noticed after the added "security", so these two commands can safely be commented out.

    Code:
    ' Code shown is tested to compile and run with
    ' PB CC4, PBCC5, PB WIN8, PB WIN9. It will probably
    ' work unchanged with many older versions as well.
    #COMPILE EXE
    #DIM ALL
    
    %CC_INSTR_SEARCHFORWARD   =  0&
    %CC_INSTR_SEARCHBACKWARDS = -1&
    
    DECLARE FUNCTION CC_Instr( lStartPos AS LONG, sBuffer AS STRING, sWantedString AS STRING, sCollateString AS STRING, lDirection AS LONG ) AS LONG
    ' ==============================================================================================================
    
      FUNCTION CC_Instr( lStartPos AS LONG, sBuffer AS STRING, sWantedString AS STRING, sCollateString AS STRING, lDirection AS LONG ) AS LONG
      ' CC_Instr (=CustomCase INSTR or CustomCharacterset INSTR) will search sBuffer for sWantedString starting at position lStartPos and return an integer informing of
      ' the location of the string while using sCollateString for character translation when comparing. The result is indicated by the returnvalue.
      ' sBuffer will be searched from the beginning to the end (left to right) or from the end to the beginning (right to left) according to the lDirection flag.
      ' None of the parametres are altered during execution, and only 16 bytes are allocated for temporary data storage.
      '
      ' Also worth noting, is that - due to the (mandatory) custom-characterset feature, this routine does not have ANY speed- nor resource penalty when handling case
      ' INSENSITIVE searches or any odd needs for character substitution.
      ' As for speed in general, the author's measurements suggests that this function completes in around 50% to 60% of the time used by PowerBasic's INSTR() for
      ' short-to-medium length strings (<90 chars used when testing), and approx. 20% faster than INSTR() for long strings (750-900KB strings used when testing).
      ' Note, though, that such timings are highly need-, hit- and environment dependant so your timings will most probably differ if you try for yourself using
      ' your own strings.
      '
      ' Input:  lStartPos      = Where in sBuffer to start your search. 1 = start of sBuffer. Which end to start searching from is decided by lDirection.
      '         sBuffer        = The buffer to search when looking for the string. sBuffer is NOT altered during the search.
      '         sWantedString  = The string we want to find in sBuffer. sWantedString is NOT altered during the search.
      '         sCollateString = A 256-character string containing the characters to use when translating chars in sBuffer and sWantedString.
      '                          This string may also be built according to PowerBasic's COLLATE STRING rules, and may
      '                          be - if used in a search-algorithm within sorted arrays - identical to the COLLATE-string
      '                          used when sorting the same array, so your custom sorting-tables will get extended mileage.
      '         lDirection     = 0 or not 0 (or, if you prefer: %CC_INSTR_SEARCHFORWARD or %CC_INSTR_SEARCHBACKWARDS)
      '                          0 -->     lStartPos = Relative to the beginning of sBuffer, and the search will start from
      '                                    beginning-of-sBuffer and continue to end-of-sBuffer ("normal" search left to right).
      '                          Not 0 --> lStartPos = Relative to the end-of-sBuffer, and the search will start from the
      '                                    end-of-sBuffer and continue towards the beginning-of-sBuffer. ("reverse"/"backwards" search right to left)
      '
      ' Output: <nothing>        No alterations are done to the function's parametres.
      '
      ' Returnvalues:       0  = sWantedString was not found in sBuffer, or an illegal parameter was detected:
      '                              - lStartPos < 1
      '                              - LEN(sCollateString) <> 256
      '                    >0  = sWantedString is found at that position. 1 = always the beginning of (the leftmost position in ) sBuffer.
        LOCAL lRetVal AS LONG
        LOCAL lBufSize AS LONG
        LOCAL lWantLen AS LONG
        LOCAL psWantString AS STRING PTR
          ' Here and there one or more NOP instructions are inserted. This is in order to align the code to
          ' WORD-, DWORD- or PARAGRAPH-addresses. It has been attempted to collect all NOPs into places where
          ' they only fill the alignment function and are never executed, but where that has proved to be
          ' impossible or impractical, the NOPs are scattered around in the preceding code where they can be of
          ' some use; aligning variable accessing instructions (and others) and to fill in where some register- or
          ' execution-channel congestion can be thought to be eased. Although the effects of this has not been
          ' measured, timings more than suggests that the alignment work pays off big time (up to 20% has been measured)
          ' for long strings, while it doesn't really matter when it comes to short strings. However, I have not experimented
          ' with more or less alignment code, so other ways may prove to be faster still (e.g. only go for DWORD alignment all over
          ' as opposed to try to go for PARAGRAPH alignment as I have done here). One has to stop development at some point and say
          ' "That's it". There will always be something that can be improved upon or experimented with to see if it is better or not.
          ' Make sure that sCollateString is exactly 256 characters long
          ! PUSHAD
          ! mov EAX, sCollateString                  ' EAX = pointer to sCollateString handle
          ! mov EBX, [EAX]                           ' EBX = pointer to start of sCollateString
          ! cmp EBX, 0                               ' EBX = 0 --> Null pointer: sCollateString is empty
          ! je RevEndFindString
          ! mov ECX, [EBX-4]                         ' ECX = LEN(sCollateString)
          ! cmp ECX, 256                             ' Check that sCollateString is exactly 256 characters long
          ! jne RevEndFindString                     ' Exit if LEN(sCollateString)<>256
          ' Store pointer to sWantedString in memory
          ' Store LEN(sWantedString) in memory
          ' Check that lWantLen > 0
          ! mov EAX, sWantedString                   ' EAX = handle to sWantString
          ! mov ESI, [EAX]                           ' ESI = Pointer to sWantString
          ! cmp ESI, 0                               ' ESI = 0 --> Null pointer: LEN(sWantedString) = 0
          ! je RevEndFindString
          ! mov EDX, [ESI-4]                         ' move length of sWant into ECX
          ! mov lWantLen, EDX                        ' save LEN(sWantString)
          ' Check that lWantLen <= LEN(sBuffer)
          ! mov EAX, sBuffer                         ' EAX = pointer to string handle
          ! mov EDI, [EAX]                           ' EBX = pointer to string data
          ! cmp EDI, 0                               ' EDI = 0 --> Null pointer: sBuffer is empty
          ! je RevEndFindString
          ! mov ECX, [EDI-4]                         ' move length of string into ECX
          ! mov lBufSize, ECX                        ' Set the lBufSize parameter also
          ! cmp ECX, EDX                             ' Is LEN(sBuffer) < LEN(sWantedString)?
          ! jc RevEndFindString                      ' If CY, then sWantedString is longer than sBuffer - exit
          ' Current situation:
          '   EAX = No interest at this point
          '   EBX = pointer to sCollateString (does not change until ending calculations when sWantString has been located)
          '   ECX = number of Chars in sBuffer (=Chars to search)
          '   EDX = lWantLen, and variable lWantLen initialized
          '   ESI = pointer to start (left side) of sWantString, variable psWantString (pointer to sWantString) initialized
          '   EDI = pointer to start (left side) of sBuffer - no need to save this one in memory
          '
          '   Also, we have made sure that LEN(sWantString) <= LEN(sBuffer), lWantLen > 0 and LEN(sBuffer) >= lWantLen
          '
          '   The above is in anticipation that Left-To-Right search is most frequently used ...
          '   for Right-To-Left calculations, we need to readjust ECX prior to continuing
          ' Check the search direction we want
          ! mov EAX, lDirection                      ' EAX = pointer to lDirection handle
          ! mov EAX, [EAX]                           ' EAX = value of lDirection
          ! cmp EAX, 0                               ' See if lDirection indicates a Left-To-Right search or not
          ! jne RightToLeftSearchStart               ' If lDirection <> 0 then this is a RightToLeft search
          ' -----  LEFT-TO-RIGHT-SEARCH starts here  -------
          ' Adjust ECX so that we do not search more of sBuffer than can fit sWantLen
          ! nop                                      ' WORD alignment
          ! sub ECX, EDX                             ' ECX = LEN(sBuffer), EDX = LEN(sWantString) --> ECX = adjusted number of chars to search
          ! jc RevEndFindString                      ' If CY, then lWantLen > LEN(sBuffer)
          ! inc ECX                                  ' ECX = LEN(sBuffer) - LEN(sWantString) + 1
          ! mov psWantString, ESI                    ' Initialize pointer to sWantString
          ! nop                                      ' WORD alignment
          ' Check that lStartPos is not less than 1
          ! mov EAX, lStartPos                       ' EAX = pointer to lStartPos
          ! nop                                      ' PARAGRAPH alignment
          ! mov EDX, [EAX]                           ' EDX = Value of lStartPos
          ! cmp ECX, EDX                             ' Compare LEN(sBuffer) to lStartPos
          ! jc RevEndFindString                      ' If lStartPos > LEN(sBuffer) - exit
          ! dec EDX
          ! nop                                      ' DWORD alignment
          ! cmp EDX, 0                               ' Check lStartPos > 0 upon input
          ! nop                                      ' PARAGRAPH alignment
          ! jl RevEndFindString
          ' Part of PARAGRAPH alignment for the loops below ... DWORD aligning the following instruction
          ! nop
          ! nop                                      ' DWORD alignment
          ' Adjust EDI to point to the real starting position and decrease ECX with the corresponding number of chars
          ! add EDI, EDX                             ' Make sure EDI points to the real starting position
          ! sub ECX, EDX                             ' Subtract the corresponding number of chars from ECX (= number of chars to search)
          ' PARAGRAPH alignment done - the top of the instruction-loop is aligned
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
          ' Look for the first character to match
        FindFirstCharLoopTopInit:
          ! mov ESI, psWantString
          ! nop
          ! nop                                      ' DWORD alignment
          ! movzx EAX, BYTE PTR [ESI]                ' Get sWantedString's first character into AL (we're using only AL)
          ! nop                                      ' DWORD alignment
          ! movzx EDX, BYTE PTR [EBX+EAX]            ' Get sWantedString's first charcter's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into DL
          ' The FindFirstCharLoopTop label is now PARAGRAPH aligned (relative address 0)
        FindFirstCharLoopTop:
          ! movzx EAX, BYTE PTR [EDI]                ' Get sBufferString's next charcter into AL
          ! movzx EAX, BYTE PTR [EBX+EAX]            ' Get sBufferString's next character's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into AL
          ! inc EDI
          ! cmp EAX, EDX                             ' Compare AL with DL
          ! je FindRestCharsInit
          ! dec ECX
          ! jnz FindFirstCharLoopTop
          ! jmp EndCC_Instr                          ' ECX = 0 --> No strings found. To test: Use 1-letter strings as well!
          ' PARAGRAPH align the top of the FindRestCharsInit (relative address 0)
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        FindRestCharsInit:
          ' Here: Check if the rest of the string matches also
          ! dec ECX
          ! push EDI                                 ' Save the current position in sBuffer
          ! push ECX                                 ' ECX = Remaining chars of sBuffer
          ! mov ECX, lWantLen                        ' Number of characters in sWantedString to ECX
          ! inc ESI                                  ' Move the pointer to the next Char of sWantString before comparison commences ...
          ! dec ECX
          ! jz StringEqual                           ' Is the Wanted String empty now (could have been only 1 char)?
          ' PARAGRAPH align the top of the FirstCharsLoopTop (to relative address 0)
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        FindRestCharsLoopTop:
          ! movzx EAX, BYTE PTR [EDI]                ' Get sBufferString's next charcter into AL (we're still only using AL)
          ! movzx EDX, BYTE PTR [EBX+EAX]            ' Get sBufferString's next character's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into DL
          ! movzx EAX, BYTE PTR [ESI]                ' Get sWantedString's next character into AL
          ! movzx EAX, BYTE PTR [EBX+EAX]            ' Get sWantedString's next charcter's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into AL
          ! inc ESI
          ! inc EDI
          ! dec ECX                                  ' Check if ECX=0
          ! jz FindRestCharsLoopEnd                  ' If ECX = 0, no more characters in sWantedString
          ! cmp EAX, EDX                             ' Compare AL with DL
          ! je FindRestCharsLoopTop                  ' If equal - check next char as well
          ! pop ECX                                  ' ECX = Remaining chars of sBuffer
          ! pop EDI                                  ' EDI = Save the current position in sBuffer
          ! cmp ECX, 0                               ' ECX = 0?
          ! je EndCC_Instr                           ' If ECX is zero, we have searched the entire buffer with no sWantedString
          ! jmp FindFirstCharLoopTopInit
          ' The FindRestCharsLoopEnd label is PARAGRAPH aligned
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        FindRestCharsLoopEnd:
          ! cmp EAX, EDX                             ' ECX = 0, but the last characters aren't compared yet
          ! je StringEqual                           ' If comparison was successful - and the last character of sWantedString matched sBuffer
          ! pop ECX                                  ' ECX = Remaining chars of sBuffer
          ! pop EDI                                  ' EDI = Save the current position in sBuffer
          ! cmp ECX, 0                               ' ECX = 0?
          ! je EndCC_Instr                           ' If ECX is zero, we have searched the entire buffer with no sWantedString
          ! jmp FindFirstCharLoopTopInit
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        EndFindString:
          ' We did not locate sWantedString or sWantedString was in part found at the very end of sBuffer, so nothing more to look for!
          ! xor EAX, EAX
          ! mov lRetVal, EAX
          ! jmp EndCC_Instr
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        StringEqual:
          ! mov EDX, lWantLen
          ! mov EBX, lBufSize                        ' EBX = SizeOf(sBuffer)
          ! dec EDX
          ! pop ECX                                  ' ECX = Number of Chars left of sBuffer from the second Char of sWantedString - this is OK since lRetVal returned is offset 1, not 0
          ! sub EBX, EDX
          ! pop EDI                                  ' Get back previous positions - here: It only serves as a Stack-balance
          ! sub EBX, ECX                             ' EBX = sBuffer Offset to the first character of sWantedString
          ! mov lRetVal, EBX                         ' This offset-pointer (the Char following sWantedString) is stored in lRetVal
          ! jmp EndCC_Instr
    
      ' --------  RIGHT-TO-LEFT search starts here  ------------
          ' PARAGRAPH align the top of the RightToLeftSearchStart (to relative pos. 0)
          ' This code is never executed - its sole function is to fill up to the next paragraph
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        RightToLeftSearchStart:
          ' This is the Search-Backwards-part. The logic is identical, but we decrease string pointers instead.
          ' Make ESI and EDI to point to the end of their strings
          ! add EDI, ECX                             ' EDI = pointer to the rightmost position of sBuffer + 1
          ! nop
          ! dec EDI                                  ' EDI = pointer to the rightmost position of sBuffer
          ! add ESI, EDX                             ' ESI = pointer to the rightmost position of sWantLen + 1
          ! nop
          ! dec ESI                                  ' ESI = pointer to the rightmost position of sWantLen
          ! mov psWantString, ESI
          ' Adjust ECX so that we do not search more of sBuffer than can fit sWantLen
          ! sub ECX, EDX                             ' ECX = LEN(sBuffer), EDX = LEN(sWantString) --> ECX = adjusted number of chars to search
          ! jc RevEndFindString                      ' If CY, then lWantLen > LEN(sBuffer)
          ! inc ECX                                  ' ECX = LEN(sBuffer) - LEN(sWantString) + 1
          ' Check that lStartPos is not less than 1
          ! nop
          ! mov EAX, lStartPos                       ' EAX = pointer to lStartPos
          ! nop
          ! mov EDX, [EAX]                           ' EDX = Value of lStartPos
          ! cmp ECX, EDX                             ' Compare LEN(sBuffer) to lStartPos
          ! jc RevEndFindString                      ' If lStartPos > LEN(sBuffer) - exit
          ! dec EDX
          ! nop
          ! cmp EDX, 0                               ' Check lStartPos > 0 upon input
          ! nop
          ! jl RevEndFindString
          ! sub EDI, EDX                             ' EDI = pointer to the start-to-search position of sBuffer, offset by lStartPos as appropriate
          ! nop
          ! nop
          ! sub ECX, EDX                             ' ECX = ECX - lStartPos
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
          ' Look for the first character to match
        RevFindFirstCharLoopTopInit:
          ! mov ESI, psWantString
          ! nop
          ! nop                                      ' DWORD alignment
          ! movzx EAX, BYTE PTR [ESI]                ' Get sWantedString's first character into AL
          ! nop
          ! movzx EDX, BYTE PTR [EBX+EAX]            ' Get sWantedString's first charcter's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into DL
          ' The RevFindFirstCharLoopTop label is now PARAGRAPH aligned (relative address 0)
        RevFindFirstCharLoopTop:
          ! movzx EAX, BYTE PTR [EDI]                ' Get sBufferString's first charcter into AL
          ! movzx EAX, BYTE PTR [EBX+EAX]            ' Get sBufferString's first character's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into AL
          ! dec EDI
          ! cmp EAX, EDX                             ' Compare DL with AL
          ! je RevFindRestCharsInit
          ! dec ECX
          ! jnz RevFindFirstCharLoopTop
          ! jmp EndCC_Instr                          ' ECX = 0 --> No strings found. To test: Use 1-letter strings as well!
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ' .... the seemingly longer-than-needed string of NOPs are necessary due to jumps becoming
          ' long or short in ways not fit for paragraph alignment. No harm done though - this is never
          ' executed.
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        RevFindRestCharsInit:
          ' Here: Check if the rest of the string matches also
          ! dec ECX
          ! push EDI                                 ' Save the current position in sBuffer
          ! push ECX                                 ' ECX = Remaining chars of sBuffer
          ! nop                                      ' DWORD alignment
          ! mov ECX, lWantLen                        ' Number of characters in sWantedString to ECX
          ! dec ESI                                  ' Move the pointer to the next Char of sWantString before comparison commences ...
          ! dec ECX
          ! jz RevStringEqual                        ' Is the Wanted String empty now (could have been only 1 char)?
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        RevFindRestCharsLoopTop:
          ! movzx EAX, BYTE PTR [EDI]                ' Get sBufferString's next charcter into AL (we're still only using AL)
          ! movzx EDX, BYTE PTR [EBX+EAX]            ' Get sBufferString's next character's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into DL
          ! movzx EAX, BYTE PTR [ESI]                ' Get sWantedString's next character into AL
          ! movzx EAX, BYTE PTR [EBX+EAX]            ' Get sWantedString's next charcter's value (or SORT-WEIGHT if you're using an ARRAY SORT COLLATE string as sCollateString) into AL
          ! dec ESI
          ! dec EDI
          ! dec ECX                                  ' Check if ECX=0
          ! jz RevFindRestCharsLoopEnd               ' If ECX = 0, no more characters in sWantedString
          ! cmp EAX, EDX                             ' Compare AL with DL
          ! je RevFindRestCharsLoopTop               ' If equal - check next char as well
          ! pop ECX                                  ' ECX = Remaining chars of sBuffer
          ! pop EDI                                  ' EDI = Save the current position in sBuffer
          ! cmp ECX, 0                               ' ECX = 0?
          ! je EndCC_Instr                           ' If ECX is zero, we have searched the entire buffer with no sWantedString
          ! jmp RevFindFirstCharLoopTopInit
          ' Label RevFindRestCharsLoopEnd is PARAGRAPH aligned
        RevFindRestCharsLoopEnd:
          ! cmp EAX, EDX                             ' ECX = 0, but the last characters aren't compared yet
          ! je RevStringEqual                        ' If comparison was successful - and the last character of sWantedString matched sBuffer
          ! pop ECX                                  ' ECX = Remaining chars of sBuffer
          ! pop EDI                                  ' EDI = Save the current position in sBuffer
          ! cmp ECX, 0                               ' ECX = 0?
          ! je EndCC_Instr                           ' If ECX is zero, we have searched the entire buffer with no sWantedString
          ! jmp RevFindFirstCharLoopTopInit
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        RevEndFindString:
          ' We did not locate sWantedString or sWantedString was in part found at the very end of sBuffer, so nothing more to look for!
          ' Or ... and illegal condition was identified during initialization. Either way, we're through.
          ! xor EAX, EAX
          ! mov lRetVal, EAX
          ! jmp short EndCC_Instr
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop
          ! nop                                      ' PARAGRAPH alignment
        RevStringEqual:
          ! pop ECX
          ! inc ECX
          ! mov lRetVal, ECX
          ' This position is DWORD aligned, so no adjustments made ...
        EndCC_Instr:
          ! POPAD
        FUNCTION = lRetVal
      END FUNCTION
    
    ' =============================================================================================================================================
    
    FUNCTION PBMAIN () AS LONG
    
      LOCAL sXlat, sString, sSubString AS STRING
      LOCAL i, lPos AS LONG
    
      ' FUNCTION CC_Instr( lStartPos AS LONG, sBuffer AS STRING, sWantedString AS STRING, sCollateString AS STRING, lDirection AS LONG ) AS LONG
      sXlat = CHR$(0 TO 255)
    
      ' The ANSI part of UCASE$() is not defined in PB versions below CC4.03 or WIN8.03,
      ' except maybe the .02 versions which I have never seen - have they been generally available?
      #IF %DEF(%PB_DLL32)
        #IF %PB_REVISION > &H802
          sXlat = UCASE$(sXlat, ANSI)
         #ELSE
          sXlat = UCASE$(sXlat)
        #ENDIF
       #ELSE
        #IF %PB_REVISION > &H402
          sXlat = UCASE$(sXlat, ANSI)
         #ELSE
          sXlat = UCASE$(sXlat)
        #ENDIF
      #ENDIF
    
      sString = "This is my string to be searched"
      sSubString  = "hIs"
      lPos = CC_Instr(1, sString, sSubString, sXlat, %CC_INSTR_SEARCHFORWARD)
    
      ' When successfully run, lPos = 2 and one of the messages below will display it.
      #IF %DEF(%PB_DLL32)
      ' Code to display results for the Windows GUI compilers.
        MSGBOX "sSubString = " + $DQ + sSubString + $DQ + " was found in pos." + STR$(lPos) + " of " + $CRLF + $DQ + sString + $DQ + "."
       #ELSE
      ' Code to display results for the Console compilers.
        PRINT "sSubString = " + $DQ + sSubString + $DQ + " was found in pos." + STR$(lPos) + " of " + $DQ + sString + $DQ + "."
        WAITKEY$
      #ENDIF
    
    END FUNCTION
    
    
    ' ===============================================================================================================================================================================
    ViH

    ----------------------------------------------------------------------------------------------------------

    "If debugging is the process of removing bugs, then programming must be the process of putting them in." - Unknown

    "You can fool some of the people all of the time, and all of the people some of the time, but you can make
    a fool of yourself anytime." - Unknown

  • #2
    #register none

    #register none set in debug but not execute?

    add the following to the top of the file.
    Code:
    #register none

    Comment


    • #3
      :exactly:Thank you, Brian! That seems to fix it!

      I have yet to test my other functions, but I suspect you hit the nail!

      In order to make the assembly functions work without hampering PB's optimisation of the other parts of the code, the #REGISTER NONE statement should be placed inside the function itself - below the FUNCTION's header (not to be confused with the DECLARATION) and before the variables are being defined.

      I'll return with a final report shortly, and a correction to CC_INSTR() containing this fix (of course).

      Again, thank you very much! :shake:

      ViH
      ----------------------------------------
      Isn't this wonderful? A solution in less than 40 minutes - when posting the question at midnight!!!

      Comment


      • #4
        Originally posted by Vidar Hanto View Post
        ----------------------------------------
        Isn't this wonderful? A solution in less than 40 minutes - when posting the question at midnight!!!
        Probably took you longer than that just to compose and post the message {grin}.

        =================================
        "If you can't annoy somebody,
        there's little point in writing."

        Anonymous
        =================================
        It's a pretty day. I hope you enjoy it.

        Gösta

        JWAM: (Quit Smoking): http://www.SwedesDock.com/smoking
        LDN - A Miracle Drug: http://www.SwedesDock.com/LDN/

        Comment


        • #5
          Code:
          ' This code is never executed - its sole purpose is to fill up to the next paragraph
                ' .... the seemingly longer-than-needed string of NOPs are necessary due to jumps becoming
                ' long or short in ways not fit for paragraph alignment. No harm done though - this is never
                ' executed.
                ! nop
                ! nop
                ....
          >>>>
          Code:
          #ALIGN  boundary
          Michael Mattias
          Tal Systems Inc. (retired)
          Racine WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            #register none can also be placed inside of the SUB/FUNCTION that has the inline ASM located within it. This will only disable register variables for that routine only.
            Scott Slater
            Summit Computer Networks, Inc.
            www.summitcn.com

            Comment


            • #7
              Originally posted by Michael Mattias View Post
              Code:
              #ALIGN  boundary
              (This just turned into an inline assembly discussion, but I'll return to the problem in a later posting).

              Yes, and no.

              Yes: I am aware of that possibility, and I actually had it in my original posting just before my posting it, but I removed that and some other stuff to shorten it and to keep my posting more to the point. I am planning revisions of most of my assembly code (incl CC_INSTR()), but PBWin9 and PBCC5 has so much more added for assembly coding than just #ALIGN that I want to look into before publishing any updates.

              No:
              Code:
              ' Code shown is tested to compile and run with
              ' PB CC4, PBCC5, PB WIN8, PB WIN9. It will probably
              ' work unchanged with many older versions as well.
              #ALIGN would hamper its functionality for anything but PBCC5 and PBWIN9. And it was a major point to demonstrate it failed in any version of PB I've tested it with. I could have (I had) entered it inside a #IF/#ENDIF clause, but in the end I didn't.

              But this brings forward a topic from my original posting of CC_INSTR() and removed parts from before my first posting in this thread:
              Originally posted by from the source forum
              Code:
               ...and I have attempted to align the code to paragraphs and DWORDs as I see
              fit. This is based on the assumption that a function's code always starts off a
              paragraph. I HAVE NOT VERIFIED THAT YET...
              Well, I am still unable to verify that for sure, but what I have seen would suggest this: If a function starts off a boundary, it is not a paragraph boundary. But if CC_INSTR() is your VERY FIRST function in your source, it seems to get aligned. And such alignment is worth looking for if you switch to assembly in the first place.

              Enter #ALIGN. This is a documented way to get code aligned. Remove the "! PUSHAD" and "!POPAD" instructions, then insert the code shown below (you'll quickly see where - top and bottom lines overlap) Voilá! You have yourself a fully functional string searh function that is >5% faster (if my timings from the source forum still holds true). The code shown below is functionally tested, but not timed at this time.
              Code:
                ' Returnvalues:       0  = sWantedString was not found in sBuffer, or an illegal parameter was detected:
                '                              - lStartPos < 1
                '                              - LEN(sCollateString) <> 256
                '                    >0  = sWantedString is found at that position. 1 = always the beginning of (the leftmost position in ) sBuffer.
                  #REGISTER NONE
                  LOCAL lRetVal AS LONG
                  LOCAL lBufSize AS LONG
                  LOCAL lWantLen AS LONG
                  LOCAL psWantString AS STRING PTR
                    ' Here and there one or more NOP instructions are inserted. This is in order to align the code to
                    ' WORD-, DWORD- or PARAGRAPH-addresses. It has been attempted to collect all NOPs into places where
                    ' they only fill the alignment function and are never executed, but where that has proved to be
                    ' impossible or impractical, the NOPs are scattered around in the preceding code where they can be of
                    ' some use; aligning variable accessing instructions (and others) and to fill in where some register- or
                    ' execution-channel congestion can be thought to be eased. Although the effects of this has not been
                    ' measured, timings more than suggests that the alignment work pays off big time (up to 20% has been measured)
                    ' for long strings, while it doesn't really matter when it comes to short strings. However, I have not experimented
                    ' with more or less alignment code, so other ways may prove to be faster still (e.g. only go for DWORD alignment all over
                    ' as opposed to try to go for PARAGRAPH alignment as I have done here). One has to stop development at some point and say
                    ' "That's it". There will always be something that can be improved upon or experimented with to see if it is better or not.
                    ' Make sure that sCollateString is exactly 256 characters long
                    #IF %DEF(%PB_DLL32)
                      #IF %PB_REVISION > &H8FF
                        #ALIGN 64
                      #ENDIF
                     #ELSE
                      #IF %PB_REVISION > &H4FF
                        #ALIGN 64
                      #ENDIF
                    #ENDIF
                    ! mov EAX, sCollateString                  ' EAX = pointer to sCollateString handle

              Comment


              • #8
                Here is the lengthy, but as far as I know, complete solution to my problem.
                Code:
                CONCLUSION: #REGISTER NONE is not required for use in SUBs or FUNCTIONs that contain inline assembly code, but it is strongly recommended. If you leave it out, you better know what you are doing!
                After Brian served me a solution for my problem on a silver platter, and thus pointed me in the right direction, I wanted to look into if it was true that - apparently - in order to get ASM speed, you had to switch off BASIC speed! And, indeed! I was lucky, and found a way to keep both ASM and BASIC speed!

                I have mostly pieced together info from different parts of the official docs. Where missing, I found more info, or something that gave me an idea, somewhere in the forums. I have tested what I have found/deduced outside the docs at my very best (by experience, I consider the official docs to be reliable. Errors are rare.), but due to the nature of this problem I can not be completely sure I have chosen situations that will trigger an error I can detect.

                If my reasoning below is wrong, inaccurate, or right - but for the wrong reasons, I trust someone knowledgable will be so kind as to put forward the correct information.

                ---





                What hampered my original code was register variables. Register variables cannot be accessed as other variables in all circumstances. They are stored - as their name says - in a [CPU] register. E.g. VARPTR(x) will not work (even in BASIC) if x is a register variable(!) although using register variables to call other SUBs/FUNCTIONs still work. PB's use of register variables do interfere with ASM code, but exactly how is not clear.
                • #REGISTER DEFAULT and #REGISTER ALL takes on to make varaibles register variables and store them in CPU registers during the execution of your code. In that way, accessing them becomes much faster than ordinary variables stored in memory (RAM). There may be up to two integer-type variables and up to four Extended-precision floating variables.
                • #REGISTER ALL sets out to make all integers and floats register variables - up to the current limit. The two first integer variables defined in a SUB/FUNCTION and the four first floats are assigned to registers and those remaining are kept as normal memory variables. IF, in the future, a CPU with more registers should come along (with a new PB version), more variables could be assigned to registers.
                • #REGISTER DEFAULT sets out to make the two first integer type variables it finds in each SUB/FUNCTION register variables, and so does up to four extended precision floating variables. SUBs/FUNCTIONs that make calls to other SUBs/FUNCTIONs, will have no floats assigned to registers.
                • #REGISTER NONE switch off the automatic assignment of register variables completely, for the duration of the SUB/FUNCTION where it resides, or for the entire program if entered at the very top of your program. You really don't want that. Register variables improves speed, and that is exactly why you would consider using assembly in the first place. You don't want to "switch off" speed when speed is what you want! But, more importantant than fast code is reliable code ...
                As it turns out, the use of register variables has the side effect of hampering inline assembly code. But ONLY if the register variables are being accessed from within the assembly code itself!!!! Also, beware that register variables are local to each and every SUB/FUNCTION. They do not "survive" (but can be used for) SUB/FUNCTION calls, and are as such not transferred between functions/subs.

                ---

                So, in the end, there are several ways to go about it when writing assembly code, and making it reliable. There could be even more ways than I have been able to uncover this far:
                1. #REGISTER NONE at the top of your code will enable you to go back and forth between BASIC and ASM as you wish without worries in your entire program. But the overall speed of your program may suffer (see more below).
                2. #REGISTER NONE at the top of the SUB/FUNCTION will enable you to go back and forth between BASIC and ASM as you wish without worries inside that particular SUB/FUNCTION. But the overall speed of your sub/function may suffer (see more below).
                3. Write the ENTIRE SUB/FUNCTION in assembly, using NO LOCAL VARIABLES! (No DIMs, no LOCALs etc.). Now #REGISTER DEFAULT or #REGISTER ALL can find no variables to put in registers, and in effect disables the mechanism.
                4. If one cannot do without variables, one can reserve a stack-frame for such storage (recommended against in the docs). This could be remedied by the use of a single register + numeric-equate offsets (to maintain code readability). However, to be useful, most ASM snippets need input and output - which do require regular variables of some sort. To be universally applicable, this approach would require its own SUB/FUNCTION. That being so, 3) above with local variables and combined with #REGISTER NONE is more viable and just as [speed] efficient.
                5. Write your assembly code in a way that do not access register variables! If you need mixed BASIC adn ASM code, this is what you really should be looking for: PB gives you speed for "free", and you want to keep it - but you'll want your ASM code to gain even more speed.
                How do you do that? Assume you leave out any #REGISTER statements (= most programs, I would think). That means #REGISTER DEFAULT is in effect. Read again #REGISTER DEFAULT above... See?
                YOU SIMPLY DEFINE TWO VARIABLES that you do NOT ACCESS from within your assembly code AT THE VERY TOP in your sub/function!

                Alternatively, you use #REGISTER NONE and issue the REGISTER command twice.

                Choose LONG or DWORD variables for your register variables (DWORDs are just as speed-efficient as LONGs inside [CPU] registers). These should be variables you use frequently, such as loop counters and/or array indexes. Such variables are the definitive winners when it comes to register variables! You gain speed both ways! More BASIC speed and clean, reliable and (of course), fast ASM code!

                If you don't have two integer class variables to choose from that you don't access from your ASM code, you define an extra dummy variable - at the top, but below the one you'll actually use elsewhere in your code. That will fool PB into making it a register variable in place of those you access from within your assembly code. (If there are no integer class variables defined that you do not access from your assembly code you simply use #REGISTER NONE).

                There's not much difference if #REGISTER ALL is in effect. Actually, the only difference is when extended precision floats are made register variables.
                ---

                My original CC_INSTR() example that demonstrated the problem can be fixed by inserting a single dummy variable at the very top. However, I cannot be sure that one dummy is sufficient to keep other data sets from stumble like the current does without any dummy variable. To be sure, two dummy variables are needed if #REGISTER NONE is not in effect. However, since CC_INSTR() is also an all-assembly code function, using #REGISTER NONE will not hamper speed.


                ================================================

                <This part has been removed due to new info received that invalidates major parts of it. Leaving it in, would needlessly confuse new readers>

                ================================================

                I want to finish off with one wish for PB:

                Please have the register variable effect on assembly language - and the ways to remedy it, properly visible at the heart of the PB docs. Information that critical to writing reliable inline assembly code should be easily spotted - both in the ASM keyword reference and in the Programmer's Reference. There's a lot of good info for assembly language users in the docs, but this part is missing.

                Admittedly, the #REGISTER NONE effect IS documented if you dig deep enough *) and look in a place somewhat unrelated (but not fully) to this problem: If you read the Programmer's Reference, Inline Assembler's part: "Intermixing ASM and BASIC code" and look at the example, you'll find that it makes use of #REGISTER NONE, with the following comment attached: "Ensure there is no conflict with PowerBASIC Register variables".



                *) This is not to say the documentation is poor! The documentation is really quite good! But it should be read literally more often than not. The language is very "compact" and accurate. In that way, it excels in comparison to most other documentation. But it very often need you - well, at least me, that do not have English as the native tongue - to read slow, analyzing the text, in order to grasp the full reach of it.



                ---


                ViH
                Last edited by Vidar Hanto; 30 Mar 2009, 06:52 PM. Reason: Removed info

                Comment


                • #9
                  Other thoughts

                  Vidar,

                  Glad I helped.

                  A few more thoughts:

                  When using #REGISTER NONE, you can still explicitly specify a register variable

                  Code:
                  register x as long


                  I do not know the algorithm that PB decides what variables to use as register variables so

                  Code:
                  local x as long
                  changed to

                  Code:
                  local y as long 
                  local x as long
                  The above could result in the application running entirely different speed if Y is now used in the register and X as not. Explicitly defining register variables gives control.

                  Comment


                  • #10
                    Brian.

                    After my digging for info on the matter, my impression is that PB automatically takes the first, or the two first, integer variables defined and set it/them to be register variable(s).
                    Except where the REGISTER statement has been explicitely used. In that case, variables prepended with the REGISTER statement, will be prioritized over the ones defined prior to REGISTER prepended variables. But again, if more than two REGISTER variables are defined, then the sequence in which they appear is the deciding part.
                    My own tests "confirm" that impression - but again, confirm is in quotes because in this area I cannot be absolutely sure that my test scenario will generate a fault I can detect - if an error should occur! The effects have proven that they can be very subtle, and, although the effects are repeatable and thus not random, I have yet to find a way to predict the effects of register variable interference when introducing something new into the test scenario.

                    My impression is based on the following (from the Register statement docs - my boldfacing):
                    Register variables are always local to the Sub or Function where they appear. In the current version of PowerBASIC, there may be up to two integer-class variables (Word/Dword/Integer/Long) and up to four Extended-precision floats. It is possible that future versions of the compiler will change these limits, so you may declare an unlimited number of them. Any "extra" Register variables are automatically reclassified as locals during compilation.
                    The REGISTER statement allows you to choose which variables will be classified as Register variables. If you do not make the choice in a particular Sub/Function, the compiler will attempt to choose for you. By default, the compiler will always assign any integer-class local variables available. Extended-precision float variables will be automatically assigned only in Functions that contain no external Function calls.
                    add to that information given by Lance Edmonds back in 2000 (that message is one of the lucky strikes during my info search) and can be found here http://www.powerbasic.com/support/fo...ML/000027.html:
                    Notes:
                    1. The integer variable (xyz%) must NOT be a register variable. This means that you will need to either disable register variables (for that sub/function), or declare a few dummy numeric integer-class variables first, since xyz% must be a memory variable.
                    With #REGISTER NONE not in effect, that would render
                    Code:
                    local x as long
                    Code:
                    #REGISTER NONE
                    register x as long
                    Code:
                    local y as long
                    local x as long
                    all to be register variables. In the last example, to make sure to keep x out of registers - and the others in, you need to do one of the following (assuming that #REGISTER NONE is not in effect when entering the sub/function):
                    Code:
                    local x as long
                    register y as long
                    register z as long
                    Code:
                    local y as long
                    local z as long
                    local x as long
                    Code:
                    #register none
                    local x as long
                    register y as long
                    Note the use of a third variable in the first two of the three examples. z may well be an unused (dummy) variable if your program does not make use of three integer type variables.





                    Note also:
                    • The same logic can be applied to (up to four) extended precision floating variables as well.
                    • In the docs, PB says that the limit of two integer and four ext. float register variables per sub/function may be extended in the future.
                    So, I agree with you. The universal, true, proper and future proof recommendation should be to have ALL subs/functions (or entire programs) to use #REGISTER NONE and then each sub/function should explicitely define their own register variables by using the REGISTER statement!


                    ViH
                    Last edited by Vidar Hanto; 30 Mar 2009, 09:28 AM. Reason: Typing error

                    Comment


                    • #11
                      > The universal, true, proper and future proof recommendation .....

                      Send those New Feature Suggestions to [email protected]

                      (But only if you'd actually like to see it some day.)
                      Michael Mattias
                      Tal Systems Inc. (retired)
                      Racine WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        Due to all the above, there remains the sad fact that using assembly code from within a PB macro of any kind is potentially unsafe!
                        Two ways you can make you asm safe always--even in macros--are as follows: 1) Push ebx, esi, and edi before your asm if you are going to use them, then pop them after your asm is complete. 2) Limit your asm to eax, ecx, and edx (the scratch registers) and no register conflicts will occur, because the register variables don't use the scratch registers.
                        Last edited by John Gleason; 30 Mar 2009, 11:02 AM. Reason: clarified

                        Comment


                        • #13
                          Two ways you can make you asm safe always--even in macros--are as follows: ...
                          I wish that was true, but it is not correct.

                          That is about the only thing I've been able to conclude with 100% certainty in this area. The proof is in my first posting in this thread: PUSHAD/POPAD doesn't remedy the problem. Those mnemonics saves off (in this order, according to Intel docs): EDI, ESI, EBP, ESP, EBX, EDX, ECX and EAX, except that ESP's value is discarded rather than copied to ESP (why they include ESP when listing the sequence is unclear).
                          The register variable mechanism still interferes with the assembly code. Which, I think, can be somewhat due to the same fact that from BASIC code, VARPTR() cannot be used to reference a register variable.

                          Based on that fact, I posted the Macro warning.

                          But when it comes to Windows API calls, your reasoning is correct (e.g. ref. PB docs, The Inline Assembler, Saving registers).


                          ViH

                          Comment


                          • #14
                            Folks,

                            There is a huge amount of misinformation in this thread. I hope that none of our friends here will allow it to influence their programming methods.

                            1- There is nothing unsafe about PowerBASIC ASM. It works as advertised.
                            2- There is nothing unsafe about PowerBASIC Macros. They work as advertised.
                            3- There is nothing unsafe about PowerBASIC Register Variables. They work as advertised.
                            4- PowerBASIC Register Variables do not interfere with ASM code. You can even reference them by name.

                            I've waited almost a week to see an example of these things. None has been provided to PowerBASIC support or here in the forums.

                            The rules (abridged):

                            * Register variables are stored in ESI and EDI. If you use RegVars, don't overwrite those registers, nor the other protected registers. If you are having a problem which involves register variables, here is the reason: You are overwriting the values in ESI and EDI. Either use them as register variables yourself, accessing them by their variable name or their register name (ESI/EDI), ==or== use them for any other purpose, but save/restore the entry values.

                            * If you use RegVars, don't try to use an integer FPU opcode (like FILD) to reference them. The FPU loads from memory only, not CPU registers. If you do, you'll receive a warning exception at compile-time.

                            The PowerBASIC Assembler functions perfectly. Follow the very simple rules, and it will benefit your code greatly.

                            Have questions??? Ask us! We'll be very pleased to help!

                            Bob Zale
                            PowerBASIC Inc.

                            Comment


                            • #15
                              That is about the only thing I've been able to conclude with 100% certainty in this area. The proof is in my first posting in this thread: PUSHAD/POPAD doesn't remedy the problem.
                              You are correct sir, the registered variables do still interfere. :ashamed: Strike the push/pop solution from my above post. Limiting the registers to eax, ecx, and edx is still a valid technique, but hugely restricts integer asm coding options.

                              The remaining answer then is: #REGISTER NONE either globally or at the procedural level where the macros are used. Then they can be safely used without interference.

                              I forgot to mention that floating point asm also sometimes requires #REGISTER NONE, and I know of no other way around that requirement. So the ultimate bulletproof solution for all asm must be: #REGISTER NONE. Realistically, the loss of compiler register optimization speed will probably be very small compared to the big speed increase your asm will generate. There is no need to abandon any of your prior hard-earned macros imho.

                              Added: and yes, what Mr. Zale succinctly points out above far better than I.
                              Last edited by John Gleason; 30 Mar 2009, 12:57 PM.

                              Comment


                              • #16
                                Originally posted by John Gleason View Post
                                ...the registered variables do still interfere....
                                Sorry, John, but Register Variables do not interfere with assembler code. Integer class register variables are stored in ESI and EDI. If you wish to reference them by variable name, you can do so. If you wish to use those registers for another purpose, you must save and restore them. I would not call that interference... I'd call it priceless.

                                Best regards,

                                Bob Zale
                                PowerBASIC Inc.

                                Comment


                                • #17
                                  Thanks Bob for the clear explanation. Let me just scratch that interference reference if you will. I, for one, now have a much improved understanding of register variables.

                                  Comment


                                  • #18
                                    Thank you Bob!

                                    ESI/EDI that's the answer I was looking for all the time!

                                    Where is it documented? How future proof is it?

                                    Comment


                                    • #19
                                      Vidar, hope you don't mind a couple kind of off-subject questions: how did you determine the paragraph alignment for your search code? How big is a paragraph and is that size alignment usually optimal?

                                      Comment


                                      • #20
                                        John.

                                        I used MASM to decide the NOPs. With that, only the initial code's alignment becomes important from within PB. Now #ALIGN takes care of that.

                                        A paragraph is 16 bytes. It is an expression that has survived since at least DOS and the 8088's segmentet addressing. Maybe it is time to stop using that expression. As such, a paragraph always has 0 in its adress' last hex digit. That is why I call it "relative offset 0" in my comments to CC_INSTR().

                                        No. I am not certain that paragraph alignment is always the right choice. There's another thread just recently started http://www.powerbasic.com/support/pb...ad.php?t=40251 which you have seen, that delves into this as well.

                                        However, paragraph alignment makes sense in that it is half of a Pentium cache block, and that instructions are fetched from memory 8 bytes at a time (64 bit bus). Add to that a 128 byte prefetch queue and branch prediction, then 16 byte alignment fits many ways without growing the code too much. That's Pentium. In Athlon's case (you can see for yourself by following a link in the thread referenced just above) things are a little different: AMD recommends paragraph aligned labels. Spot on.

                                        In general, my experience with code aligment is that the effects vary, but they are never substantial. However, it is there for the taking if you need it. Your milage will also differ from situation to situation, CPU to CPU (eg. Intel vs AMD) etc.

                                        Note that if code alignment is not overdone, I have yet to see any LOSS in speed, even though there is more code. Since it is fairly easy to insert the proper number of NOPs, I routinely do that when finishing off an ASM stub that amounts to more than a few lines and there is a loop of some sort, and leave it at that. Until I find that another value is better suited, paragraph aligning labels is my favourite - particularly so if I can insert NOPs that is never executed.


                                        ViH

                                        Comment

                                        Working...
                                        X