Announcement

Collapse
No announcement yet.

Variant array assignment breaks POKE$/PEEK$.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variant array assignment breaks POKE$/PEEK$.

    Please bear with me - long post.

    In working on a project that needs to do lots of array manipulation, I was experimenting with various techniques to quickly copy very large arrays. I initially just used PowerBASIC's POKE$/PEEK$ technique (take the result of PEEK$'ing from a pointer to the source array's first element for the number of bytes in the array, and POKE$ it to a pointer to the destination array).

    I wondered if I could construct a "purer" block copy method that didn't use PEEK$/POKE$, since I wasn't sure if their implementation involved the use of temporary moves that might slow down the copy of very large arrays (it seems pretty fast, but I wasn't sure about memory usage and intermediate copy operations).

    Anyway, I constructed a hybrid BLOCKCOPY routine which uses pointers to very large strings, and basic string assignment statements, which I was hoping compiled to very efficient machine code, and which I was at least fairly certain wouldn't use any memory outside of the source and destination arrays themselves.

    I then discovered that PowerBASIC has the ability to directly "assign" a whole array to a variant, and I decided to experiement with that technique as well (even though I was nearly certain that wouldn't be the fastest method, as any array-copy implementation that uses it would *explicitly* be using additional memory and intermediate copy operations).

    I wound up writing a test program that used all three techniques to repeatedly copy a large array to another array. As the arrays grew in size, I discovered that when they reach a certain size, the PEEK$/POKE$ technique doesn't work anymore, once you have performed a variant assignment. My own BLOCKCOPY routine continues to work just fine, but the PowerBASIC PEEK$/POKE$ method doesn't work anymore once the variant assignment technique has been done even one time.

    Note that the program doesn't blow up or GPF, and I can still continue to use it, but the PEEK$/POKE$, even though it executes without apparent error, doesn't actually do anything - the destination array is just never filled again by this technique.

    I have included the program's source in this posting in hopes that someone can tell me whether they see any problems (bugs) in my code. I don't think I've ever posted any code here, so I'm hoping folks will be gentle if they think I am doing something dumb. It is more than possible that my logic has an obvious flaw that I am somehow just not seeing.

    I've tried to organize and comment the code to make it as easy as possible to follow, since I'm asking for help understanding what is going in with my code.

    TIA for any help or information anyone sees fit to offer.

    Oh, BTW I am using PBCC 5.0 - I'm not sure, but I *think* the array/variant assignment stuff might be new with that version. If that is true, then this code probably won't compile on anything before 5.0.

    Code:
    #COMPILE EXE
    #DIM ALL
    
    '%ELEMENTCOUNT = 98664440   'Largest value for ELEMENTSIZE where variant/array assignment doesn't break future POKE$/PEEK$ ops.
    %ELEMENTCOUNT = 98664441   'One larger, and MOVEDATA (POKE$/PEEK$) doesn't ever work after the first variant assignment is done
    %ELEMENTLAST = %ELEMENTCOUNT - 1
    %ARRAYBYTES = %ELEMENTCOUNT * 4
    %BLOCKSIZE = 8333375       'Largest value for BLOCKSIZE that works for me without causing a GPF in BLOCKCOPY subroutine.
    
    GLOBAL glArray() AS LONG          'Source array will be filled with random data one time.  It is not changed.
    GLOBAL glArrayCopy() AS LONG      'Copy array will be repeatedly REDIM'd and filled with a copy of glArray's data.
    
    '
    ' Basic method for block copying data - right from the PowerBASIC help file.
    '
    SUB MOVEDATA(BYVAL pDestination AS BYTE POINTER, BYVAL pSource AS BYTE POINTER, BYVAL nBytes AS LONG)
    
        POKE$ pDestination, PEEK$(pSource, nBytes)
    
    END SUB
    
    '
    ' Alternate method, involving assignment of very large fixed-length strings, using only Basic language statments (with
    ' a little help from PowerBASIC's pointers, of course).
    '
    ' Note that except for arrays that are a multiple of %BLOCKSIZE, the final block will be smaller than %BLOCKSIZE, and
    ' therefore cannot use the same block assignment statement, since that would overrun the destination array, as well as
    ' the source array.  Instead, a loop is performed that simply copies the final, smaller block a byte at a time.  Worst
    ' case is that this loop is executed BLOCKSIZE-1 times.
    '
    SUB BLOCKCOPY(BYVAL pDestination AS STRING POINTER * %BLOCKSIZE, BYVAL pSource AS STRING POINTER * %BLOCKSIZE, BYVAL nBytes AS LONG)
    
        DIM iLoops AS LONG,iLast AS LONG,i AS LONG
        DIM pDestinationByte AS BYTE POINTER, pSourceByte AS BYTE POINTER
    
        iLoops = nBytes \ %BLOCKSIZE                         'Calculate the number of complete blocks contained in the source array
        iLast = nBytes MOD %BLOCKSIZE                        'Calculate the number of leftover bytes after all block moves are done
    
        FOR i = 0 TO iLoops - 1                              'Using 0 instead of 1 since we're using i as a pointer offset
            @pDestination[i] = @pSource[i]                   'Move a block of data
        NEXT i
    
        IF iLast > 0 THEN                                    'If there are any leftover bytes to move
            pDestinationByte = VARPTR(@pDestination[iLoops]) 'Make a pointer to where to start in the destination array
            pSourceByte = VARPTR(@pSource[iLoops])           'Make a poiter to where to start in the source array
            FOR i = 0 TO iLast - 1                           'Again, use 0 instead of 1 since i is used as a pointer offset
                @pDestinationByte[i] = @pSourceByte[i]       'Copy 1 byte
            NEXT i
        END IF
    
    END SUB
    
    '
    ' Main function - allocate array storage, initialize array with random data, drop into menu loop to repeatedly use the three
    ' copy techniques to copy the original array to the test copy, comparing the result for correctness each time.
    '
    FUNCTION PBMAIN () AS LONG
    
        REDIM glArray(%ELEMENTLAST)
        REDIM glArrayCopy(%ELEMENTLAST)
    
        DIM i AS LONG, j AS LONG
        DIM iCopyErrors AS LONG
        DIM nSkip AS LONG, sKey AS STRING*1
        DIM vVar AS VARIANT
    
        STDOUT "Filling array..."
        FOR i = 0 TO %ELEMENTLAST
            glArray(i) = RND(1,%ELEMENTLAST)
        NEXT i
    
        DO WHILE -1
    
            sKey = ""
            DO WHILE INSTR("123Q",sKey) = 0
    
                STDOUT
                STDOUT "1 = Basic statements only (BLOCKCOPY)"
                STDOUT "2 = PowerBASIC POKE$/PEEK$ technique (MOVEDATA)"
                STDOUT "3 = Assignment to/from a Variant"
                STDOUT "Q = Quit"
                STDOUT
                STDOUT "Press a key for one of the choices above..."
                sKey = UCASE$(WAITKEY$)
    
            LOOP
    
            STDOUT
    
            REDIM glArrayCopy(%ELEMENTLAST)
    
            SELECT CASE sKey
    
                CASE "1"
                    STDOUT "Making a copy of the array using BLOCKCOPY..."
                    BLOCKCOPY VARPTR(glArrayCopy(0)), VARPTR(glArray(0)), %ARRAYBYTES
    
                CASE "2"
                    STDOUT "Making a copy of the array using MOVEDATA..."
                    MOVEDATA VARPTR(glArrayCopy(0)), VARPTR(glArray(0)), %ARRAYBYTES
    
                CASE "3"
                    STDOUT "Making a copy of the array using variant assignment..."
                    vVar = glArray()
                    glArrayCopy() = vVar
    
                CASE "Q"
                    EXIT DO
    
            END SELECT
    
            STDOUT "Comparing the two arrays..."
            nSkip = 0
            iCopyErrors = 0
            FOR i = 0 TO %ELEMENTLAST
                IF glArray(i) <> glArrayCopy(i) THEN
                    INCR iCopyErrors
                    IF NOT nSkip THEN
                        STDOUT USING$("Error: glArrayCopy(#) = #_, but glArray(#) = #", i, glArrayCopy(i), i, glArray(i))
                        sKey = UCASE$(WAITKEY$)
                        IF sKey = "Q" THEN EXIT FOR
                        IF sKey = "S" THEN nSkip = -1
                    END IF
                END IF
            NEXT i
    
            IF i < %ELEMENTCOUNT THEN
                STDOUT "Comparison stopped with " & STR$(iCopyErrors) & " copy errors detected."
            ELSE
                STDOUT "Comparison complete with " & STR$(iCopyErrors) & " copy errors detected."
            END IF
    
        LOOP
    
    END FUNCTION

  • #2
    Using a variant as temporary storage for an array is extraordinarily inefficient since it requires creation of a SafeArray. Avoid it like the plague.

    Bob Zale
    PowerBASIC Inc.

    Comment


    • #3
      And try to use Pointers instead of PEEK$ POKE$ these days. Much safer and efficiend. But even avaiod that, just as Bob says.
      Barry

      Comment


      • #4
        Test program does not show what you are really trying to accomplish, but..

        Do you really want a COPY of the array? Or, do you want to be able to refer to EITHER array and by doing so be looking at the same data?

        If the latter, all you need to do is set up an absolute array:

        Code:
          REDIM GLArrayCopy(%ELEMENT LAST) AT VARPTR (glArray(0))
        I use this techinque to create an array which is a SUBSET of of larger arrays, eg...

        I load a 'master' array for <n> sets of data, each set with a variable number of elements. I keep track of the subscript where each new set starts in master(), and the number of elements in that set.

        When I need only the elements for a particular set, I just..

        Code:
         REDIM WorkArrray (nElements(setNo) -1) AT  VARPTR (masterArray(starting_subscript)) 
        
         CALL FunctionWhichWantsArrayForThisSetOnly  (WorkArray())
        Now my receiving function gets an array consisting only of the elements of MasterArray() which belong to set 'setno'

        It's a thought.

        MCM
        Michael Mattias
        Tal Systems (retired)
        Port Washington WI USA
        [email protected]
        http://www.talsystems.com

        Comment


        • #5
          You'll need MoveMemory() API

          MoveMemory( Byval pTarget, Byval pSource, Length )
          hellobasic

          Comment


          • #6
            My question to the group remains - can anyone explain why assigning a large array to a VARIANT and then assigning that VARIANT to a second array causes all subsequent POKE$/PEEK$-based copies from the first array to the second to silently fail?

            Thanks for the replies. I've used the standard @someone notation below to reply to individuals with this one post.

            @All others

            I think some may have missed the point I was making about a potential problem with the PBCC implementation of VARIANT/array assignment.

            There is no question that this method is inefficient for my purposes, but, if the reference to it is done correctly from a coding point of view, it should not break something.

            @Bob Zale

            Thanks very much for your reply. I am aware of the dangers of VARIANT assignment. This program was mostly just a recreational attempt to try and understand exactly how inefficient, relatively speaking, VARIANT assignment is. In all truth, this it the first and only time I've ever even tried using a VARIANT in a PowerBASIC program.

            However, regarding your comments as to efficiency:

            I have changed my program to calculate the average copy time for 10 copies of a 50-million element LONG array, using each of the 3 techniques given, and discovered that, although the VARIANT assignment is always the slowest, it isn't dramatically slower. Average times on my machine hover around the .55 second mark for ~50-million +/- 10% (exact count chosen randomly) LONGS using VARIANT assignment, and around a .5 second average for the POKE$/PEEK$ technique. With careful selection of the blocksize, my pointer-based BLOCKCOPY routine, using a blocksize of 1-million bytes, can very consistently beat the POKE$/PEEK$ technique by about .15 seconds, achieving an average copy time of about .35 seconds.

            Seems to me that the major drawback of the VARIANT assignment technique is more about unnecessary memory usage than about raw speed, since it really isn't *that* much slower than the POKE$/PEEK$.

            The winner seems to be the pointer-based method, as far as using only the features of PowerBASIC.

            @Edwin Knoppert

            I will check out the MoveMemory() API right away, and add it to my tests. I wouldn't be at all surprised if this turned out to be the absolute fastest method. Thanks for your suggestion. I hadn't even thought about checking out the Windows API.

            (EDIT) Edwin - the MoveMemory is the winner by a wide margin! The CopyMemory MACRO (in the WIN32API.INC) file uses the same syntax as my BlockCopy routine. This method results in copies of 50-million LONGS in about .11 seconds. The best I have been able to do with pointers and fussy selection of block sizes is .33 seconds. MoveMemory API wins by a factor of 3!

            @Michael Mattias

            Yes - for my application, I need to maintain a separate copy of the original array, which is then compared for changes to the original array. I need to be able to consistently return the original array to its initial (populated, not REDIM'd) state after each set of changes. Your technique is interesting and looks very useful for certain situations, but is not applicable to my needs, if I am understanding it correctly. Thanks much for your reply.

            @Barry Erick

            Concerning pointers - yes, that's exactly why I decided to write the pointer-based BLOCKCOPY routine in the program I posted, and, in fact, as expected, it's the fastest method I've found so far.
            (EDIT) - pointers lose out to the Windows API - Windows' CopyMemory macro (in the WIN32API.INC file) is 3 times as fast as my best attempt with pointers and block copies
            using assignments to very long strings.

            Thanks to all for replying. I am still hoping for an explanation for the VARIANT/array assignment behavior which silently breaks subsequent POKE$/PEEK$ array copies.
            Last edited by Patrick Mills; 9 Jan 2009, 10:46 AM.

            Comment


            • #7
              Yes - for my application, I need to maintain a separate copy of the original array, which is then compared for changes to the original array. I need to be able to consistently return the original array to its initial (populated, not REDIM'd) state after each set of changes
              Based on the amount of process memory that requires to keep the original array in memory, I'd take a hard look at my design.

              While I am a self-confessed lover of memory-mapped files, I think an MMF might work well here. Save the original array in a disk file... since it's all LONG integers, the records are all the same size (SIZEOF(LONG), a syntax not added to CC5).

              When compare time comes, you just memory map and view the file, and refer to individual long integers by offset from the start of the file, Or use the the mapped view of the MMF as the underlying memory for an absolute array, technique shown here: Memory Mapped Files instead of RANDOM disk file access 5-8-04


              Advantages:
              1. Memory for the original array is now charged to Windows, not to your process.
              2. If you are low on the TOTAL memory available, you can MapViewOfFile in pieces for the compare, reducing RAM usage even more.
              3. You are not using valuable system memory for your process except when you need it, which should enhance performance for all processes on the system when you are not using it. If you want to verify the memory use improvement, you can always insert checks using Add Process Memory Usage Report to any program 1-12-04

              Disadvantages:
              1. You have to invest the time necessary to copy and paste my demo code and read the doc on MMFs. Wait a minute, reading up and learning about MMFs is an advantage, not a disadavantage!

              MCM
              Last edited by Michael Mattias; 10 Jan 2009, 10:59 AM.
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment

              Working...
              X