Announcement

Collapse
No announcement yet.

Very slow VIRTUAL arrays

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Very slow VIRTUAL arrays

    I have a question about the speed of operations with VIRTUAL arrays.
    The following test code, using static arrays, executed on my system in less
    than 0.05 seconds:

    DIM a?(32000), b?(32000)
    CLS
    t! = TIMER
    FOR j% = 1 TO 10
    FOR i% = 0 TO 32000
    IF a?(i%)<>b?(i%) THEN x = 0
    NEXT i%
    NEXT j%
    PRINT USING "##.####"; TIMER-t!

    The following code, which differs from the above only in the statement "VIRTUAL",
    took 13.5 seconds to execute, this is 270 times slower:

    DIM VIRTUAL a?(32000), b?(32000)
    CLS
    t! = TIMER
    FOR j% = 1 TO 10
    FOR i% = 0 TO 32000
    IF a?(i%)<>b?(i%) THEN x = 0
    NEXT i%
    NEXT j%
    PRINT USING "##.####"; TIMER-t!

    This last code, which performs the same operation reading data from disk,
    took 5.2 seconds:

    OPEN "TEST1" FOR BINARY AS #1
    OPEN "TEST2" FOR BINARY AS #2
    CLS
    t! = TIMER
    FOR j% = 1 TO 10
    FOR i% = 0 TO 32000
    GET #1, ,a?
    GET #2, ,b?
    IF a?<>b? THEN x = 0
    NEXT i%
    NEXT j%
    PRINT USING "##.####"; TIMER-t!
    CLOSE

    So this is the question: I understand that comparisons between two different
    VIRTUAL arrayas require multiple steps of memory remapping. But I never
    imagined that these additional steps would result in a factor of 270 compared
    with static arrays, and even slower than access to the hard disk! Is there not
    a way to make access to VIRTUAL arrays faster?

    Hans Ruegg

  • #2
    The effect of page swapping is certainly going to be large when you consider that for every single comparison in your loop that EMS paging must occur. Paging takes a finite amount of time because the EMM manager must swap at least one EMS "page" of 4K or more. Compared to array access in conventional memory where there are no comparable overheads, this will be much slower.

    Virtual arrays do work, but if you want performance then you'll need to revise your approach to this type of problem.

    The most effective technique is to copy "large" blocks of memory from EMS into conventional memory and work with these blocks without the overheads of EMS. My approach would be to allocate a fixed length string array in EMS and copy this section by section into conventional memory to work with.

    To further increase performance with this block of memory, we can add in the use of indexed pointers... they can be used to speed up all sorts of code quite dramatically if applied correctly.

    The following is my optimized version of your code, which executes in a fraction of a second.
    Code:
    DIM virtual a(0) as string * 32000
    DIM virtual b(0) as string * 32000
    DIM ap as byte ptr
    dim bp as byte ptr
     
    CLS
    t! = TIMER
    a$ = a(0)
    b$ = b(0)
    ap = strptr32(a$)
    bp = strptr32(b$)
    FOR j% = 1 TO 10
      FOR i% = 0 TO 32000
        IF @ap[i%] <> @bp[i%] THEN x = 0
      NEXT i%
    NEXT j%
    PRINT USING "##.####"; TIMER-t!
    ------------------
    Lance
    PowerBASIC Support
    mailto:[email protected][email protected]</A>
    Lance
    mailto:[email protected]

    Comment


    • #3
      The loss of speed is indeed tremendous when using VIRTUAL arrays, therefore hardly of any use (for me). I experienced the slow memory access when applying a filter effect on an image stored in a virtual array.

      As I understand, for each read or write operation to a virtual array, a chunk of 16kb is transferred. That's the way EMS is accessed. So it's best to read and write large chunks in one time...

      ------------------
      Sebastian Groeneveld
      mailto:[email protected][email protected]</A>
      Sebastian Groeneveld
      mailto:[email protected][email protected]</A>

      Comment


      • #4
        Lance,

        I am experimenting with your technique to speed up one
        of my programs that keeps a lot of data in virtual arrays.

        If I understand you correctly, we can use 32-bit pointers
        to get data out of EMS directly and the program will not
        do any of that page swapping to conventional memory.
        Is that right?

        In your example you get a pointer value outside the loop
        ( ap = strptr32(a$) ) Can you take a pointer, evaluate
        it once to the array start point, and then increment it
        yourself in a loop? Will it eventually run into a segment
        boundary and become invalid?

        This is an example of the situation:

        TYPE atype
        L AS LONG
        R AS SINGLE
        END TYPE
        DIM VIRTUAL VV(20000) AS atype
        DIM PL as LONG PTR
        DIM PX as LONG PTR

        FOR j& = 0 TO 15000
        VV(j&).L = j&
        NEXT j&

        PL = varptr32(VV(0))
        FOR j& = 0 TO 15000
        PX = varptr32(VV(j&))
        K& = @PX
        L& = @PL
        IF j&<10 THEN : PRINT j&,PX,PL,K&,L&
        IF L& <> j& THEN
        PRINT "mismatch"
        PRINT j&,PX,PL,K&,L&
        EXIT FOR
        END IF
        INCR PL,8
        NEXT j&

        To get the correct value of VV( ) seems to require
        using @PX which is from a VARPTR32( ) evaluated inside the loop
        rather than my @PL which comes from incrementing a starting
        pointer value.



        ------------------




        [This message has been edited by Larry Shelton (edited May 26, 2000).]

        Comment


        • #5
          [quote]If I understand you correctly, we can use 32-bit pointers
          to get data out of EMS directly and the program will not
          do any of that page swapping to conventional memory.
          Is that right?{/quote]
          Yes you can, but you cannot guarantee that the EMS page will not be switched if you reference other EMS array elements, or some tother EMS-related even occurs, etc. Therefore, the most reliable way to is to copy sections of the virtual array into conventional memory, where you can control it, use pointers, etc. This is my reason for using 32-kbyte strings, which I treat as a collection of BYTE's, etc. By using strings in this way, you can mode large chunks of data between conventional and virtual memory with ease. Of course, if performance is not an issue, just work with the elements directly through the PB RTL.

          In your example you get a pointer value outside the loop
          ( ap = strptr32(a$) ) Can you take a pointer, evaluate
          it once to the array start point, and then increment it
          yourself in a loop? Will it eventually run into a segment
          boundary and become invalid?
          Yes, segments much be respected, but if you "normalize" a pointer before using it, you'll have to exceed 64K to cross the next segment boundary. For example, a HUGE conventional array will often have some space at the segment boundary that is not part of the array, and indexed pointer code needs will need to take this into account.

          In my example, I used a dynamic string to hold a conventional memory copy of the virtual memory data - dynamic strings can only be 32750 bytes or less, and therefore they represent no problem at all.

          This is an example of the situation:
          TYPE atype
          L AS LONG
          R AS SINGLE
          END TYPE
          <snip>
          Your code will fail because it does not respect the segment and page boundaries of the EMS page frame. Use this approach at your own risk!

          To get the correct value of VV( ) seems to require
          using @PX which is from a VARPTR32( ) evaluated inside the loop
          rather than my @PL which comes from incrementing a starting
          pointer value.
          Exactly! This is because the PB RTL will give you the address of the array element, but the address that was requested several thousand iterations of the loop ago will likely be invalid - this is because you cannot determine when the EMM manager will perform a EMS page swap. Again, use this approach at you own risk. The only way to ensure success in all circumstances is to copy the data to coventional memory, or just work with the array elements in pure basic code instead.

          In summary: Performance can be had but the price is more code - just like any other programming problem, more code can often result in better performance!

          OTOH, if performance is not critical, then don't worry - virtual arrays will work fine as they are!


          ------------------
          Lance
          PowerBASIC Support
          mailto:[email protected][email protected]</A>
          Lance
          mailto:[email protected]

          Comment

          Working...
          X