Announcement

Collapse
No announcement yet.

ARRAY SORT Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ARRAY SORT Performance

    I have been touting the merits of PowerBasic for some time at work.
    I got my copies of DLL 6.0 can CC 2.0 and have been playing with them at home.
    I wanted to show off at work so I brought everything in.
    I compiled the example VBSORT.DLL and passed it around.

    They asked me to code up and example of how to use it.
    I sorted 48100 strings of 194 characters each in 13 seconds.
    The VB 5.0 program reads in the array and passes it to the DLL for sorting.
    I thought that the low performance must be due to DLL loading and unloading.
    I put a button on my form to start the process so that I could do it several times without stopping the program.

    I was then asked to compare with a pure VB routine to sort the array.
    Imagine my shock when QSORT sorted the array in 3 seconds.

    What did I do wrong? Why was VB so much faster?
    Help! I have egg on my face BIG TIME.


    ------------------
    ATB

    Charles Kincaid

  • #2
    Loading and unloading DLLs does involve overhead. What you're most likely running
    into here is a perversity of VB's string handling. 32-bit versions of VB use Unicode
    strings internally, but convert to/from the more common ANSI string format when
    passing strings to DLLs. Major, major overhead here.

    If practical, avoid the step of having VB read in the array, and do that directly in
    PowerBASIC. That will avoid the conversion process.

    If your code isn't too large, feel free to post it here, and perhaps we can come up
    with other suggestions.


    ------------------
    Tom Hanlin
    PowerBASIC Staff

    Comment


    • #3
      Originally posted by Tom Hanlin:
      If practical, avoid the step of having VB read in the array, and do that directly in
      PowerBASIC. That will avoid the conversion process.
      Or do your time calculation (and MsgBox ?) in the dll procedure.


      [This message has been edited by Peter Scheutz (edited April 23, 2001).]
      Best Regards
      Peter Scheutz

      Comment


      • #4
        found the following in poffs, but not here. maybe moved to the archives?
        anyway, some tips from dave navarro, that may be of help:
        ----------------------------------------------------------------
        'message
        'forum: powerbasic for windows
        'topic: vb string array tip
        'name: dave navarro, member
        'date: july 30, 1999 03:33 pm

        if you are passing a visual basic string array to a powerbasic dll, you may
        have noticed that it can take vb some time to convert all those unicode strings
        into ansi and then back again.

        the most efficient way to handle string arrays is to actually dim them as
        global in your dll where the entire array doesn't ever have to be converted.
        then you simply ask the dll for individual strings in the array when you
        actually need them.

        however, sometimes that's not practical. you must have the array in vb.

        lets say that you need to sort a 30,000 element string array. a quick-sort
        routine in vb code sorts the array in about 8 seconds. array sort in
        powerbasic does it in 2 seconds. but, when you add the conversion from unicode
        to ansi (before the sort) and ansi back to unicode (after the sort) the total
        time for the call to pb/dll is 12 seconds. you lose 4 more seconds instead of
        gaining. do you give up pb/dll? maybe not...

        if you dimension the array starting at element zero and start your data at
        element one, you can pass *just* element zero to your dll as a single string by
        reference. vb then only converts that single string from unicode to ansi, not
        the entire array. the remaining elements remain in unicode format.

        array sort in powerbasic doesn't care what format the data in each string is
        in. it does a binary sort, so using array sort on an array of unicode strings
        works the same as if the strings had all been converted to ansi.

        first, our powerbasic code:

        sub sortstrings(byval dummystring as long) export
        'dummystring is a pointer to the element zero which is never actually used
        'we add four to that (handles are four bytes long) to get the address of the
        'first valid string
        dummystring = dummystring + 4

        'then we dimension a powerbasic string array at the same location in memory
        'as the vb array
        dim a(1 to 30000) as string at dummystring

        'now we can sort all the unicode strings
        array sort a()

        'and we're done
        end sub

        in visual basic we declare our sub:

        declare sub sortstrings lib "our.dll" alias "sortstrings" (byref dummy$)

        we dimension our visual basic array starting at element zero:

        dim a$(0 to 30000)

        and when we're ready to sort the array, we simply pass the unused element zero:

        call sortstrings(a$(0))

        and now our array is sorted. vb only converted the unused element zero to
        ansi, not the whole array. so instead of 12 seconds, we sort the array in 2.

        i should note that i have only tested this on machines with the english version
        of windows installed. i don't know if the same technique would work with other
        languages.

        --dave


        ------------------

        Comment


        • #5
          Ah Ha! That would explain it. I'll recode the DLL example to have
          it pass the Lbound and Ubound along with the address of the Lbound
          element. That should avoid the conversions.

          Thanks for the tip. I'll post the results when done.


          ------------------
          ATB

          Charles Kincaid

          Comment


          • #6
            Borje,

            Thanks for digging out that tidbit of info! That will certainly
            speed up some of VB code as well.



            ------------------
            Paul Squires
            mailto:[email protected][email protected]</A>
            Paul Squires
            FireFly Visual Designer (for PowerBASIC Windows 10+)
            Version 3 now available.
            http://www.planetsquires.com

            Comment


            • #7
              Why is it that Dave Navarro's code (shown in Borje's post) increments
              dummyString by 4 when unicode uses 2 bytes per character? It seems
              like dummyString should be incremented by 2 in order for it to point
              to a(1). Could someone please clarify this for me?

              dummyString = dummyString + 4

              ------------------


              [This message has been edited by Charles Dietz (edited April 23, 2001).]

              Comment


              • #8
                Charles,

                The answer is in Dave's post.

                The array is actually a DWORD/LONG array of POINTERS (handles) to
                the actual strings. So, a$(0) is a pointer to the string for a$(0),
                a$(1) is a pointer to the string for a$(1), etc.

                DWORDS/LONGS are four bytes in size.



                ------------------
                Bernard Ertl
                Bernard Ertl
                InterPlan Systems

                Comment


                • #9
                  I wrote a program in CC to read in my file and sort it.
                  The problem then turned out to be that I needed a time display that
                  displays fractional seconds.

                  100,000 lines of 194 characters each sorts in 1.35 seconds.

                  Now I need a way to find out the path and name of the
                  executable that I am running. In VB it was easy.
                  App.Path gave me the path and App.ExeName gives me the file name.

                  Anybody know the PB equivalent?


                  ------------------
                  ATB

                  Charles Kincaid

                  Comment


                  • #10
                    Charles,

                    These came from the PB community, but I don't remember where.

                    Hope they help.

                    Regards,
                    --Bob

                    '==============================================================================
                    FUNCTION GetFullExePath() AS STRING
                    'Returns full path and App name.
                    ' usage: ret$ = GetFullExePath()
                    ' ret$ = c:\bobs\bobs.exe
                    LOCAL TmpAsciiz AS ASCIIZ * %MAX_PATH
                    GetModuleFileName GetModuleHandle(BYVAL 0&), TmpAsciiz, 255
                    FUNCTION = TmpAsciiz
                    END FUNCTION
                    '==============================================================================
                    FUNCTION GetFullAppPath() AS STRING
                    'Returns full App path.
                    ' usage: ret$ = GetFullAppPath()
                    ' ret$ = c:\bobs\
                    LOCAL TmpAsciiz AS ASCIIZ * %MAX_PATH
                    GetModuleFileName GetModuleHandle(BYVAL 0&), TmpAsciiz, 255
                    FUNCTION = LEFT$(TmpAsciiz, INSTR(-1, TmpAsciiz, "\"))
                    END FUNCTION
                    '==============================================================================
                    FUNCTION GetFullAppName() AS STRING
                    'Returns full App name.
                    ' usage: ret$ = GetFullappName()
                    ' ret$ = bobs.exe
                    LOCAL TmpAsciiz AS ASCIIZ * %MAX_PATH
                    GetModuleFileName GetModuleHandle(BYVAL 0&), TmpAsciiz, 255
                    FUNCTION = MID$(TmpAsciiz, INSTR(-1, TmpAsciiz, "\") + 1)
                    END FUNCTION
                    '==============================================================================


                    ------------------
                    "It was too lonely at the top".

                    Comment


                    • #11
                      Those will work, but I'd suggest:
                      For 255, substitute %MAX_PATH.
                      For GetModuleHandle(BYVAL 0&), just use 0 (zero).

                      ------------------
                      Tom Hanlin
                      PowerBASIC Staff

                      Comment


                      • #12
                        Originally posted by Bob Houle:
                        Charles,

                        These came from the PB community, but I don't remember where.

                        Hope they help.

                        Regards,
                        --Bob
                        Thanks Bob, they did help. I'll go back and put in Tom's enhancment.
                        Thanks Tom. %MAX_PATH might be smaller than 255. If so, boom!

                        I never did get the code from Dave Navarro to work. Guess I'll have to write
                        everything in PB from now on.



                        ------------------
                        ATB

                        Charles Kincaid

                        Comment


                        • #13
                          %MaxPath = 260

                          Hardcoding 255 as a path limit is the real dangerous behavior!

                          ------------------
                          Lance
                          PowerBASIC Support
                          mailto:[email protected][email protected]</A>
                          Lance
                          mailto:[email protected]

                          Comment

                          Working...
                          X