Announcement

Collapse
No announcement yet.

Is SELECT CASE slower

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is SELECT CASE slower

    In a recent posting on high speed graphics Lance and others commented that SELECT is slower than other decision structures.

    I have a program in PB/DLL 6.0 which creates pseudo UDT's (ie the program doesn't know the UDT structure in advance) and then sorts them with a binary sort, this is controlled with SELECT so as it is time critical I converted it to ON GOTO, to my surprise it ran 20% slower. To be precise in my test data I created 2.25 million pseudo UDT's with a structure of string * 18, long integer, string * 2. The SELECT method sorted then in 3 min 50 sec, the ON GOTO took 4 min 58 sec. I verified with the counters srtlpcnt& and efldcnt& that both verions went through the two decision structures the same number of times, 508 million times for the first and 271 million times for the second.

    I then coded my own type of goto by building arrays of codeptr's to the labels and using a goto of the form :-
    GOTO DWORD sel4(selection number)
    this brought the time back to 3 min 52 sec. The claim that SELECT always using floating point seems strange as SELECT can also handle strings. From my time tests it would appear that the compiler is smart enough to construct something similar to the last method outlined. If this is the case then it would be even faster than IF - ELSEIF -ELSE constructs which do every test until they hit a match.

    I have no idea why the ON GOTO was so slow, does anyone know what is the fastest decision method?

    The code is below in SELECT form with the ON GOTO code remarked out.
    The type of field to be sorted is stored in the integer array ka
    Type 1 and 5 are strings, 2 and 6 are longs and 3 and 7 are double reals.
    Type 1, 2 and 3 are sorted ascending, 5, 6 and 7 descending, the first field must be ascending.
    Code:
    swaplen& = keylen& \ 4 - 1
    offset&= (xx& + 1) \ 2
    DO WHILE offset& > 0
        lmt& = xx& - offset&
        DO
            swtch& = 0
            FOR yy& = 0 TO lmt&
                eql& = -1
                swp& = 0
                srtlpcnt& = srtlpcnt& + 1
                ibp1 =VARPTR(ind1(0))+ yy& * keylen&
                wbp1 = VARPTR(ind1(0)) +(yy& + offset&) * keylen&
                SELECT CASE ka(0,0,kma)
                'ON ka(0,0,kma)GOTO sel05_1, sel05_2, sel05_3
                'GOTO ESel05
                    CASE 1
                    'sel05_1:
                       dwp1 = ibp1
                       dwp2 = wbp1
                       FOR u& = 0 TO sslen(0) - 1
                            IF @dwp1[u&] > @dwp2[u&] THEN
                               swp& = -1
                               swtch& = yy&
                               EXIT FOR
                            ELSEIF @dwp1[u&] < @dwp2[u&] THEN
                               eql& = 0
                               EXIT FOR
                            END IF
                        NEXT
                        'GOTO ESel05
                    CASE 2
                    'sel05_2:
                        wolp = ibp1
                        wlp1 = wbp1
                        IF @wolp > @wlp1 THEN
                           swp& = -1
                           swtch& = yy&
                        ELSEIF @wolp < @wlp1 THEN
                           eql& = 0
                        END IF
                        'GOTO ESel05
                    CASE 3
                    'sel05_3:
                        wodp = ibp1
                        wdp1 = wbp1
                        IF @wodp > @wdp1 THEN
                           swp& = -1
                           swtch& = yy&
                        ELSEIF @wodp < @wdp1 THEN
                           eql& = 0
                        END IF
               END SELECT
               'ESel05:
               zz& = 1
               DO WHILE eql& AND zz& <= numkeys&
                            efldcnt& = efldcnt& + 1
                            SELECT CASE ka(zz&,0,kma)
                            'ON ka(zz&,0,kma) GOTO sel04_1, sel04_2, Sel04_3, ESel04, sel04_5, sel04_6, sel04_7
                            'GOTO ESel04
                                CASE 1
                                'sel04_1:
                                    dwp1 = ibp1 + koff1(zz&)
                                    dwp2 = wbp1 + koff1(zz&)
                                    FOR u& = 0 TO sslen(zz&) - 1
                                        IF @dwp1[u&] > @dwp2[u&] THEN
                                            swp& = -1
                                            swtch& = yy&
                                            EXIT FOR
                                        ELSEIF @dwp1[u&] < @dwp2[u&] THEN
                                            eql& = 0
                                            EXIT FOR
                                        END IF
                                    NEXT
                                    'GOTO ESel04
                                CASE 2
                                'sel04_2:
                                    wolp = ibp1 + koff1(zz&)
                                    wlp1 = wbp1 + koff1(zz&)
                                    IF @wolp > @wlp1 THEN
                                        swp& = -1
                                        swtch& = yy&
                                    ELSEIF @wolp < @wlp1 THEN
                                        eql& = 0
                                    END IF
                                    'GOTO ESel04
                                CASE 3
                                'sel04_3:
                                    wodp = ibp1 + koff1(zz&)
                                    wdp1 = wbp1 + koff1(zz&)
                                    IF @wodp > @wdp1 THEN
                                        swp& = -1
                                        swtch& = yy&
                                    ELSEIF @wodp < @wdp1 THEN
                                        eql& = 0
                                    END IF
                                    'GOTO ESel04
                                CASE 5
                                'sel04_5:
                                    dwp1 = ibp1 + koff1(zz&)
                                    dwp2 = wbp1 + koff1(zz&)
                                    FOR u& = 0 TO sslen(zz&) - 1
                                        IF @dwp1[u&] < @dwp2[u&] THEN
                                            swp& = -1
                                            swtch& = yy&
                                            EXIT FOR
                                        ELSEIF @dwp1[u&] > @dwp2[u&] THEN
                                            eql& = 0
                                            EXIT FOR
                                        END IF
                                    NEXT
                                    'GOTO ESel04
                                CASE 6
                                'sel04_6:
                                    wolp = ibp1 + koff1(zz&)
                                    wlp1 = wbp1 + koff1(zz&)
                                    IF @wolp < @wlp1 THEN
                                        swp& = -1
                                        swtch& = yy&
                                    ELSEIF @wolp > @wlp1 THEN
                                        eql& = 0
                                    END IF
                                    'GOTO ESel04
                                CASE 7
                                'sel04_7:
                                    wodp = ibp1 + koff1(zz&)
                                    wdp1 = wbp1 + koff1(zz&)
                                    IF @wodp < @wdp1 THEN
                                        swp& = -1
                                        swtch& = yy&
                                    ELSEIF @wodp > @wdp1 THEN
                                        eql& = 0
                                    END IF
                            END SELECT
                            'ESel04:
                    zz& = zz& +1
                LOOP
                IF swp& THEN
                   dwp1 = ibp1
                   dwp2 = wbp1
                   FOR u& = 0 TO swaplen&
                       @dwps[u&] = @dwp1[u&]
                       @dwp1[u&] = @dwp2[u&]
                       @dwp2[u&] = @dwps[u&]
                   NEXT
                END IF
            NEXT
            lmt& = swtch&
        LOOP WHILE swtch&
        offset& = offset& \ 2
    LOOP
    ------------------


    [This message has been edited by John Petty (edited March 10, 2001).]

  • #2
    John --

    > The claim that SELECT always using floating
    > point seems strange as SELECT can also handle
    > strings.

    Well, I'm pretty sure that the claim was that all numeric SELECTs are performed with floating point math.

    But I have recently been told by PB R&D that I was mistaken about that. (The actual term they used was "drivel". ) That used to be true, and still is (I believe) in PB/DOS, but the more recent versions of the Windows compilers are smarter than that. They do in fact use the data type of the SELECT CASE statement for all comparisons.

    According to R&D, IF/THEN and SELECT CASE should execute at roughly the same speed (all other things being equal) with IF/THEN having a minor speed advantage.

    > If this is the case then it would be even faster
    > than IF - ELSEIF -ELSE constructs which do every
    > test until they hit a match.

    A SELECT CASE statement also has to perform every test until it hits a match. The only reason that ON GOTO is different is that only 256 values can be evaluated, so all 256 jumps (if you actually use that many) can be defined ahead of time. That's not true with SELECT CASE. PB has no choice but to try the tests, one by one, until it finds one that evaluates as True.

    As for ON GOTO being slower I'm very surprised by that. I guess it's possible that my "knowledge" about ON GOTO is out of date too, but it's hard to imagine that ON GOTO could be that inefficient. Time to write my own test program...

    By the way, if you put formatting tags before and after your test program like this:

    [ CODE ]
    Code:
    SOURCE CODE GOES HERE
    [ /CODE ]

    ...it will be much easier to read, and your indenting will be preserved. (I added spaces to the tags above so they would be visible. You would actually type them without the spaces.)

    -- Eric


    ------------------
    Perfect Sync Development Tools
    Perfect Sync Web Site
    Contact Us: mailto:[email protected][email protected]</A>



    [This message has been edited by Eric Pearson (edited March 10, 2001).]
    "Not my circus, not my monkeys."

    Comment


    • #3
      Oops I'm wrong again. The recent thread didn't specify "numeric" SELECT CASE statements when it talked about floats/integers/etc. My mistake, then and now.

      -- Eric

      P.S. Lance probably won't be jumping in to this thread any time soon... He got married on Friday and is on his honeymoon.



      [This message has been edited by Eric Pearson (edited March 10, 2001).]
      "Not my circus, not my monkeys."

      Comment


      • #4
        I have tried them all in attempts to optimize my language translator
        as much as possible and found that SELECT CASE actually was a bit faster
        at comparing BYTE pointers. At least in my code it was. Therefore, I
        couldn't understand why people said it should be slower..

        (no, GOTO was impossible to use/test. Too complicated code for that..)

        ..and in case you still can't keep away from here - congratulations Lance!


        ------------------

        Comment


        • #5
          Eric
          Thanks for the code delimiters (have now edited), I new they existed but aren't listed in the bulletin board FAQ.
          If the SELECT tests until it gets a match then then the 3rd metod I described of using the value as an index into an array of code pointers should be miles faster.
          Just goes to show how good the compiler really is

          ------------------

          Comment


          • #6
            Hi Eric,

            > Time to write my own test program...

            OK, I have some approximate results.

            My current project requires scanning of a byte stream in the minimum time. I originally started coding it in
            assembly language, but the complexity is such that I just became overwhelmed. So I'm using PBDLL6.

            Anyway, I've built 3 separate versions of a test program which walks through the bytes of the Win32 help file,
            which on my machine is 12,312 KB in size. The test program versions differ only in the way the current byte is
            disposed of - they use one of
            Code:
              ON...GOTO
              
              IF...THEN
              
              SELECT CASE
            Each version of the test program is over 800 lines (purely due to having an explicit branch for each value in
            the range 0 to 255), so I haven't posted them, but I've put the program skeleton below.

            The results I obtained were for the number of API ticks taken by the scanning process in each case. This is in
            no way a scientific analysis - just a 'look see'. The results (number of ticks) I got were -
            Code:
              Method        Test 1    Test 2     Test 3
              ------        ------    ------     ------
            	
              ON...GOTO       1172      1152       1162
              
              IF...THEN       5729      5748       5729
              
              SELECT CASE     5869      5848       5878
            This is the program skeleton - no frills, just a testbed -
            Code:
            #compile  exe
            #register none
            #include  "win32api.inc"
            
            function pbmain() as long
            
              dim   mCounts(0 to 255) as long                'something to do work on
            	
              local sBuffer           as string              'for file contents
              local b                 as byte                'the current char
              local p                 as byte ptr            '-> current char
              local pLastChar         as byte ptr            '-> last char in buffer 
            	
              local mTicks            as long                'API tick count
            	
              'Buffer the target file
              open "D:\PBStuff\Help Files\win32.hlp" for binary access read as #1
              get$ #1, lof(1), sBuffer
              close #1
            	
              'Initialise...
              p = strptr( sBuffer )
              pLastChar = p + len(sBuffer) - 1
              decr p                                         'in readiness looping
              
              'Buffer scan starts here...
              mTicks = GetTickCount()
            
            NextChar:
            
              if p < pLastChar then incr p else goto Done
              b = @p
            
            '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            'Individual byte values get handled here ; regardless of which
            'of the 3 methods is used, the action performed against a char is -
            '
            ' incr mCounts(actual char value)                'do something with char
            ' goto NextChar
            '
            ' ...
            '
            '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            
            Done:
              msgbox "Buffer scan took " & format$(GetTickCount() - mTicks) & " ticks"
            	
            end function
            Cheers -

            Paul




            [This message has been edited by Paul Noble (edited March 11, 2001).]
            Zippety Software, Home of the Lynx Project Explorer
            http://www.zippety.net
            My e-mail

            Comment


            • #7
              Paul --

              Thanks! Those are the results that I would expect, knowing what I know now. IF/THEN is slightly faster than SELECT CASE, and ON GOTO beats the pants off both of them.

              -- Eric


              ------------------
              Perfect Sync Development Tools
              Perfect Sync Web Site
              Contact Us: mailto:[email protected][email protected]</A>
              "Not my circus, not my monkeys."

              Comment


              • #8
                Paul --

                Thinking about your test, I wrote a program to calculate the average ASCII value in the Win32.HLP file (am I a geek or what?) and it is around 78. Since the average ASCII value in a truly random data stream would be 127.5 your test was inadvertently biased in favor of the IF/THEN and SELECT tests. Since IF and SELECT perform tests sequentially, if the data has a low average value then they will both "find a match" faster, on average. The speed of ON GOTO is not dependent on the average test value or the number of "target" values that are being used. You can test for 256 different values just as rapidly as you can test for 2.

                Win32.HLP is a valid test for a data stream that is primarily text, but I suspect that if you repeat your tests using a ZIP file (or RND(0,255) for that matter) that ON GOTO will be almost a full order of magnitude (1,000%) faster than either of the other methods. That result would be right in line with the ON GOTO tests I performed back in my PB/DOS days, oh so many clock-ticks ago.

                -- Eric


                ------------------
                Perfect Sync Development Tools
                Perfect Sync Web Site
                Contact Us: mailto:[email protected][email protected]</A>
                "Not my circus, not my monkeys."

                Comment


                • #9
                  Eric,

                  That's a very good point about the ASCII bias. Before your advice earlier in the week,
                  I had been using SELECT CASE in the project and had in fact made a note to do 'tuning'
                  runs over sample data (it's all text) before release, to try to establish character
                  frequencies and promote the commonest ones to the top of the SELECT CASE.

                  Since then I've converted the core logic to use ON...GOTO, and as you say, ASCII value
                  is no longer an issue. It had escaped my notice for the examples, though.

                  > oh so many clock-ticks ago

                  Cheers -
                  Zippety Software, Home of the Lynx Project Explorer
                  http://www.zippety.net
                  My e-mail

                  Comment


                  • #10
                    John's code still begs the question as to why ON GOTO produced worse
                    performance.

                    I'm wondering if it might have something to do with the ratio of
                    evaluations that meet the criteria vs. those that fail?

                    Paul, Eric -> Your samples hit one of the defined jump labels 99%
                    of the time, didn't they?

                    It appears from John's code that the ON GOTO will fail at a higher
                    rate. Could that make any difference?

                    John, would mind replacing :

                    Code:
                    'ON ka(0,0,kma)GOTO sel05_1, sel05_2, sel05_3
                    'GOTO ESel05
                    with:

                    Code:
                    'ON ka(0,0,kma)GOTO sel05_1, sel05_2, sel05_3, ESel05, ESel05, ESel05, ESel05
                    and see if it makes any difference? I'm assuming the value of ka()
                    always ranges from 1-7.


                    BTW, I experienced a 33% performance boost when I changed a SELECT
                    CASE to ON GOTO in a section of my code.

                    ------------------
                    Bernard Ertl

                    [This message has been edited by Bern Ertl (edited March 11, 2001).]
                    Bernard Ertl
                    InterPlan Systems

                    Comment


                    • #11
                      Bern, Eric
                      Between both your comments I think I now know why program ran slower with ON GOTO.
                      The evaluation succeeds 100% of the time as the types are preset, however applying Eric's ASCII bias the data I used (real not done for testing) matches to the first option in the first SELECT CASE 100% of the time and so it is only doing one compare, in the second SELECT CASE it will always match the first or second!

                      ON GOTO has to do range checking before it can jump as 0 or any number larger than the number of choices are ignored. This would explain why the third method I used, of constructing an array of codepointers and using the value as an index brought the speed back, I didn't do any range checking.

                      I was surprised at my results, as the ON GOTO seemed a logical solution. In my case where range checking is not necessary as the values will always match a choice then building an array of code pointers will by far be the fastest in the same way that PB gains speed by not doing bounds checking on arrays. BTW this method also allows more than 255 choices


                      ------------------

                      Comment

                      Working...
                      X