Announcement

Collapse
No announcement yet.

QB4.5 Vs PB3.5

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • QB4.5 Vs PB3.5

    On my 166Mhz running under windows 95 and launching from file manager, the following runs in 21+ seconds using QB4.5 and 63+ seconds using PB3.5.
    The code was compiled using $CPU 80386 and $OPTIMIZE SPEED

    DEFINT A - Z
    $STATIC

    DIM Arr(59, 255)

    PRINT TIMER

    FOR I = 0 TO 59
    FOR J = 0 TO 255
    Arr(I, J) = I + J
    FOR K = O TO 255
    SWAP Arr(I, J), Arr(I, 255 - J)
    FOR L = 0 TO 255
    NEXT L
    NEXT K
    NEXT J
    NEXT I

    PRINT TIMER

    Baring the use of ASM (which I can't see how it would speed it up any) is there a way to optimize the code to run faster than the QB4.5 code?

    ------------------


    [This message has been edited by Walt Decker (edited August 16, 2001).]
    Walt Decker

  • #2
    Greetings Walt!

    I'm not a brilliant programmer and I don't know of a better way to handle what you've done; but, I tested it on my 600Mhz--Win98 machine.

    PowerBasic for DOS 3.5
    Code:
    $CPU 80386
    $OPTIMIZE SPEED
    
    DEFINT A - Z
    $STATIC
    
    DIM Arr(59, 255)
    
    Start&= TIMER
    
    FOR I = 0 TO 59
       FOR J = 0 TO 255
         Arr(I, J) = I + J
            FOR K = O TO 255
               SWAP Arr(I, J), Arr(I, 255 - J)
                  FOR L = 0 TO 255
                  NEXT L
            NEXT K
       NEXT J
    NEXT I
    
    Ed&=TIMER
    
    print Ed&-Start&
    QuickBasic PDS 7.1
    Code:
    DEFINT A-Z
    
    DIM Arr(59, 255)
    
    Start& = TIMER
    
    FOR I = 0 TO 59
       FOR J = 0 TO 255
         Arr(I, J) = I + J
            FOR K = O TO 255
               SWAP Arr(I, J), Arr(I, 255 - J)
                  FOR L = 0 TO 255
                  NEXT L
            NEXT K
       NEXT J
    NEXT I
    
    Ed& = TIMER
    
    PRINT Ed& - Start&
    ...I averaged 13 seconds with PB and 45 seconds with QBX.

    Kurt Reonis
    [email protected]
    Donnie Ewald
    [email protected]

    Comment


    • #3
      It's possible to make virtually any code faster by the judicious use of ASM but,
      when the code does nothing meaningful, one must question the value of optimizing
      it.

      Optimization hint #1: remove the empty FOR L / NEXT L loop.


      ------------------
      Tom Hanlin
      PowerBASIC Staff

      Comment


      • #4
        Strange. Testing the code on my 800MHz machine I got the same kind of results.

        Tom, the empty FOR ... NEXT loop does something. I didn't fill it in because I'm trying to find the
        slowest piece of code and optimize that if I can. I really don't want to use ASM unless I absolutely have to.


        Besides, this isn't my code. It happens to be from a fellow who gained employment because I broke your rule about not posting employment information. He talked
        his employer into purchasing PB3.5 for the project he is working on and now has egg on his face.
        ------------------


        [This message has been edited by Walt Decker (edited August 16, 2001).]
        Walt Decker

        Comment


        • #5
          Walter, are you saying that you got him a job and now you are doing the programming for him too?

          Seriously, what exactly does the real code need to achieve? If this is just toy code, then we are wasting time... lets deal with the real code.

          For example, you could use pointers to give the code a step upward, etc.

          ------------------
          Lance
          PowerBASIC Support
          mailto:[email protected][email protected]</A>
          Lance
          mailto:[email protected]

          Comment


          • #6
            Different brand name compilers are desigend differently; it follows that there will be some things one compiler does better than another, and vice-versa; no compiler does everything better than another.

            As Lance says, take a look at your fundamental logic structure. Try another approach.

            MCM

            Michael Mattias
            Tal Systems Inc. (retired)
            Racine WI USA
            [email protected]
            http://www.talsystems.com

            Comment


            • #7
              QUOTE

              Walter, are you saying that you got him a job...

              UNQUOTE

              No, I did not get him a job. He read about my post here before Mr. Hanlin pulled it, contacted me for the information and that's the last I heard from him until last Friday.

              QUOTE

              ...and now you are doing the programming for him too?

              UNQUOTE

              No. He sent me that code only, except that the inner loop has code to read and compare bit pairs in each integer elememt.

              QUOTE

              Seriously, what exactly does the real code need to achieve?

              UNQUOTE

              I have no idea other than it has something to do with encrytion.

              QUOTE

              If this is just toy code, then we are wasting time... lets deal with thereal code.

              UNQUOTE

              I'd hate to waste your time. I'd think that with a common procedure like SWAP PB3.5 would execute at least as fast as QB4.5. I realize that PB3.5 is no longer Mr. Zales bread and butter but this code certainly reduces my admiration for the product.

              QUOTE

              For example, you could use pointers to give the code a step upward, etc.

              UNQUOTE

              Tried that. It helped a little, but not enough. I've also used a single dimension array and calculated the the offset myself. I've also unrolled the loops.



              ------------------
              Walt Decker

              Comment


              • #8
                The point here is that there are probably a number ways to make the code fly, but as this is not the "real code", any specific optimizations will need to be rewritten for the "real" code.

                You said pointers did not help you (but you did not say how you implemented the pointer code)... if you post your pointer-based code, I'll see if I can reduce the execution time without resorting to inline-assembler. How is that for an offer?
                I'd think that with a common procedure like SWAP PB3.5 would execute at least as fast as QB4.5. I realize that PB3.5 is no longer Mr. Zales bread and butter but this code certainly reduces my admiration for the product.
                Hang on a tic Walter, please try to keep to the subject at hand... we *are* trying to help you and your friend here.

                Firstly, the code works, right? You just want more speed for a particular section.

                Secondly, any given code snippet for another Basic language may or may not run faster in PB without change (because of the differences in compiler design), thats where optimization comes in. This applies to porting code between most types of compilers - thats what makes compilers different.

                Lastly, there *are* ways to optimize the code to speed it up... I'd put money on it.

                So, if you would be so good as to post the code you have got the best performance from to date, we'll see what we can do to help you further...


                ------------------
                Lance
                PowerBASIC Support
                mailto:[email protected][email protected]</A>
                Lance
                mailto:[email protected]

                Comment


                • #9
                  Walt, posts for PowerBASIC jobs might be acceptable here. However, the job posts you
                  have made have not been for PowerBASIC jobs and were hence ruled off topic. Let's not
                  grumble about it. If you want to have a generic BASIC job forum, you are free to set
                  up your own web site and, odds are, we'll be happy to provide a link to it.

                  As Michael notes, "no compiler does everything better than another". If PB/DOS
                  were identical to QB, there wouldn't have been much point in writing it. If you show
                  us your code, perhaps we can suggest improvements. If the only reason you prefer one
                  BASIC to another is the speed of the SWAP statement, I'm baffled, but that's certainly
                  your decision to make.

                  ------------------
                  Tom Hanlin
                  PowerBASIC Staff

                  Comment


                  • #10
                    My Findings:

                    I tested the above code on PB 3.5 and on QB 4.5 and PDS 7.1. PB ran the
                    snippet faster on my machine in fact it completed in a quarter of the time of
                    QB 4.5. I know you didn't want to use ASM, but I even tried replacing the
                    SWAP statement with a simple asmSWAP that I wrote. It did not increase the
                    speed by doing this. The only way I was able to come up with a very slight
                    increase in speed was by swapping the variables manually. Instead of a
                    statement like:

                    SWAP x, y

                    I used:

                    z = x: x = y: y = z

                    This caused a very slight increase in speed only about .5%.


                    Scott


                    ------------------
                    Scott Slater
                    Summit Computer Networks, Inc.
                    www.summitcn.com

                    Comment


                    • #11
                      I owe pb an appology. It seems that there are at least 2 computers on the planet on which pb executes at a snail speed compared to QB.

                      quote
                      ----------------------------
                      You said pointers did not help you (but you did not say how you implemented the pointer code)... if you post your pointer-based code, I'll see if I can reduce the execution time without resorting to inline-assembler. How is that for an offer?
                      ----------------------------

                      No. I said they didn't help "much." On my laptop they trimed 5+- seconds off the execution time of 63+- seconds. The way I implemented the pointers was by changing the m x n array to an m array, assigning 1 pointer to the array and then calculating the indices of the pointer for initilizing the array and swapping the values.


                      Mr. Hanlin
                      quote
                      ----------------------------
                      Walt, posts for PowerBASIC jobs might be acceptable here. However, the job posts you
                      have made have not been for PowerBASIC jobs and were hence ruled off topic.
                      ----------------------------

                      So what? A PB programmer gained employment because of it. PB made at least one sale because of it.

                      quote
                      -----------------------------
                      Let's not grumble about it.
                      -----------------------------

                      I'm not grumbling. I think the policy is a disservice to the people who frequent your forum and a detriment to PB in general. People who use PB are going to promote it, not only to other programmers but also to the company withwhich they are employed.

                      quote
                      -----------------------------
                      If you want to have a generic BASIC job forum, you are free to set
                      up your own web site and, odds are, we'll be happy to provide a link to it.
                      -----------------------------

                      I have a generic programming job topic.



                      ------------------
                      Walt Decker

                      Comment


                      • #12
                        I can quarantee and non graphical code in PB 3.5 will run faster
                        than QB when correctly used. Having been the on-line support guy
                        PB back in the DOS days, I proved that often. PBDLL also runs way
                        faster than vb and as fast and sometimes faster than c++
                        -- Barry


                        ------------------
                        Barry

                        Comment


                        • #13
                          I can quarantee and non graphical code in PB 3.5 will run faster
                          than QB when correctly used. Having been the on-line support guy
                          PB back in the DOS days, I proved that often. PBDLL also runs way
                          faster than vb and as fast and sometimes faster than c++
                          Having been around that long, surely you know that it's not the compiler, it's the programmer.

                          MCM

                          Michael Mattias
                          Tal Systems Inc. (retired)
                          Racine WI USA
                          [email protected]
                          http://www.talsystems.com

                          Comment


                          • #14
                            Just wondering ... (without having done any testing): If two people
                            are testing the same process and the results are not the same (as the
                            above posts show), they must be testing it with different parameters.
                            Might the compiler settings have an influence here? For example, did
                            you test with "Optimize Speed" and 80386 code?
                            I might be far away from the truth, but since nobody asked this question
                            so far...

                            Hans Ruegg

                            Comment


                            • #15
                              Different CPUs, especially CPUs by different companies, are optimized
                              differently... these days, it's not unusual to test two algorithms on
                              two different machines, and see algorithm 1 run faster on the first
                              machine, and algorithm 2 run faster on the second machine-- especially
                              when timing tiny snippets of code like this.

                              ------------------
                              Tom Hanlin
                              PowerBASIC Staff

                              Comment


                              • #16
                                I tested the *.exe on 3 different computers. It was compiled using $OPTIMIZE SPEED and $CPU 386. The three computers were 166MHz IBM ThinkPad laptop with a pentium II processor running WIN95B, an 800MHz custom-build desktop with a Pentium III processor running Win2000Pro, and a 33MHz custom-built 486 running Win95b.
                                The 800MHz machine and the 33MHz machine ran the PB code faster than the QB code. I also ported the PB-DOS code to PB/dll and believe it or not the DOS code beat the PB/dll code by about 2 seconds on the 800MHz machine and about 5 seconds on the 33MHz machine. On the 166Mhz machine, the QB code was faster by about 30 seconds.

                                ------------------
                                Walt Decker

                                Comment


                                • #17
                                  My apologies, Walt ... I realized you did test with these settings... so I had to do
                                  some more reasearch. (One other factor is if error testing is on or off, but I found
                                  this is nos the main point either.) - I did now some tests as well with the code you
                                  posted initially, and was surprised to find that QB really runs faster when compiled
                                  as EXE (although it is slower when running in the IDE). But I also found that when
                                  testing execution speed under Windows, there are many other factors influencing which
                                  you cannot control easily, for example what other programs are running at the same
                                  time.
                                  These were the results (all referring to your code compiled as EXE, on a Pentium II at
                                  266 MHz and under Windows 98):
                                  First test:
                                  QB 4.5: 25 seconds
                                  PB 3.5 (Errorchecking on): 59 seconds
                                  PB 3.5 (Errorchecking all off): 42 seconds
                                  Second test: (The difference in speed seems to be due to the fact that during the first
                                  test, the PB IDE was running in the background, while during the second test it was
                                  closed.)
                                  QB 4.5: 13 seconds
                                  PB 3.5 (Errorchecking all off): 29 seconds
                                  So I tested again in DOS mode:
                                  QB 4.5: 11.97 seconds
                                  PB 3.5 (Errorchecking all off): 28.72 seconds

                                  ...and it's not the SWAP statement nor anything about array handling,
                                  it's just the FOR ... NEXT loops by themselves! I took everything out
                                  of the code except the loops, and it took still 27 seconds to execute.

                                  So there might be a way of optimizing the loops by using inline assembler?
                                  Consider the following test code which resulted (on my system) in considerable
                                  speed differences. The first example is just PB code. The second performs INC
                                  and CMP instructions on memory references (counter variables). The third example
                                  moves the counter value to AX, performs INC and CMP on AX, and moves AX back
                                  to the variable in memory. In my tests, this third example was fastest.
                                  (First: 15.8 sec
                                  Second: 27.3 sec
                                  Third: 11.7 sec)

                                  Code:
                                  '-------------------------------------
                                  PRINT TIMER
                                  
                                  FOR i = 0 TO 32000
                                  FOR k = 0 TO 32000
                                        '(this loop is empty)
                                  NEXT k
                                  NEXT i
                                  
                                  PRINT TIMER
                                  
                                  ! MOV i, 0
                                  ! JMP EvalI
                                  LoopI:
                                    ! MOV k, 0
                                    ! JMP EvalK
                                  LoopK:
                                        '(this loop is empty)
                                  NextK:
                                    ! INC k
                                  EvalK:
                                    ! CMP k, 32000
                                    ! JLE LoopK
                                  NextI:
                                    ! INC i
                                  EvalI:
                                  ! CMP i, 32000
                                  ! JLE LoopI
                                  
                                  PRINT TIMER
                                  
                                  ! MOV AX, 0
                                  ! JMP EvalI2
                                  LoopI2:
                                    ! MOV AX, 0
                                    ! JMP EvalK2
                                  LoopK2:
                                        '(this loop is empty)
                                  NextK2:
                                    ! MOV AX, k
                                    ! INC AX
                                  EvalK2:
                                    ! MOV k, AX
                                    ! CMP AX, 32000
                                    ! JLE LoopK2
                                  NextI2:
                                    ! MOV AX, i
                                    ! INC AX
                                  EvalI2:
                                  ! MOV i, AX
                                  ! CMP AX, 32000
                                  ! JLE LoopI2
                                  
                                  PRINT TIMER
                                  '-------------------------------------
                                  Conclusion: The cause for the speed difference seems to be the way how the loops
                                  are constructed, and in this case, the way QB does it seems to be faster.
                                  If on some systems PB results faster, might it be due to differences in processor
                                  architecture? (For example, I had learnt that XOR AX, AX is faster than MOV AX, 0.
                                  But recently I was told that on some newer processors this is no longer true. Might
                                  the case discussed here be a similar case?)

                                  I hope this helps in some way.

                                  Hans Ruegg

                                  Comment


                                  • #18
                                    Just out of curiosity, have you tried doing something simple like adding X=1 inside the currently-empty loops? Some compilers recognize things like empty FOR/NEXT loops and do not actually compile or execute the code. This is considered (by some) to be an optimization, but I prefer compilers like PB that "do exactly what I say". I'd be surprised (but not amazed) if QB did that, but it would certainly account for the speed differences you are seeing.

                                    -- Eric


                                    ------------------
                                    Perfect Sync Development Tools
                                    Perfect Sync Web Site
                                    Contact Us: mailto:[email protected][email protected]</A>
                                    "Not my circus, not my monkeys."

                                    Comment


                                    • #19
                                      I am sorry, but I do not have PB and internet connection in the same place,
                                      so I cannot test it right now. Maybe later ... or someone else could take
                                      the time to do it?

                                      Hans Ruegg

                                      Comment


                                      • #20
                                        Ok, I compiled the Code below on PBDOS v3.2, PBDOS v3.5, QuickBASIC
                                        v4.5, and QBX v7. I even compiled it using the PDQ library for yet
                                        a 5th EXE file. The results were all very much the same for all 5
                                        variants of the program. Below are the speed results as on a Intel
                                        Celeron 333 machine running under Win98 (First Edition).

                                        QB v4.5 26.5898
                                        QBX v7 26.6406
                                        QBX w/PDQ 26.5732
                                        PB v3.2 26.5839
                                        PB v3.5 26.6388

                                        Even PDQ with its 1,154 byte EXE file still took 26 Seconds. So I
                                        think that this is pretty irrelevant at this point. I added a
                                        statement inside of the dead loop and used this code for all cases
                                        except PDQ where I had to change the arrays to suite its needs, but
                                        this still did not make any drastic improvements in speed.

                                        Code:
                                        DEFINT A-Z
                                         
                                        $STATIC
                                         
                                        DIM Arr(59, 255)
                                         
                                        TT# = TIMER
                                         
                                        PRINT TT#
                                         
                                        FOR I = 0 TO 59
                                          FOR J = 0 TO 255
                                            Arr(I, J) = I + J
                                            FOR K = O TO 255
                                              SWAP Arr(I, J), Arr(I, 255 - J)
                                              FOR L = 0 TO 255
                                                QQ& = QQ& + 1
                                              NEXT L
                                            NEXT K
                                          NEXT J
                                        NEXT I

                                        Scott


                                        ------------------
                                        Scott Slater
                                        Summit Computer Networks, Inc.
                                        www.summitcn.com

                                        Comment

                                        Working...
                                        X