Announcement

Collapse
No announcement yet.

Integer performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Integer performance

    (sorry for my terrible written english)

    I'm porting a software from Microsoft Basic 7.1 PDS to PowerBasic 3.5 DOS.
    With just some modification to the code, things are working pretty well. But, I have a performance issue: I found the PB compiled program running slower thant the MS version!
    After a little investigation, it seems to me that working with integer arrays it's not so fast with PowerBasic.

    An example:

    Code:
    DEFINT A-Z
    
    DIM a(1000)
    DIM b(10000)
    
    t! = TIMER
    
    FOR r = 1 TO 5
        PRINT r
        FOR i = 1 TO 1000
            c = 0
            FOR j = 1 TO 10000
                c = c + 1
                a(i) = c
                b(j) = a(i)
            NEXT j
        NEXT i
    NEXT r
    
    tt! = TIMER
    
    PRINT tt! - t!
    On a P3/550Mhz, under WinXP, the times are:

    MS Basic: 0.8 Sec.
    PB: 4.61 Sec.

    Using LONG instead of INTEGER (DEFINT -> DEFLNG):

    Ms Basic: 10.10 Sec.
    PB: 10.21 Sec.

    I enable 386 code generation, disabled all errors testing, disabled CTRL-Break checking, etc.

    I'm doing something wrong? Why a so big difference with integers?

    Thx for any info,
    Bye!

    ------------------
    May the Force be with you!
    -- The universe tends toward maximum irony. Don't push it.

    File Extension Seeker - Metasearch engine for file extensions / file types
    Online TrID file identifier | TrIDLib - Identify thousands of file formats

  • #2
    Sorry to bother but... I have received a notification from the forum saying that Tony Burcham has replayed... but I can't see any message. What's happening?

    Thx, Bye!



    ------------------
    May the Force be with you!
    -- The universe tends toward maximum irony. Don't push it.

    File Extension Seeker - Metasearch engine for file extensions / file types
    Online TrID file identifier | TrIDLib - Identify thousands of file formats

    Comment


    • #3
      Firstly, benchmarking with simplistic (non-real-world) code like this is prone to misleading results because many factors can influence the result. For example, the brand of CPU can have a big influence on small-loop orientated test code, since the compiled code may execute better on some brands/configurations than others.

      Next, this code is performing 100,000,000 array allocations. To perform that in 0.8 seconds seems implausible in real mode unless you are using a VERY fast PC, so there is likely to be some other factor involved in the supposedly "fast" MS time test, such as a compiler optimization that able to exploit this simplistic test code. To test this theory, add a PRINT statement into the inner FOR loop and run it again (it'll greatly extend the execution time but you can be sure all the loops are being executed by the MS BASIC version).

      BTW, in my brief tests, the 100,000,000 (1000 * 10000 * 5 * 2) array allocations takes 2.6 Sec without error testing enabled, and 16.5 secs with it enabled. Optimizing for speed shaves around .5 seconds off too, but only if error testing is enabled (it has an almost unmeasurable effect when disabled).

      Now, you mention that you disabled error testing, but as error testing will have a big impact in this type of code (there will be a lot of numeric overflow testing and array bounds testing code), the best way to be sure your results are correct it to use metastatements to control code generation... that way compiling with the command-line compiler will give the same results are compiling with the IDE.

      Here is my "final" version of your code... it does the 100 Million array allocations in about 2.5 seconds on my PC. If I change DEFINT to DEFLNG, it takes 5.1 seconds to run... just on double the time, which reflects the doubled data size (2 bytes -> 4 bytes).

      Please try this variant, and see how it compares with your previous results?

      Thanks!
      Code:
      $error all off
      $optimize speed
      $option cntlbreak off
      $cpu 80386
      $debug map off
      $event off
      $lib all off
      DEFINT A-Z
      DIM STATIC a(1000)
      DIM STATIC b(10000)
      t! = TIMER
      FOR r = 1 TO 5
          PRINT r
          FOR i = 1 TO 1000
              c = 0
              FOR j = 1 TO 10000
                  c = c + 1
                  a(i) = c
                  b(j) = a(i)
              NEXT j
          NEXT i
      NEXT r
      tt! = TIMER
      PRINT tt! - t!


      ------------------
      Lance
      PowerBASIC Support
      mailto:[email protected][email protected]</A>
      Lance
      mailto:[email protected]

      Comment


      • #4
        Originally posted by Lance Edmonds:

        Code:
        $error all off
        $optimize speed
        $option cntlbreak off
        $cpu 80386
        $debug map off
        $event off
        $lib all off
        DEFINT A-Z
        DIM STATIC a(1000)
        DIM STATIC b(10000)
        t! = TIMER
        FOR r = 1 TO 5
            PRINT r
            FOR i = 1 TO 1000
                c = 0
                FOR j = 1 TO 10000
                    c = c + 1
                    a(i) = c
                    b(j) = a(i)
                NEXT j
            NEXT i
        NEXT r
        tt! = TIMER
        PRINT tt! - t!
        There is no need to use the variable 'c' as all it does is run up the 'tic' bill a bit.
        Code:
          FOR j = 1 TO 10000
            a(i) = j
            b(j) = j
          NEXT j
        All in all I'd say this was/is a pretty bad chunk of code to test on.



        ------------------
        C'ya
        d83
        C'ya
        Don

        http://www.ImagesBy.me

        Comment


        • #5
          Originally posted by Lance Edmonds:
          Firstly, benchmarking with simplistic (non-real-world) code like this is prone to misleading results because many factors can influence the result. For example, the brand of CPU can have a big influence on small-loop orientated test code, since the compiled code may execute better on some brands/configurations than others.
          Yes, you are right. But I have experimented such "strange" (in the sense that I believed to see the program running faster) performance issue with a real world application, and then simply tried to post some code, as little as possibile, to show what I mean.

          some other factor involved in the supposedly "fast" MS time test, such as a compiler optimization that able to exploit this simplistic test code. To test this theory, add a PRINT statement into the inner FOR loop and run it again (it'll greatly extend the execution time but you can be sure all the loops are being executed by the MS BASIC version).
          You are right again. In facts, that difference is too big. With the software I was porting the difference is around 2x, at max.

          Please try this variant, and see how it compares with your previous results?
          No difference. I have already played with all the optimization swtiches & metacommands before posting.

          Thanks for your time and the fast reply, I will try to test something more, and I'll keep you informed.

          Thanks!

          Bye!

          ------------------
          May the Force be with you!
          -- The universe tends toward maximum irony. Don't push it.

          File Extension Seeker - Metasearch engine for file extensions / file types
          Online TrID file identifier | TrIDLib - Identify thousands of file formats

          Comment


          • #6
            Originally posted by Don Schullian:
            There is no need to use the variable 'c' as all it does is run up the 'tic' bill a bit.
            I agree... I'have put it there only to do something more. Removing it, even with the MS compiler, change the final execution time.

            All in all I'd say this was/is a pretty bad chunk of code to test on.
            As I sayied, this was only a little piece of code to show some difference... The soft I was experimenting with, and porting, is much more "pretty", I hope. It's basically a MARS for CoreWars, written more than 10 years ago. http://mark0.ngi.it/xrk/index.html

            Bye!


            ------------------
            May the Force be with you!

            [This message has been edited by Marco Pontello (edited October 13, 2002).]
            -- The universe tends toward maximum irony. Don't push it.

            File Extension Seeker - Metasearch engine for file extensions / file types
            Online TrID file identifier | TrIDLib - Identify thousands of file formats

            Comment


            • #7
              Just some more experimenting.
              Sorry for any mistake, I'm new to PB and just thinkering around with the compiler. The Company I work for have buyed PB/DOS to see if it could help us speedup a lot of old MS BC soft we are using.

              Take this little (and simplicistic) piece of code:
              Code:
              DEFINT A-Z
              
              DIM Total(8191)
              DIM Total2(8191)
              DIM Total3(8191)
              
              t! = TIMER
              
              FOR a = 1 TO 10000
                  FOR b = 1 TO 8191
                      Total(b) = a
                      Total2(b) = a
                      Total3(b) = a
                  NEXT b
              NEXT a
              
              tt! = TIMER
              PRINT tt! - t!
              Compiling it with PB, I see that there is a big performance hit for every array-element assignment, compared to the EXE generated by MS BC 7.10.
              I tried commenting out every "Totalx()" assignment, and measuring the different timings. These are the results in sec., on a P3/550Mhz:
              Code:
                          PB     QBX
              0 assign   1.09   0.27 (only the two nested loops)
              1 assing   2.96   0.60
              2 assing   5.32   0.82
              3 assign   7.68   1.15 (as for the listing above)
              Looking at the code generated by the MS BC, I see no nasty tricks:
              Code:
              11:             Total(b) = a                                                  
              4097:0050 8BF0           MOV       SI,AX
              4097:0052 D1E6           SHL       SI,1                                       
              4097:0054 8B0E5AC0       MOV       CX,Word Ptr [C05A]                         
              4097:0058 898C5600       MOV       Word Ptr [SI+0056],CX                      
              12:             Total2(b) = a                                                 
              4097:005C 898C5640       MOV       Word Ptr [SI+4056],CX                      
              13:             Total3(b) = a                                                 
              4097:0060 898C5680       MOV       Word Ptr [SI-7FAA],CX                      
              14:         NEXT b                                                            
              4097:0064 40             INC       AX
              This, maybe, explain why I see a 1.5-2x performance difference in my MARS, as the rappresentation of the Core use 4/5 parallel array of integers of the sime size/dimension.

              Bye!

              ------------------
              May the Force be with you!

              [This message has been edited by Marco Pontello (edited October 14, 2002).]
              -- The universe tends toward maximum irony. Don't push it.

              File Extension Seeker - Metasearch engine for file extensions / file types
              Online TrID file identifier | TrIDLib - Identify thousands of file formats

              Comment


              • #8
                Compiling it with PB, I see that there is a big performance hit for every array-element assignment, compared to the EXE generated by MS BC 7.10
                Of course there is, because there is no such thing as a free lunch.

                Of course, PB does not impose a restriction on array element size when the array goes over 128Kb, either (assuming you even remember to compile with the correct command-line switch).. Oh, and MS Basic does not allow you to access the array elements via pointer variables.

                What you are doing here is analyzing a specific routine; your "results" cannot be applied generally. That's like saying becuase you don't like the taste of kumquats, all fruit is ill-tasting.

                My point, yet again? "Do nothing" loops are in no way, form, shape or manner any kind of valid "compiler comparison."

                MCM
                PS: If you really want to optimize that routine, you may contact my office for rates and availability


                Michael Mattias
                Tal Systems Inc. (retired)
                Racine WI USA
                [email protected]
                http://www.talsystems.com

                Comment


                • #9
                  Ehm...
                  Sorry, I think that it's my fault to not being able to express so well in english.
                  I try to make it clear: I was not, by any means, "crticizing" PB/DOS in general! I think it's a very good product, and I was very well impressed by the number of features it put on the table.
                  And I was also very impressed by the general speed & size of the executable it create.
                  I have tried the free/demo 3.2 version before, even the FirstBasic "edition", and I suggested it to the company where I work.

                  In facts, the good impression I had using PB/Dos, and the level of support and commitment by the PB team showed in the forum, are the reasons for me to posting and pointing out (asking for confirmation or what) what I thinked to be a little strange thing. Nothing more, nothing less.

                  And I think to not have used bad language o bad "attitude" expressing/illustrating the concept. Aside, indeed, for my terrible written english skill (which I have pointed out clearly on the first line of my first post!).

                  So, I don't really understand the "tone" of your reply.

                  Originally posted by Michael Mattias:
                  Of course there is, because there is no such thing as a free lunch.
                  Of coure there must be. It's for this reason that I written "big performance hit" and not "performance hit".

                  What you are doing here is analyzing a specific routine; your "results" cannot be applied generally. That's like saying becuase you don't like the taste of kumquats, all fruit is ill-tasting.
                  I agree. Indeed, I don't have generalized. I have noted that performance hit in a very specific situation (working with integers array elements, and in very tight loops) and presented it. I have nothing to say against the speed of PB/DOS in any other situation or aspect, and I think to not have written nothing different.

                  My point, yet again? "Do nothing" loops are in no way, form, shape or manner any kind of valid "compiler comparison."
                  "Do NEAR nothing" code may have some utility, in some particular and very specific situations; eventually only to see what codes is generated in that circumstance...

                  PS: If you really want to optimize that routine, you may contact my office for rates and availability
                  WOW! I suppose I have to smile!
                  Thanks for the offer. I think I have a 22KB x86 ASM source already enough optimized, if that was the whole point. No need to bother your office!


                  In the end, believe me, I don't have none intention to "trolling" or starting a flame thread. I sincerly hope there was simply a problem with my english skill! Sorry to all for that.

                  Bye!


                  ------------------
                  May the Force be with you!



                  [This message has been edited by Marco Pontello (edited October 14, 2002).]
                  -- The universe tends toward maximum irony. Don't push it.

                  File Extension Seeker - Metasearch engine for file extensions / file types
                  Online TrID file identifier | TrIDLib - Identify thousands of file formats

                  Comment


                  • #10
                    Originally posted by Marco Pontello:
                    Sorry to bother but... I have received a notification from the forum saying that Tony Burcham has replayed... but I can't see any message. What's happening?

                    Thx, Bye!
                    Hi Marco,
                    I did post a reply, but then decided I should let the
                    experts on compiler theory answer this one.

                    Tony Burcham



                    ------------------
                    TheirCorp's projects at SourceForge

                    TheirCorp's website

                    sigpic

                    Comment


                    • #11
                      Originally posted by Tony Burcham:
                      Hi Marco,
                      I did post a reply, but then decided I should let the
                      experts on compiler theory answer this one.
                      Later, I figured out what may be appened.
                      Thanks for the message/precisation Tony!

                      Bye!


                      ------------------
                      May the Force be with you!
                      -- The universe tends toward maximum irony. Don't push it.

                      File Extension Seeker - Metasearch engine for file extensions / file types
                      Online TrID file identifier | TrIDLib - Identify thousands of file formats

                      Comment

                      Working...
                      X