Announcement

Collapse
No announcement yet.

Weird performance boost

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Paul Dixon View Post
    Neil,

    1) You drop out of the "g" loop one too soon

    Are you sure? The g loop is 162 to 2 and I exit when it gets decremented to 1. You appear to exit when it reaches 0 so you include an iteration at g=1 which was not there in the original BASIC loop.
    In the original, it was doing two tests inside the loop.
    Originally posted by Paul Dixon View Post
    2) the EXIT drops out of the f loop, not the g loop.
    Guilty as charged. I thought the speed improved a lot more than expected with that change but since it gave the right answer I didn't investigate.

    Paul.
    It gave the right answer on the first line because that hadn't really changed anything but was the complete "score" for the set unchanged. Hence the inclusion of the second line. It's no bad thing. If it had worked right first time I wouldn't have made the effort to understand exactly what it was doing.
    Neil Croft (cissp)

    Comment


    • #22
      Originally posted by Paul Dixon View Post
      Neil,
      Slight additional speed up.
      Code:
      FOR e&=48 TO d&+1 STEP -1
      	e2&=e&*163
      	!mov esi,nptr '<---add this line to make use of the now unused esi register
      
      	FOR f&=49 TO e&+1 STEP -1
      replace these 3 lines:
      Code:
      !mov ecx,edi 'g&             'offset to current row
      !sal ecx,2              '4 bytes per entry in array
      !add ecx,nptr           'add address of array
      with this one:
      Code:
      !lea ecx,[edi*4+esi]        'faster address calculation
      Paul.
      I shall amend and test.
      Neil Croft (cissp)

      Comment


      • #23
        Originally posted by Michael Mattias View Post
        > Here's my revised code and it works........shortcuts on the addition line

        Aren't you STILL doing all the additions?

        Can't you ...
        ! cmp eax, 2
        after each
        !add eax,[ecx+edx*4]

        ..and jump somewhere if > 2?

        MCM
        Yes, you're quite correct and the currently running code (without Paul's additional improvements actually does this. I'll post revised shortly when I've added Paul's other bits.
        Neil Croft (cissp)

        Comment


        • #24
          Latest function with Paul's ASM and Michael's drop out shortcut.

          Code:
          THREAD FUNCTION MyThread (BYVAL dummy AS LONG) AS LONG
              DIM nptr AS LONG POINTER
              LOCAL a&,b&,c&,d&,e&,f&,g&,z&
              LOCAL a2&,b2&,c2&,d2&,e2&,f2&,sc&
              nptr=VARPTR(n(0,0))
              tst&=0
              dlt&=0
              FOR z&=161 TO 1 STEP -2
                  sc&=0
                  FOR a&=44 TO 1 STEP -1
                      a2&=a&*163
                      FOR b&=45 TO a&+1 STEP -1
                          b2&=b&*163
                          FOR c&=46 TO b&+1 STEP -1
                              c2&=c&*163
                              FOR d&=47 TO c&+1 STEP -1
                                  d2&=d&*163
                                  FOR e&=48 TO d&+1 STEP -1
                                      e2&=e&*163
                                      !mov esi,nptr '<---add this line to make use of the now unused esi register
                                      FOR f&=49 TO e&+1 STEP -1
                                          f2&=f&*163
                                           !mov edi,162   'use register edi to hold g&
                                          #ALIGN 32
                                          lp1:
                                              !cmp edi,z&
                                              !je skip1               'if = then skip the calculation
                                              !lea ecx,[edi*4+esi]        'faster address calculation
                                              !mov edx,a2&
                                              !mov eax,[ecx+edx*4]    'get a
                                              !mov edx,b2&
                                              !add eax,[ecx+edx*4]    'add b
                                              !mov edx,c2&
                                              !add eax,[ecx+edx*4]    'add c
                                              !cmp eax,3
                                              !je isc
                                              !mov edx,d2&
                                              !add eax,[ecx+edx*4]    'add d
                                              !cmp eax,3
                                              !je isc
                                              !mov edx,e2&
                                              !add eax,[ecx+edx*4]    'add e
                                              !cmp eax,3
                                              !je isc
                                              !mov edx,f2&
                                              !add eax,[ecx+edx*4]    'add f
                                              !cmp eax,3              '=3
                                              !jne skip1              'no.
                                              isc:
                                              !inc sc&                'yes, >2 so inc sc& and then ...
                                              !jmp ngloop                    '... exit the loop
                                              skip1:
                                              !dec edi
                                              !cmp edi,0
                                              !jne lp1
                                         ngloop:
                                      NEXT
                                  NEXT
                              NEXT
                          NEXT
                      NEXT
                  NEXT
                  IF sc&>tst& THEN tst&=sc&:dlt&=z&
              NEXT
              FUNCTION=-1
          END FUNCTION
          Neil Croft (cissp)

          Comment


          • #25
            I'd love to load up this program into some kind of magic meter which would show me a little screen like

            Code:
              Original Execution time     nnnnn
              New Execution time         nnnnnn
              %improvement due to use of assembly language per se        XX
              %improvement due to use of better alogorithm and programming techinque YY
            Any bets against "YY" being greater than "XX?"

            I thought not.
            Michael Mattias
            Tal Systems (retired)
            Port Washington WI USA
            [email protected]
            http://www.talsystems.com

            Comment


            • #26
              Calculating the pointer indexes had about as much effect as splitting the adds but that in turn made the ASM simple (well, simple enough for me to understand).

              The algorithm v implementation argument is as old as the hills and one that will rage forever I suspect. I do believe, based upon my own, limited, experience that a better algorithm wins every time but I'm not aware of any compiler or software that'll write a better algorithm. Now, if someone could come up with a pre-compiler that went through code that said "You don't want to do it that way, you want to do it this way", that would be one rich developer.

              On this problem, I recently stumbled upon a series of blog posts where a C++ chap had approached it differently. His code was fast but used vast amounts of memory and didn't even manage to achieve the currently known best value either.

              Regardless of whether this is the "right" way to solve the problem or not, I've had a simple improvement (splitting the adds) pointed out to me that will hopefully make me look at my code slightly differently in the future. I have also had my very limited knowledge of pointers and ASM greatly improved and therefore, as I've said before on here, learning anything is a "good thing"(tm). I just wish I could justify the money to get Knuth's books on programming and algorithms

              Current best progress on this particular problem is
              Code:
              07-31-2009 16:03:22 13721577  262239 98.1247% 1
              which is better than my Monte Carlo runs given I restarted the runs yesterday. I shall now go and look once more at my bit array attempts to see why they can't be as quick.
              Neil Croft (cissp)

              Comment


              • #27
                > but I'm not aware of any compiler or software that'll write a better algorithm.

                Compilers don't write application algorithms; programmers write application algorithms.

                Now, if someone could come up with a pre-compiler that went through code that said "You don't want to do it that way, you want to do it this way", that would be one rich developer.
                The precompiler you want is available now..right in your home or office, at zero cost.

                I think it's marketed under the brand name " THINKING BEFORE YOU WRITE THAT FIRST LINE OF CODE."


                MCM
                Michael Mattias
                Tal Systems (retired)
                Port Washington WI USA
                [email protected]
                http://www.talsystems.com

                Comment


                • #28
                  Originally posted by Michael Mattias View Post
                  I think it's marketed under the brand name " THINKING BEFORE YOU WRITE THAT FIRST LINE OF CODE."
                  I looked on Amazon but I couldn't find that.

                  At the end of the day, ASM is just another language. Bad code is bad code whatever language you use. Ditto, good code is good code. I just hope to glean enough from these forums to move further away from the former and closer to the latter.
                  Neil Croft (cissp)

                  Comment


                  • #29
                    Michael,
                    if ASM improves the speed by 50% then it'll likely improve the speed of bad algorithms by 50% and good algorithms by 50%. Either way, it's 50% better.

                    Paul.

                    Comment


                    • #30
                      >if ASM improves the speed by 50%

                      Fifty percent compared to WHAT?

                      Hack ASM is not going to be 50% faster than quality BASIC.

                      And no language is going to be fast with bad algorithms.

                      I think it's safe to say this thread is NOT a demonstration of "when you can write better ASM than BASIC".... oh, gee whiz, I hope it isn't anyway. Bad BASIC makes bad COBOL look good.
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #31
                        > Either way, it's 50% better.

                        No, it's not.. it's 50% faster.
                        Michael Mattias
                        Tal Systems (retired)
                        Port Washington WI USA
                        [email protected]
                        http://www.talsystems.com

                        Comment


                        • #32
                          Michael,
                          Fifty percent compared to WHAT?
                          Compared to what it was before it was augmented with ASM.
                          Good BASIC does the job in X
                          Good BASIC with a bit of ASM does the same job in X/2

                          Bad BASIC does the same job in Y where Y>>X
                          Bad BASIC with a bit of ASM does the same job in Y/2

                          When you need something to run fast then a good algorithm helps and a bit of ASM helps more. You can't just ignore the help from one because the other is yet perfect.

                          Paul.
                          Last edited by Paul Dixon; 31 Jul 2009, 06:21 PM.

                          Comment


                          • #33
                            Michael,
                            in a case such as this, how good something is is determined by how fast it is so 50% faster is 50% better. That's not to say you can't make it better still with a different algorithm, but as things stand at this point of this thread, the current code is 50% better than it was at the start of the thread.

                            Paul.

                            Comment


                            • #34
                              Neil,
                              I don't think bit arrays are going to help at all. Your array is small enough that it easily fits in the CPU cache so there's going to be no access speed advantage with bit array but you will have the disadvantage of having to work with bits which is inherently slower.

                              I think you'd be better off looking at my original suggestion of moving the g& loop to outside the a& loop. It will allow you to calculate sub totals once only instead of every time and there are other conditions you can test to exit early which will save you as much time as save now by ending the g& loop early.

                              The following code works for the first set (but gives a different answer fore the second, I haven't worked out why yet) but it's entirely in BASIC (MCM will be happy!) and runs in a similar time on my machine to the last version you posted.

                              It could be improved further by checking if the sum has already reached 3 before some of the loops or by converting parts to ASM and losing style points from Michael.

                              Paul.
                              Code:
                                  FOR z&=161 TO 1 STEP -2
                                      '
                                      sc&=0
                                          FOR g&=162 TO 1 STEP -1
                                              IF g&<>z& THEN
                              
                                              FOR a&=44 TO 1 STEP -1
                                                  aSum&[email protected][g& OF 162,a& OF 49]
                                                  FOR b&=45 TO a&+1 STEP -1
                                                      bSum& = aSum& + @nptr[g& OF 162,b& OF 49]
                                                      FOR c&=46 TO b&+1 STEP -1
                                                          cSum& = bSum& + @nptr[g& OF 162,c& OF 49]
                                                          FOR d&=47 TO c&+1 STEP -1
                                                              dSum& = cSum& + @nptr[g& OF 162,d& OF 49]
                                                              IF dSum&>0 THEN   'if it's not > 0 by now then it will never exceed 2 in the last loop so skip the next 2 loops
                                                              FOR e&=48 TO d&+1 STEP -1
                                                                  eSum& = dSum& + @nptr[g& OF 162,e& OF 49]
                                                                  IF eSum&>1 THEN  'if it's not > 1 by now then it will never exceed 2 in the next loop so skip the next loop
                              
                                                                      FOR f&=49 TO e&+1  STEP -1
                                                                      fSum& = eSum& + @nptr[g& OF 162,f& OF 49]
                              
                                                                      IF fSum& >2 THEN INCR sc&
                              
                                                                      NEXT
                                                                  END IF
                              
                                                              NEXT
                                                              END IF
                                                          NEXT
                                                      NEXT
                                                  NEXT
                                              NEXT
                              
                                             END IF
                              
                                          NEXT
                                      '
                                      IF sc&>tst& THEN tst&=sc&:dlt&=z&
                                      '
                                  NEXT

                              Comment


                              • #35
                                It is a fallacy that faster is ALWAY better.

                                It all depends on what is important in this application. Size? Resource usage? Maintainability? Features? Development expense? Deadline?

                                Developers must always balance competing needs.

                                Look at, let's say, the much-maligned Microsoft Visual BASIC. Is there anyone who can't give VB a good score for "ease of GUI development" and "how quickly I can have my screen working?"

                                Of course, no one will say that development speed and ease of development comes for free: An MS-VB application is a dog to deploy and often runs like a pig. But by golly, it's done on time and it looks good - which may be plenty good enough!

                                MCM
                                Michael Mattias
                                Tal Systems (retired)
                                Port Washington WI USA
                                [email protected]
                                http://www.talsystems.com

                                Comment


                                • #36
                                  Michael,
                                  It all depends on what is important in this application. Size? Resource usage? Maintainability? Features? Development expense? Deadline?
                                  Important in this application? That'd be speed.

                                  Paul.

                                  Comment


                                  • #37
                                    Originally posted by Paul Dixon View Post
                                    Michael,

                                    Important in this application? That'd be speed.

                                    Paul.
                                    That would be correct.

                                    This isn't something I'll look at in 2 years and wonder what the heck G& is supposed to represent so readability isn't important.

                                    There isn't a machine here with less than 2gig (and only the one with less than 4 gig) of RAM so memory constraints aren't relevant.

                                    It's my hobby (like Sudoku or Crosswords) and I program in my free time so development costs are zero.

                                    Michael is quite correct that there are many reasons to develop code in many ways. I was once a climbing instructor on a "management training" course for part of the software team of a large aircraft manufacturer. I had never seen any team before or since who could write so small and neatly on a whiteboard. Over some beers we discussed software development. They told a few horror stories (coding metres instead of feet or vice versa and how high inside an aircraft the nose wheel comes when you land with that bug.) Their requirements were absolute accuracy, complete readability and fairly quick development cycles. Performance really wasn't that high on their list of needs. Obscure, CPU specific opcodes were a big no-no. Easily maintained generic libraries was their game.

                                    It's also true that a notepad and pen should always be the first tools of programming anything other than a trivial home-use utility. ASM is just another one of the tools in the box, like a different spanner is to a mechanic. It should neither be dismissed as irrelevant or held to be the only way to do the job.

                                    Someone once said, if the only tool in your toolkit is a hammer, every problem looks increasingly like a nail.
                                    Neil Croft (cissp)

                                    Comment


                                    • #38
                                      Originally posted by Michael Mattias View Post
                                      An MS-VB application ... done on time and it looks good
                                      I've worked for a couple of large (global) IT companies since 1996 and I can vouch that's not actually always true. A case in point was a George III mainframe replacement program for Y2K which someone suddenly realised in October 1999 wasn't going to be done on time. It is fair to say in that case however that it was the project management and client's fault more than the programmers and language choice.

                                      I would agree it's a dog to deploy though.

                                      And don't get me started on MS load balancing and what "damage" that can do to an RFC compliant network.
                                      Neil Croft (cissp)

                                      Comment

                                      Working...
                                      X