Announcement

Collapse
No announcement yet.

What is the fastest PB command to read data at a position in a large string

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Just looking at the code which takes 32.766 seconds on a I7 2600K

    Added a global just to see if there is anything obvious.
    The cancel check takes about 5-seconds
    INCR gcounter
    'IF master$="exit" THEN GOTO 99 'mike remarked this line 234

    endtime = TIMER
    ? USING$("gmaster #, time ##.###",gcounter,endtime-starttime) '123,170,400 times

    Not sure how critical the time is.
    Use [ code ] [ /code ] in source code before pasting to keep indentation

    Comment


    • Hi Mike, Sorry... the master$="exit" can just be commented out as it is in my main application. (There was also a discussion to remove the dialog doevents, but i need that in order for the processing dialog to update... Actually in my main code, not the standalone that i provided, having the dialog doevents just in the main iz loop is not enough and i have it in the for loop below that but only activate it every 20th iteration in the loop which works well enough to update the dialog)...

      Comment


      • Dean,
        You forgot to rem out both "dum = ASC(file3d$, ..." lines
        Avout 20% in speed increase...


        Mike told me that those lines was already remed in the download he got.
        He is right, I surely mixed something...
        Got it. unremed while doing new indentation...

        Comment


        • By changing master$ to a long instead of a string, I gained near 2%.
          Unneeded reading above posts...

          Comment


          • MOD is slow, sometime you can replace with 8 time faster AND, this is the case for MOD 256

            iz2 = amigall MOD 256
            is the same as
            iz2 = amigall AND 255

            Sadly, since it's not called often, it does not give much speed improvement...


            Added:
            Same thing for 50% faster Integral division vs FIX() and / (Floating-point division)

            iz1 = FIX((amigall) / 256)
            is slower but the same as
            iz1 = amigall \ 256

            Comment


            • Not using any of the string reversal code or swap code or rotate.
              Still using MID$ instead of pointers or peek/poke.
              Hopefully there is a control file while testing to make sure dotting an i or crossing a t doesn't cause an error.
              FC Newfile ControlFile

              Here is 2 more little ones:
              Code:
              ' temp$=CHR$(255)+CHR$(254) 'line 171
              ' zscan$=""
              ' zzscan$=""
              ' FOR i=1 TO ngz
              '  zscan$=zscan$+temp$
              ' NEXT i
              
              zscan = REPEAT$(ngz,CHR$(255,254))
              
              
              
              'temp$=CHR$(iz1,iz2) 'line 279 not always needed (removed)
              IF iz<migsample1 THEN
               MID$(zzscan$,iy1ngx2+(ix-1)*2+1,2)=MID$(file3d$,iz1ngxngy2+iy1ngx2+(ix-1)*2+1,2)
              ELSE
               'MID$(zzscan$,iy1ngx2+(ix-1)*2+1,2)=temp$  '(removed)
                MID$(zzscan$,iy1ngx2+(ix-1)*2+1,2)=CHR$(iz1,iz2) '(changed to)
              END IF
              Use [ code ] [ /code ] in source code before pasting to keep indentation

              Comment


              • Potato vs Potatoe writing records in reverse.
                STEP instead of calculating start byte of each record in loop

                'reorder the writing of the volume planes line 300
                '
                Code:
                  KILL "mig-"+autoname$
                  OPEN "mig-"+autoname$ FOR BINARY AS #3
                
                  'FOR i= 1 to ngz
                  ' PUT$#3, MID$(file3d$,(ngz-i)*ngxngy2+1,ngxngy2)
                  ' NEXT i
                  'CLOSE#3
                
                  FOR i=(ngz-1)*ngxngy2+1 TO 0 STEP -ngxngy2
                   PUT$#3, MID$(file3d$,i,ngxngy2)
                  NEXT i
                  CLOSE#3
                  '
                Use [ code ] [ /code ] in source code before pasting to keep indentation

                Comment


                • Thanks Pierre and Mike...I learned a few more programming tricks that i did not know... Well, unfortunately as mentioned things do not get much faster as they are not called much... I think we more or less reach the top of the speed curve and from where we were last week till now is a big important difference for us and to make 3D processing more viable in the time domain.... have a good rest of the week...

                  Comment


                  • Yep, with only necessary calculations done, it become hard to optimize.
                    FOR/NEXT loop are already fast, more than DO/LOOP or WHILE/WEND as I often seen.
                    4,200,000 looping iterations with many calculations is a fairly heavy task...
                    Going to ASM might help... Will require some work though.
                    Let's hope a good idea is around the corner, for the fun of coding...

                    Comment


                    • "with only necessary calculations done, it become hard to optimize."

                      I think there is a LOT of unnecessary calculation being done.
                      I'm sure the speed could be doubled with a bit of thought.

                      Comment


                      • Great, with a little luck, Dean will go down to 14 seconds...

                        Comment


                        • deleted

                          Comment


                          • On my machine, I replaced both peek section as you did.
                            I also replaced the width section following your code.

                            I did the test twice for each config...
                            Old code: 47 and 49 seconds
                            Your new code 49 & 49 seconds.

                            So in my case the time taken for the extra variable kill the gain.
                            Are you sure about your test?
                            60 to 40 is impressive!

                            Comment


                            • Originally posted by Pierre Bellisle View Post
                              On my machine, I replaced both peek section as you did.
                              I also replaced the width section following your code.

                              I did the test twice for each config...
                              Old code: 47 and 49 seconds
                              Your new code 49 & 49 seconds.

                              So in my case the time taken for the extra variable kill the gain.
                              Are you sure about your test?
                              60 to 40 is impressive!
                              I guess you are referring to my deleted message. I found the increase hard to believe so checked and found I'd made a stupid typo that resulted in the supposed gain.
                              ( typed an equals instead of a minus! )

                              That's why I deleted it

                              Comment


                              • The next big saving will come from changing how the code works, not from tweaking it.

                                The innermost loop is run 900 million times.
                                The endian swap is done 490 million times.
                                Yet the initial file3d$ string is only 4.2 million words long! Why are there 490 million swaps when there are only 4.2 million unique values to be swapped?
                                Why isn't the string swapped once in advance?

                                The tempvalue in the innermost loop only takes on 3.9 million possible values yet it's calculated 490 million times.

                                Some of the calculations done in the innermost loop need to be removed completely and put outside of the 3 outer loops.

                                Comment


                                • Background 3D migration. We have a 3D volume and given the velocity of microwave in the volume (ground) we have a 3d hyperboloid (who's shape is narrow or wide and based on the velocity) intersecting through the volume where we need to add up all the cells that that hyperboloid intersects and place in the cell at the apex of the hyperboloid. We have to move the hyperboloid through every cell in the volume and do this calculation of addition. We use an angle division of 10 degrees in the example .bas but one can use a setting for 1 degree or even 90 degrees to look at portions of the hyperboloid So this is why there are so many loops and heavy...

                                  Comment


                                  • Dean,
                                    what I'm suggesting is to move the more complex rotation of the hyperbola to the outside loop then the inside loops will become relatively quick summations.

                                    Code:
                                    'As it is now
                                    'scan every point in the 3 dimensional block
                                    FOR x = xStart TO xFinish
                                    FOR y = yStart TO yFinish
                                    FOR z = zStart TO zFinish
                                    
                                    'now, for each point x,y,z, create a hyperbola with a vertex at that point
                                    ' and rotate that hyperbola through 360 degrees to sum all points that the resulting hyperbolid intersects.
                                    FOR Angle = O TO 360
                                    'calculate the rotated hyperbola to give the next section of the hyperboloid so sum withall the other sections
                                    'This is the more computationally intensive part
                                    'The hyperboloid is being calculated millions of times but it's the same hyperboloid every time
                                    
                                    
                                    
                                    'Instead.. do the hyperboloid first
                                    FOR Angle = 1 TO 360
                                    'calculate the rotated herpobla to give the next section of the hyperboloid
                                    
                                    FOR x = xStart TO xFinish
                                    FOR y = yStart TO yFinish
                                    FOR z = zStart TO zFinish
                                    'now, for each point x,y,z, sum the points that intersect the hyperbola
                                    'It's only a partial sum and will need to be accumulated on each rotation of the hyperbola
                                    'Now, the part done millions of times is just a simple summation. Not the more complex rotation and summation.

                                    Comment


                                    • Another alternative, if the apex of the hyperbola is always pointing vertically upwards, is to forget rotating the hyperbola.
                                      A horizontal cross section through the hyperbola is a circle and a circle is much easier to handle.
                                      The radius of the circle can be calculated from the hyperbola and the distance from the apex.

                                      This would avoid the problem you have now where you need to check if points have been done before and exclude them.
                                      This is for points near to the apex as there are only a few possible points immediately below the apex but, scanning at 10degree steps means you hit those few points 36 times.

                                      At the same time, you're likely missing points distant from the apex as 10 degree steps is too coarse, giving only 36 points when there could be hundreds.

                                      Using circles, you know which points to include and don't need to check that they haven't already been done.
                                      Scan each layer as circles then sum the circles which form part of each hyperboloid.

                                      But maybe I haven't fully understood the task so this may all be wrong!

                                      Comment

                                      Working...
                                      X