No announcement yet.

PB source and square roots

  • Filter
  • Time
  • Show
Clear All
new posts

  • PB source and square roots

    I recently wrote a program which involved taking the
    square roots of a sequence of numbers until one of them was
    found to be an integer. In about half of these cases, all the
    numbers in the sequence were multiples of 4, and it occurred to
    me that I could work with the numbers divided by 4, instead of
    as they first occurred in the sequence. I expected this to
    produce a (tiny) increase in speed. However, the main loop of
    the program appears to take 16/15 as long to find the integer as
    it did before, and that's before you take into account the mult.
    and div. by 4 at the start and end.

    I'm using PowerBASIC 3.2 for DOS. My machine is a 133MHZ
    Pentium. The metastatements I used were:

    $CPU 80386

    (it occurs to me that $CPU 80486, $CPU PENTIUM, etc. would be
    good metastatements for future versions.)

    The loop kept repeating these two lines:

    40 incr s&&, v&&: incr v&&, x: s##=sqr(s&&): if s##=int(s##) then 50
    45 goto 40

    Now, I think that the most likely explanation for the loss of
    speed is that the square root algorithm is taking longer, so
    my first question is: can anybody think of another reason?

    Secondly, I know that root extraction involves an iteration.
    Does anyone know what starting value PB uses for this iteration
    when trying to extract the square root of a quad integer? Plus,
    does anyone know what algorithm PB uses to extract the root?

    This is mainly to help me find out what happened, but also
    because I'd like to construct my own square root algorithm that
    would allow the user to specify the iteration's start value.

    Finally, is there any way to purchase the source code for your
    version of PB? This is standard practice for vendors of C and
    C++ and doesn't appear to have done their sales any harm.


    James McLaughlin.

  • #2
    since you're using floats and you have a FPU in your computer the compiler won't be using an iterative method or any other method. It'll just load the FPU with the value, execute the FSQRT instruction then read the result from the FPU.

    As for the speeed, if you post the before and after code it'd be easier to point to the "problem". My first guess is that you ARE running faster by nearly 50% but you're finding twice as many roots and don't realise it.

    I doubt very much if the squre root takes a different time in different cases. On my CPU it takes 35 clks to do an extended precision square root regardless of the value.




    • #3
      The way you approach a problem often is more important than the
      method you use in implementing it.

      Since you are interested in squareroots that are integers,then
      you would be better served by incrementing an integer, then
      multiplying it by itself, to identify each number that has an
      integer as its square root. From 1 you get 1; increment to 2
      and you get four; increment to 3 and you get 9; and so on.

      You can exhaust the possibilities much faster this way. and by
      calculating the different from one squared integer to the next,
      you can calculate how many cycles you saved by this method, or
      with what percentage that squares of integers have in the real
      real number set.

      Old Navy Chief, Systems Engineer, Systems Analyst, now semi-retired


      • #4
        In response to the replies I've received: thanks to Paul for
        telling me about FSQRT, but considering I'm using quad integers,
        are you sure that the opcode can handle numbers that big?

        Plus, there's no "instant" means of extracting a square root -
        I don't know what FSQRT does, but it's got to be using some
        kind of recursion - I'll try to find out what. Thanks!

        I'm not finding twice as many roots - I just realised that as
        the square root I was searching for was a multiple of 2, it's
        square (and all the s&& and v&& values) was a multiple of 4, so
        by dividing by 4, I could find, by testing the same number of
        values, the intended value (but divided by 2). When it finds
        that integer, my program terminates.

        I'm only interested in one specific square root - and if Donald
        Darden takes a close look at the increments occurring, he'll
        realise I've already taken the differences between squares into
        account to optimise my code.

        The program is a slightly optimised version of Fermat's
        factoring method - I
        realise that's a crap method, but I was just replacing the
        $FLOAT PROCEDURE on my old programs with $FLOAT NPX, and I
        thought it would be fun to see what I could do to the old


        James McLaughlin



        • #5
          quads are 64 bit. The FPU handles 64 bit integers without problems. A lot of people don't realise it's a 64 bit device.

          The method used does not need to be recursive. The standard method for calculating square roots is to use a repeated subtraction algorithm which gives a certain number of bits of the result per clock cycle, in the case of the FPU I think it's 2 bits per clock, that's why a SQRT of an extended number (64 bit mantissa) takes 35clks every time regardless of value. 3 clks will be overhead and 32 clks to give 64 bits of result.

          The exact timing on your CPU may be different but it'll still be a fixed value regardless of the number you are calculating ther root of.