Announcement

Collapse
No announcement yet.

fast integer to ascii

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • fast integer to ascii

    Added an integer to ascii routine here. It converts integers to ascii in between 22.3 and 71.6 cycles for 1 to 10 digit integers. It uses an StringZ buffer.
    LarryC
    Website
    Sometimes life's a dream, sometimes it's a scream

  • #2
    DIV is a relatively slow opcode. It's usually better to avoid/minimise its use and replace it by MULs.

    http://www.powerbasic.com/support/pb...1&postcount=30

    Comment


    • #3
      Great timing! Just posted. It's a bit faster than the div version as well. Disclaimer, I'm still struggling a bit with the math so I cheated and adapted the math from the asm of a C compiler.

      Edit: And wow wish I'd seen that post sooner.
      Last edited by Larry Charlton; 8 Mar 2015, 02:24 PM.
      LarryC
      Website
      Sometimes life's a dream, sometimes it's a scream

      Comment


      • #4
        Hi Paul,

        Thanks for posting the link to your algo, I had lost the damned thing. Converts to MASM just fine. :coffee4:
        hutch at movsd dot com
        The MASM Forum

        www.masm32.com

        Comment


        • #5
          Originally posted by Paul Dixon View Post
          DIV is a relatively slow opcode. It's usually better to avoid/minimise its use and replace it by MULs.

          http://www.powerbasic.com/support/pb...1&postcount=30
          Still trying to wrap my head around exactly what's happening in your code. Here's the gist I came up with.

          Ignoring that you're processing upper digits and lower digits separately, and skipping leading zeros, what I think you were doing highly paraphrased is:

          Edit: updated to be closer.

          Code:
          #Compile Exe
          #Dim All
            
           AsmData cvt
            DD &h3030, &h3130, &h3230, &h3330, &h3430, &h3530, &h3630, &h3730, &h3830, &h3930
            DD &h3031, &h3131, &h3231, &h3331, &h3431, &h3531, &h3631, &h3731, &h3831, &h3931
            DD &h3032, &h3132, &h3232, &h3332, &h3432, &h3532, &h3632, &h3732, &h3832, &h3932
            DD &h3033, &h3133, &h3233, &h3333, &h3433, &h3533, &h3633, &h3733, &h3833, &h3933
            DD &h3034, &h3134, &h3234, &h3334, &h3434, &h3534, &h3634, &h3734, &h3834, &h3934
            DD &h3035, &h3135, &h3235, &h3335, &h3435, &h3535, &h3635, &h3735, &h3835, &h3935
            DD &h3036, &h3136, &h3236, &h3336, &h3436, &h3536, &h3636, &h3736, &h3836, &h3936
            DD &h3037, &h3137, &h3237, &h3337, &h3437, &h3537, &h3637, &h3737, &h3837, &h3937
            DD &h3038, &h3138, &h3238, &h3338, &h3438, &h3538, &h3638, &h3738, &h3838, &h3938
            DD &h3039, &h3139, &h3239, &h3339, &h3439, &h3539, &h3639, &h3739, &h3839, &h3939
          End AsmData
            
           Function PBMain () As Long
            Local fraction As Ext
            Local i, num, number As Long
            Local v As Dword
            Local wp As Dword Ptr
            Local buf As StringZ*14
            Local bp As Word Ptr
            
             number = 1234567890
            
             wp = CodePtr( cvt )
            bp = VarPtr( buf )
            fraction = CDbl(number) / (10000000000##-1##)
            
             For i=0 To 4
              fraction *= 100##
              num = Fix(fraction)
              v = @wp[num]
              @bp[i] = @wp[num]
              fraction = Frac(fraction)
            Next
            @bp[5] = 0
            ? buf
          End Function
          only with Longs instead of Ext. Do you know of any references to managing floating point numbers or fractions in a long? That looks like a useful technique and I'd like to understand it a bit better.
          Last edited by Larry Charlton; 9 Mar 2015, 10:51 AM.
          LarryC
          Website
          Sometimes life's a dream, sometimes it's a scream

          Comment


          • #6
            Larry,
            you have the gist of it but don't think of it as floats and fractions, just as integers and remainders.


            The technique is sometimes called Magic Number Division because when you first see the peculiar multiplier producing the correct result it looks like magic.

            I work out the numbers myself but you can get websites to do it for you:
            http://www.hackersdelight.org/magic.htm

            I just checked and that site gives the exact same numbers I used, for division by 10000 use MUL by &hD1B71759 followed by a 13 bit shift.


            The confusing part is probably that I have 2 sections of code doing almost the same thing.
            The exact path taken through the code depends on the initial number of digits.
            Until a valid, non-zero leading digit is found, I have to check for leading zeroes and suppress them.
            Once I get that leading non-zero digit, I no longer need to check as all subsequent digits are valid pairs so I jump to the ZeroSupressed block of code which runs much faster as it doesn't need to check.


            The multiply/remainder part is straight forward, once you know what the magic number does and that the CPU MUL instruction gives a 64 bit answer, with the upper 32 bits in EDX and the lower 32 bits in EAX.


            Doing it in decimal to hopefully make it clearer, assume 10 digit registers..
            Just imagine that the upper register contains the integer part and the lower register the remainder.

            Num = 1234567890

            Multiply by 1/1E8 (same as divide by 1E8 but faster)
            That gives:
            0000000012 3456789000 in the 2 registers
            Use the 12 in the upper register as an index into the table to lookup the digits "12" and store them.

            3456789000 MUL 100 then gives
            0000000034 5678900000
            Use the 34 in the upper register to lookup the digits "34" and store them.

            5678900000 MUL 100 then gives
            0000000056 7890000000
            Use the 56 in the upper register to lookup the digits "56" and store them.

            etc.

            In the real CPU, the registers are 32 bits each and the numbers are binary but the principle is the same.

            Paul.

            Comment


            • #7
              Interesting. Back in the 1960's, the "decimal print routine" was a big thing. (Read Steven Levy's book Hackers.. which book even has its own "wikipedia" enty here.. http://en.wikipedia.org/wiki/Hackers...ter_Revolution).

              In summary, for a long time the "Holy Grail" was to create this routine in fifty or fewer instructions. It got to fifty-three, then fifty-two, fifty one and then one day some guy posted it in only FORTY-NINE instructions.

              Victory, right? Wrong.

              It wasn't long before the next post in which someone posted an eleven-instruction routine... of course, a thing of elegance, marvel and beauty.

              Looking at what was posted today, I'd have to say it appears we've come a long way, baby... but in what direction?

              MCM
              PS:
              Finding that link just now, it appears Mr. Levy's book is now available in "e-publishing" format. Highly recommended for anyone with an interest in how we got where we are.

              Comment


              • #8
                Originally posted by Paul Dixon View Post
                ...
                Num = 1234567890

                Multiply by 1/1E8 (same as divide by 1E8 but faster)
                1 / 1E8 you picked because that leaves the first two digits in EDX? That means eax is the "fractional" remainder so multiplying whatever EAX is by 100 will put the next two digits in EDX and you can repeat for the other digit pairs? Specifically the part that's giving me grief is why you can divide by 1 / 1E8 and then start multiplying the remainder to pull to more digits out.

                It's this part of the code that's making my head hurt
                Code:
                  mov    edx, &hD1B71759        ; =2^45\10000    13 bit extra shift
                  mul    edx                    ; gives 6 high digits in edx
                
                  mov    eax,&h68DB9            ; =2^32\10000+1
                
                  shr    edx,13                 ; correct for multiplier offset used to give better accuracy
                  jz     skiphighdigits         ; if zero then don't need to process the top 6 digits
                
                
                  mov    ecx,edx                ; get a copy of high digits
                  imul   ecx,10000              ; scale up high digits
                  sub    edi,ecx                ; subtract high digits from original. EDI now = lower 4 digits
                
                
                  mul    edx                    ; get first 2 digits in edx
                  mov    ecx,100                ; load ready for later
                
                  jnc    next1                  ; if zero, supress them by ignoring
                  cmp    edx,9                  ; 1 digit or 2?
                  ja     ZeroSupressed          ; 2 digits, just continue with pairs of digits to the end
                
                  mov    edx,chartab[edx*4]     ; look up 2 digits
                  mov    [esi],dh               ; but only write the 1 we need, supress the leading zero
                  inc    esi                    ; update pointer by 1
                  jmp    ZS1                    ; continue with pairs of digits to the end
                
                next1:
                  mul    ecx                    ; get next 2 digits

                Any chance this is at a basic level how floating point numbers work (excluding the mantissa and stuff).
                LarryC
                Website
                Sometimes life's a dream, sometimes it's a scream

                Comment


                • #9
                  Larry,
                  1 / 1E8 you picked because that leaves the first two digits in EDX?
                  Yes. Of course these figures are just a decimal example to show how it works. In real life things are binary.

                  That means eax is the "fractional" remainder so multiplying whatever EAX is by 100 will put the next two digits in EDX and you can repeat for the other digit pairs?
                  Yes.

                  Specifically the part that's giving me grief is why you can divide by 1 / 1E8 and then start multiplying the remainder to pull to more digits out.
                  Just imagine the decimal point is between the 2 decimal registers
                  Code:
                      EDX       EAX
                  1234567890 . 000000000
                  divide by 1E8:
                  0000000012.3456789000
                  Remove the 12 in the high register:
                  0000000000.3456789000
                  x100
                  0000000034.5678900000
                  Remove the 34 in the high register:
                  0000000000.5678900000
                  x100
                  0000000056.7890000000
                  etc.

                  In real life, it's binary, but the principle is the same.
                  Imagine the original number is 98765 decimal.
                  98765 = 181CD hex
                  Code:
                      EDX     EAX
                  000181CD.00000000 
                  Divide by 10000 decimal
                  00000009.E0624DD3      
                  Remove the first digits  09hex = 09 dec
                  00000000.E0624DD3
                  Multiply by 100 decimal (64 hex)
                  00000057.A666666C
                  Remove the digits 57 hex = 87 dec.
                  00000000.A666666C
                  Multiply by 100 decimal (64 hex)
                  00000041.00000230
                  Remove the digits 41 hex = 65 dec
                  So the digits removed are 09 87 65 giving the 98765 decimal we expected.

                  Don't think of this as floating point which is something different.

                  Paul.

                  Comment


                  • #10
                    Thought I had it, then poof. When I run through I can get the problem to work using division, but not multiplication:

                    Code:
                              499602d2     1234567890 
                              05f5e100     100000000  
                    div 499602d2 / 05f5e100-----
                    x         C            12
                    remainder 020f76d2     34567890
                    div       020f76d2 / f4240h ------
                    x         22           34
                    remainder 8AA52        567890
                    div       8AA52 / 2710h ----------
                    x         38           56
                    remainder 1ed2         7890
                    div 1ed2 / 64h -------------
                    x         4e            78
                    remainder 5a            90
                    Just using div opcodes:
                    Code:
                        Prefix "!"
                        xor edx, edx
                        mov eax, 1234567890
                        mov ebx, 100000000
                        div ebx
                         mov ebx, 1000000
                        mov eax, edx
                        xor edx, edx
                        div ebx
                        mov ebx, 10000
                        mov eax, edx
                        xor edx, edx
                        div ebx
                        mov ebx, 100
                        mov eax, edx
                        xor edx, edx
                        div ebx
                    I'll work on it some more, I think it'll click after a bit.
                    LarryC
                    Website
                    Sometimes life's a dream, sometimes it's a scream

                    Comment


                    • #11
                      Larry,
                      first, look at a binary/hex version as it's easy to see what's happening

                      Code:
                          PREFIX "!"
                          mov eax, &h12345678     '12345678 is the number in hex to start with
                          mov ebx, &h100          '100 the multiplier, we're doing 2 hex digits at a time so use &h100
                          mul ebx                 'This puts the  top 2 digits 12 in edx and shifts up the remainder in eax
                          mul ebx                 'This puts the next 2 digits 34 in edx and shifts up the remainder in eax
                          mul ebx                 'This puts the next 2 digits 56 in edx and shifts up the remainder in eax
                          mul ebx                 'This puts the last 2 digits 78 in edx and shifts up the remainder in eax leaving 0.
                                                                        '
                          END PREFIX

                      Comment


                      • #12
                        I was naïve, I tried that first.

                        1234567890 x 100 = 123456789000 (1C BE991A08)

                        Couldn't see how to make 1C be a C (12). That's what's making my head hurt

                        1234567890 x 100h = 49 9602 d200 can't see how 49h becomes 12d either.
                        Last edited by Larry Charlton; 10 Mar 2015, 10:47 AM.
                        LarryC
                        Website
                        Sometimes life's a dream, sometimes it's a scream

                        Comment


                        • #13
                          Code:
                          'PBCC6 program
                          FUNCTION PBMAIN () AS LONG
                          LOCAL a,b,scle AS QUAD
                          LOCAL ans AS STRINGZ * 12
                          
                          a = 123456   'the decimal number to do
                          'we need to scale the number by 1000000
                          b = a / 1000000   'unfortunately, we can't do this as b is an integer and we lose all the lower order, fractional bits.
                          
                          scle = &h100000000 / 1000000 'so, instead, we scale the number up by 32 bits then divide   (32 bits for this example, usually more to make full use of the registers)
                          b = a * scle
                          
                          'The top half of a now contains the integer part, the bottom half, the residue
                          'this appears cumbersome in BASIC but in ASM it's very neat
                          PRINT HI(DWORD,b),LO(DWORD,b)
                          
                          'Then, repeatedly multiply the residue by 100 to get each pair of digits
                          b = LO(DWORD,b) * 100
                          PRINT HI(DWORD,b),LO(DWORD,b)
                          
                          b = LO(DWORD,b) * 100
                          PRINT HI(DWORD,b),LO(DWORD,b)
                          
                          b = LO(DWORD,b) * 100
                          PRINT HI(DWORD,b),LO(DWORD,b)
                          
                          
                          !int 3
                              PREFIX "!"
                              ;now do it in decimal
                              mov ecx,scle            'The same scale we used in BASIC
                              mov eax,123456          'get the number
                              mul ecx                 'scale it
                          
                          
                              mov ecx,100             'the multiplier to give 2 decimal digits at a time
                                                      'note, we now switch to looking in edx, the upper 32 bits of the result, for the 2 digits
                          
                              mul ecx                 'This gives edx = &h0c = 12 decimal, the first 2 digits
                              mul ecx                 'This gives edx = &h22 = 34 decimal, the next 2 digits
                              mul ecx                 'This gives edx = &h38 = 56 decimal, the last 2 digits
                          
                              'we get the same answer as the BASIC version
                              END PREFIX
                          
                          
                          WAITKEY$
                          
                          END FUNCTION
                          Last edited by Paul Dixon; 10 Mar 2015, 04:59 PM.

                          Comment


                          • #14
                            wow, not sure why that works, but it seems really clear now. Now to work on understanding the math side of it a bit. Thank you so much.
                            LarryC
                            Website
                            Sometimes life's a dream, sometimes it's a scream

                            Comment


                            • #15
                              it's getting clearer. It actually has a name, binary scaling. The reason it kept looking like floating point math is it's pseudo floating point.

                              Thanks again for hanging in there with me. It looks like a highly useful technique for a number of things and a technique I'm very likely to find uses for.
                              LarryC
                              Website
                              Sometimes life's a dream, sometimes it's a scream

                              Comment


                              • #16
                                Larry,
                                if it's becoming clear now, we can move on to the next step!
                                You only need to read this if you want to understand why the scaling value is what it is. If you're happy to accept the values calculated for you by that website I mentioned then you can skip it.


                                You might have noticed that in the original fastLong2Str I split the number into 2 parts before processing it and I used a peculiar scale value:
                                Code:
                                !mov edx, &hD1B71759        '=2^45\10000    13 bit extra shift
                                The reason is that when the number is scaled the result sometimes isn't accurate enough to hold the number to a high enough precision to reproduce all the digits and the lower order digits turn out wrong.

                                By scaling by 2^45\10000 instead of the expected 2^32\10000 I make sure all the bits of the registers are used in the scaling calculation instead of leaving 13 bits unused at the top.
                                This adds 13 bits of precision to the MUL and then afterwards, we shift the result to the right by 13 bits to compensate.
                                It's still way faster to MUL then SHIFT than it is to DIV.

                                If you try to work immediately with a 10 digit number you'll find you can never get enough accuracy as there are no unused bits at the top of the register to be used as they contain real data already.
                                So, first split the number roughly in half then scale using all bits of the register, then shift back to compensate.

                                Paul.

                                Comment


                                • #17
                                  in your conversion you split the number into 6 digits and 4 digits. That was to keep the number from overflowing right? So to get the full number you would do something like:

                                  Code:
                                  a = 1234567890   'the decimal number to do
                                   scle = &h100000000 / 10000 'so, instead, we scale the number up by 32 bits then divide   (32 bits for this example, usually more to make full use of the registers)
                                    
                                  b = a * scle
                                   'The top half of a now contains the integer part, the bottom half, the residue
                                  'this appear cumbersome in BASIC but in ASM it's very neat
                                    
                                   Local hi6int, lo4int As Quad
                                   hi6int = Hi(Dword,b)
                                  lo4int = a - hi6int * 10000
                                    
                                   b = hi6int * scle
                                  Print Hi(Dword,b),Lo(Dword,b) ' 1st 2 digits
                                    
                                   b = Lo(Dword,b) * 100
                                  Print Hi(Dword,b),Lo(Dword,b) ' 2nd 2 digits
                                    
                                   b = Lo(Dword,b) * 100
                                  Print Hi(Dword,b),Lo(Dword,b) ' 3rd 2 digits
                                    
                                   scle = &h100000000 / 100 'so, instead, we scale the number up by 32 bits then divide   (32 bits for this example, usually more to make full use of the registers)
                                   b = lo4int * scle
                                  Print Hi(Dword,b),Lo(Dword,b) ' next 2
                                    
                                   b = Lo(Dword,b) * 100
                                  Print Hi(Dword,b),Lo(Dword,b) ' next 2
                                  LarryC
                                  Website
                                  Sometimes life's a dream, sometimes it's a scream

                                  Comment


                                  • #18
                                    Larry,
                                    yes, that looks about right.
                                    6 and 4 digits to keep everything in neat pairs as 5 and 5 would be awkward.

                                    Paul.

                                    Comment


                                    • #19
                                      Ah yes the 13 bits So really a bit more like this:

                                      Code:
                                      a = 1234567890   'the decimal number to do
                                       scle = &H200000000000 / 10000
                                      b = a * scle
                                      Shift Right b, 13
                                        
                                      Local hiInteger, loFraction As Long
                                      hiInteger = Hi(Dword,b)
                                      loFraction = Lo(Dword, b)
                                        
                                       b = hiInteger * scle
                                      Shift Right b, 13
                                        
                                      Print Hi(Dword,b),Lo(Dword,b) ' 1st 2 digits
                                        
                                       b = Lo(Dword,b) * 100
                                      Print Hi(Dword,b),Lo(Dword,b) ' 2nd 2 digits
                                        
                                       b = Lo(Dword,b) * 100
                                      Print Hi(Dword,b),Lo(Dword,b) ' 3rd 2 digits
                                        
                                       b = loFraction
                                      b = Lo(Dword,b) * 100
                                      Print Hi(Dword,b),Lo(Dword,b) ' next 2
                                        
                                       b = Lo(Dword,b) * 100
                                      Print Hi(Dword,b),Lo(Dword,b) ' next 2
                                      Which makes this doubly handy because when we get our high 6 digits, we can test for 0 and skip if it the number isn't that big. It's also one multiply less that the previous solution but 2 13 bit shifts more, wonder which would be faster
                                      Last edited by Larry Charlton; 10 Mar 2015, 06:57 PM.
                                      LarryC
                                      Website
                                      Sometimes life's a dream, sometimes it's a scream

                                      Comment


                                      • #20
                                        Larry,
                                        I knew you'd understand it eventually.
                                        Just make sure the scaling constant is as big as you can make it without overflowing the register.

                                        Paul

                                        Comment

                                        Working...
                                        X