Added an integer to ascii routine here. It converts integers to ascii in between 22.3 and 71.6 cycles for 1 to 10 digit integers. It uses an StringZ buffer.
Announcement
Collapse
No announcement yet.
fast integer to ascii
Collapse
X
-
DIV is a relatively slow opcode. It's usually better to avoid/minimise its use and replace it by MULs.
http://www.powerbasic.com/support/pb...1&postcount=30
-
Great timing! Just posted. It's a bit faster than the div version as well. Disclaimer, I'm still struggling a bit with the math so I cheated and adapted the math from the asm of a C compiler.
Edit: And wow wish I'd seen that post sooner.Last edited by Larry Charlton; 8 Mar 2015, 02:24 PM.
Comment
-
Hi Paul,
Thanks for posting the link to your algo, I had lost the damned thing. Converts to MASM just fine.hutch at movsd dot com
The MASM Forum - SLL Modules and PB Libraries
http://www.masm32.com/board/index.php?board=69.0
Comment
-
Originally posted by Paul Dixon View PostDIV is a relatively slow opcode. It's usually better to avoid/minimise its use and replace it by MULs.
http://www.powerbasic.com/support/pb...1&postcount=30
Ignoring that you're processing upper digits and lower digits separately, and skipping leading zeros, what I think you were doing highly paraphrased is:
Edit: updated to be closer.
Code:#Compile Exe #Dim All AsmData cvt DD &h3030, &h3130, &h3230, &h3330, &h3430, &h3530, &h3630, &h3730, &h3830, &h3930 DD &h3031, &h3131, &h3231, &h3331, &h3431, &h3531, &h3631, &h3731, &h3831, &h3931 DD &h3032, &h3132, &h3232, &h3332, &h3432, &h3532, &h3632, &h3732, &h3832, &h3932 DD &h3033, &h3133, &h3233, &h3333, &h3433, &h3533, &h3633, &h3733, &h3833, &h3933 DD &h3034, &h3134, &h3234, &h3334, &h3434, &h3534, &h3634, &h3734, &h3834, &h3934 DD &h3035, &h3135, &h3235, &h3335, &h3435, &h3535, &h3635, &h3735, &h3835, &h3935 DD &h3036, &h3136, &h3236, &h3336, &h3436, &h3536, &h3636, &h3736, &h3836, &h3936 DD &h3037, &h3137, &h3237, &h3337, &h3437, &h3537, &h3637, &h3737, &h3837, &h3937 DD &h3038, &h3138, &h3238, &h3338, &h3438, &h3538, &h3638, &h3738, &h3838, &h3938 DD &h3039, &h3139, &h3239, &h3339, &h3439, &h3539, &h3639, &h3739, &h3839, &h3939 End AsmData Function PBMain () As Long Local fraction As Ext Local i, num, number As Long Local v As Dword Local wp As Dword Ptr Local buf As StringZ*14 Local bp As Word Ptr number = 1234567890 wp = CodePtr( cvt ) bp = VarPtr( buf ) fraction = CDbl(number) / (10000000000##-1##) For i=0 To 4 fraction *= 100## num = Fix(fraction) v = @wp[num] @bp[i] = @wp[num] fraction = Frac(fraction) Next @bp[5] = 0 ? buf End Function
Last edited by Larry Charlton; 9 Mar 2015, 10:51 AM.
Comment
-
Larry,
you have the gist of it but don't think of it as floats and fractions, just as integers and remainders.
The technique is sometimes called Magic Number Division because when you first see the peculiar multiplier producing the correct result it looks like magic.
I work out the numbers myself but you can get websites to do it for you:
http://www.hackersdelight.org/magic.htm
I just checked and that site gives the exact same numbers I used, for division by 10000 use MUL by &hD1B71759 followed by a 13 bit shift.
The confusing part is probably that I have 2 sections of code doing almost the same thing.
The exact path taken through the code depends on the initial number of digits.
Until a valid, non-zero leading digit is found, I have to check for leading zeroes and suppress them.
Once I get that leading non-zero digit, I no longer need to check as all subsequent digits are valid pairs so I jump to the ZeroSupressed block of code which runs much faster as it doesn't need to check.
The multiply/remainder part is straight forward, once you know what the magic number does and that the CPU MUL instruction gives a 64 bit answer, with the upper 32 bits in EDX and the lower 32 bits in EAX.
Doing it in decimal to hopefully make it clearer, assume 10 digit registers..
Just imagine that the upper register contains the integer part and the lower register the remainder.
Num = 1234567890
Multiply by 1/1E8 (same as divide by 1E8 but faster)
That gives:
0000000012 3456789000 in the 2 registers
Use the 12 in the upper register as an index into the table to lookup the digits "12" and store them.
3456789000 MUL 100 then gives
0000000034 5678900000
Use the 34 in the upper register to lookup the digits "34" and store them.
5678900000 MUL 100 then gives
0000000056 7890000000
Use the 56 in the upper register to lookup the digits "56" and store them.
etc.
In the real CPU, the registers are 32 bits each and the numbers are binary but the principle is the same.
Paul.
Comment
-
Interesting. Back in the 1960's, the "decimal print routine" was a big thing. (Read Steven Levy's book Hackers.. which book even has its own "wikipedia" enty here.. http://en.wikipedia.org/wiki/Hackers...ter_Revolution).
In summary, for a long time the "Holy Grail" was to create this routine in fifty or fewer instructions. It got to fifty-three, then fifty-two, fifty one and then one day some guy posted it in only FORTY-NINE instructions.
Victory, right? Wrong.
It wasn't long before the next post in which someone posted an eleven-instruction routine... of course, a thing of elegance, marvel and beauty.
Looking at what was posted today, I'd have to say it appears we've come a long way, baby... but in what direction?
MCM
PS:
Finding that link just now, it appears Mr. Levy's book is now available in "e-publishing" format. Highly recommended for anyone with an interest in how we got where we are.Michael Mattias
Tal Systems (retired)
Port Washington WI USA
[email protected]
http://www.talsystems.com
Comment
-
Originally posted by Paul Dixon View Post...
Num = 1234567890
Multiply by 1/1E8 (same as divide by 1E8 but faster)
It's this part of the code that's making my head hurt
Code:mov edx, &hD1B71759 ; =2^45\10000 13 bit extra shift mul edx ; gives 6 high digits in edx mov eax,&h68DB9 ; =2^32\10000+1 shr edx,13 ; correct for multiplier offset used to give better accuracy jz skiphighdigits ; if zero then don't need to process the top 6 digits mov ecx,edx ; get a copy of high digits imul ecx,10000 ; scale up high digits sub edi,ecx ; subtract high digits from original. EDI now = lower 4 digits mul edx ; get first 2 digits in edx mov ecx,100 ; load ready for later jnc next1 ; if zero, supress them by ignoring cmp edx,9 ; 1 digit or 2? ja ZeroSupressed ; 2 digits, just continue with pairs of digits to the end mov edx,chartab[edx*4] ; look up 2 digits mov [esi],dh ; but only write the 1 we need, supress the leading zero inc esi ; update pointer by 1 jmp ZS1 ; continue with pairs of digits to the end next1: mul ecx ; get next 2 digits
Any chance this is at a basic level how floating point numbers work (excluding the mantissa and stuff).
Comment
-
Larry,
1 / 1E8 you picked because that leaves the first two digits in EDX?
That means eax is the "fractional" remainder so multiplying whatever EAX is by 100 will put the next two digits in EDX and you can repeat for the other digit pairs?
Specifically the part that's giving me grief is why you can divide by 1 / 1E8 and then start multiplying the remainder to pull to more digits out.
Code:EDX EAX 1234567890 . 000000000 divide by 1E8: 0000000012.3456789000 Remove the 12 in the high register: 0000000000.3456789000 x100 0000000034.5678900000 Remove the 34 in the high register: 0000000000.5678900000 x100 0000000056.7890000000 etc.
In real life, it's binary, but the principle is the same.
Imagine the original number is 98765 decimal.
98765 = 181CD hex
Code:EDX EAX 000181CD.00000000 Divide by 10000 decimal 00000009.E0624DD3 Remove the first digits 09hex = 09 dec 00000000.E0624DD3 Multiply by 100 decimal (64 hex) 00000057.A666666C Remove the digits 57 hex = 87 dec. 00000000.A666666C Multiply by 100 decimal (64 hex) 00000041.00000230 Remove the digits 41 hex = 65 dec
Don't think of this as floating point which is something different.
Paul.
Comment
-
Thought I had it, then poof. When I run through I can get the problem to work using division, but not multiplication:
Code:499602d2 1234567890 05f5e100 100000000 div 499602d2 / 05f5e100----- x C 12 remainder 020f76d2 34567890 div 020f76d2 / f4240h ------ x 22 34 remainder 8AA52 567890 div 8AA52 / 2710h ---------- x 38 56 remainder 1ed2 7890 div 1ed2 / 64h ------------- x 4e 78 remainder 5a 90
Code:Prefix "!" xor edx, edx mov eax, 1234567890 mov ebx, 100000000 div ebx mov ebx, 1000000 mov eax, edx xor edx, edx div ebx mov ebx, 10000 mov eax, edx xor edx, edx div ebx mov ebx, 100 mov eax, edx xor edx, edx div ebx
Comment
-
Larry,
first, look at a binary/hex version as it's easy to see what's happening
Code:PREFIX "!" mov eax, &h12345678 '12345678 is the number in hex to start with mov ebx, &h100 '100 the multiplier, we're doing 2 hex digits at a time so use &h100 mul ebx 'This puts the top 2 digits 12 in edx and shifts up the remainder in eax mul ebx 'This puts the next 2 digits 34 in edx and shifts up the remainder in eax mul ebx 'This puts the next 2 digits 56 in edx and shifts up the remainder in eax mul ebx 'This puts the last 2 digits 78 in edx and shifts up the remainder in eax leaving 0. ' END PREFIX
Comment
-
I was naïve, I tried that first.
1234567890 x 100 = 123456789000 (1C BE991A08)
Couldn't see how to make 1C be a C (12). That's what's making my head hurt
1234567890 x 100h = 49 9602 d200 can't see how 49h becomes 12d either.Last edited by Larry Charlton; 10 Mar 2015, 10:47 AM.
Comment
-
Code:'PBCC6 program FUNCTION PBMAIN () AS LONG LOCAL a,b,scle AS QUAD LOCAL ans AS STRINGZ * 12 a = 123456 'the decimal number to do 'we need to scale the number by 1000000 b = a / 1000000 'unfortunately, we can't do this as b is an integer and we lose all the lower order, fractional bits. scle = &h100000000 / 1000000 'so, instead, we scale the number up by 32 bits then divide (32 bits for this example, usually more to make full use of the registers) b = a * scle 'The top half of a now contains the integer part, the bottom half, the residue 'this appears cumbersome in BASIC but in ASM it's very neat PRINT HI(DWORD,b),LO(DWORD,b) 'Then, repeatedly multiply the residue by 100 to get each pair of digits b = LO(DWORD,b) * 100 PRINT HI(DWORD,b),LO(DWORD,b) b = LO(DWORD,b) * 100 PRINT HI(DWORD,b),LO(DWORD,b) b = LO(DWORD,b) * 100 PRINT HI(DWORD,b),LO(DWORD,b) !int 3 PREFIX "!" ;now do it in decimal mov ecx,scle 'The same scale we used in BASIC mov eax,123456 'get the number mul ecx 'scale it mov ecx,100 'the multiplier to give 2 decimal digits at a time 'note, we now switch to looking in edx, the upper 32 bits of the result, for the 2 digits mul ecx 'This gives edx = &h0c = 12 decimal, the first 2 digits mul ecx 'This gives edx = &h22 = 34 decimal, the next 2 digits mul ecx 'This gives edx = &h38 = 56 decimal, the last 2 digits 'we get the same answer as the BASIC version END PREFIX WAITKEY$ END FUNCTION
Last edited by Paul Dixon; 10 Mar 2015, 04:59 PM.
Comment
-
it's getting clearer. It actually has a name, binary scaling. The reason it kept looking like floating point math is it's pseudo floating point.
Thanks again for hanging in there with me. It looks like a highly useful technique for a number of things and a technique I'm very likely to find uses for.
Comment
-
Larry,
if it's becoming clear now, we can move on to the next step!
You only need to read this if you want to understand why the scaling value is what it is. If you're happy to accept the values calculated for you by that website I mentioned then you can skip it.
You might have noticed that in the original fastLong2Str I split the number into 2 parts before processing it and I used a peculiar scale value:
Code:!mov edx, &hD1B71759 '=2^45\10000 13 bit extra shift
By scaling by 2^45\10000 instead of the expected 2^32\10000 I make sure all the bits of the registers are used in the scaling calculation instead of leaving 13 bits unused at the top.
This adds 13 bits of precision to the MUL and then afterwards, we shift the result to the right by 13 bits to compensate.
It's still way faster to MUL then SHIFT than it is to DIV.
If you try to work immediately with a 10 digit number you'll find you can never get enough accuracy as there are no unused bits at the top of the register to be used as they contain real data already.
So, first split the number roughly in half then scale using all bits of the register, then shift back to compensate.
Paul.
Comment
-
in your conversion you split the number into 6 digits and 4 digits. That was to keep the number from overflowing right? So to get the full number you would do something like:
Code:a = 1234567890 'the decimal number to do scle = &h100000000 / 10000 'so, instead, we scale the number up by 32 bits then divide (32 bits for this example, usually more to make full use of the registers) b = a * scle 'The top half of a now contains the integer part, the bottom half, the residue 'this appear cumbersome in BASIC but in ASM it's very neat Local hi6int, lo4int As Quad hi6int = Hi(Dword,b) lo4int = a - hi6int * 10000 b = hi6int * scle Print Hi(Dword,b),Lo(Dword,b) ' 1st 2 digits b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' 2nd 2 digits b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' 3rd 2 digits scle = &h100000000 / 100 'so, instead, we scale the number up by 32 bits then divide (32 bits for this example, usually more to make full use of the registers) b = lo4int * scle Print Hi(Dword,b),Lo(Dword,b) ' next 2 b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' next 2
Comment
-
Ah yes the 13 bitsSo really a bit more like this:
Code:a = 1234567890 'the decimal number to do scle = &H200000000000 / 10000 b = a * scle Shift Right b, 13 Local hiInteger, loFraction As Long hiInteger = Hi(Dword,b) loFraction = Lo(Dword, b) b = hiInteger * scle Shift Right b, 13 Print Hi(Dword,b),Lo(Dword,b) ' 1st 2 digits b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' 2nd 2 digits b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' 3rd 2 digits b = loFraction b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' next 2 b = Lo(Dword,b) * 100 Print Hi(Dword,b),Lo(Dword,b) ' next 2
Last edited by Larry Charlton; 10 Mar 2015, 06:57 PM.
Comment
Comment