No announcement yet.

Rnd2 disscussion

  • Filter
  • Time
  • Show
Clear All
new posts

  • Rnd2 disscussion

    Given that fine and super comprehensive code you wrote there David, I thought it'd be wise to research George Marsaglia's multiply-with-carry (mwc) theory once more. So I did and discovered one more zero-period seed group to avoid: if mwcRandom = &hffffffff and mwcCarry = multiplier - 1 (eg. 4221480339 - 1) then period is apparently 0.

    The odds of that happening by chance are 1 in 18131119988195545710 or 1.8*10^19 random seeds. I'm not sure if this needs to be addressed in the code since 1) current PCs doing nothing but re-seeding 24/7 would hit such a seed only once every several thousand years. Under normal Rnd2 usage re-seeding a few times a day, make that some billion years. And 2) it looks like the seed mwcCarry is fixed in the code and never can be any of the multiplier - 1 values. ? I think.

  • #2
    Hi John

    mwcCarry's initial state is fixed for all of the Cases except when Randomise is used and then it is changed by the generator itself. I did that instead of your 'mwcCarry = mwcCarry + TIMER * 18' in rndExt as the outcome here could not be guaranteed to not fall foul of any restrictions.

    Added: The problematic pairings are, of course, non-existent entry points to a multiplier's sequence. An active sequence would not see such a pairing being generated either as that would contradict the life of the sequence in the first place which is why I let the generator loose on mwcCarry's initial state.
    Last edited by David Roberts; 2 Sep 2008, 04:16 PM.


    • #3
      David, I posted a somewhat revised Rnd2 in source and would like to know what you think of it. I tried to make it more like RND basically. I saw you were making an object of it too.

      The significant changes to your original code are: 1) toggleScope uses only one generator so it cannot un-toggle then repeat the last sequence anymore. To repeat it must be in the same toggle state. 2) Different arguments are used for closer RND match, especially allowing 2GB of negative integers to be used to re-seed generator. 3) Wider range of multipliers used and scrambled in order. 4) Randomise code was updated and called TrueRandom It's a memory device hehheh. 5) Optimized integer-range handling. (Note the little brainwave in there. I was kinda proud of those 3 or 4 lines.)


      • #4
        Hi John

        That kept you quiet for a few days, didn't it?

        No comments yet, I'm just acknowledging your last post.

        You have obviously put a lot of effort into your revision deserving a good look and that is what it will get.


        • #5
          Still testing.

          mwcRandom = &h55b218da - one 'assures maximum possible 2147483648 unique sequences

          That is a great tweak to my RandomiseInt which simply gave access to the multipliers with just one entry point.

          The increase in speed is astonishing - I didn't think there was much that could be done on that score.

          So far this has the makings of the definitive PB prng and the argument for using RND is getting weaker. Those that use random numbers a lot should be queuing up to buy you a beer.

          Bit short of time at the moment so it may be a few days before completion. The floating point asm will be a good teaching aid for me as that is an aspect that I have not dabbled with much.

          This is good stuff, John.


          • #6
            I haven't been able to do much with my PC of late - I've got a overheating problem and both cores of my E6700 are running red hot. I've got the side off at the moment and a big household fan blasting the motherboard and the temperature seems to be stabilized. The system had been throttling back to 25% so I was down to about 650MHz. I can live with that until I get it fixed but the temperatures were rising such that I had to turn the machine off to let it cool down.

            Anyway, I've been looking at your 'Case 1' ie Randomise aka TrueRandom "a fun way to think of it"

            To choose the multiplier you are multiplying the QueryPerformanceConter (QPC) with the Time Stamp Counter (TSC). With modern systems it seems that they are one in the same. Have you seen this?

            Since there is a slight delay before the TSC is read then the result is effectively QPC * ( QPC + delta) where delta is the increment to the 'second' reading.

            Both QPC and the TSC are functions of when the system was started. Timer returns the number of seconds since midnight and is clearly not dependent upon when the system was started so Timer would be a better choice for the 'second' reading. This approach will work whether QPC and the TSC are one in the same or not.

            The original Rnd2 used

            If Not RandomizeYet Then
              Seed = Timer*2^32/86400 ' 60*60*24
              Randomize Seed
              RandomizeYet = %True
            End If
            mwcSoph = @ptrMultiplier[Rnd(0,383)]
            Seed, then will be in the range (0, 2^32).

            Your approach, if something other than the TSC is used for the 'second' reading, is an improvement but nothing like strong enough to name the method TrueRandom.

            The TSC is read again to vary mwcRandom. Given the frequency and the time since the the TSC was first read then we can determine the new TSC without reading it. OK we have asm twix the two readings but we are not introducing new information.

            The default wmcCarry is OK with all of the multipliers but its current value may not be valid with a chosen multiplier and I see that you check this. The original Rnd2 does not do such a check and I'll need to 'fix' that.

            Having corrected mwcCarry, if required, you then call sRandomLong as the original Rnd2 does.

            The above is almost wholly time oriented but that is OK if we do not know when it will be used.

            DO: TrueRandom: ? STR$(Rnd2(1, 6)): TrueRandom: ? STR$(Rnd2(1, 6)): LOOP

            has the TrueRandom calls separated by the same time increments. Theoretically, the above line is predicatable upto using sRandomLong; and thus pseudo-random algorithm dependent. In practice, PCs being PCs there will be a little 'drift' but we should not rely on 'drift' for randomness.

            I must conclude that 'Case 1' is not much of an improvement on the original Rnd's Randomise; which also 'fails' with the dice example.

            Notwithstanding the above I understand the need for "rolling real dice in real time with no dependency on any pseudo-random algorithm". To achieve that we need something like your approach but with a lot more parameters and with plenty which are not time based. We are, of course, now talking about a Cryptographic generator. Rather than reinvent the wheel we can use the beast that comes with our PC courtesy of Microsoft and covered here. We could get 10 bytes, use the first two for choosing a multiplier and the next two sets of four for mwcRandom and mwcCarry making sure that the latter two are valid for the mwc process. We now do not need to call sRandomLong which muddies our crypto with pseudo.

            If you do this then 'Randomise aka TrueRandom' will be well justified.
            Last edited by David Roberts; 30 Sep 2008, 08:42 AM.


            • #7
              Hi Dave, my main thoughts here are about the queryPerformanceCounter and queryPerfFreq functions. On my old laptop, the perf freq is a bit under a microsecond. The time stamp cntr TSC is ~2 nanosecs. That's like 500 times faster. Your post shows the two nearly identical, which must be a change in, like you said, the modern processors.

              But they are different by significant amounts, and since the performance counter is a cumulative number, whereas the TSC a repeat-every-few-seconds number, those small fluctuations relative to each other still look like "time static" to me and may be sufficient to give statistically good random numbers. They certainly are sufficient on my machine. Can you run a test to try this out?

              The worst case ie. most bytes returned with least possible time fluctuation randomness would be:

              FUNCTION PBMAIN () AS LONG
                LOCAL ii AS LONG
                LOCAL lineo AS STRING, Lptr AS LONG PTR
                lineo = SPACE$(30000000)
                Lptr = STRPTR(lineo)
                FOR ii = 1 TO 7500000
                   @Lptr = rnd2Long
                   INCR Lptr
                OPEN "c:\trueRandomTest.dat" FOR OUTPUT AS #1
                PRINT #1, lineo;
                ? "done"
              EXIT FUNCTION
              Here's a very stringent quick stat test for your resultant file. It works by drag and dropping your file onto it, or putting it in the "Send To" menu. (Some day I'll add the little file select box ) You can make the file bigger if you want; larger files are better because ever more subtle flaws can be found.
              Attached Files


              • #8
                Hi John

                I get 244.06 and 67.727%

                I cannot make sense of them with my chi^2 tables.

                Good or bad results?


                • #9
                  I've just replaced your Randomise with the original Rnd2 Randomise and got 254.4 and 49.706%.

                  If your test is based upon less than 10%, say, would question the null hypothesis of randomness then neither Randomise is a worry.

                  Anyway, I await your reply.


                  • #10
                    That's what we call a big time PASS on both accounts. The chi square fails if it consistently goes < 1% or returns something like .0001% or .9999% on a single run. Multiple chi square runs should produce a graph like what I'll post in a little bit heheh.
                    Attached Files
                    Last edited by John Gleason; 30 Sep 2008, 01:03 PM. Reason: posted graph


                    • #11
                      Btw, your RANDOMIZE method works fine too, but I was trying for more resolution and a few less tix. I originally used getTickCount (1/1000 sec) rather then the perf counter, but thought the pf would be more resolution. I could always go back to it if pf shows any problems

                      The chart above shows that the %'s returned from multiple runs should be a straight line, and my 155 test runs clearly formed a perfectly straight line.
                      Hey, ya gotta give a line a break there. It needs a little room for some random wobble.

                      But the point is, you should get a 99%, a 1%, a 5% etc. just as often--that is, one percent of the time--as any other percentage on the straight line. If you don't, then the line is going to have a curve or step and that means the algo needs a tweak or, *gulp* a deep-six.


                      • #12
                        OK, with a sufficiently old system the QPF and the TSC may be different animals. With a sufficiently modern system they seem to be one in the same. Rather than get involved in the age of a system the simplest solution is to drop one of them and use something else.

                        If they are one in the same despite the QPF being cumulative and the TSC being cyclic they are nevertheless born of the system start and we are relying upon weird and wonderful fluctuations to give a difference. The simplest solution to remove any doubt is to drop one of them and use something else.

                        GetTickCount is also dependent upon the system start so I would reject that no matter what resolution it had.

                        Resolution and tix: I'm not that concerned about resolution. I'd rather have half a dozen truly random low resolution figures multiplied together than one high resolution figure with a suspect randomness. Randomizing in normal circumstances should only be used once whether we are using RND or Rnd2 ( with the same multiplier ) so tix isn't a problem here. Tix will be important in code such as the dice example.

                        I'm not convinced that a non-pseudo method is necessary in the dice example. You have shown that the wmc method passes all of the Die Hard routines so how are we going to prove that a non-pseudo method is better?

                        A cryptographic generator may be a bit over the top.

                        With my Randomise the seed for RND is got from Timer and RND is used to choose the multiplier with the PB Randomize only used the first time Randomise is invoked. I doubt very much that Randomise will be used so much that the period of RND is compromised. Do you have any strong arguments against this approach for choosing the multiplier?

                        mwcRandom is got by adding the TSC to the current mwcRandom. This is what you use as well and I got it from your earlier code. We could improve on this by introducing one or two other 'random' events.

                        mwcCarry is got by invoking the mwc on the current mwcCarry; which also 'shuffles' mwcRandom in the process. You also use this approach.

                        However, the current mwcCarry may be incompatible with the newly chosen multiplier so needs to be checked as you have done.

                        What I'd like to get is an agreement on 'Case 1' so that I can move on to the rest of Rnd2 revisited. Ultimately I'd like to not have your Rnd2 and my Rnd2 but have Rnd2 revisited as the final version. I'll then put a comment in Rnd2 to treat it as a beta and go to Rnd2 revisited for the latest version.

                        Of course, if anyone else wants to chip in then I'd welcome that but I have a feeling that won't happen.


                        The area behind my CPU fan was filthy. My vacuum cleaner has an adapter and small attachments to get into tiny places. Yeah, yeah, yeah - I know. Well, I heard a strange sound when I accidentally touched the heat sink ( I think ). A boot up failed. Waited a while - no luck. I turned the machine off and pulled the mains socket out and left for a few hours hoping that whatever I put into the system would dissipate. I'm back in business again and my core temps are now 30°C lower without my household fan.

                        The system has told me of one file corruption so I'll need to check the drive fully.

                        How lucky am I?
                        Last edited by David Roberts; 30 Sep 2008, 06:18 PM.


                        • #13
                          Cool, no spontaneous computooular combustion. I'll think about rnd2(1) some more, but by all means, test the remainder of rnd2 revisited any way you can think of. I actually used it yesterday testing yet another round-up-on-5 algo a fellow PB'er posted.


                          • #14
                            > and since the performance counter is a cumulative number, whereas the TSC a repeat-every-few-seconds number

                            I repeated that a few posts later forgetting that we tend to use only the LO Dword of the TSC in eax. The HI Dword is in edx. TSC is cumulative as well.

                            Interestingly the QPC seems to have a 125ms head start on the TSC on my machine!

                            Added: Remembering that on my machine the QPC and TSC have the same frequency.

                            QPC: 5559228891510 
                            TSC: 5558895442640 
                            Difference  333448870   125.031917ms
                            TSC: 5558896571470 
                            QPC: 5559230021490 
                            Difference -333450020  -125.032348ms
                            ' PBCC 5
                            #Compile  Exe
                            #Optimize speed
                            #Register None
                            #Dim      All
                            #Tools    Off
                            %NOMMIDS = 1
                            %NOGDI = 1
                            #Include "WIN32API.INC"
                            Function PBMain() As Long
                            Local qFreq, qQPC, qTSC As Quad
                            ' If you have a single core remove the next three lines
                              Local hProcess As Dword
                              hProcess = GetCurrentProcess()
                              SetProcessAffinityMask( hProcess, 1) ' ie CPU 0
                              QueryPerformanceFrequency qFreq
                              QueryPerformanceCounter qQPC
                              !lea esi, qTSC
                              !mov [esi], eax
                              !mov [esi+4], edx
                              Print "QPC:";qQPC
                              Print "TSC:";qTSC
                              Print "Difference ";qQPC - qTSC;Using$("#####.######",(qQPC - qTSC)*1000/qFreq) + "ms"
                              !lea esi, qTSC
                              !mov [esi], eax
                              !mov [esi+4], edx
                              QueryPerformanceCounter qQPC
                              Print "TSC:";qTSC
                              Print "QPC:";qQPC
                              Print "Difference ";qTSC - qQPC;Using$("#####.######",(qTSC - qQPC)*1000/qFreq) + "ms"
                            End Function
                            Last edited by David Roberts; 1 Oct 2008, 06:48 PM.


                            • #15
                              The plot thickens.

                              If we add
                              Print "QPC Age:";Using$("#####",qQPC*1000/qFreq) + "ms"
                              Print "GetTickCount Age:";Using$("#####",GetTickCount) + "ms"
                              just before Waitkey$

                              I got GetTickCount a couple of seconds older than QPC. From SDK: "The GetTickCount function retrieves the number of milliseconds that have elapsed since the system was started."

                              This is what I get shortly after a Restart:
                              QPC: 217591254830 
                              TSC: 214924213290 
                              Difference  2667041540  1000.060573ms
                              TSC: 214928608140 
                              QPC: 217595651090 
                              Difference -2667042950 -1000.061101ms
                              QPC Age:81592ms
                              GetTickCount Age:81671ms
                              Here GetTickCount is only 79ms older than QPC

                              QPC is one second older than TSC with this run.

                              Not sure that I'd learn much by doing extensive tests, like running the above every hour after a Restart but it does show that when it comes to timing nothing is straightforward with our machines.


                              • #16
                                22 minutes later. plus a <yawn>
                                QPC: 3594891028090 
                                TSC: 3592223986900 
                                Difference  2667041190  1000.060441ms
                                TSC: 3592225185310 
                                QPC: 3594892227970 
                                Difference -2667042660 -1000.060993ms
                                QPC Age:1347977ms
                                GetTickCount Age:1348265ms
                                GetTickCount is now 288ms older than QPC.

                                Perhaps GetTickCount needs to be wound up like an old fashioned watch.

                                TSC's frequency may be a tad faster than QPC - a small tad - and is chasing QPC.

                                Added: The last statement seems to be false. It appears that the difference may well be 'fixed' for the duration of a Windows session.
                                Last edited by David Roberts; 2 Oct 2008, 07:29 AM.


                                • #17
                                  Ignorance is bliss!

                                  My beast is currently at 45°C. If it gets any cooler I'll have to get it a scarf.

                                  Anyway, Core Temp tells me the frequency is 1600.13MHz (266.69 x 6.0). Do some work and it changes to 2666.88MHz (266.69 x 10.0). I assume that this is the Power Management at play. So, QPC and TSC are no indicators to when the system started.

                                  My apologies to GetTickCount. It is my shout at the next tea break. One lump or two?

                                  This also means that we can use QPC/TSC and GetTickCount as independent figures in our prng manipulations. I rejected GetTickCount earlier. Blimey, looks like I'm buying the rolls as well.
                                  Last edited by David Roberts; 2 Oct 2008, 06:37 AM.


                                  • #18
                                    I thought of potentially even a better way than GTC/QPC. Ready for this?
                                    !BSWAP EAX     ;!  1/2 the brainwave...
                                    !ROR EAX, 1    ;!! and the other 1/2 ;)
                                    Heyhey, my second brainwave in three weeks. Now the second rdtsc value has very little in common with the first in rnd2.

                                    Now, re. the true randomness of the algo--the below code creates a worst case for randomness, and conversely, best case for precision reading the TSC. You should close all programs then run it, and if there were no inherent randomness, after a few iterations into the loop, the same tick count would repeat for a long time. That is exactly what doesn't happen. Extra ticks get added here and there in an entirely unpredictable manner--on my machine one tick per 57 bytes on average. Isolating just those extra ticks results in a very close to random distribution.

                                    As used with successive calls to the rnd2 function where you call other functions between the rnd2 function, (you can uncomment the rnd2 calls to simulate this) the randomness increases by a factor of 19, or about one tick per 3 bytes on average. So, in the dice example you would get on average over 1.33 bits of trueRandom per die (I think 4/3 + 4/57 = 1.4 is the exact value ). IMHO that validates the TrueRandom designation because as long as one bit or more varies in either multiplier (optimally, both vary in rnd2) of the mwc algo, an entirely different result is returned.

                                    #COMPILE EXE
                                    #DIM ALL
                                    #INCLUDE ""
                                    FUNCTION PBMAIN () AS LONG
                                        LOCAL ii, ii2 AS LONG, lineo AS STRING
                                        LOCAL x, y AS LONG, Lptr AS LONG PTR
                                        LOCAL z AS QUAD, t AS SINGLE
                                        lineo = SPACE$(8000000)
                                        Lptr = STRPTR(lineo)
                                        OPEN "c:\bswapFile2.dat" FOR BINARY AS #1
                                        t = TIMER
                                        FOR ii = 1 TO 2000000
                                    '       rnd2
                                           !mov x, eax
                                    '       rnd2
                                           !mov y, eax
                                           y = y - x
                                           @Lptr = y
                                           INCR Lptr
                                        PUT #1,, lineo
                                        ? "k" & STR$(TIMER - t)
                                    END FUNCTION


                                    • #19
                                      Ignorance is bliss indeed
                                      My beast is currently at 45°C. If it gets any cooler I'll have to get it a scarf.
                                      Mine is running 36C and it is alarming me because it should be 28C normally

                                      (I wonder if the heater was turned on and no one told me???)
                                      Engineer's Motto: If it aint broke take it apart and fix it

                                      "If at 1st you don't succeed... call it version 1.0"

                                      "Half of Programming is coding"....."The other 90% is DEBUGGING"

                                      "Document my code????" .... "WHYYY??? do you think they call it CODE? "


                                      • #20

                                        > Now the second rdtsc value has very little in common with the first in rnd2.

                                        It has everything in common. All the original information is still there - nothing removed and nothing added.

                                        Consider X = A XOR B

                                        X may look nothing like A but multiply both sides with XOR B and we get

                                        X XOR B = A XOR B XOR B
                                        = A XOR 0
                                        = A

                                        !BSWAP EAX
                                        !ROR EAX, 1

                                        can be reverse engineered as well.

                                        Put another way. I give you 32 bits of random data. You make a copy and then apply an algorithm to the original which can be reverse engineered and tell me that you are now going to hand back 64 bits of random data. Don't think so.

                                        !BSWAP EAX
                                        !ROR EAX, 1

                                        is cryptography, not statistics.

                                        With regard the the exe I haven't a clue what you are doing. I need some sleep - I'll look at it when I'm bushy tailed - stopped being bright eyed ages ago.


                                        At one point I saw 80 odd °C and hence the CPU throttling. Left off overnight the boot saw 37ºC. I'm at 50ºC at the moment. I got a quote this morning for a supply and fit new fan and heat sink. It won't kill me so that is what I'll do. Fit it myself? I could write a book on the near death experiences I've had with DIY projects. Murphy's Law kicks in as soon as a pick up a screw driver.