No announcement yet.

On Hamming Weight

  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    To use the phrase, "it's just wonderful." in describing a pseudo random number generator sums up the last post for me.

    Anyone new to PowerBASIC should be made aware that "it works as advertised" is a false statement if we request more than 4 KB of data. By that, I mean there is evidence that beyond 4 KB the data examined is not random. So, any floats beyond 1 KB should not be trusted to be random, as advertised, and it is misleading to say otherwise.


    • #22
      . By that, I mean there is evidence that beyond 4 KB the data examined is not random. So, any floats beyond 1 KB..
      I am not sure what you mean by "4KB" and "1KB" here (number of values requested? Total size at 4 bytes (SIZEOF(SINGLE)) per call to RND() for a single seeding?) but in any case, location of said evidence?
      Michael Mattias
      Tal Systems (retired)
      Port Washington WI USA
      [email protected]


      • #23
        By 4 KB I mean just that - 4096 bytes. By 1 KB I mean 1024 * RND requests.

        Evidence is got by using PractRand 0.94. Years ago, we used to use George Marsaglia's Die Hard test suite. PractRand is now the de facto test suite as used by Melissa O'Neill, Sebastiano Vigna, Bernard Widynski, Mark Overton and many other PRNG developers.

        Mr Mattias I admire your loyalty to PB's RND but in 2021 you are on very thin ice. If someone asked me, "I just want a few hundred RNDs - is PB's RND OK?". I would say, "Yes, probably because I couldn't prove otherwise.". If more than about 1 KB of RNDs are required, then I would strongly recommend another generator. We could use PCG32, for example, which passes PractRand to 16 TB of data, that is 4 TB of floats - and that is a lot of floats. PCG32 has a period of 2^64 compared with PB's 2^32. In this day and age 2^32 is regarded as inadequate. PB's throughput should be OK for most applications, but if speed is an issue, the generators developed in the last few years are very much faster.

        PB's RND still has a role to play, but that role is small.


        • #24
          Another way:

          The average HW of the n-binary digit numbers is the sum of all their HWs divided by 2 ^n.

          The sum of all the HW can be partitioned into:
            1 * (# with HW = 1)
          + 2 * (# with HW = 2)
          + 3 * (# with HW = 3)
          + n * (# with HW = n)
          Let C(n,k) = the number of ways k things can be chosen from n things without regard for order:
          C(n,k)  =  n! / (k! * (n - k)!).
          Then the first expression above equals
          1 * C(n,1) + 2 * C(n,2) + 3 * C(n,3) + ... + n * C(n,n)
          Offhand I don’t see a simplification of this expression. One example,
          the sum of all HW for n = 5 is
          5 + 20 + 30 +20 + 5 = 80
          and thus the average HW for a 5 binary digit number is
          80/32 = 2.5
          Turning that around, we have the combinatorial identity:
          1 * C(n,1) + 2 * C(n,2) + 3 * C(n,3) + ... + n * C(n,n)  =  n * 2^(n-1)
          Last edited by Mark Hunter; 17 Aug 2021, 04:16 AM.
          Politically incorrect signatures about immigration patriots are forbidden. Searching “immigration patriots” is forbidden. Thinking about searching ... well, don’t even think about it.


          • #25
            Well done Mark. I spotted your 64 as opposed to 32 and have been looking for a simplification of the i * C(n,i) expression, but you beat me to it.

            I mentioned, in post #14, "If we ran the code to exhaustion, each row would be 32!/(r!*(32-r!) [Combination] so 05, for example, would read 201376." but, it seems that was not entirely true.


            • #26
              Mr Mattias I admire your loyalty to PB's RND
              I am not loyal to any brand (except maybe Oracle for its DB) ... but I am loyal to "works as advertised."

              I have a hunch we are not on the same page here, but I first will have to read up on this "PractRand" thing. But today I have to go to the gym and then come home, have some lunch then drive down to Grafton to pick up my brother and drive him up to Two Rivers (WI). Probably won't be back until dinnertime so it won't be until tomorrow at soonest.
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]


              • #27
                PractRand is a bit 'geeky' and not everyone takes to it. I struggled with it at first, and it has a CLI which I am not overly keen on. John Gleason found it some years ago. I had to make it work, so persevered. I have had to help out a few at FreeBASIC as some of them found it a bit of a struggle as well. Give me a PRNG now, and I will test it fairly quickly.

                If you have a bit of spare time, have a look at PCG32. It has a lot of functionality and the dll is only 27KB. Admittedly, it is not as convenient as just using RND, but it isn't that inconvenient using a dll.

                FreeBASIC has a useful keyword called #undef. I can develop using one of FB's PRNGs with RND. When I am done, I can then include one of my PRNG's bas file and then execute this, for example.
                Dim Shared As pcg32 pcg
                #undef Rnd
                #define Rnd pcg.randse
                My source code does not require any other editing. RND is left as is and uses my PRNG.


                • #28
                  I have had a 'new feature request' for #undef or equal in for about twelve years and 3 or 4 major releases.

                  Still waiting.

                  (Really handy if it applies to DECLARE statements and if %DEF() may be used against procedures. Great when the definitions in change across releases).
                  Michael Mattias
                  Tal Systems (retired)
                  Port Washington WI USA
                  [email protected]


                  • #29
                    Still waiting.
                    I think that Bob Zale may have been like some bosses that I have worked with over the years when I had to put an idea to them and manipulate the conversation so that it looked the idea was theirs. If I didn't do that, there was no chance of my ideas being taken on board. The trick is to put the idea with an intended flaw and ask if they can spot the flaw. They did that and then took credit for the idea. Yes, it is a bit sneaky, but most of them were egotists and had it coming to them as far as I was concerned.


                    • #30
                      I think that Bob Zale may have been like some bosses that I have worked with over the years when I had to put an idea to them and manipulate the conversation so that it looked the idea was theirs
                      Not my experience. I received several phone calls and/or emails over the years from Mr. Zale regarding new feature suggestions I had sent in. No "manipulation" was ever done in either direction.

                      But he really had to like the idea .. these were always for something he was already planning on incorporating. That is, these were never "Ifs" they were always "Whens"

                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]


                      • #31
                        I agree with Michael, Bob Zale knew what he was doing and did it properly. I see it as presuptuous to suggest that he was manipulated in the manner suggested. Over time I have heard many suggestion about how to future develop PB but I have not seen the talent to do so, just aspirations of grandeur from people who don't have a clue.
                        hutch at movsd dot com
                        The MASM Forum



                        • #32
                          An obvious use of Hamming Weight (HW) is to consider a PRNG's state vector. To keep life simple, I looked at PCG32's state vector, which is a QWORD; 'typedef unsigned __int64 QWORD' in Microsoft parlance.

                          The consensus is that the initial state vector should have a good entropy, by which is meant a HW of 32 ± k where 0 <= k <= 32 and k should be close to zero, if not zero.

                          Finding a justification for that on the internet is like looking for a needle in a haystack. However, Sebastiano Vigna has shown that with an initial state of poor entropy, the early numbers generated by a PRNG have a poor quality of randomness. This has led to many suggesting that we 'burn' some early numbers by requesting a sufficient number to raise the entropy to a satisfactory value before using any. It seems that we do not suddenly jump from a poor entropy to a good entropy, but drift upwards. The upwards drift may be quick, but may also be slow.

                          Needless to say, we are talking about a bit of a black art here.

                          My latest approach is to force the initial state vector to have a good entropy, allowing us to dispense with 'burning'. We can generate initial state vectors by a variety of means: another generator such as SpiltMix64 by Vigna 2015, a CPRNG such as those by Microsoft, Intel's RDSEED (if you have it), polling RDTSC, for example. I favour polling RDTSC because it is fast and pretty much unpredictable.

                          Having done that, I then wondered what the HW of the state vector was during a generator session. I have always thought a good quality PRNG goes through stages of randomness, and every now, and again we go through a section of the sequence where the randomness is not as good as we would like.

                          So I looked at the HW of the state vector of PCG32 after 100 million requests of a random DWORD and got this. Because PowerBASIC is bereft of an unsigned 64-bit datatype I had to use a different compiler.
                          0             0
                          1             0
                          2             0
                          3             0
                          4             0
                          5             0
                          6             0
                          7             0
                          8             0
                          9             2
                          10            9
                          11            35
                          12            179
                          13            719
                          14            2581
                          15            8768
                          16            26326
                          17            73926
                          18            195798
                          19            472148
                          20            1062866
                          21            2227881
                          22            4357553
                          23            7956287
                          24            13587614
                          25            21737022
                          26            32617893 }
                          27            45891669  }
                          28            60645056   }
                          29            75299464    }
                          30            87835259     }
                          31            96331150      }
                          32            99368156 89.66%
                          33            96328979      }
                          34            87815806     }
                          35            75281703    }
                          36            60652215   }
                          37            45906418  }
                          38            32596894 }
                          39            21748287
                          40            13589013
                          41            7952985
                          42            4353159
                          43            2228927
                          44            1064730
                          45            472979
                          46            195966
                          47            74892
                          48            26349
                          49            8703
                          50            2657
                          51            751
                          52            176
                          53            43
                          54            6
                          55            1
                          56            0
                          57            0
                          58            0
                          59            0
                          60            0
                          61            0
                          62            0
                          63            0
                          Clearly we have a binomial distribution and, not surprisingly, a HW of 32 gave the highest number. In fact, nearly 90% are in the range 32 ± 6. I have no idea if 32 ± 6 is good enough to use as an initial state. If it is, then outside that range is not good enough, and that can happen about 10% of the time. So just using a random initial state is taking a chance; justifying my approach to force the initial state vector to have a good entropy.

                          Looking at the low HW in the above, the code is not forensic enough to tell us when they occurred. Remembering that "we do not suddenly jump from a poor entropy to a good entropy but drift upwards" I should imagine that the opposite is true, so those low HW values probably occurred at the same time.

                          This is speculation, but it may be that when we drift into the tail of the distribution if we are there too long, before drifting back to the centre of the 'bell shape', then PractRand may issue a 'Evaluation unusual' message. If we are in the tail for much longer, PractRand may abort a test with a failure. There may be no correlation between an unusual 'p-value', originally used by George Marsaglia's Die Hard suite of tests, and a poor current state vector entropy. For what it is worth, I reckon that there is. If we do several tests with PractRand we may see 'Evaluation unusual' at different times, at 4GB, or 16GB, or whatever. This may be because we have entered the generator's sequence at different positions. We could have a run between two poor entropy phases and see a 1TB test with no 'Evaluation unusual' at all. Mark Overton, author of RomuTrio, suggests running PractRand to 4TB and more than once.

                          The above is not an indictment of PRNG's; although it may be. To my mind, we are simply observing how random numbers behave, and we will get unusual results even with quantum random numbers.

                          Drifting into a tail of the distribution for a short period is not grounds for concern. George Marsaglia wrote: By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has "failed the test at the .05 level". Such p's happen among the hundreds that DIEHARD produces, even with good RNG's. So keep in mind that " p happens".

                          So, is the above telling us to change our ways? I think it is. I am now convinced that simply 'burning' the early numbers of a PRNG is not good enough. If we have a low entropy initial state vector, then burning an insufficient number of early values may push us further into a tail. I have not got Intel's RDSEED but if I did, I would not use it unless it had a quality HW. Intel tell us that RDSEED is multiplicative prediction resistant, whereas RDRand is only additive prediction resistant. I have no idea what that actually means, and I have not seen anyone else, other than Intel, mention prediction resistance.


                          • #33
                            Originally posted by Yours truly
                            so those low HW values probably occurred at the same time.
                            After making the code a little more forensic, the above statement was found to be false. It seems that if we do drift into a tail, the residency is short-lived. In fact, I couldn't find any evidence that we remained there after the next random number request. 100 million requests is small compared to a PractRand run of 1TB so a longer test may see us being in a tail for longer. Getting a few 'Evaluation unusual' in a 1TB PractRand run can be ignored. In fact, there is a switch we can use in PractRand to stop reporting 'Evaluation unusual'. I don't use it - I want to see every anomaly - no matter how small.

                            So with a quality PRNG like PCG32 drifting into a tail during a generator session is not an issue.

                            However, not rejecting non-optimal entropy initial state vectors is an issue. Using an optimal entropy initial state not only ensures that early numbers generated are not suspect, but 'burning' is no longer required.


                            • #34
                              Originally posted by Dale Yarker View Post
                              Yes, some 64 bit operations are possible. I've "experimented" a little with SHA-512 (is 64 bit) in PB assembly on CPU. Did you use CPU or FPU registers for PRNGs?

                              The CPU has assembly DIV / IDIV and MUL / IMUL to cover signed and unsigned arithmetic (ADD and SUB don't need "I" versions). My "assumption" has been Intel did not put an unsigned version of divide or multiply in FPU; any arithmetic that set 64th bit would make a mess with automatic 2s compliment. So, implementation of QWORD would only be for some operations. (and PRNGS are strictly unsigned)
                              The FPU holds the 64 bit mantissa as an unsigned 64 bit number.
                              There is no twos complement.
                              If there is an overflow from the 64th bit then the most significant 64 bits are kept, result is shifted right, the low bits are lost, and the exponent is adjusted.
                              If the FPU multiplies or divides signed numbers then the 64 bit mantissa stays as unsigned and the extra sign bit is set or cleared accordingly.

                              Whether the FPU can be used efficiently in random number generators depends on what you're trying to do with it.


                              • #35
                                Here is the speed, in MHz, of one of my PRNGs.

                                The FreeBASIC compiler uses the FPU by default, but I can force it to use SSE instead.
                                    32    64
                                SSE 599  1668
                                FPU 599  1550
                                In 32-bit mode there is no difference, but in 64-bit mode SSE is 7.6% faster. In fact, I have SSE hardwired in the compiler's command line.

                                The above throughputs remove the For/Next overhead.


                                • #36
                                  Extended precision - Wikipedia
                                  Guess I misread the source where it says 80 bit extended mantissa is 63 bits for fraction and 1 bit for sign. Remaining bits for exponent.


                                  • #37
                                    Lots of people get it wrong.
                                    The 63 bits for the fraction is correct but there's the leading "1" for the integer part of the mantissa which makes 64 bits in total.

                                    The diagram from the Intel manual makes it clearer:
                                    Bits 0-63 (i.e. 64 bits) make up the significand.
                                    Bits 64-78 (i.e. 15 bits) make up the exponent.
                                    Bit 79 is the sign bit.
                                    Attached Files


                                    • #38
                                      Oh well. But a fixed "1" bit sounds just as useless for unsigned 64 bit work as a sign bit.


                                      Bold, larger font and red added after next post was made. Text was not changed. Maybe braille will help?


                                      • #39
                                        The leading "1" is not useless, it's the Most Significant Bit.
                                        In binary, other than number zero, all numbers begin with a 1.
                                        With the FPU, the exponent is adjusted to make sure the first 1 in the number is in bit 63 which then always allows for 64 bits of precision.

                                        Imagine you only have 3 bits.
                                        How would the unsigned numbers be represented?

                                        The way it works with the CPU:
                                        0 000
                                        1 001  This is really 1
                                        2 010  This is really 10
                                        3 011  This is really 11
                                        4 100  This is really 100
                                        5 101  This is really 101
                                        6 110  This is really 110
                                        7 111  This is really 111
                                        Note that apart from the number zero, all the numbers really begin with a 1 but when they are stored right aligned in a register the preceding bits before the first bit in the number are padded out with 0s.

                                        How it works in the FPU:
                                        0 000 plus exponent of 0 = 0 x 2^0 = zero
                                        1 100 plus exponent of 0 = 1.00 x 2^0 = 1
                                        2 100 plus exponent of 1 = 1.00 x 2^1 = 2
                                        3 110 plus exponent of 1 = 1.50 x 2^1 = 3
                                        4 100 plus exponent of 2 = 1.00 x 2^2 = 4
                                        5 101 plus exponent of 2 = 1.25 x 2^2 = 5
                                        6 110 plus exponent of 2 = 1.50 x 2^2 = 6
                                        7 111 plus exponent of 2 = 1.75 x 2^2 = 7
                                        Note that for the FPU, for all numbers other than zero, the first bit is always set. The bits of the number are left aligned in the register so the padding is after the number instead of before it. But this still fully represents the whole range of 3 bit numbers, or in the case of the real FPU, the full range of 64 bit numbers.


                                        • #40
                                          I did not say useless for floating point work! In fact it is standard to have one digit left of the decimal point, 1 through 9 in decimal. In binary the only possible digit left of the binary point is 1, so it is fixed or assumed.

                                          Your discussion is all true for signed floating point. I'm saying it is useless for unsigned integers. In first go-around I was in error about sign bit vs MSB, and you corrected. After that you forgot how to read! Post 38 made more obvious for you!

                                          Sorry for interruption David.