Announcement

Collapse
No announcement yet.

Algorithms for combinations of 5 numbers in 50

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Ribeiro Alvo View Post
    Did you run the program?
    I do not think so.
    Apologies. Please see my post above. I missed a line in my test program. I'll let you know what I find with a proper test.

    Comment


    • It took me a while to realise how you got such a flat distribution from such a flawed random selection. I finally realised that these lines in your loop:
      Code:
      ...
      r+=1
      .... 
        n(i)=(n(i)+r) MOD 50+1 ...
      means that you are not doing 10 million independent trials of one algorithm, you are using either 50 or 10 million (depending on how you look at it) different algorithms to generate your test data. The probability of a certain combination will change with every iteration of your For...Next Loop.

      Comment


      • Even better.
        Very good distribution.
        Although not as elegant with Knuth's algorithm, it does the same work as an alternative.
        Here goes.

        Code:
        'PBCC Program
        #COMPILE EXE
        #DIM ALL
        
        FUNCTION PBMAIN () AS LONG
         LOCAL i,r AS BYTE
         DIM n(1 TO 5) AS BYTE
         RANDOMIZE TIMER
        
         r=RND(0,45)
         FOR i=1 TO 5
          n(i)=(n(i-1)+r+i)
         NEXT
        
        DO
         r=RND(1,50)
         n(5)=RND(n(4)+1,50)
         FOR i=1 TO 5
          n(i)=(n(i)+r) MOD 50+1
          PRINT RIGHT$(" "+STR$(n(i)),3);
         NEXT
          PRINT
        LOOP UNTIL WAITKEY$=$ESC
        
        END FUNCTION
        Last edited by Ribeiro Alvo; 13 Apr 2019, 10:11 AM.

        Comment


        • Originally posted by Ribeiro Alvo View Post
          Even better.
          Very good distribution.
          Although not as elegant with Knuth's algorithm, it does the same work as an alternative.
          Here goes.

          Code:
          'PBCC Program
          #COMPILE EXE
          #DIM ALL
          
          FUNCTION PBMAIN () AS LONG
          LOCAL i,r AS BYTE
          DIM n(0 TO 5) AS BYTE
          RANDOMIZE TIMER
          
          n(0)=RND(0,45)
          FOR i=1 TO 5
          n(i)=n(i-1)+i
          NEXT
          
          DO
          r+=1
          n(5)=RND(n(4)+1,50)
          FOR i=1 TO 5
          n(i)=(n(i)+r) MOD 50+1
          PRINT RIGHT$(" "+STR$(n(i)),3);
          NEXT
          PRINT
          LOOP UNTIL WAITKEY$=$ESC
          
          END FUNCTION
          So n(1) to n(4) are initialised with a fixed sequence based solely on n(0)?
          And after you set the initial values of n(1) to n(4), you just increment them in each iteration? The ONLY random valuer after initialisation is n(5)?

          Are you sure that "DO" is in the correct place?

          Apparently you didn't understand the implication of post #302.

          Even if you move the "DO", it is still a VERY poor implementation for selecting ONE instance of "5 out of 50". It certainly doesn't do the same as a single instance of a FY/K shuffle.

          It is only a good distribution of individual digits when iterated a multiple of 50 times. Even then, it is a very poor distribution of combinations (which is the whole point of the 5 out of 50 for lottery pick etc).


          The individual digit distribution only flattens out because you are implementing 50 different algorithms by incrementing r every time you you loop through the algorithm. Each of those 50 algorithms contains a strong bias towards certain numbers, but each loop shifts the bias by one position in the number line, consequently flattening out the distribution only after a multiple of 50 iterations and only when you are selecting from 50 numbers. .

          Comment


          • Stuart

            I changed the code for even better.

            Code:
            r=RND(1,50)

            Comment


            • Originally posted by Ribeiro Alvo View Post
              Stuart

              I changed the code for even better.

              Code:
              r=RND(1,50)
              That does not make it "even better", it makes it "far worse". It doesn't fix the fundamental problem. It's still 50 different algorithms, only now you are not ensuring that they are used sequentially. IOW, much of the time, your digit distribution will be worse.

              At least the change to
              Code:
              r=RND(0,45)
                FOR i=1 TO 5   n(i)=(n(i-1)+r+i)  NEXT
              does increase the INITIAL randomness to some extent. However, there are still only 46 possible initial combinations of the first four columns (you discard the value in n(5) later)
              r n1 n2 n3 n4
              0 1 3 6 10
              1 2 5 9 14
              2 3 7 12 18
              3 4 9 15 22
              4 5 11 18 26
              5 6 13 21 30
              6 7 15 24 34
              7 8 17 27 38
              8 9 19 30 42
              9 10 21 33 46
              10 11 23 36 50
              11 12 25 39 54
              12 13 27 42 58
              13 14 29 45 62
              14 15 31 48 66
              15 16 33 51 70
              16 17 35 54 74
              17 18 37 57 78
              18 19 39 60 82
              19 20 41 63 86
              20 21 43 66 90
              21 22 45 69 94
              22 23 47 72 98
              23 24 49 75 102
              24 25 51 78 106
              25 26 53 81 110
              26 27 55 84 114
              27 28 57 87 118
              28 29 59 90 122
              29 30 61 93 126
              30 31 63 96 130
              31 32 65 99 134
              32 33 67 102 138
              33 34 69 105 142
              34 35 71 108 146
              35 36 73 111 150
              36 37 75 114 154
              37 38 77 117 158
              38 39 79 120 162
              39 40 81 123 166
              40 41 83 126 170
              41 42 85 129 174
              42 43 87 132 178
              43 44 89 135 182
              44 45 91 138 186
              45 46 93 141 190
              Next we have:
              n(5)=RND(n(4)+1,50) so for each value of n(1)...n(4), n(5) has a limited number of values.

              Finally we do this:
              n(i)=(n(i)+r) MOD 50+1 to return each of the values (which can be up to "190") to the range 1 to 50.

              Here's a rewrite which uses your exact algorithm,but instead of selecting a couple of random numbers, it iterates through all possible values. Turns out there are about 40,000 of them but only about 1300 different combinations. Unfortunately, we are no longer generating unique combinations. For instance, the set 1,1,2,3,4 appears. There are many other non unique combinations as well. Bottom Line; the latest version is totally useless.

              Code:
              #COMPILE EXE
              #DIM ALL
              
              FUNCTION PBMAIN () AS LONG
              DIM n(1 TO 5) AS BYTE
              LOCAL r1,r2,r3, i AS LONG
              
              'Generate all possible combinations for Ribeiro's latest version
              OPEN "NotRandom.txt" FOR OUTPUT AS #1
              '=====Ribeiro part 1
              'r=RND(0,45)
              ' FOR i=1 TO 5
              '  n(i)=(n(i-1)+r+i)
              ' NEXT
              
              'So we iterate through each of the possible values of r
              FOR r1=0 TO 45
                  FOR i=1 TO 5
                      n(i)=(n(i-1)+r1+i)
                  NEXT
              
                  '======Ribeiro part 2
                  'r=RND(1,50)
              
                  'now let's iterate through each of these possibles in an inner loop
                  FOR r2 = 1 TO 50
                       '======Ribeiro part 3
                      ' n(5)=RND(n(4)+1,50)
              
                      'and we let loop through all possible values of n(5) for each value of n(1) to n(4)
                       FOR r3 = MIN(n(4) + 1, 50) TO MAX(n(4)+1,50) 'note n(4) is one of 45 values in the range 10 to 190!
                          n(5)=r3
              
                          ' Ribeiro Part 4
                          'FOR i=1 TO 5
                          '  n(i)=(n(i)+r) MOD 50+1
                          '  PRINT RIGHT$(" "+STR$(n(i)),3);
                          ' NEXT
              
                          ' we do the same and store the values of n(1) to n(5) sorted for ease of analysis
                          FOR i=1 TO 5
                              n(i)=(n(i)+r2) MOD 50+1
                          NEXT
                          ARRAY SORT n()
                          PRINT #1, n(1) $TAB n(2) $TAB n(3) $TAB n(4) $TAB n(5)
                      NEXT  'r3
                  NEXT 'r2
              NEXT  'r1
              CLOSE #1
              CLOSE #1
              ? "Done"
              END FUNCTION

              Comment


              • Stuart

                Without being impolite, I do not even know what to say to comment on this analysis.
                Run the code below. To my algorithm and then to Knuth's.

                Code:
                'PBCC Program
                #COMPILE EXE
                #DIM ALL
                
                FUNCTION PBMAIN () AS LONG
                
                 PRINT " Processing..."
                 LOCAL i,a,b,c,d,e,m,n,r AS BYTE
                 LOCAL y,t AS LONG
                 LOCAL x,h AS QUAD
                 RANDOMIZE TIMER
                 m=50
                 n=5
                 h=10000000
                 DIM n(0 TO 5) AS BYTE
                 DIM nums(1 TO 50) AS BYTE
                 DIM freq(1 TO 50) AS LONG
                 DIM fr(1 TO 2118760) AS LONG
                 DIM qn(1 TO 2118760) AS STRING
                 DIM qr(1 TO h) AS LONG
                 DIM pn(1 TO 46, 2 TO 47, 3 TO 48,4TO 49,5 TO 50) AS LONG
                
                 FOR a=1 TO 46
                  FOR b=a+1 TO 47
                   FOR c=b+1 TO 48
                    FOR  d=c+1 TO 49
                      FOR e=d+1 TO 50
                        y+=1
                        pn(a,b,c,d,e)=y
                        qn(y)=CHR$(a,b,c,d,e)
                      NEXT
                    NEXT
                   NEXT
                  NEXT
                 NEXT
                
                
                FOR i=1 TO m
                 nums(i)=i
                NEXT
                
                 r=RND(0,45)
                 FOR i=1 TO 5
                  n(i)=n(i-1)+r+i
                 NEXT
                
                FOR x=1 TO h
                
                'My algorithm
                 n(5)=RND(n(4)+1,50)
                 r=RND(1,50)
                 FOR i=1 TO 5
                  n(i)=(n(i)+r) MOD 50+1
                 NEXT
                
                'Knuth Shuffle
                'for i=1 to 5
                ' SWAP nums(i),nums(RND(i,45+i)  )
                ' n(i)=nums(i)
                'next
                
                 ARRAY SORT n(1) FOR 5
                
                 qr(x)=pn(n(1),n(2),n(3),n(4),n(5))
                
                 fr(qr(x))+=1
                 IF fr(qr(x))=1 THEN t+=1
                
                FOR i=1 TO 5
                   freq(n(i))+=1
                  'PRINT right$(" "+str$(n(i)),3);
                NEXT
                  'PRINT:if WAITKEY$=$ESC THEN END
                
                NEXT
                
                FOR i=1 TO 50
                 PRINT i,freq(i),fr(i)
                NEXT
                
                PRINT t,t/2118760*100
                WAITKEY$
                
                END FUNCTION

                Comment


                • I give up.

                  Click image for larger version

Name:	i-can-explain_1024x1024.jpg
Views:	75
Size:	35.8 KB
ID:	780353

                  Comment


                  • hutch at movsd dot com
                    The MASM Forum

                    www.masm32.com

                    Comment


                    • Here is one more.
                      This is based on the separation of the set of numbers into groups.
                      In this case I divided the 50 numbers into 5 groups, with sizes between 1 and 12.

                      Code:
                      'PBCC Program
                      #COMPILE EXE
                      #DIM ALL
                      
                      FUNCTION PBMAIN () AS LONG
                       LOCAL i,r,s,u AS BYTE
                       RANDOMIZE TIMER
                       DIM n(1 TO 5) AS BYTE
                       DIM g(1 TO 50) AS BYTE
                       DIM tg(1 TO 5) AS BYTE
                       DIM ng(1 TO 5, 1 TO 46) AS BYTE
                      
                      DO
                       r=RND(0,49)
                       FOR u=1 TO 50
                        g(u)=(r+u) MOD 50+1
                       NEXT
                       s=0
                       FOR i=1 TO 5
                        tg(i)=RND(1,12)
                        IF i=5 THEN tg(i)=50-s
                        FOR u=1 TO tg(i)
                         s+=1
                         ng(i,u)=g(s)
                        NEXT
                        PRINT RIGHT$(" "+STR$(ng(i,RND(1,tg(i)))),3);
                       NEXT
                       PRINT
                      LOOP UNTIL WAITKEY$=$ESC
                      END FUNCTION

                      Comment


                      • Originally posted by Ribeiro Alvo View Post
                        Here is one more.
                        This is based on the separation of the set of numbers into groups.
                        In this case I divided the 50 numbers into 5 groups, with sizes between 1 and 12.
                        Why?

                        It would help if you commented your code so we know what you THINK you are doing.
                        (And meaningful variable names and indentation won't hurt either).



                        Comment


                        • Originally posted by Ribeiro
                          Here is one more.
                          Keep going, Ribeiro, you are only two short of the Guinness book of records.

                          Originally posted by Stuart
                          (And meaningful variable names and indentation won't hurt either).
                          Amen to that, and I would suggest, Stuart, a lie down in the nearest darkened room before you have a breakdown.

                          Comment


                          • Stuart

                            Is it necessary to comment on these few lines?
                            But if that's the case, I'll do it.

                            David

                            Guinness book of records. LOL

                            Comment


                            • I cant be bothered going through the faulty logic in thinking that this can generate an evenly distributed full spectrum of combinations

                              All I will say that the the programming logic is also faulty.

                              This: "ng(i,u)=g(s) "

                              Generates a Subscript Out Of Range error as soon as u is greater than 12.

                              I'll leave it up to you to work out why u exceeds 12 but this may give you a clue "=50-s"

                              Comment


                              • Stuart

                                Thanks for the note.
                                The value of 'u' can reach 46 if the sum of the first 4 groups is 4.
                                Thanks again.
                                Corrected code.

                                Comment


                                • Originally posted by Ribeiro Alvo View Post
                                  Stuart

                                  Thanks for the note.
                                  The value of 'u' can reach 46 if the sum of the first 4 groups is 4.
                                  Thanks again.
                                  Corrected code.
                                  Editing the code in a previous post without commenting as to what you have changed is very bad practice. It makes nonsense of subsequent posts.

                                  For the record:

                                  You edited POST #310 to change
                                  DIM ng(1 TO 5, 1 TO 12) AS BYTE to DIM ng(1 TO 5, 1 TO 46) AS BYTE

                                  Comment


                                  • At the end of this whole topic, I can conclude that, for me, the best performance generator at all levels is what, from a Pseudo Random integer, gives us the corresponding combination.
                                    Thanks to Paul Dixon for Post # 75.
                                    Here is an example code for that purpose.

                                    Code:
                                    'PBCC6 program
                                    #COMPILE EXE
                                    #DIM ALL
                                    
                                    FUNCTION PBMAIN () AS LONG
                                    
                                     LOCAL i,m,r,u AS LONG
                                     LOCAL a,c,h AS QUAD
                                     DIM comb(1 TO 52 , 0 TO 5) AS QUAD
                                     RANDOMIZE TIMER
                                    
                                    FOR m=1 TO 52
                                      FOR i=0 TO 5
                                       h=1
                                        FOR a=(m-i+1) TO m
                                         h*=a
                                        NEXT
                                        FOR a=1 TO i
                                         h/=a
                                        NEXT
                                        comb(m,i)=h
                                       NEXT
                                     NEXT
                                    
                                    DO
                                     r=RND(1,comb(50,5))
                                     c=0
                                     FOR i=4 TO 0 STEP-1
                                      c+=1
                                      FOR u=1 TO 45
                                       IF r-comb(50-c,i) <1 THEN EXIT
                                       r-=comb(50-c,i)
                                       c+=1
                                      NEXT
                                      PRINT RIGHT$(" "+STR$(c),3);
                                     NEXT
                                     PRINT
                                     LOOP UNTIL WAITKEY$=$ESC
                                    ERASE comb()
                                    END FUNCTION
                                    Last edited by Ribeiro Alvo; 15 Apr 2019, 11:56 PM. Reason: Bug fix.

                                    Comment


                                    • Do you test your code before posting? More "Subscript out of range" errors! Second one occurs after you correct the first error in the code.

                                      Comment


                                      • Originally posted by Stuart McLachlan View Post
                                        Do you test your code before posting? More "Subscript out of range" errors! Second one occurs after you correct the first error in the code.
                                        No.
                                        It is now fixed.
                                        Thanks Stuart.

                                        Comment


                                        • Originally posted by Ribeiro
                                          It is now fixed.
                                          HOW?

                                          You are not listening to Stuart.
                                          Originally posted by Stuart
                                          Editing the code in a previous post without commenting as to what you have changed is very bad practice. It makes nonsense of subsequent posts.
                                          Use '#Debug Display'. Your code is violating comb's second parameter limits all over the place.

                                          Only Stuart has stayed with this thread, even though he wrote "I give up". The rest have given up without saying so.

                                          LISTEN & LEARN.

                                          Comment

                                          Working...
                                          X