Announcement

Collapse
No announcement yet.

ARRAY DELETE on selected rows

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ARRAY DELETE on selected rows

    Consider an array of CSV rows obtained through FILESCAN and LINE INPUT#. SQL is the simplest way to show my intentions:
    Code:
    DELETE FROM MyArray WHERE CSVcolumn12 <> 68
    /*
    Being able to run SQL against an array.... Might be a good library idea?
    */

    The deleted rows will never be replaced by anything, so I want to be able to REDIM PRESERVE to the smaller size. This is not only to save memory but also because the UBOUND is used elsewhere.
    Erich Schulman (KT4VOL/KTN4CA)
    Go Big Orange

  • #2
    Code:
    nElement =  UBOUND (MyArray,1) + 1    ' assume zero-based array
    DO 
        ARRAY SCAN MyArray(0) FOR nElement, <> 68,  to iHit 
         IF ISTRUE iHit 
             DECR   iHit        ' correct to allow for 0-based array 
             ARRAY DELETE MyArray (iHit) 
             DECR nElement 
         ELSE
           EXIT DO   ' no more hits, all done
        END IF 
    LOOP
    REDIM PRESERVE myArray (nElement -1)
    Wrap and gift-wrap as desired.

    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      Or, did you mean each array element is a CSV-string representing a row of a table?

      In that case....

      1. Write Array to disk, (JOIN$(Myarray(), $CRLF).
      2a. Set up an ODBC datasource; use the Microsoft text driver and write all the SQL you want using your favorite techniques (ODBC API or ADO).
      2b. If you can't be bothered to set up an ODBC datasource, simply look here....

      .. where you will find the "DSN-less" connection string for use with ADO.... and use all the SQL statements you want

      3. When done manipulating the data with SQL, reload remaining rows into your PB Array.

      MCM

      PS:

      4. Post demo of same in source code forum
      Last edited by Michael Mattias; 27 Feb 2009, 11:25 AM.
      Michael Mattias
      Tal Systems (retired)
      Port Washington WI USA
      [email protected]
      http://www.talsystems.com

      Comment


      • #4
        Originally posted by Michael Mattias View Post
        Or, did you mean each array element is a CSV-string representing a row of a table?
        Yes, that is what I am working with. I don't have to do this using SQL. I thought posting the SQL line would explain my intentions more succinctly than in English.

        I know that I could DIM a temporary array of equal size, copy the rows I want to keep, REDIM the original array without PRESERVE, copy back from the temp array, and ERASE the temp array. But that just seems like a terrible way to accomplish the task.
        Erich Schulman (KT4VOL/KTN4CA)
        Go Big Orange

        Comment


        • #5
          >But that just seems like a terrible way to accomplish the task

          Well, maybe you can think of another way.

          Personally I doubt I would have even considered working with an array of CSV-strings if I needed to delete rows, except perhaps to use a TWO-dimensional array subscripted on row and column.

          know that I could DIM a temporary array of equal size, copy the rows I want to keep, REDIM the original array without PRESERVE, copy back from the temp array, and ERASE the temp array.
          You don't need a temp array at all.
          Code:
          ub    = UBOUND (myArray,1) 
          iSub = LBOUND (Myarray,1) 
          nDel = 0 
          DO 
             IF PARSE$(MyArray (iSub), column_no) <> 68 THEN 
                   ARRAY DELETE MyArray (iSub) 
                   INCR nDel 
            ELSE 
                   INCR iSUB
            END IF
          LOOP UNTIL iSub > (ub - ndel)    ' Might need to add or subtract one here 
          REDIM PRESERVE myArray (ub - nDel)
          MCM
          Last edited by Michael Mattias; 27 Feb 2009, 12:20 PM. Reason: fixed error on last code line the REDIM PRESERVE
          Michael Mattias
          Tal Systems (retired)
          Port Washington WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            Here is a way:
            Code:
            #COMPILE EXE
            #DIM ALL
            
            FUNCTION PBMAIN () AS LONG
                LOCAL ii, writeElement AS LONG
                DIM arrCsv(10000) AS STRING
                
                FOR ii = 0 TO 10000
                   arrCsv(ii) = "10000,10001,10002," & USING$("#", ii AND 127) & ",10003,10004,10005"
                NEXT
            
                'now dump any records that have 4th field [COLOR="Red"]<>[/COLOR] 68 and compress the empty space...
            
                FOR ii = 0 TO 10000
                   IF INSTR(18, arrCsv(ii), "68") [COLOR="Red"]<>[/COLOR] 0 THEN
                      arrCsv(writeElement) = arrCsv(ii)
                      INCR writeElement
                   END IF
                NEXT
            
                
                REDIM PRESERVE arrCsv(writeElement - 1)
                ? "done"
            
            END FUNCTION
            added: I just noticed, you asked to delete the element if the column <> 68, but I deleted if it = 68. So I changed it above in red.
            Last edited by John Gleason; 27 Feb 2009, 01:08 PM.

            Comment


            • #7
              >you asked to delete the element if the column <> 68....

              <Stuff> happens... like not recognizing you made the job more complicated than it needed to be...

              Code:
              ub    = UBOUND (myArray,1) 
              lb     = LBOUND(myarray,1)
              iSub = lb
              DO  WHILE iSub <=Ub 
                 IF PARSE$(MyArray (iSub), column_no) <> 68 THEN 
                       ARRAY DELETE MyArray (iSub) 
                       DECR ub 
                ELSE 
                       INCR iSUB
                END IF
              LOOP 
              IF Lb = 0 THEN 
                   REDIM PRESERVE myArray (ub)
              ELSE
                   REDIM PRESERVE myArray (lb TO UB) 
              END IF
              MCM
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment


              • #8
                Originally posted by Michael Mattias View Post
                Personally I doubt I would have even considered working with an array of CSV-strings if I needed to delete rows, except perhaps to use a TWO-dimensional array subscripted on row and column.
                I am using the single dimension array so I can reuse your answer at http://www.powerbasic.com/support/pb...ad.php?t=37839

                Sorting is the last thing done before writing to disk. The filtering of rows (the 68 thing) is done early on.
                Erich Schulman (KT4VOL/KTN4CA)
                Go Big Orange

                Comment


                • #9
                  >you made the job more complicated than it needed to be...

                  Complicated can be in the eyes of the beholder, but depending on whether you're removing a few or a lot of records, the code I posted may perform a couple orders of magnitude faster.

                  Comment


                  • #10
                    > so I can reuse your answer at .....

                    That answer was based on your question as stated: READING a CSV-file in an order dependent on the values of multiple columns.

                    You don't talk about DELETING at all.

                    See my 'updated' post at #7 here... that's the kind of stuff which happens when you try to use previously-developed code for another purpose.

                    Here of course, to delete, I would just use the sort key: by adding the 'target' columns to the key, you could get 'em all together and not have to traverse the array twice.

                    That, or maybe just not include in the sort key any rows which would be deleted based on selected columns' values. Why bother sorting 'em if they will be gone anyway?

                    MCM
                    Michael Mattias
                    Tal Systems (retired)
                    Port Washington WI USA
                    [email protected]
                    http://www.talsystems.com

                    Comment


                    • #11
                      This project is unrelated to the one from last year, even though they do a few similar things.

                      I read in the CSV file into the array and discard the lines I don't need. I use the shortened array as though it were the original. Then, with a parse-replace routine, I perform calculations and other transformations on each line as needed. The sort comes last, before the new CSV is written.
                      Erich Schulman (KT4VOL/KTN4CA)
                      Go Big Orange

                      Comment


                      • #12
                        Here's a more complete example:
                        Code:
                        #COMPILE EXE
                        #DIM ALL
                        %ARRAYSIZE = 500000
                        $MATCHSTRING = "68"
                        
                        FUNCTION PBMAIN () AS LONG
                            LOCAL ii, writeElement AS LONG
                            DIM arrCsv(%ARRAYSIZE) AS STRING
                        
                            FOR ii = 0 TO %ARRAYSIZE     'creates data array where ~1/2 the records have 12th field = 68
                               arrCsv(ii) = USING$("#",ii) & ",10001,10002,10000,10001,10002,10000,10001,10002,10000,10001," & _
                               USING$("#", RND(68, 69)) & ",10003,10004,10005"
                            NEXT
                            ? "Created test data, ready to go..."
                        
                            'now quickly dump any records that have 12th field <> 68 and compress the empty space...
                            FOR ii = 0 TO %ARRAYSIZE
                               IF PARSE$(arrCsv(ii), ",", 12) = $MATCHSTRING THEN
                                  arrCsv(writeElement) = arrCsv(ii)
                                  INCR writeElement
                               END IF
                            NEXT
                        
                            REDIM PRESERVE arrCsv(writeElement - 1)
                            ? "Done." & STR$(UBOUND(arrCsv())) & " final array elements"
                            
                            'then do calculations
                            'finally do the sort and save.
                        
                        END FUNCTION

                        Comment


                        • #13
                          Code:
                          'John Gleason code put into a SUB for reuse and timing tests
                          '766 ticks on this machine
                           
                          #COMPILE EXE  'ParseCSVDump.Bas
                          #DIM ALL
                          %TimeIt = 1
                          DECLARE FUNCTION GetTickCount LIB "KERNEL32.DLL" ALIAS "GetTickCount" () AS DWORD
                          FUNCTION PBMAIN () AS LONG
                              LOCAL ii AS LONG
                              LOCAL lArraySize AS LONG
                              LOCAL sMatchString AS STRING
                           
                              lArraySize = 500000
                              sMatchString = "68"
                              DIM arrCsv(lARRAYSIZE) AS STRING
                              FOR ii = 0 TO lARRAYSIZE     'creates data array where ~1/2 the records have 12th field = 68
                                 arrCsv(ii) = USING$("#",ii) & ",10001,10002,10000,10001,10002,10000,10001,10002,10000,10001," & _
                                 USING$("#", RND(68, 69)) & ",10003,10004,10005"
                              NEXT
                           
                              ParseDump arrCSV(),12,sMATCHSTRING
                           
                              ? "Elements after parse" + STR$(UBOUND(arrCSV))'249842 elements in 765 ticks
                              SLEEP 3000  'for PBCC users
                              'then do calculations
                              'finally do the sort and save.
                           END FUNCTION
                           
                          SUB ParseDump(arrCSV() AS STRING, FieldNumber AS LONG, sMatchString AS STRING)
                             'Code by John Gleason placed into a SUB with optional timing test
                           
                             LOCAL ii              AS LONG
                             LOCAL WriteElement    AS LONG
                           
                             #IF %TimeIt
                               'LOCAL CycleCount      AS QUAD  'TIX
                               LOCAL StartTickCount  AS DWORD
                               LOCAL EndTickCount    AS DWORD
                               StartTickCount = GetTickCount
                             #ENDIF
                           
                              'now quickly dump any records that have 12th field <> sMatchString and compress the empty space...
                              FOR ii = 0 TO UBOUND (arrCSV)
                                 IF PARSE$(arrCsv(ii), ",", FieldNumber) = sMATCHSTRING THEN
                                    arrCsv(writeElement) = arrCsv(ii)
                                    INCR writeElement
                                 END IF
                              NEXT
                              REDIM PRESERVE arrCsv(writeElement - 1)
                           
                              #IF %TimeIt
                                EndTickCount = GetTickCount
                                'TIX END CycleCount&&  'TIX method
                                '249842 elements in 766 ticks
                                 ? "Elements" + STR$(UBOUND(arrCSV)) +  "  ticks" + STR$(EndTickCount-StartTickCount)
                              #ENDIF
                           
                          END SUB
                          The world is full of apathy, but who cares?

                          Comment

                          Working...
                          X