Announcement

Collapse
No announcement yet.

Faster way to fill structure?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Faster way to fill structure?

    'Opening a file and reading into a string is very quick, but then placing
    'the values into a structure in the example below takes 16 seconds
    'to process 8000 records.
    '
    'If the file is opened for random access and each record is just read
    'it takes under 1-second. Is LSET CLIENT slow because of the MID$ ?


    Code:
    TYPE ClientType
      Everything AS STRING * 203
    END TYPE
      FUNCTION PBMAIN AS LONG
        LOCAL CLIENT AS ClientType
        LOCAL RecordNumber AS LONG
        LOCAL TotalRecords AS LONG
        LOCAL sData          AS STRING
        OPEN "\grid\file101.dat" FOR BINARY AS #1
        TotalRecords = LOF(1)\LEN(CLIENT)
        GET$ #1, LOF(1),sData
        CLOSE #1
        FOR RecordNumber = 1 TO TotalRecords         'This takes 16 seconds for 8000 records
          LSET CLIENT = MID$(sData,(RecordNumber-1) *LEN(CLIENT)+1)
        NEXT
     
        OPEN "\grid\file101.dat" FOR RANDOM AS #1  LEN  = LEN(CLIENT) 'under a second for 8000 records
        TotalRecords = LOF(1)\LEN(CLIENT)
        FOR RecordNumber = 1 TO TotalRecords
           GET #1,RecordNumber,CLIENT
        NEXT
       END FUNCTION
    Last edited by Mike Doty; 21 Jan 2008, 03:14 PM.
    The world is full of apathy, but who cares?

  • #2
    >Is LSET CLIENT slow because of the MID$ ?

    Yes.

    Except, it is not "slow," is it "slower than reading from disk"

    Imagine that... reading the disk is faster than doing it in memory..in this application.

    Maybe whomever it is who is always saying "optimization is always application-specific" is not full of crap after all.

    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      Interesting, I thought it would be faster having the file in memory and
      just looping through the string. Maybe using STRPTR or something might
      get them a little bit closer. After seeing this, I'm no longer going to
      keep the file in memory and use a TYPE. This is a good example of
      what not to do.
      The world is full of apathy, but who cares?

      Comment


      • #4
        Mike,

        You can still use the technique of keeping the file in memory and looping through it. You can use the following using a pointer to access each record:

        Code:
        #COMPILE EXE
        #DIM ALL
        
        TYPE ClientInformation
            Everything  AS STRING * 203
        END TYPE
        
        FUNCTION PBMAIN () AS LONG
        
            LOCAL x     AS LONG
            LOCAL CI    AS ClientInformation
            LOCAL CIptr AS ClientInformation PTR
            LOCAL s, f  AS SINGLE
            LOCAL fdata AS STRING
            
            OPEN "C:\FILE101.DAT" FOR RANDOM AS #1 LEN = 203
            FOR x = 1 TO 8000
                PUT #1, x, CI
            NEXT
            CLOSE #1
        
            s = TIMER
            OPEN "C:\FILE101.DAT" FOR BINARY ACCESS READ AS #1 LEN = 16384
            GET$ #1, LOF(1), fData
            CLOSE #1
            
            CIPtr = STRPTR(fData)
            FOR x = 0 TO 7999
                TYPE SET CI = @CIPtr[x]
            NEXT
            
            f = TIMER
            
            MSGBOX FORMAT$(f-s),, "In Memory using Pointer"
            
        END FUNCTION
        Adam Drake
        PowerBASIC

        Comment


        • #5
          Memory Mapped Files instead of RANDOM disk file access 5-8-04
          Michael Mattias
          Tal Systems (retired)
          Port Washington WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            I wonder if something like the following would work. Since the string is already in memory then try overlaying it with an absolute array:

            Code:
               Dim MyArray( TotalRecords ) As ClientType At StrPtr(sData)
            Paul Squires
            FireFly Visual Designer (for PowerBASIC Windows 10+)
            Version 3 now available.
            http://www.planetsquires.com

            Comment


            • #7
              That will work great provided the array size stays the same...
              Adam Drake
              PowerBASIC

              Comment


              • #8
                'Yes, that was it. Extremely fast, now!
                Code:
                DIM MyArray(1 TO TotalRecords) AS ClientType AT STRPTR(sData)
                MSGBOX  MyArray(TotalRecords).lastname
                The world is full of apathy, but who cares?

                Comment


                • #9
                  So if I understand this right:
                  If a very long string sData is created initially then REDIMing will be very fast as there are no memory allocation steps.

                  Code:
                  sData = STRING$(1000000) ' 1MB
                  
                  TotalRecords = 20
                  
                   DIM MyArray( TotalRecords ) As ClientType At StrPtr(sData)
                  
                  NewRec = 10
                   REDIM MyArray( TotalRecords+NewRec ) As ClientType At StrPtr(sData)
                  Where ClientType is a UDT of say 1000 bytes so MyArray will use 30k max

                  Do I have this right?

                  Comment


                  • #10
                    I would also like to know.
                    If the absolute array is an overlay why couldn't the array just be 1-element and the pointer incremented? For that matter, why even
                    use an array? Page 227 of the manual under DIM.
                    The world is full of apathy, but who cares?

                    Comment


                    • #11
                      If the absolute array is an overlay why couldn't the array just be 1-element and the pointer incremented? For that matter, why even use an array?
                      You wouldn't.

                      You'd either use an incremented (or offset) pointer to get elements, or you'd use a PB array.
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        Where ClientType is a UDT of say 1000 bytes so MyArray will use 30k max

                        Do I have this right?
                        You do! Furthermore, it makes no difference how big the UDT is (as long as it fits the allocated string). A 22K UDT DIMS just as fast as the 1K UDT.

                        Comment


                        • #13
                          'Would something like this save memory?
                          Code:
                           
                          TYPE ClientType
                            me AS STRING * 1
                          END TYPE
                          FUNCTION PBMAIN () AS LONG
                            DIM sData AS STRING
                            sData = "ABC"
                            DIM x AS LONG
                            FOR x = 0 TO 2
                              REDIM MyArray(0) AS clientType AT STRPTR(sData) + x * LEN(clientType)
                              PRINT MyArray(0).me
                            NEXT
                            WAITKEY$
                          END FUNCTION
                          The world is full of apathy, but who cares?

                          Comment


                          • #14
                            Mike,
                            The memory is svaed because you are dimensioning the array on top of memory that has allready been allocated for the string.

                            How the Compiler keeps track of all this is amazing to me.

                            You could do this with pointers

                            pMyArray as ClientData PTR

                            Then

                            pMyArray = STRPTR(sData) ' First UDT array element


                            pMyArray = STRPTR(sData) + SIZEOF(ClientData)*9 ' element 9

                            Then

                            PRINT @pMyArray.me

                            just forget the array altogether.
                            Last edited by Mike Trader; 23 Jan 2008, 05:28 PM.

                            Comment


                            • #15
                              John,

                              That raises another question then, if the string is being updated in one thread and a second thread is REDIMing an array on top of it is there a risk of a GPF?

                              I assume not because context switching will take care of the serializing of the operations as per this thread:
                              User to user discussions about the PB/Win (formerly PB/DLL) product line. Discussion topics include PowerBASIC Forms, PowerGEN and PowerTree for Windows.

                              Comment


                              • #16
                                REDIMing an array on top of it is there a risk of a GPF?
                                Like you, I would think not since the two operations seem unrelated, but some testing to confirm it probably wouldn't hurt.

                                Comment


                                • #17
                                  That raises another question then, if the string is being updated in one thread and a second thread is REDIMing an array on top of it is there a risk of a GPF?
                                  A GPF is best case scenario. Well, I for one would prefer a GPF to silent no-warning data corruption.

                                  FWIW: 'REDIM AT' allocates NO memory at all. Ever.

                                  MCM
                                  Michael Mattias
                                  Tal Systems (retired)
                                  Port Washington WI USA
                                  [email protected]
                                  http://www.talsystems.com

                                  Comment


                                  • #18
                                    If you have to make modifications to the size of the array, you should not use an array that was dimensioned using 'AT'...

                                    Try something like this to read it fast, and then do what you want to the array:

                                    Code:
                                    #COMPILE EXE
                                    #DIM ALL
                                                            
                                    TYPE ClientInformation
                                        Everything  AS STRING * 203
                                    END TYPE
                                    
                                    FUNCTION PBMAIN () AS LONG
                                    
                                        LOCAL x     AS LONG
                                        LOCAL CIptr AS ClientInformation PTR
                                        LOCAL s, f  AS SINGLE
                                        LOCAL fdata AS STRING
                                        LOCAL CITmp AS ClientInformation
                                    
                                        DIM CI(1 TO 8000) AS ClientInformation
                                    
                                        OPEN "C:\FILE101.DAT" FOR RANDOM AS #1 LEN = 203
                                        FOR x = 1 TO 8000
                                            PUT #1, x, CITmp
                                        NEXT
                                        CLOSE #1
                                    
                                        s = TIMER
                                        OPEN "C:\FILE101.DAT" FOR BINARY ACCESS READ AS #1 LEN = 16384
                                        GET$ #1, LOF(1), fData
                                        CLOSE #1
                                    
                                        CIPtr = STRPTR(fData)
                                        FOR x = 0 TO 7999
                                            TYPE SET CI(x+1) = @CIPtr[x]
                                        NEXT
                                    
                                        f = TIMER
                                    
                                        MSGBOX FORMAT$(f-s),, "In Memory using Pointer"
                                    
                                    END FUNCTION
                                    Adam Drake
                                    PowerBASIC

                                    Comment


                                    • #19
                                      FWIW, when you alter a string in any way, the address of the data may change, although the address of the string handle does not...meaning in addition your precise corruption or protection error will vary depending on WHAT data type you are REDIM'ing "AS" .... i.e you might get lucky.

                                      The bottom line is, don't do this. (Subjective, biased opinion).
                                      Michael Mattias
                                      Tal Systems (retired)
                                      Port Washington WI USA
                                      [email protected]
                                      http://www.talsystems.com

                                      Comment


                                      • #20
                                        Adam wrote:
                                        If you have to make modifications to the size of the array, you should not use an array that was dimensioned using 'AT'...
                                        Mike Trader,
                                        I believe now Adam is correct and I was wrong earlier saying you can REDIM AT an array you already DIM'ed or REDIM'ed AT earlier. It doesn't work as I originally expected and you can test the failure in the code below. However, the fastest "workaround" is to simply create a second array name AT the same STRPTR address as shown. Still only one copy in memory of the data is needed.

                                        Michael wrote:
                                        when you alter a string in any way, the address of the data may change
                                        I can only hope that's not true, because I have a ton of code predicated on the idea that that the string stays put until I erase or re-create it. Again see the example below.
                                        Code:
                                        #COMPILE EXE
                                        #DIM ALL
                                        
                                        FUNCTION PBMAIN () AS LONG
                                            LOCAL ii AS LONG
                                            LOCAL strX, theString AS STRING
                                            
                                            theString = "john      gleason   programmermichael   mattias   adviser   adam      drake     analyst   "
                                            REDIM nameNjobArr(8) AS STRING * 10 AT STRPTR(theString)
                                            FOR ii = 0 TO 8
                                               strX = strX & nameNjobArr(ii) & $CRLF
                                            NEXT
                                            ? strX
                                            
                                            strX = ""
                                                                 'changing theString is okay as long as you do it "in place". eg. MID$=, @pointer, asm.
                                                                 'The memory block accessed is therefore exactly the same, but just has different contents.
                                            MID$(theString, 1) = "john      gleason   coder     michael   mattias   designer  adam      drake     lawmaker  "
                                            FOR ii = 0 TO 8
                                               strX = strX & nameNjobArr(ii) & $CRLF
                                            NEXT
                                            ? strX
                                        
                                            strX = ""
                                        '    REDIM nameNjobArr(2) AS STRING * 30 AT STRPTR(theString) '<< REDIM'ing same array name does not work properly
                                            REDIM nameWithJobArr(2) AS STRING * 30 AT STRPTR(theString)  'DIM or REDIM'ing new array name AT works properly
                                            FOR ii = 0 TO 2
                                               strX = strX & nameWithJobArr(ii) & $CRLF
                                            NEXT
                                            ? strX
                                        
                                        END FUNCTION

                                        Comment

                                        Working...
                                        X