Announcement

Collapse
No announcement yet.

Is String Append possible

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is String Append possible

    I need to add two strings. One is very large (20 to 30 MB)
    the other is smaller (around 1.5MB)

    While Files

    FOR i = 1 to 28000 ' or so
    TempStr =TempStr + NewLine ' where NewLine is about 80 chars
    NEXT

    LargeStr = LargeStr + TempStr ' add latest file to exisiting

    Loop

    Each Time i add NewLine to TempStr the entire contents
    of TempStr is copied in memory and the NewLine is added. This
    is very time consuming and bogs down my program badly.

    Then when I add TempStr to LargeStr there is a long wait also.

    Is there a way to append to a string WITHOUT copying its contents
    each time, like appending to a file on disk?

    ------------------
    Kind Regards
    Mike

  • #2
    To append to a file on disk, open the file in APPEND mode.

    Code:
    DIM FF AS LONG
    FF = FREEFILE
    
    OPEN "c:\my file.dat" FOR APPEND ACCESS READ WRITE AS #FF
    
    'Then, do your appending loop
    FOR L& = 1 TO 50000
        'String handling code here (to build, etc., the string)
        PRINT #FF, TempStr
    NEXT
    This will append the new strings to the end of the opened file,
    in the order that they are PRINTed.

    I don't know how your strings are arranged, so you may still have
    to have the preliminary block statements to build each TempStr
    before it is appended to the file. Unfortunately, I'm not very
    familiar with string handling to think of a good way to do this.
    Maybe using pointers would make it faster?

    Regards,


    ------------------
    Clay C. Clear

    mailto:[email protected][email protected]</A>

    Clay Clear's Software

    [This message has been edited by Clay Clear (edited August 31, 2001).]

    Comment


    • #3
      You might also consider using a fixed length or ASCIIZ string if
      you have a maximum buffer size. You can then copy the contents
      of the temp string directly to the buffer space of the main string.

      Search the Forums for memcopy and I think you will find useful code.



      ------------------
      Bernard Ertl
      Bernard Ertl
      InterPlan Systems

      Comment


      • #4
        Hello,

        This example takes 2.5 seconds on my K6II-500 with 256MB RAM.

        Not debugged to confirm accuracy as I'm out of time at the moment,
        but it should serve as a good example. I'll double-check it latter.

        Note that if you hard drive goes steady while working with this
        size of data then nothing you can do will ever get this lightning
        quick. But, cutting down on moving memory will make the largest
        difference in this case.

        Code:
        'Copy large data with pointers example
        'Questions? Ask here or email Colin Schmidt
        'at [email protected]
        
        #COMPILE EXE
        #REGISTER NONE
        
        FUNCTION PBMAIN
            LOCAL llCount1 AS LONG
            LOCAL lnTimer AS SINGLE
            
            LOCAL lsBig AS STRING
            LOCAL lsLine AS STRING
            LOCAL lbpBig AS BYTE PTR
            LOCAL lbpLine AS BYTE PTR
            LOCAL llBigPos AS LONG
            LOCAL llLinePos AS LONG
            LOCAL llBigSize AS LONG
            LOCAL llLineSize AS LONG
            
            lnTimer = TIMER
        
            'Make dummy data
            lsBig = SPACE$(31457280) '30MB
        
            'Start of copy stuff...
        
            'Save location of first byte after end of lsBig
            'Note that count starts at zero with pointers
            llBigPos = LEN(lsBig)
        
            'Append buffer space to lsBig
            lsBig = lsBig + SPACE$(LEN(lsBig) * .2)
            llBigSize = LEN(lsBig)
            
            'Get pointer to start of lsBig
            lbpBig = STRPTR(lsBig)
            
            FOR llCount1 = 1 TO 28000 'x 80 = ~2.1MB
                
                'Create dummy data
                lsLine = SPACE$(80)
                llLineSize = LEN(lsLine)
                
                'Reset pointers
                lbpBig = STRPTR(lsBig)
                lbpLine = STRPTR(lsLine)
                
                'Check to see if more buffer space is needed in lsBig
                IF llLineSize + llBigPos > llBigSize THEN
                    lsBig = lsBig + SPACE$(llBigSize * .2)
                    llBigSize = LEN(lsBig)
                    'Grab our pointer again - it may have moved
                    lbpBig = STRPTR(lsBig)
                END IF
                
                'Move llBigPos to start of paste space
                lbpBig = lbpBig + llBigPos
                
                'Copy
                FOR llLinePos = 0 TO llLineSize - 1
                    @lbpBig[llLinePos] = @lbpLine[llLinePos]
                NEXT llLinePos
                llBigPos = llBigPos + llLineSize
                
            NEXT llCount1
            
            lsBig = LEFT$(lsBig, llBigPos)
        
            MSGBOX "Done - " + FORMAT$(TIMER - lnTimer, "##.##")
        
        END FUNCTION
        ------------------
        Colin Schmidt & James Duffy, Praxis Enterprises, Canada
        [email protected]

        Comment


        • #5
          If you only want to merge files, without touching the data, there are
          better ways to do that....

          Code:
                                                                      
           Global LargeStr$  'This need to be global to hold more than 1 meg of data
           Local CurPos&,NewLine$
          
           LargeStr$ = String$(0,30000000) '30 meg dynamic string
           CurPos& = 0
           While Files
            FOR i = 1 to 28000 ' or so
             String_Append CurPos&,LargeStr$,NewLine$ ' where NewLine is about 80 chars
            Next
           Loop
           If CurPos& > 0 then
            PutDataToDisk Left$(LargeStr$,CurPos&)
            CurPos& = 0
           End if
             
          ============================================================  
          Sub String_Append(ByRef CurPos&,ByRef Buffer$,ByVal AddOn$)
          'Code and idea: Steve Hutchesson
          #Register None 
          Local pBuff&,pAddOn&,lenAddOn&,Cp&    
            Cp& = CurPos&
            pBuff&    = StrPtr(Buffer$)
            pAddOn&   = StrPtr(AddOn$)
            lenAddOn& = Len(AddOn$)
              ! cld                 ' read forwards
              ! mov edi, pBuff&     ' put buffer address in edi
              ! add edi, Cp&        ' add starting offset to it
              ! mov esi, pAddOn&    ' put string address in esi
              ! mov ecx, lenAddOn&  ' length in ecx as counter
              ! rep movsb           ' copy ecx count of bytes from esi to edi
              ! mov edx, Cp&        '
              ! add edx, lenAddOn&  ' add CurPos and lenAddOn 
              ! mov Cp&, edx        ' put new value in CurPos
              CurPos& = Cp&
          End Sub
          ------------------
          Fred
          mailto:[email protected][email protected]</A>
          http://www.oxenby.se



          [This message has been edited by Fred Oxenby (edited August 31, 2001).]
          Fred
          mailto:[email protected][email protected]</A>
          http://www.oxenby.se

          Comment


          • #6
            Colin thx for that - I learned more about pointers!

            Fred that is exactly what i wanted. fantastic.
            you guys are so clever.

            BTW, do I need CurPos?

            Can I not just do this:

            OPEN DestFile FOR BINARY Access WRITE LOCK Shared AS #200
            PUT$ #200, LargeStr ' write the new file to HD
            CLOSE #200

            ------------------
            Kind Regards
            Mike

            [This message has been edited by Mike Trader (edited August 31, 2001).]

            Comment


            • #7
              CurPos& tells you "next insertpoint" in the string.
              And you only want to save actual data.
              All data after CurPos& is invalid....


              ------------------
              Fred
              mailto:[email protected][email protected]</A>
              http://www.oxenby.se

              Fred
              mailto:[email protected][email protected]</A>
              http://www.oxenby.se

              Comment


              • #8
                Fred,

                This is lightning fast. Very cool.

                I have a small problem, I have to other things with the LargeStr
                in between adding lines to it. It seems that:
                CurPos& <> Len(LargeStr)

                This causes me problems in the rest of the program.
                Is there any way to get them equal?

                Perhaps adding a null string somehow to CurPos&+1.

                I dont full understand how you are doing what you are doing so im
                not sure how to modify it so that all the normal PB functions
                will work as expected on LargeStr.

                Can this sub be modified to produce a string that will pass the
                look and smell test?

                ------------------
                Kind Regards
                Mike

                Comment


                • #9

                  I have a small problem, I have to other things with the LargeStr
                  in between adding lines to it. It seems that: CurPos& <> Len(LargeStr)

                  Yes, Len(LargeStr$) = 30000000 but CurPos& = LengthOfValidData
                  and NO LargeStr$ is only a pre-allocated area of memory. It is not
                  ment to be manipulated with "string-functions" or as a replacement
                  for dynamic strings. It is solely for appending

                  Code:
                  If you look at this way, perhaps things is selfexplained..
                  
                  1) Create a large buffer with all NULL-s. (LargeStr$ = String$(0,30Meg)
                     This buffer only need to be filled whith nulls once.
                     Physical length of buffer is 30 MB but you have no data, so actual length is zero
                     
                  2) Create a variable to hold actual data length and assign zero length. (CurPos& = 0)
                  3) Call String_append and add 80 byte of data to your buffer. 
                     (String_Append CurPos&,LargeStr$,NewLine$)
                  4) The buffer will still be 30 MB, but sub String_Append will return CurPos& = 80 (length of valid data)
                  5) Call String_append and add another 80 byte of data to your buffer. (String_Append CurPos&,LargeStr$,NewLine$)
                  6) The buffer will still be 30 MB, but sub String_Append will return CurPos& = 160 (length of valid data)
                  7) When saving LargeStr$ to disk, you only want to save valid data, in this case that is 160 bytes
                     PutDataToDisk Left$(LargeStr$,CurPos&)
                  8) Invalidate data in buffer  (CurPos& = 0)
                  
                  A small example how I use String_Append to create IBM-style variable-length records
                  with a record-descriptor in front/endOf every record..
                    
                  '--Create IBM-record---------------------------------
                       Rl% = 1 + Len(Rad$)
                       If Rl% > 255 Then Function = 111:Exit Function
                       String_Append CurPos&,LargeStr$,Mki$(Rl%) & Cmd$ & Rad$ & Mki$(Rl%)
                  ------------------
                  Fred
                  mailto:[email protected][email protected]</A>
                  http://www.oxenby.se



                  [This message has been edited by Fred Oxenby (edited September 01, 2001).]
                  Fred
                  mailto:[email protected][email protected]</A>
                  http://www.oxenby.se

                  Comment


                  • #10
                    Fred,
                    What you describe above makes perfect sense.

                    In the above example you PRE-ALLOCATE all the memory for LargeStr.

                    I didnt realize that needed to be done. Now I get it.

                    The problem is I dont know how many files the user
                    is going to want to process. I guess I could Guestimate it based
                    on the maximum file size possible and create a string long enough.
                    OR
                    I could detect when I am getting close to the end and create a
                    second string even longer
                    OR
                    ???
                    ------------------
                    Kind Regards
                    Mike

                    [This message has been edited by Mike Trader (edited September 01, 2001).]

                    Comment


                    • #11
                      Without knowing what you try to do, its hard to give some advice...
                      But if you are going to append only files, at runtime, you will know
                      their names. Then you can start with quering/adding size of this files.
                      Use their total size to prealloc the buffer.
                      You probably already have files to process in a string-array.
                      It wont take many milliseconds to get their total size...
                      ------
                      >I could detect when I am getting close to the end and create a
                      >second string even longer
                      You can expand your buffer. (LargeStr$=LargeStr$ & String$(0,10000000)
                      Len(LargeStr$) = allocated size

                      Code:
                      Function FSO_GetFileSize(ByVal FileSpec$)Export As Quad
                      Local FD As WIN32_FIND_DATA,hFile As Long
                        hFile = FindFirstFile(ByVal StrPtr(FileSpec$), FD)
                        If hFile = %INVALID_HANDLE_VALUE Then Function = -1:Exit Function
                        FindClose hFile
                        Function = FD.nFileSizeHigh * &H100000000 + FD.nFileSizeLow
                      End Function

                      ------------------
                      Fred
                      mailto:[email protected][email protected]</A>
                      http://www.oxenby.se



                      [This message has been edited by Fred Oxenby (edited September 02, 2001).]
                      Fred
                      mailto:[email protected][email protected]</A>
                      http://www.oxenby.se

                      Comment

                      Working...
                      X