No announcement yet.

Suggestions on my string function

  • Filter
  • Time
  • Show
Clear All
new posts

  • Suggestions on my string function


    I would like to throw open a problem I have to the group - your input would be great:

    I have been developing a stock market application that loops through approximately 120000 different scenarios, each time working out the profit and loss for each scenario.

    I have found that the more scenarios I run in total, the slower the time taken for each scenario is. This is nothing to do with the core engine, as each time it is processing the same number of data items per scenario.

    I was puzzled, but then thought it could be this:

    At the end of each scenario run, I add the results to a text string like this:

    [scenario run # 0]
    output = ""

    [then for each scenario run]
    output = output + str$(profit for that scenario) + "," + str$(loss for that scenario) + "," + str$(other) + chr$(13)

    I hold "output" as a string in memory until all scenarios have finished, then write "output" as a .csv and open in excel to sort the columns.

    Could my "output = output +" be slowing the application down?

    It could make sense - does PB take a copy of the old string and add the new line to the string each time? - I don't know how PB manages strings. Therefore, as output grows in length, so does the time needed to create a new output string?

    However, if PB was doing this, it would explain why my app slows down - ie. the number of scenarios run per minute falls as the number of scenarios tested rises.

    Suggestions please - thanks.


  • #2
    I think its almost a certainity Alex that each concatenation of the string will result in PB having to request memory for a new longer string, move the string to the new location, then relinquish the memory of the string. That's why string minipulations of this sort give those kinds of results. Perhaps might be faster to keep writting the data to a file then read it out when finished. Just a thought.


    • #3
      Could my "output = output +" be slowing the application down?
      Quite sure.

      Suggestions please - thanks.
      GLOBAL m_Buffer AS STRING                        ' // Buffer for generated code
      GLOBAL m_BufLen AS LONG                          ' // Length of used buffer
      SUB AddLine(BYREF strText AS STRING)
         IF LEN(strText) + m_BufLen > LEN(m_Buffer) THEN
            m_Buffer = m_Buffer & SPACE$(1000000)
         END IF
         MID$(m_Buffer, m_BufLen + 1, LEN(strText)) = strText
         m_BufLen = m_BufLen + LEN(StrText)
      END SUB
      Use it as follows:

      AddLine str$(profit for that scenario) + "," + str$(loss for that scenario) + "," + str$(other) + chr$(13)
      When finished adding lines...

      m_Buffer = LEFT$(m_Buffer, m_BufLen)
      And save m_Buffer.


      • #4
        Concatenating strings is slow, not just in PB but in all compilers. Create a string long enough to hold all your data, output = space$(100000). Then use the MID$ statement to insert the data :

        ' Check the length of output$ and if need be increase it with another space$(x) if need be.

        MID$(output, currentLengthPos) = str$(profit for that scenario) + "," + str$(loss for that scenario) + "," + str$(other) + chr$(13)

        Steve Rossell
        PowerBASIC Staff


        • #5
          Thanks everyone - loads of great advice, and quick as well

          It's amazing how ideas come after exercise. I went out for a run and got an idea on this issue - hence my posting.



          • #6
            You could use an array maybe ? something like this ?
            This is very fast to collect data and save to a file
               DIM sArray(0 TO 10000) AS STRING 
               Dim lLineNumber as Long
               lLineNumber = 0
            'use a loop or what ever to inc the count 
            'do your thing 
                sArray (lLineNumber) = "your string "
                 lLineNumber = lLineNumber +1 'inc the count 
                   if lLineNumber = 10000 then  Msgbox "need a bigger array" : exit
            'when done save the array
                 OPEN "C:\" & "saved.txt" FOR OUTPUT AS #1
                  PRINT #1, sArray ()
                    CLOSE #1
            A dozen what.


            • #7
              String Concatenation is deadly.
              Use a class like this:

              String Builder class
              An extension of what Jose proposed
              Or just write it out to disk.


              • #8

                What I would do is:

                At the top of the routine:
                Open "Scenarios.txt" for OutPut as #1 'Starts a new file
                Print #1, "File for " & Date$
                Close #1
                Open "Scenarios.txt" for Append as #1

                'other code here.

                Print #1, str$(profit for that scenario) + "," + str$(loss for that scenario) + "," + str$(other) 'Actually I would use Using$(...) for cleaner formatting
                Simpler than fooling around with a humongous string (code wise) and probably just as fast as Windows will handle the buffering at the OpSys level.

                "People demand freedom of speech
                to make up for the freedom of thought
                which they avoid."
                Soren Aabye Kierkegaard (1813-1855)
                It's a pretty day. I hope you enjoy it.


                JWAM: (Quit Smoking):
                LDN - A Miracle Drug:


                • #9
                  Here is the code that I have been using. It is deadly fast. I'm sure that I stole it from somewhere on this forum (probably from Hutch?). Nonetheless, it makes concatentating strings extremely fast.

                  #Compile Exe
                  '//  Fast function for concatenating many strings
                  Function AppendStr2( ByVal stPos   As Long, _
                                       ByRef sBuffer As String, _
                                       ByVal Addon   As Long, _
                                       ByVal lenAdd  As Long _
                                       ) Export As Long
                      #Register None
                      Local pBuffer As Long
                      ' If the buffer is not large enough to handle the adding
                      ' of this string then we need to expand the buffer.
                      If stPos + lenAdd + 1 > Len(sBuffer) Then
                         sBuffer = sBuffer & String$( Max&(lenAdd, 100 * 1024), 0)   ' increase 100K minimum
                      End If
                      ' Copy the new string to the end of the buffer
                      pBuffer = StrPtr(sBuffer)
                      ! cld               ; Read forwards
                      ! mov edi, pBuffer  ; Put buffer address In edi
                      ! Add edi, stPos    ; Add starting offset To it
                      ! mov esi, Addon    ; Put String address In esi
                      ! mov ecx, lenAdd   ; length In ecx As counter
                      ! rep movsb         ; Copy ecx count Of bytes From esi To edi
                      ! mov edx, stPos
                      ! Add edx, lenAdd   ; Add stPos And lenAdd For Return value
                      ! mov Function, edx
                  End Function
                  Function PBMain() As Long
                     Local sBuffer As String
                     Local st      As String
                     Local x       As Long
                     Local nPos    As Long
                     Local t       As Double
                     %NUM_ITERATIONS = 20000
                     '//  FAST ASSEMBLY LANGUAGE METHOD
                     ' Concatenate a large number of strings very quickly.
                     t = Timer
                     nPos = 0
                     For x = 1 To %NUM_ITERATIONS
                        st = "123456789012345678901234567890"   '<-- 30 bytes
                        nPos = AppendStr2( nPos, sBuffer, StrPtr(st), Len(st) )
                     ' sBuffer will now contain the fully concatenated string
                     sBuffer = Left$( sBuffer, nPos )
                     MsgBox Format$(Timer-t, "0.000") & " seconds (AppendStr2)." & $CrLf & _
                            "Length=" & Str$(Len(sBuffer)) 
                     '//  SLOWER BASIC METHOD
                     sBuffer = ""
                     t = Timer
                     For x = 1 To %NUM_ITERATIONS
                        st = "123456789012345678901234567890"
                        sBuffer = sBuffer & st
                     MsgBox Format$(Timer-t, "0.000") & " seconds (BASIC method)." & $CrLf & _
                            "Length=" & Str$(Len(sBuffer)) 
                  End Function
                  Paul Squires
                  FireFly Visual Designer (for PowerBASIC Windows 10+)
                  Version 3 now available.


                  • #10
                    > extremely fast.

                    20000 iterations on my beast came in under Timer's radar so I increased it to 50000.

                    0.016s compared to 39.718s.

                    Paul, that is lethal.


                    • #11
                      Some good ideas.

                      I may use the idea suggested by Michael Mayerhoffer - if I use an array of strings for "output", I can reference individual scenarios directly.

                      How quick is it to:

                      1. REDIM Output(120000) as string

                      2. For each scenario, Output(scenario) = {{{string text}}}

                      Do I need to declare the maximum string length for each string array item in advance to speed up, eg:

                      REDIM Output(120000) as string
                      Set Each Output array item to be a maximum of 20 chars at the start before running the application



                      • #12
                        Originally posted by Alex Chambers View Post

                        REDIM Output(120000) as string * 20

                        'allocates the buffer immediately, much faster than doing it dynamically as each array item is filled
                        "Sometimes it is not enough to our best;
                        we must do what is required."
                        Sir Winston Churchill (1874-1965)
                        It's a pretty day. I hope you enjoy it.


                        JWAM: (Quit Smoking):
                        LDN - A Miracle Drug:


                        • #13
                          Just to give you an Ideal what it did for me. I generate 40k data points based on some math functions. Then save to a file.

                          Using the standard string concatenation - it took minutes maybe 6-7.

                          Using the array methode - around a second.
                          A dozen what.


                          • #14
                            A quick update on this one:

                            Now using:

                            a) Output(0-120000) as string


                            b) Paul Squires's assembly Fast function for concatenating many strings

                            I have gone from 1 scenario per second being run to 1013!!!!

                            This is unbelievable!

                            PB is extremely fast - thanks once again.