Announcement

Collapse
No announcement yet.

Need string with more than 100k bytes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need string with more than 100k bytes

    Hello to everybody. I was wondering if it is possible to have a single string with more than say 100k (100000)bytes. I searched all forums but I couldn't find enything. My application downloads HTML data and some of them easily exceed 100k bytes. It would be a lot easier for me to process one string of all data than to use a few of them.
    Thanks in advance.

  • #2
    An array (or even several separate strings) will still be the simplest approach for you by far...

    PB/DOS strings are limited to 32750 bytes or less - it depends on the $STRING segment size your code is using. Therefore you cannot use the built-in string functionality of PB/DOS to manipulate such a large "string" buffer, even though it is "possible" to create large buffer in memory. You'd have to use pointers.

    If your PC is using some form of disk cache or (even better) a ram-disk, then it would be simple to place the data into a binary file, and use the binary file functionality of PB/DOS to retrieve sections of the binary file, using GET$ as you would use MID$() on a normal string.

    By using a ram-disk, performance should still be fairly reasonable, and limited only by how big you can create a RAM drive in memory.

    With a dynamic array, it is possible to allocate 64Kb of contiguous memory, so you could work with that amount fairly easily, treating the data as a bit array, or by using pointers. It going over the 64K segment boundary that makes life more complicated.

    If you stick with the array mathod, the EMS frame page measn you could not treat the virtual array as a contiguous block either, even though you can create a fixed-length string array of up to 32Mb for storing such data (actual EMS limit depends on your memory manager - QEMM 8 is capable of providing 64Mb of EMS for virtual arrays, but individual virtual arrays are limited to 32Mb each - with 64Mb you could create two 32MB virtual arrays, or four 16Mb arrays, etc.

    FOr this type of work, you may really want to consider moving to PB/CC which can create a single dynamic string of up to 2Gb in length, limited by virtual memory (hard disk space for the Windows swap file to grow)

    Anyone else have any ideas?

    ------------------
    Lance
    PowerBASIC Support
    mailto:[email protected][email protected]</A>
    Lance
    mailto:[email protected]

    Comment


    • #3
      I believe HUGE arrays do, in fact, allocate in contiguous blocks and the only 'descriptor' is the single array descriptor. However, you can't use 'plan' pointer math to access these arrays across segment boundaries because of the segmentffset archivtecture of the 16-bit OS.

      That out of the way, you could always store >100K of string data in a string array consisting of multiple elements of, say, 16K:
      Code:
      DIM X(5) AS STRING
      FOR N=0 to 5
        X(N) = SPACE$(16384)
      NEXT N
      True, when parsing the large HTML files, you'd have to count characters processed and change to the correct array element, but it's do-able.

      And there's always the HUGE array approach to allocate 100K as long as you are willing to do your own segment-shifting arithmetic.

      MCM
      Michael Mattias
      Tal Systems Inc. (retired)
      Racine WI USA
      [email protected]
      http://www.talsystems.com

      Comment


      • #4
        I'm sure that at least some classes of HUGE arrays are not able to be stored in contiguios blocks, but I'll check with R&D.


        ------------------
        Lance
        PowerBASIC Support
        mailto:[email protected][email protected]</A>
        Lance
        mailto:[email protected]

        Comment


        • #5
          Lance,
          I personally am very interested in finding out what R&D has to say
          about the different types of HUGE arrays, particularly BYTE arrays,
          as to whether or not they are allocated in contiguous memory.
          Since this subject has come up several times, I think it would be
          nice to have some kind of reference material indicating which types
          of HUGE arrays will be allocated contiguously, and which cannot.
          I've done tests that indicate that HUGE BYTE arrays are contiguous,
          but I guess I can't assume that that will always be the case unless
          I hear otherwise.

          Thanks


          ------------------

          Comment


          • #6
            Originally posted by G Grant:
            Lance,
            I personally am very interested in finding out what R&D has to say
            about the different types of HUGE arrays, particularly BYTE arrays,
            as to whether or not they are allocated in contiguous memory.
            All arrays, other than virtual, are always contiguous when possible. The only exception is for data size which is not a power of two. In that case, there is an appropriate gap at each 64k boundary.

            Bob Zale
            PowerBASIC Inc.


            ------------------

            Comment


            • #7
              That's good to hear. I've written several applications that rely on
              contiguous HUGE arrays, and I was getting a little nervous (unnecessarily
              so) when this subject kept coming up.

              Thanks Bob

              ------------------

              Comment

              Working...
              X