Announcement

Collapse
No announcement yet.

Record over-reach in Variable Length files?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Record over-reach in Variable Length files?

    I could tool up to test this, but it oughta be simpler just to
    ask here!

    What happens when you grossly over-reach a Variable Length file
    in PB 3.5 for DOS?

    I typically do a quick file integrity check 'proof' on many of the
    Variable Length files created. My habit is to not use the record
    number one for actual data, but to use it for check-sum sort of
    results from data that has to be normally known from the cap record.

    That way, if the initial record doesn't conform to both the needed
    check data that was generated by the cap record, and that doesn't
    match what else the system knows about this file, we can begin to
    suspect the file is corrupt. So far for years, that's worked well
    enough to catch most thingees that go bump in the night.

    Now I want to very much enlarge the entire record size in a certain
    file. Obviously, to 'discover' that the file might be the old length
    record style, much shorter, I can't quite rely, I think, on any
    normal EOF technique to tell me. In this case, the EOF and last
    byte record for the file will be at, say the 277,000 byte mark, where
    the new file size would put that same place at about 4,777,000 bytes.

    I don't really want to open it twice, once in straight Binary to see
    what the EOF byte mark is, then compare it to what was in the first
    record, which I could do. Turns out that the two versions will, for
    fact, have the first 512 bytes that match exactly in the file.
    Thus even if I open it in the much larger length for the new format,
    whatever data was at the Magic Point(s) in the first record will
    still track and tell me what the expected cap and crypt code check
    sum was for the original file .. even if that is WAY short in the
    new one!

    Now comes the curious point!

    I could do a 'read' for data in at the supposed known good record
    NUMBER location in the file. But of course that will put it FAR
    beyond the 'end' of what is the old file that is the wrong style
    which might be still on some disk or restoration effort!

    If I do that, as far as I know, I think (?) <- puppu with tail in
    curly cue if it does't smiley funny, the box will cheerfully go out
    and 'read' Heaven knows what from where?

    What does PB 3.5 do when this happens? Will it attempt to actually
    read beyond the actual EOF on that hard disk? Or will it return
    a disk I/O error that I can actually trap to make the determination?
    That said, if it does return garbage, and I discover it, if I then
    close the file, what happens to the file itself for having tried
    to read way past the end?

    Curious puppy wants to know!

    Thank you ..



    ------------------
    Mike Luther
    [email protected]
    Mike Luther
    [email protected]

  • #2
    If you haven't yet filled your 512 bytes of control data in the first record (1), add a version # string to a currently unused area.

    For now, if the first record does not have the signature, it must be 'old' style.

    As long as you are adjusting the first record, you might want to add record count and/or file length members to the control data.

    Note(1):If the 512 bytes is NOT control data; that is, it is "application" data which just happens to be the same, consider adding a header record with control data, or a separate control file with control data, or even putting the control data record at the END of the file.

    Oops, forgot to answer the question: if you access beyond End of File in sequential mode (open for INPUT), you get an error.
    For a file in opened BINARY, you can raise EOF() but only if you do at least one read beyond the end. For files opened RANDOM, a WRITE (PUT) extends the size of the file to handle the new rew record and Ithink a READ (GET) returns "whatever happens to be on the idsk in the designated location had the file been that large."

    MCM


    [This message has been edited by Michael Mattias (edited August 09, 2001).]
    Michael Mattias
    Tal Systems Inc. (retired)
    Racine WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      Thanks Mike ..

      If you haven't yet filled your 512 bytes of control
      data in the first record (1), add a version # string
      to a currently unused area.

      For now, if the first record does not have the signature,
      it must be 'old' style.

      As long as you are adjusting the first record, you might
      want to add record count and/or file length members to
      the control data.
      First record already has 'record count' in it. Looking
      back over the more recent work last year, I realize that
      I'm alreadty working with a file length pointer to the
      last record in the first record stub-in for one other
      file I last did. It is an indexed completely variable
      length trash text collection file for comment text with
      pointers to where records start and stop.

      I have 11 bytes 'extra' still left in the original, spill
      over from my habit of always alligning files where possible
      on 256, 512, 1024 ... byte boundaries for network use
      'efficiency', if any of that ever matters any more, sigh ..

      I didn't snap in the early morning mental fog to the EOF
      length location that could be added to even these yet until
      I read your post! But that gives me one more way to proof
      the file if I cross-post a byte pattern sample from the
      last PUT operation in there as well as a few of those last
      11 bytes!

      Note(1):If the 512 bytes is NOT control data; that is, it
      is "application" data which just happens to be the same,
      consider adding a header record with control data, or a
      separate control file with control data, or even putting
      the control data record at the END of the file.
      I never thought of putting control data at the END of a
      file. All my life at this I keep trying hardest to build
      all the generic data files such that they pile up latest
      written text, for example, on the top of the file! When
      you assemble most paper data files, I think you mostly
      want to see the latest stuff on top, going further to
      the bottom only as needed. It's a sort of pain to do that
      with stuff, but to me, no client, I would think, would
      normally want to read a case always flipping to the back
      of the file.

      --> What we really need in PowerBASIC is, in addition
      --> to the OPEN for APEND ..

      --> is

      --> OPEN for PREPEND ..



      Oops, forgot to answer the question: if you access
      beyond End of File in sequential mode (open for
      INPUT), you get an error.
      Realized that.

      For a file in opened BINARY, you can raise EOF() but
      only if you do at least one read beyond the end.
      I'm not sure I ever tried that with BINARY! I've always
      just used LOF to find the pointer to the final byte
      at open time and then used the actual long integer value
      gotten from that as a cap pointer from that point forward
      or backward ... as in the case of binary searches into
      them ...

      Doesn't that also hold true for INPUT files as well?
      I'm sure I frequently use am IF NOT EOF FL%(#) to figure
      out when we've run out of steam on them without using
      an error mode control technique to do thata...

      For files opened RANDOM, a WRITE (PUT) extends the size
      of the file to handle the new rew record and I think a
      READ (GET) returns "whatever happens to be on the disk
      in the designated location had the file been that large.
      That's exactly what I *THINK* is the case! It's sort of like the
      alleged Winston Churchill remark, or was it Mae West, I can't
      recall .. "That's a pr(e)(o)position up with which I do not want
      to put!"



      Foggy memory says that "When first I came to Louisville, some
      pleasure there to find!", with RANDOM order files, hard disk space
      was an expensive propostion! Grin. Thus if you asked for something
      on the menu outside of normal boundaries, you smashed the budget
      big time! And sometimes, you'd bring down the house!

      However, now that we've opened up how to read a hard disk at any
      point from just a simple OPEN for RANDOM game that raises some
      very interesting perspectives on what one could do with that!

      By creating a reasonable length buffer and starting at a known
      file name that wasn't your's, you could wisk of a whale of a
      lot across a TCP/IP connection without further adieu, correct
      Mssr.? The mule needs to chew on the bit a little bit about that!
      Not in PowerBASIC of course, but some competitive thingees ...

      Thank you for your thoughts Mike...

      ------------------
      Mike Luther
      [email protected]
      Mike Luther
      [email protected]

      Comment


      • #4
        I never thought of putting control data at the END of a file.
        I only suggested that because it sounds like you are trying to 'retrofit' existing datafiles.

        Combined with the 'signature' thing, it could work very simply..

        Code:
        TYPE ControlRecordType
          Signature AS STRING * 16
          Version   AS LONG
          yadda_yadda as whatever
        END TYPE
        
        DIM CR AS ControlRecordType
        
          OPEN "Thefile" FOR BINARY AS hFile
          SEEK hFile, LOF(hFile) - SIZEOF(CR)
          GET hFile,,CR
          IF CR.Signature <> "Zshfdjsah8rnlasn" THEN  ' something really, really hard to get by accident
              AppVersion = %OLD_VERSION
          ELSE
              AppVersion = CR.Version
          END IF
        
          CLOSE hFile
        
          IF AppVersion = %OLD_VERSION THEN
             (existing code)
          ELSEIF AppVersion = 216 THEN
             (code for version 2.16)
          ELSEIF AppVersion = 217 THEN
             (code for version 2.17)
          END IF
        Using a control record like this also allows you to do "on the fly" conversions of the user data from %OLD_VERSION to 2.17 or whatever.

        Doesn't [EOF() true condition] also hold true for INPUT files as well?
        Nope. With sequential access (OPEN xxxx FOR INPUT ...), EOF() is true when you at the end of the file. When you use "OPEN xxx FOR BINARY..." EOF() is not true until you go BEYOND EOF.

        Of course, if you are keeping track of where you are yourself, this is moot; but others may wish to use EOF() and I thought it best to explain how EOF() is not the same as EOF(). (Go figger!)


        MCM





        [This message has been edited by Michael Mattias (edited August 09, 2001).]
        Michael Mattias
        Tal Systems Inc. (retired)
        Racine WI USA
        [email protected]
        http://www.talsystems.com

        Comment

        Working...
        X