Announcement

Collapse
No announcement yet.

Problem with OPEN for APPEND on a file with existing EOF-mark

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with OPEN for APPEND on a file with existing EOF-mark

    From the birth of DOS there was a standard for putting an End-Of-File mark (&H1A = CHR$(26)) at the end of an ASCII text file (after the last CR/LF). Today this standard is not used anymore (the size of the file is determined by the file-length in the directory-entry). But there are still programs in use out there at the customers, that put this EOF-mark at the end of the file, when the (output-)file is closed.
    Therefore, when a BASIC program OPENs a text-file FOR APPEND, and the file contains an EOF-mark, the EOF-mark is discarded, before the new records are APPENDed.
    This is true for old BASIC compilers, including PB/DOS, but PB/CC does not remove this EOF-mark, and that is a problem for me, because I have a text-file, that is updated by an old editor and my BASIC program APPENDing to it.
    Did Power Basic intentionally remove this "backward compatibility" from PB/CC ?


    ------------------


    [This message has been edited by Peter Voll (edited August 04, 2000).]

  • #2
    If your BASIC program is written in PB/CC, why not open the file for append and seek (back) to the chr$(26) and append from there
    then add a chr$(26) when you have completed your appending.

    ------------------
    Ron

    Comment


    • #3
      The "standard" was tenuous at best, and as you note, this "standard" has long since evaporated!

      First, let's note that neither PB/DOS or PB/CC executables ever terminate a file with a CHR$(26)... you must explicitly write a CHR$(26) into the file if you want one to be in there. Therefore the issue is "natural" compatibility with CHR$(26) files.

      PB/DOS does adjust the file-pointer to overwrite the CHR$(26) in APPEND mode (if one exists). In PB/CC, you are responsible for overwriting any trailing CHR$(26) - in other words, you are the programmer and PB/CC faithfully write the data the the "true" end of the file, and without any "behind the scenes" activity, simply because the world of Windows never uses CHR$(26) termination flags.

      Essentially, it is up to you to write code that either respects the CHR$(26) if one is encountered, or plainly ignores it.

      That said, I'll pass a note to R&D about this... they may consider a revision of this behavior, but they may have other reasons to retain the current behavior.



      ------------------
      Lance
      PowerBASIC Support
      mailto:[email protected][email protected]</A>
      Lance
      mailto:[email protected]

      Comment


      • #4
        OK, thank you for the answers. So I understand, that this "backward compatibility" was intentionally removed.
        As Ron suggests, I can overcome the problem "programmatically":
        Code:
        OPEN "ASCIITXT.FIL" FOR BINARY AS #1
        i& = LOF(1)
        SEEK #1,i&
        GET$ #1,1,W$
        IF W$=CHR$(26) THEN SEEK #1,i&: SETEOF #1
        CLOSE #1
        OPEN "ASCIITXT.FIL" FOR APPEND AS #1
        Neither the editor nor I need the EOF mark, so I do not need to add one before I close the file.

        Ron said:
        If your BASIC program is written in PB/CC...
        My program is not yet converted to PB/CC, I'm waiting (patient ) for the update to fix the "Error 496 Destination file write error" problem, so it was more or less by accident, that I discovered this difference in behavior between PB/DOS and PB/CC.
        ------------------


        [This message has been edited by Peter Voll (edited August 04, 2000).]

        Comment


        • #5
          The CHR$(26) end-of-file mark is actually not an MS-DOS standard, but a CP/M standard. Early versions of MS-DOS were designed with the idea of making it easier for people to port programs from CP/M to the PC. Unlike MS-DOS, CP/M didn't keep track of exact file sizes, so it was necessary to have an end-of-file marker. Needless to say, this idea is long obsolete.

          The need to search an entire text file to find the first CHR$(26) is a disaster in terms of speed and code complexity, and can cause major havoc if the file contains CHR$(26) codes that aren't intended as end-of-file markers.

          So, CHR$(26) is not treated as a special case by Windows versions of PowerBASIC.


          ------------------
          Tom Hanlin
          PowerBASIC Staff

          Comment


          • #6
            I agree, the idea about CHR$(26) as an EOF-mark is long obsolete.
            But, if I had released my DOS BASIC program converted to PB/CC, before I had discovered this difference in behaviour between PB/DOS and PB/CC, then I would have had big trouble with my customers ! They would have lost all the data, they had typed in the data-entry system, after they had been editing the file with the old editor !
            So, if you don't intent to change the behaviour of PB/CC, maybe it could be a good idea to include a note about this difference in the help files "Appendix B - Upgrading from DOS".
            The issue is easy to fix (as I demonstrate above), but if you don't know there is an issue, you wouldn't think of fixing it !

            ------------------

            Comment


            • #7
              Good idea... I'll ask the Doc's department to add such a note.

              Thanks!


              ------------------
              Lance
              PowerBASIC Support
              mailto:[email protected][email protected]</A>
              Lance
              mailto:[email protected]

              Comment


              • #8
                I have done some more testing.
                Lance said:
                "... because the world of Windows never uses CHR$(26) termination flags."
                If you have an ASCII text file with a CHR$(26) in it (because you APPENDed to it without removing the existing EOF-mark), f.ex.:
                Code:
                Text line 1
                Text line 2
                CHR$(26)Text line 3
                Text line 4
                and you read this file with LINE INPUT#:
                Code:
                OPEN "ASCIITXT.FIL" FOR INPUT AS #1
                WHILE NOT EOF(1)
                LINE INPUT #1,W$
                PRINT W$
                WEND
                CLOSE #1
                then EOF is raised, at the CHR$(26) point !
                Same thing with the TYPE command at the DOS prompt, the typing stops at the CHR$(26).
                So IMO, if OPEN APPEND should write data to the "true" end of the file, then LINE INPUT# should also read to the "true" end.
                Or better (less problems with backward compatibility): OPEN APPEND should act in PB/CC as it always has done, i.e. overwrite a CHR$(26) at the end, if it exists.


                ------------------


                [This message has been edited by Peter Voll (edited August 05, 2000).]

                Comment


                • #9
                  Ooh ick! Thanks. I'll pass that along.

                  ------------------
                  Tom Hanlin
                  PowerBASIC Staff

                  Comment


                  • #10
                    As far as I can see, this issue has not been dealt with in the new version 2.1 of PB/CC.
                    OPEN FOR APPEND does not remove a CHR$(26) at the end of the file before appending to the file, but LINE INPUT # raises EOF at the CHR$(26) point.
                    I still vote for reestablishing of the backward compatibility (fix the APPEND and do not destroy LINE INPUT #).

                    ------------------


                    [This message has been edited by Peter Voll (edited January 21, 2002).]

                    Comment


                    • #11
                      If you must read a text file which has an ascii 26 character (or characters), handle the file using binary file mode. If another program is injecting the ascii 26 and you don't have the source to modify the offending program you have little choice but to use binary file mode.

                      Using BINARY mode and depending upon the size of the file, you might choose from several approaches. You might create your own LINE INPUT function where you read one BYTE at a time checking for a $LF character to indicate the end of a line. While not lightning fast, this would work fine as long as you really have $CRLF pairs at the end of each line - otherwise you will need to apply a slightly more complex means of determining the EOL. Ignore any ascii 26 character(s). There should only be one (if any) unless the file was seriously mutilated during previous writes.

                      Another way (faster) could consist of creating a BUFIN type function - discussed here in previous thread(s). Read a chunk of the file and extract each line from memory. It is much quicker than reading each BYTE with GET or GET$.

                      Search for BUFIN or WINER. Ethan Winer released the code from a book he wrote about BASIC programming perhaps a decade ago or so which contained a BUFIN function in BASIC. His PDQ product from the late 80's had an assembler BUFIN$.

                      [This message has been edited by Ron Pierce (edited January 22, 2002).]

                      Comment


                      • #12
                        Ron,
                        I appreciate your trying to help, but if you read this thread from the top, you will see, that I do not have a real problem solving this. In fact, the 2nd post (by you !) and the 4th post (by me) show the solution.
                        What I am complaining about is the inconsistency between APPEND and LINE INPUT#, that has been introduced in PB/CC.
                        Lance said:
                        ...simply because the world of Windows never uses CHR$(26) termination flags.
                        and Tom said:
                        ...So, CHR$(26) is not treated as a special case by Windows versions of PowerBASIC.
                        PB/DOS was consistent in the way, that APPEND removed a CHR$(26) at the end of the file before appending, and LINE INPUT# raised EOF at the CHR$(26) point.
                        PB/CC is inconsistent in the way, that APPEND does not remove the CHR$(26), but LINE INPUT# still raises EOF at the CHR$(26) point.
                        So if the folks at PowerBasic insist, that CHR$(26) is not a special case, the consequence would be, that LINE INPUT# should treat it as any other normal character and read to the "true" end of the file.
                        But thats not what I would suggest, I'm pleased with the way LINE INPUT# works, but certainly not pleased with the way APPEND works in PB/CC...

                        ------------------


                        [This message has been edited by Peter Voll (edited January 22, 2002).]

                        Comment


                        • #13
                          Peter --

                          > the inconsistency between APPEND and LINE INPUT#,
                          > that has been introduced in PB/CC.

                          I think you are making a connection between APPEND and LINE INPUT that isn't really there. LINE INPUT is used only for text files. You're assuming that OPEN FOR APPEND is used only for text files, and that's not true. It's perfectly ok to use OPEN FOR APPEND to add binary data to the end of a file. If APPEND was modified in the way you are suggesting, you could no longer use it for binary files.

                          > PB/DOS was consistent

                          That's because using EOF markers was a DOS convention. In fact, if I'm not mistaken, some versions of PB/DOS search the entire file for the first EOF marker, and append the file at that point. I hope you are not suggesting that PB/CC emulate that behavior, because while it produces the technically-correct results for DOS-convention files it results in a huge speed penalty if the file is large. Which APPEND files often are.

                          > LINE INPUT# still raises EOF at the CHR$(26) point.

                          IMO that is the appropriate behavior. If a text-only function like LINE INPUT sees a non-text character that means "EOF" in the context of a text-only file, it should stop.

                          -- Eric


                          ------------------
                          Perfect Sync Development Tools
                          Perfect Sync Web Site
                          Contact Us: mailto:[email protected][email protected]</A>

                          [This message has been edited by Eric Pearson (edited January 22, 2002).]
                          "Not my circus, not my monkeys."

                          Comment


                          • #14
                            > ...use OPEN FOR APPEND to add binary data...

                            OK, I can see that you have a point here !
                            As I said, this is not a big problem, when you know about the problem.
                            And the issue is now described in "Apendix B - Upgrading from DOS" (as suggested).

                            > LINE INPUT# still raises EOF at the CHR$(26) point and that is the appropriate behavior.

                            I agree. So, we can conclude, that there are still situations, where CHR$(26) is treated as a special case by Windows versions of PowerBASIC

                            ------------------

                            Comment


                            • #15
                              So, we can conclude, that there are still situations, where CHR$(26) is treated as a special case by Windows versions of PowerBASIC.
                              Or was left 'untreated' ????

                              Regardless, this is an inconsistency and should be addressed in a future version of the compilers.

                              MCM
                              Michael Mattias
                              Tal Systems (retired)
                              Port Washington WI USA
                              [email protected]
                              http://www.talsystems.com

                              Comment


                              • #16
                                Micheal --

                                How is this an "inconsistency"? LINE INPUT is just for text, and OPEN FOR APPEND is not, so they can't follow exactly the same rules. If you mean that PB/DOS and the PB Windows compilers behave differently, then I'd agree that a note in the docs would be a good idea. But I don't see any reason for the compilers to be changed.

                                -- Eric

                                ------------------
                                Perfect Sync Development Tools
                                Perfect Sync Web Site
                                Contact Us: mailto:[email protected][email protected]</A>

                                [This message has been edited by Eric Pearson (edited January 22, 2002).]
                                "Not my circus, not my monkeys."

                                Comment


                                • #17
                                  OPEN for APPEND really is a text-mode operation. Since there isn't much
                                  distinction between text and binary files under Microsoft OSes, it can
                                  be used with binary files also, although I'm not sure that falls into
                                  the category of best programming practices. With binary files, you are
                                  usually best off using binary operations, which are guaranteed not to
                                  interpret any of your file data as control codes.

                                  ------------------
                                  Tom Hanlin
                                  PowerBASIC Staff

                                  Comment


                                  • #18
                                    Tom, that is an interesting point of view. Which I'm happy to see.
                                    Are you saying, that PB after all may consider to change the behaviour of APPEND to act as it did in PB/DOS ?
                                    (BTW, I'm not suggesting that APPEND should search the entire file for CHR$(26), only a CHR$(26) in the last position of the file should be removed.)

                                    ------------------

                                    Comment


                                    • #19
                                      If something is going to be done to "APPEND" it should certainly
                                      not be removing the last EOF-Char It should be the first EOF-Char
                                      in the file. It is not logical to assume that "old editors" never
                                      make an existing file smaller in size....


                                      ------------------
                                      Fred
                                      mailto:[email protected][email protected]</A>
                                      http://www.oxenby.se
                                      Fred
                                      mailto:[email protected][email protected]</A>
                                      http://www.oxenby.se

                                      Comment


                                      • #20
                                        Well, the "old editor", I'm referring to, writes the output file with the correct file-length, even if the file has been edited smaller. So there will never be more than one CHR$(26) in the file (and it will be in the last position in the file).
                                        I believe, that the editor must be "very very old " if it just writes the output-file with the same length as the input-file and just put an EOF in the middle of the file...
                                        So I'll be happy, if APPEND could remove a CHR$(26) at the end of the file before appending, but the "safe" suggestion will of course be: "APPEND in PB/CC should do as it does in PB/DOS".

                                        ------------------

                                        Comment

                                        Working...
                                        X