This is *NOT* a complaint on PowerBasic. It's how some very subtle
data file corruption can happen using PB and other file operations
I had no idea could happen, all these years gone by now, until today.
For a number of purposes at a few places I use string data with a
pattern of not <CR LF> end of line, but just the ASC(10), followed
by more characters, then another ASC(10) and so on. I write that
string operation with the WRITE function in PB 3.5., and read it as
well. The data in the case where I observed this interesting way
to corrupt that string, is not ever changed in normal client actions
for my suite. It, and CRC checksums for things related to it are
purely administrative issues. Imagine my surprise when yesterday I
saw the whole thing come apart!
Before the corruption here is a sample of a snip that got corrupted:
Note the adjacent <0A OA OD> characterss in the above string data.
Suddenly I see the following:
Note the corruption into the sequence <0D 0A 0D 0A 0D> in the above!
Years and years of using this has never been a problem, although for
countless hundreds of thousands of times PB 3.5 has read, written this
data over and over without EVER showing the above.
Well, until yesterday. And it is *NOT* a PB 3.5 issue either. Here
is how it can be provoked.
I started using WordStar back in CPM86 days. Oh yes. I know exactly
one of the issues with use of it even as to Version 7 for DOS and
even the WordStar for Windows. Even in TEXT mode, it appends from
the end of a text file to the adjacent sector point of the end of
the file with additional EOF characters as in:
And yes, I'm fully aware that this can cause serious problems for
some programs and data files. A prime example of the problems that
can cause is the early CONFIG.SYS file, for example, in OS/2 operations.
In this case, until IBM fixed that issue, a whole host of troubles
could be caused in OS/2 setup operations, particularly in regard to
PEER LAN work. So if one used WS or any other program which did
this, and there are others, the only way to be sure that you don't
corrupt data files from this is to strip the surplus EOF characters
from the end of the file if you use and editor which produces them.
And for all source code work for the compiler use in PB 3.5, Bob has
been very careful in his better than WS compatibility work to provide
that if you do use an editor which leaves them in the source, it
makes no difference to the PB toolset. So what.
But .. that turns out to be NOT TRUE in relation to data files with
embedded CHR$ <OA OA> constructs back up in a file that has the EOF
characters at the end of it!
What I can now prove is that if, by accident, the EOF characters are
present at the end of a DATA file, and I then EDIT that data file
with, say QEDIT for DOS, or TED for DOS, which does *NOT* add the
trailing EOF characters, the following will happen.
1.) The trailing EOF characters will be eliminated when you simply
open and save the data file.
2.) Each of the CHR$(OA) marks in the data file ABOVE the end of the
file will be replaced with a PRECEEDING CHR$(0D) in front of the correct
CHR$(0A) data for that byte in the file!!
Duhhhhh?
I can absolutely prove this happens when the data file that was saved
in a text mode simple open and close in WS7 is followed with a simple
open and close in QEDIT for DOS (Or QEDIT for OS/2!), and worse ..
If the data file is READ with PB 3.5 for DOS that has the trailing EOF
marks, and it absolutely *DOES* have the proper construct in the data
above it, when that file is written again with *NO* change in the
string data for the afflicted string, here in my world it is written
back to the disk with the corrupted data in it.
Uhhhh?
I have not checked this on any development system other than DOS-VDM
work in OS/2, so I can't tell whether it is also an issue in FREEDOS,
or MS-DOS 6.2+ for example. It is *NOT* an issue with any normal use
of PB 3.5 in DOS-VDM or FREEDOS or MDOS 6.2+ for any normal use of
such data files. It's not a user issue, as I would view this.
But as a developer, this is a very interesting glitch which I can see
would present a perhaps very hard to find error. Which I thought I
would describe in an effort to help others here. Just a casual look
at data or with any tool that presented an EOF issue. and even an
inadvertent save of that file, can really create havoc...
FWIW ..
------------------
Mike Luther
[email protected]
data file corruption can happen using PB and other file operations
I had no idea could happen, all these years gone by now, until today.
For a number of purposes at a few places I use string data with a
pattern of not <CR LF> end of line, but just the ASC(10), followed
by more characters, then another ASC(10) and so on. I write that
string operation with the WRITE function in PB 3.5., and read it as
well. The data in the case where I observed this interesting way
to corrupt that string, is not ever changed in normal client actions
for my suite. It, and CRC checksums for things related to it are
purely administrative issues. Imagine my surprise when yesterday I
saw the whole thing come apart!
Before the corruption here is a sample of a snip that got corrupted:
000050 45 20 45 58 43 4C 55 53 49 56 45 20 55 53 45 20 E EXCLUSIVE USE
000060 4F 46 3A 0A 0A 0D 5A 69 70 6C 6F 67 2C 20 49 6E OF: Ziplog, In
000060 4F 46 3A 0A 0A 0D 5A 69 70 6C 6F 67 2C 20 49 6E OF: Ziplog, In
Suddenly I see the following:
000050 48 45 20 45 58 43 4C 55 53 49 56 45 20 55 53 45 HE EXCLUSIVE USE
000060 20 4F 46 3A 0D 0A 0D 0A 0D 5A 69 70 6C 6F 67 2C OF: Ziplog,
000060 20 4F 46 3A 0D 0A 0D 0A 0D 5A 69 70 6C 6F 67 2C OF: Ziplog,
Years and years of using this has never been a problem, although for
countless hundreds of thousands of times PB 3.5 has read, written this
data over and over without EVER showing the above.
Well, until yesterday. And it is *NOT* a PB 3.5 issue either. Here
is how it can be provoked.
I started using WordStar back in CPM86 days. Oh yes. I know exactly
one of the issues with use of it even as to Version 7 for DOS and
even the WordStar for Windows. Even in TEXT mode, it appends from
the end of a text file to the adjacent sector point of the end of
the file with additional EOF characters as in:
001500 6C 20 72 65 63 6F 72 64 2E 0D 0A 1A 1A 1A 1A 1A l record.
some programs and data files. A prime example of the problems that
can cause is the early CONFIG.SYS file, for example, in OS/2 operations.
In this case, until IBM fixed that issue, a whole host of troubles
could be caused in OS/2 setup operations, particularly in regard to
PEER LAN work. So if one used WS or any other program which did
this, and there are others, the only way to be sure that you don't
corrupt data files from this is to strip the surplus EOF characters
from the end of the file if you use and editor which produces them.
And for all source code work for the compiler use in PB 3.5, Bob has
been very careful in his better than WS compatibility work to provide
that if you do use an editor which leaves them in the source, it
makes no difference to the PB toolset. So what.
But .. that turns out to be NOT TRUE in relation to data files with
embedded CHR$ <OA OA> constructs back up in a file that has the EOF
characters at the end of it!
What I can now prove is that if, by accident, the EOF characters are
present at the end of a DATA file, and I then EDIT that data file
with, say QEDIT for DOS, or TED for DOS, which does *NOT* add the
trailing EOF characters, the following will happen.
1.) The trailing EOF characters will be eliminated when you simply
open and save the data file.
2.) Each of the CHR$(OA) marks in the data file ABOVE the end of the
file will be replaced with a PRECEEDING CHR$(0D) in front of the correct
CHR$(0A) data for that byte in the file!!
Duhhhhh?
I can absolutely prove this happens when the data file that was saved
in a text mode simple open and close in WS7 is followed with a simple
open and close in QEDIT for DOS (Or QEDIT for OS/2!), and worse ..
If the data file is READ with PB 3.5 for DOS that has the trailing EOF
marks, and it absolutely *DOES* have the proper construct in the data
above it, when that file is written again with *NO* change in the
string data for the afflicted string, here in my world it is written
back to the disk with the corrupted data in it.
Uhhhh?
I have not checked this on any development system other than DOS-VDM
work in OS/2, so I can't tell whether it is also an issue in FREEDOS,
or MS-DOS 6.2+ for example. It is *NOT* an issue with any normal use
of PB 3.5 in DOS-VDM or FREEDOS or MDOS 6.2+ for any normal use of
such data files. It's not a user issue, as I would view this.
But as a developer, this is a very interesting glitch which I can see
would present a perhaps very hard to find error. Which I thought I
would describe in an effort to help others here. Just a casual look
at data or with any tool that presented an EOF issue. and even an
inadvertent save of that file, can really create havoc...
FWIW ..
------------------
Mike Luther
[email protected]
Comment