Announcement

Collapse
No announcement yet.

File Storage

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • File Storage

    I am working on an adaptation of the Huffman routines provided in the
    download sections here to archive multiple files.

    Once the file is compressed with the Huffman routines, I try to
    store them all into a single file.

    Originally, I used a loop that simply wrote a header, and appended
    each file into the "archive" file. That is pretty much the format
    I want to use, since it is simple, and easy to work with.

    At first, I tried appending the files with Line input, and
    print #, the problem is, some of the binary files contain
    characters (the one I noticed immediately was CHR(26)) that cause
    the function to abort and the rest of the file is not added.

    I tried to solve this problem by using Get and Put in binary mode,
    but this creates a different problem: dificulty in determining
    the header from the data...meaning the whole file is extracted
    to the first filename.

    I have tried numerous ideas, but none seem to work.

    Does anyone have any idea how to do this with a function that can
    handle a variety of file types?

    Perhaps I should get the UUEncode source, do a save (LOW ASCII)
    Encoding, add the files, then compress with the Huffman routine?

    Unless someone has a better idea....


    ------------------
    Amos
    mailto:[email protected][email protected]</A>

  • #2
    My way of Archiving multiple files from within one of my PB DOS 3.5
    programs written for clients is to simply:

    SHELL "PKZIP C:\ARCHIVE\ARCH-ZIP Files.* FileA.Dat EtcFiles.*"

    Vary the PKZIP Commandline Params according to your needs.

    Oooops ! Forgot that this works in Pure Dos only. In Windows it causes
    an "Illegal Operation" Shelled from the DOS program but not when invoked
    from the Windows DOS Prompt.



    [This message has been edited by OTTO WIPFEL (edited March 26, 2002).]

    Comment


    • #3
      Assuming your file is laid out something like this:
      Header
      Data
      Header
      Data
      Header
      Data
      ...


      Code:
      TYPE FileHeaderType
         FileName  AS STRING * 30
         FileLen   AS LONG
         OtherControlData AS yadda,yadda,yadda
      END TYPE
      
      DIM  FH AS FileHeaderType
      
      OPEN "Big_File" FOR BINARY AS #1 
      Seek # 1, 0
      DO
         GET #1,,FH                 'read header and advance 
         GET$ #1,FH.FileLength,X$   'get the file data as a string (X$)
         CALL ProcessFileData (X$)  'do something with it
      LOOP UNTIL SEEK(1) >= LOF(1)  ' might be =, might be >=
      CLOSE #1
      There are variations, but this should work.

      MCM

      Michael Mattias
      Tal Systems (retired)
      Port Washington WI USA
      [email protected]
      http://www.talsystems.com

      Comment


      • #4
        Originally posted by OTTO WIPFEL:
        Oooops ! Forgot that this works in Pure Dos only. In Windows it causes
        an "Illegal Operation" Shelled from the DOS program but not when invoked
        from the Windows DOS Prompt.
        I use SHELL to PKZIP.EXE under all versions of WIndows, including XP and it works fine for me... and believe me, if it didn't I'd hear about it pretty promptly!

        Anyway, your comment suggests something else is wrong there Otto. Remember that in a SHELL, conventional memory could be quite low, and this could conceivably cause problems for PKZIP (or any target APP)...

        If you are not using UltraShell to free conventional memory before the SHELL starts, have you tried using some of the PKZIP "trouble shooting options" to selectively disable EMS/XMS/HMA/UMB/DPMI when run on your test machine? One of those options may give you the hint you need to fix the problem.

        That said, I have not (and do not) use any of those options in my DOS apps that SHELL to PKZIP.EXE under Windows.

        However, I do remember a problem a problem with Novell PNW that would cause PKZIP to crash much like you describe above... IIRC, the cause was the DPMI handler in PNW. Maybe your test machine(s) have similar problems?



        ------------------
        Lance
        PowerBASIC Support
        mailto:[email protected][email protected]</A>
        Lance
        mailto:[email protected]

        Comment


        • #5
          It seems that Michael is the only one that actually got what I was
          saying. That looks like the solution I need too. (well a variant
          anyway)

          Since I have to do it that way anyway, I'm going to add more
          features to the file format...such as attributes and the like.

          I'll just have to rewrite the archive and unarchive functions.
          (no problem really)

          For Otto and Lance, who suggest PKZIP. I am doing an installer
          for an OpenSource project, and see no point in distributing a
          shareware program (PKUNZIP) with it.

          Besides, it's good to learn new things. It's also good to have
          an archiver library to just plug in to programs where I need it.


          ------------------
          Amos
          mailto:[email protected][email protected]</A>

          Comment


          • #6
            Thanks Michael, it works perfectly now! I will give you credit
            in the comment block at the top of the code, and in the readme.



            ------------------
            Amos
            mailto:[email protected][email protected]</A>

            Comment


            • #7
              Glad you found a solution (Michael's variable-length-record database scheme).

              However, to be clear, I never actually recommended any method for your question... I was only responding to Otto's message that using PKZIP.EXE under Windows reportedly did not work...

              That said, I prefer to use self-extracting archives that my custom installer app launches. LHA and RAR seem to work pretty well. That way there is no additional app to distribute - you simply include a 'SFX-style' archive in your distribution set. Licensed appropriately of course.

              Another benefit is that these compression apps include all kinds of useful features and CRC testing... which may save you significant time writing from scratch.




              ------------------
              Lance
              PowerBASIC Support
              mailto:[email protected][email protected]</A>
              Lance
              mailto:[email protected]

              Comment


              • #8
                AMOS - I also believe in doing my own thing
                ====
                That's what got me started programming 20 years ago when, being an Accountant,
                I could not find Accounting Software to do what I wanted it to.

                What you are doing is way above my head. As my son, Novell's Clustering Architect,
                excuse me for advertising , keeps saying "Why try re-inventing the wheel". There
                is no way I can do better than PKZIP, I have a licensed version 2.05 of.

                LANCE: Once again, your invaluable experience can help me solve a long standing
                ====== puzzle, I worked around by giving my Accounting Software clients using it
                in Windows a Batch file to run from the DOS prompt or via a Desktop Shortcut.

                Your comment prompted me to have a closer look at the problem. Testing a stand
                alone version of my Archiver, usually EXECUTED into by the MENU.EXE, I got no
                problem. The moment I ran it as part of my Accounting Software System, I did !

                It's BTRIEVE's DOS/4G Protected Mode Run-Time / Switch program !
                Code:
                Modules using memory below 1 MB:
                
                  Name           Total           Conventional       Upper Memory
                  --------  ----------------   ----------------   ----------------
                  SYSTEM      41,312   (40K)     15,008   (15K)     26,304   (26K)
                  HIMEM        1,168    (1K)      1,168    (1K)          0    (0K)
                  EMM386       4,320    (4K)      4,320    (4K)          0    (0K)
                  WIN          2,336    (2K)      2,336    (2K)          0    (0K)
                  vmm32      105,488  (103K)      1,392    (1K)    104,096  (102K)
                  SBEINIT      4,480    (4K)      4,480    (4K)          0    (0K)
                  COMMAND     11,056   (11K)     11,056   (11K)          0    (0K)
                  ADMMENU      1,088    (1K)      1,088    (1K)          0    (0K)
                  PMSWITCH    57,904   (57K)     57,904   (57K)          0    (0K)
                  COMMAND     10,688   (10K)     10,688   (10K)          0    (0K)
                  ADMARCHV   145,760  (142K)    145,760  (142K)          0    (0K)
                  COMMAND     10,688   (10K)     10,688   (10K)          0    (0K)
                  ANSI         4,320    (4K)          0    (0K)      4,320    (4K)
                  IFSHLP       2,864    (3K)          0    (0K)      2,864    (3K)
                  COMMAND     13,776   (13K)          0    (0K)     13,776   (13K)
                  KEYB         6,944    (7K)          0    (0K)      6,944    (7K)
                  Free       389,184  (380K)    389,184  (380K)          0    (0K)
                
                Memory Summary:
                
                  Type of Memory       Total         Used          Free
                  ----------------  -----------   -----------   -----------
                  Conventional          655,360       266,176       389,184
                  Upper                 158,304       158,304             0
                  Reserved                    0             0             0
                  Extended (XMS)     66,885,024             ?   534,974,464
                  ----------------  -----------   -----------   -----------
                  Total memory       67,698,688             ?   535,363,648
                
                  Total under 1 MB      813,664       424,480       389,184
                
                  Largest executable program size         373,920   (365K)
                  Largest free upper memory block               0     (0K)
                  MS-DOS is resident in the high memory area.
                The above taken SHELLED from within the Archiver.

                Any idea how to solve that one ?
                ========================

                ------------------




                [This message has been edited by OTTO WIPFEL (edited March 27, 2002).]

                Comment


                • #9
                  Arrgh! It was working at first, now I seem to be getting a file
                  not found error when I try to write the file out.

                  The code I am using is

                  Code:
                  TYPE headerType
                      fileName        AS      STRING*30
                      FileLen         AS      LONG
                  END TYPE
                  
                  DIM  FH AS headerType
                  dcf%=FREEFILE
                  OPEN arcFile$ FOR BINARY AS #dcf%
                  
                  Seek #dcf%, 0
                  DO
                          GET #1,,FH
                          print "Found file: ";FH.fileName
                          tf%=FREEFILE
                          ON ERROR RESUME NEXT
                          outFile$=trim$(FH.filename)
                          open outFile$ for BINARY as #tf%
                                  IF ERR=53 then
                                          print "Error: ";outFile$
                                          EXIT LOOP
                                  end if
                          GET$ #dcf%,FH.FileLen,X$
                          PUT$ #tf%, X$
                          close #tf%
                  LOOP UNTIL SEEK(dcf%) >= LOF(dcf%)
                  close #dcf%
                  Which is quite similar to Michael's, since I haven't added any
                  features (attributes, etc) to the initial test functions.

                  A few things I have tried are
                  • Writing a function that opens the file, writes a space to it
                    then closes it back up before opening the file for BINARY.
                  • Changing to different directories to write the file.
                  • Restarting my computer (sometimes works)


                  I have never had this problem with output files before...so I
                  must have something wrong. Any ideas?

                  If anyone needs the full source, please ask. It's all open
                  source so there are no "trade secrets" in it or anything.



                  ------------------
                  Amos
                  mailto:[email protected][email protected]</A>

                  Comment


                  • #10
                    Check for nulls (CHR$(0)) in FH.Filename (or trim with ANY CHR$(0,32) to remove trailing spaces <U>and</U> nulls.)

                    When you create the FH.Filename member,remember to use LSET instead of simple assignment(=)in order to insure spaces rather than nulls throughout the member.

                    MCM



                    [This message has been edited by Michael Mattias (edited March 27, 2002).]
                    Michael Mattias
                    Tal Systems (retired)
                    Port Washington WI USA
                    [email protected]
                    http://www.talsystems.com

                    Comment


                    • #11
                      LANCE:
                      =====

                      PKZIP -3 -), disabling 32bit support did the trick

                      Thanks.

                      ------------------

                      Comment


                      • #12
                        First, try opening the files with LOCK SHARED access.

                        Next, and in addition to Michael's comments, your code appears not to be testing the error 53 condition after opening the 1st file, and an error here may propagate through the code.

                        The 'fix' is to test ERR after the first OPEN statement, and also to use ERRCLEAR right before the OPEN statement in the loop... that way an ERR 53 after the 2nd OPEN can definitely be attributed to that statement (otherwise the point of failure cannot be readily attributed).

                        That is, ERR/ERRTEST is only set when an error occurs... it is not cleared when a subsequent statement executes correctly. Therefore, use pleanty of ERR testing, but do it at appropriate places in the code. Also, don't be afraid of using ERRCLEAR at the start of 'critical' blocks of code either. In your code above, you only test the 2nd OPEN statement for an error, but no other file I/O statements are error tested.

                        Finally, you should be able to move the ON ERROR RESUME NEXT statement outside of the loop.

                        ------------------
                        Lance
                        PowerBASIC Support
                        mailto:[email protected][email protected]</A>
                        Lance
                        mailto:[email protected]

                        Comment


                        • #13
                          Otto: excellent news!

                          ------------------
                          Lance
                          PowerBASIC Support
                          mailto:[email protected][email protected]</A>
                          Lance
                          mailto:[email protected]

                          Comment


                          • #14
                            With a little more research, and everyone's tips, I finally got it
                            working. I can now compress and archive files, then decompress them.

                            The filename problem was in fact a chr(32) at the beginning of the
                            data. That was easily fixed. I also took Lance's suggestions and
                            added more error checking.

                            I will comment the code tonight, and post it for everyone's access
                            later. After the project I am using it in is finished, I will
                            also learn more about LHA's optimizations to the Huffman Codes
                            and optimize in a similar manner.

                            Keep a lookout for smaller archives, smaller executables, and more
                            features (SFX, Attributes, Pathnames, etc)



                            ------------------
                            Amos
                            mailto:[email protected][email protected]</A>

                            Comment


                            • #15
                              Good news Amos... Thanks for letting us know!

                              ------------------
                              Lance
                              PowerBASIC Support
                              mailto:[email protected][email protected]</A>
                              Lance
                              mailto:[email protected]

                              Comment

                              Working...
                              X