Announcement

Collapse
No announcement yet.

utility to split files sizes greater than zero into smaller sizes down to 1 megabyte

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • utility to split files sizes greater than zero into smaller sizes down to 1 megabyte

    A work in progress but it works.
    I had to transfer some very large files, disk image files.
    It has code to work with directories that is mostly done but not tested but it has not been worked out or tested.
    But it does a good job on large files.
    I worked on using the CreateFile API to be able to remove buffering, but had to back off it for now because I could not get it all right.
    This program uses a smaller buffer in an effort to balance memory concerns but will handle almost anything you can through at it.
    For lack to something to call the smaller files, I create an extension "-xxxxxx.pie" added to the original file name.
    An ERRORLEVEL of 1 or higher is returned on a failure that has not really been tested because my testing has not broken.
    ERRORLEVEL 0 is success

    Code:
    'filesplt.bas
    
    'compiled with pbcc 4.04
    'non unicode but that should not matter
    ';
    #COMPILE EXE "filesplt.exe"
    #DIM ALL
    #BREAK ON
    #INCLUDE "win32api.inc"
    
    
    FUNCTION getdestination(BYREF sdestination AS STRING) AS LONG
        sdestination=TRIM$(COMMAND$(3))
        IF LEFT$(sdestination,1&)="." THEN
            sdestination=""
            EXIT FUNCTION
        END IF
    
        WHILE INSTR(sdestination,"\\")
            REPLACE "\\" WITH "\" IN sdestination
            sdestination=TRIM$(sdestination)
        WEND
        IF sdestination="\" THEN
            sdestination=LEFT$(CURDIR$,3&)
            EXIT FUNCTION
        END IF
        WHILE RIGHT$(sdestination,1&)="\"
            IF sdestination="\" THEN EXIT FUNCTION
            IF MID$(sdestination,2&,1&)=":" THEN
                IF LEN(sdestination)=3& THEN EXIT FUNCTION
                sdestination=TRIM$(LEFT$(sdestination,LEN(sdestination)-1&))
                ITERATE
            END IF
            IF RIGHT$(sdestination,1&)="\" AND LEN(sdestination)>3& THEN
                sdestination=TRIM$(LEFT$(sdestination,LEN(sdestination)-1&))
                ITERATE
            END IF
        WEND
        WHILE INSTR(sdestination,"\\")
            REPLACE "\\" WITH "\" IN sdestination
            sdestination=TRIM$(sdestination)
        WEND
        WHILE LEFT$(sdestination,1&)="\"  AND LEN(sdestination)>1& AND RIGHT$(sdestination,1&)="\"
            sdestination=TRIM$(LEFT$(sdestination,LEN(sdestination)-1&))
            WHILE INSTR(sdestination,"\\")
                REPLACE "\\" WITH "\" IN sdestination
                sdestination=TRIM$(sdestination)
            WEND
        WEND
        WHILE INSTR(sdestination,"\\")
            REPLACE "\\" WITH "\" IN sdestination
            sdestination=TRIM$(sdestination)
        WEND
        sdestination=TRIM$(sdestination)
        IF sdestination="\" THEN
            sdestination=LEFT$(CURDIR$,3&)
            EXIT FUNCTION
        END IF
    
    END FUNCTION
    
    REM check if destination directory exist
    FUNCTION checkdestinationexist(BYREF sdestination AS STRING) AS LONG
        LOCAL scurrentdir AS STRING
        LOCAL scurrentdrive AS STRING
        LOCAL sdestinationdir AS STRING
        LOCAL sdestinationdrive AS STRING
        LOCAL iquaddrivesize AS QUAD
    
        FUNCTION=1&
        sdestination=TRIM$(sdestination)
        sdestinationdir=sdestination
        IF LEN(sdestination)=0& THEN FUNCTION=0&:EXIT FUNCTION
        scurrentdir=TRIM$(CURDIR$)
        scurrentdrive=UCASE$(LEFT$(scurrentdir,1&))
        IF MID$(sdestination,2&,1&)=":" THEN
            sdestinationdrive=UCASE$(MID$(sdestinationdrive,1&,1&))
            IF INSTR("ABCDEFGHIJKLMNOPQRSTUVWXYZ",sdestinationdrive)=0& THEN
                FUNCTION=0&
                EXIT FUNCTION
            END IF
            IF  sdestinationdrive<>scurrentdrive THEN
                iquaddrivesize=DISKSIZE(sdestinationdrive)
                IF iquaddrivesize<4096& THEN FUNCTION=0&:EXIT FUNCTION
                iquaddrivesize=DISKFREE(sdestinationdrive)
                IF iquaddrivesize<4096& THEN FUNCTION=0&:EXIT FUNCTION
                TRY
                    CHDRIVE sdestinationdrive
                CATCH
                    CHDRIVE scurrentdrive
                    FUNCTION=0&
                    EXIT FUNCTION
                END TRY
            END IF
       END IF
    END FUNCTION
    
    
    
    
    FUNCTION PBMAIN () AS LONG
        LOCAL iresult AS LONG
        LOCAL isplitsize AS QUAD
        LOCAL isplitcount AS QUAD
        LOCAL sfilename AS STRING
        LOCAL sdestination AS STRING
        LOCAL scommandlineitem AS STRING
        LOCAL stemp AS STRING
    
        sdestination=TRIM$(COMMAND$(3))
    
        scommandlineitem=TRIM$(UCASE$(COMMAND$(1)))
        IF LEN(scommandlineitem)<2& THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
        IF INSTR("NM",LEFT$(scommandlineitem,1&))=0& THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
        IF INSTR(scommandlineitem," ") THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
        stemp=TRIM$(RIGHT$(scommandlineitem,LEN(scommandlineitem)-1))
        IF LEN(EXTRACT$(stemp, ANY "0123456789")) THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
        IF LEFT$(scommandlineitem,1)="N" THEN
            isplitcount=VAL(STEMP)
            IF isplitcount<1&& THEN
                FUNCTION=1
                GOTO displayhelp
            END IF
        END IF
        IF LEFT$(scommandlineitem,1)="M" THEN
            isplitsize=VAL(STEMP)
            IF isplitsize<1&& THEN
                FUNCTION=1
                GOTO displayhelp
            END IF
        END IF
    
        IF isplitsize THEN
            IF isplitsize>65536&& THEN isplitsize=65536&&
        END IF
    
        IF isplitsize<1&& AND isplitcount<1&& THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
    
        sfilename=TRIM$(COMMAND$(2))
        IF LEN(sfilename)=0& THEN
            STDOUT "no file name given"
            FUNCTION=1
            GOTO displayhelp
        END IF
    
        IF NOT ISFILE(sfilename) THEN
            STDOUT sfilename+" does not exist"
            FUNCTION=1
            GOTO displayhelp
        END IF
    
        stemp=TRIM$(COMMAND$(3))
        IF LEN(stemp) THEN
            getdestination(sdestination)
            IF LEN(sdestination)=0& THEN
                FUNCTION=1
                GOTO displayhelp
            END IF
            IF LEFT$(sdestination,1&)="." THEN
                STDOUT "Destination cannot have a dot in the beginning"
                FUNCTION=1
                GOTO displayhelp
            END IF
        END IF
        IF LEN(sdestination)=0& THEN sdestination=CURDIR$
        IF LEN(sdestination) THEN
            IF checkdestinationexist(sdestination)=0& THEN
                FUNCTION=1
                GOTO displayhelp
            END IF
        END IF
        IF LEN(sdestination)=0& THEN sdestination=CURDIR$
        IF isplitsize+isplitcount=0&& THEN
            FUNCTION=1
            GOTO displayhelp
        END IF
        IF isplitsize THEN
                iresult=splitfilebysize(sfilename,sdestination,isplitsize)
      '          else
      '          iresult=splitfilebycount(sfilename,sdestination,isplitcount)
        END IF
        FUNCTION=iresult
        EXIT FUNCTION
        displayhelp:
        STDOUT "This program splits a file into separate smaller pieces by size"
        STDOUT "  in increments of 1MB sizes. This program will not split a file"
        STDOUT "  with a size of zero, the input file has to be greater than zero."
        STDOUT "To set the largest size in MB sizes use a M character immediately MB size."
        STDOUT "There is a limit to size of not more than 65536MB, 68,719,476,736 bytes."
        STDOUT "That figure is too large and the lowest is 1MB, 1,048,576 bytes."
        STDOUT "The files are created with an extension of the originalfilename-xxxxxx.pie"
        STDOUT "The first file created will be orgininalfilename-000001.pie"
        STDOUT "The file case names will be the same as given on the command line."
        STDOUT "If pie files already exist, they are deleted before creating new files"
        STDOUT "  as the new files are created."
        STDOUT "It is purdent to delete pie files first, before running this program."
        STDOUT "  and also making any backups do to a failure of sorts."
        STDOUT "An errorlevel 0 returned means success, otherwise a positive value for failure."
        STDOUT "usage:  command M12 largefile.zip"
        STDOUT "    will break up the file largefile.zip into files no larger than 12 MBs"
        STDOUT "This program does not put files back together. You have use another utility."
        STDOUT " or the dos command COPY /B which packs files together. See COPY for use."
    
    
    
    END FUNCTION
    
    FUNCTION splitfilebysize(BYVAL sfilesource AS STRING, BYVAL sdestination AS STRING, BYVAL isplitfilesize AS QUAD) AS LONG
        LOCAL ifiletempsize AS QUAD
        LOCAL ifilechunksize AS LONG
        LOCAL ifilesizecountdown AS QUAD
        LOCAL ifilecount AS LONG
        LOCAL ifilereadnofile AS LONG
        LOCAL ifilewritenofile AS LONG
        LOCAL ifilesourcesize AS QUAD
        LOCAL idrivespace AS QUAD
        LOCAL scurrentdrive AS STRING
        LOCAL scurrentdir AS STRING
        LOCAL sdestinationdrive AS STRING
        LOCAL sdestinationdir AS STRING
        LOCAL sfilechunkdata AS STRING
        LOCAL soutputfilename AS STRING
        LOCAL idatabuffermark AS LONG
        LOCAL isplitnooffiles AS LONG
        LOCAL ssplitnooffiles AS STRING
        LOCAL ioutputfileopenflag AS LONG
        LOCAL sdatenow AS STRING
        LOCAL stimenow AS STRING
    
    
    
        FUNCTION=1&
        IF NOT ISFILE(sfilesource) THEN EXIT FUNCTION
        scurrentdir=CURDIR$
        scurrentdrive=UCASE$(LEFT$(scurrentdir,1&))
        sdestinationdrive=scurrentdir
        sdestinationdir=scurrentdir
    
        ifilereadnofile=FREEFILE
    
        TRY
            OPEN sfilesource FOR BINARY ACCESS READ WRITE LOCK SHARED AS ifilereadnofile
        CATCH
            CLOSE ifilereadnofile
            EXIT FUNCTION
        END TRY
    
        ifilesourcesize=LOF(ifilereadnofile)
        IF ifilesourcesize<1&& THEN
            CLOSE ifilereadnofile
            EXIT FUNCTION
        END IF
    
        IF MID$(sdestination,2&,1&)=":" THEN
            sdestinationdrive=UCASE$(LEFT$(sdestination,1&))
        END IF
        IF LEN(sdestination) THEN sdestinationdir=sdestination
        idrivespace=DISKFREE(sdestinationdrive)
        IF idrivespace < ifilesourcesize+(4096*4) THEN
            CLOSE ifilereadnofile
            EXIT FUNCTION
        END IF
        isplitfilesize=isplitfilesize * 1048576&&
        IF isplitfilesize > ifilesourcesize THEN isplitfilesize = ifilesourcesize
        ifilechunksize=isplitfilesize\4&
        ifilechunksize=(ifilechunksize\65536&) * 65536&
        'IF ifilechunksize < 8388608&  THEN ifilechunksize = 8388608&
        IF ifilechunksize < 4194304&  THEN ifilechunksize = 4194304&
        IF ifilechunksize > 33554432& THEN ifilechunksize = 33554432&
        IF ifilechunksize > ifilesourcesize THEN ifilechunksize = ifilesourcesize
        IF isplitfilesize > ifilesourcesize THEN isplitfilesize = ifilesourcesize
        IF ifilechunksize > isplitfilesize THEN  ifilechunksize = isplitfilesize
        isplitnooffiles=(ifilesourcesize\isplitfilesize)+1&
        ssplitnooffiles=TRIM$(STR$(isplitnooffiles))
        ssplitnooffiles=RIGHT$(REPEAT$(LEN(ssplitnooffiles),"0")+ssplitnooffiles,LEN(ssplitnooffiles))
        sfilechunkdata=SPACE$(ifilechunksize)
        STDOUT "creating "+ssplitnooffiles+" files from "+ sfilesource +"  [filesize="+TRIM$(STR$(ifilesourcesize))+"]"
        ifilecount=1&
        soutputfilename=sfilesource+"-"+RIGHT$("000000"+TRIM$(STR$(ifilecount)),6)+".pie"
        STDOUT "making-- "+ssplitnooffiles+" files on"+STR$(ifilecount)
       ' ifilewritenofile=FREEFILE
        TRY
            IF ISFILE(soutputfilename) THEN KILL soutputfilename
            OPEN soutputfilename FOR BINARY ACCESS READ WRITE LOCK WRITE  AS ifilewritenofile  LEN=ifilechunksize
            ioutputfileopenflag=1&
        CATCH
            CLOSE
            EXIT FUNCTION
        END TRY
        ifilesizecountdown=ifilesourcesize
        ifiletempsize=0&
          WHILE  ifilesizecountdown>0&
            IF ifilesizecountdown < ifilechunksize THEN
                ifilechunksize=ifilesizecountdown
                sfilechunkdata=""
                sfilechunkdata=SPACE$(ifilechunksize)
            END IF
            IF (ifiletempsize+ifilechunksize)< = isplitfilesize THEN
                TRY
                    GET$ ifilereadnofile, ifilechunksize, sfilechunkdata
                    PUT$ ifilewritenofile,sfilechunkdata
                CATCH
                    CLOSE
                    EXIT FUNCTION
                END TRY
                ifilesizecountdown-=ifilechunksize
                ifiletempsize+=ifilechunksize
                ITERATE
            END IF
    
            TRY
                GET$ ifilereadnofile, ifilechunksize, sfilechunkdata
                idatabuffermark=isplitfilesize-ifiletempsize
                PUT$ ifilewritenofile,MID$(sfilechunkdata,1&,idatabuffermark)
                CLOSE ifilewritenofile
                stimenow=TIME$
                sdatenow=DATE$
                ioutputfileopenflag=0&
                ifiletempsize+=idatabuffermark
                STDOUT "created- "+soutputfilename+"  [filesize="+TRIM$(STR$(ifiletempsize))+"] [date="+_
                    MID$(sdatenow,7&,4&)+MID$(sdatenow,1&,2&)+MID$(sdatenow,4&,2&)+" "+_
                    MID$(stimenow,1&,2&)+MID$(stimenow,4&,2&)+MID$(stimenow,7&,2&)+"]"
            CATCH
                CLOSE
                EXIT FUNCTION
            END TRY
            ifilesizecountdown-=idatabuffermark
            IF ifilesizecountdown<1&& THEN ITERATE
            ifiletempsize=0&
            INCR ifilecount
            soutputfilename=sfilesource+"-"+RIGHT$("000000"+TRIM$(STR$(ifilecount)),6)+".pie"
            STDOUT "making-- "+ssplitnooffiles+" files on"+STR$(ifilecount)
            'ifilewritenofile=FREEFILE
            TRY
                IF ISFILE(soutputfilename) THEN KILL soutputfilename
                OPEN soutputfilename FOR BINARY ACCESS READ WRITE LOCK WRITE AS ifilewritenofile    LEN=ifilechunksize
                PUT$ ifilewritenofile,MID$(sfilechunkdata,idatabuffermark+1&,(ifilechunksize-idatabuffermark))
                ioutputfileopenflag=1&
            CATCH
                CLOSE
                EXIT FUNCTION
            END TRY
            ifiletempsize=ifilechunksize-idatabuffermark
            ifilesizecountdown-=ifilechunksize-idatabuffermark
        WEND
        CLOSE
    
        stimenow=TIME$
        sdatenow=DATE$
        IF ioutputfileopenflag THEN
                STDOUT "created- "+soutputfilename+"  [filesize="+TRIM$(STR$(ifiletempsize))+"] [date="+_
                    MID$(sdatenow,7&,4&)+MID$(sdatenow,1&,2&)+MID$(sdatenow,4&,2&)+" "+_
                    MID$(stimenow,1&,2&)+MID$(stimenow,4&,2&)+MID$(stimenow,7&,2&)+"]"
        END IF
        STDOUT "created- "+ssplitnooffiles+" files   from "+sfilesource+"  [filesize="+TRIM$(STR$(ifilesourcesize))+"]"
        FUNCTION=0&
    END FUNCTION
    Attached Files
    p purvis

  • #2
    Here is an example of test where i was splitting a large file into smaller but still larger files,
    The file WKST51-c.VHD was split with the command "filesplt M8048 WKST51-c.VHD"
    and the pie files where joined back with the command "copy /b *.pie 1.VHD"
    Both 1.VHD and WKST51-c.VHD where compared with the dos COMP command and equaled.
    I did many test on other file sizes too and had no issues at all.
    The split was fast enough but i am sure could be better but for me the transferring is what is going to take time.
    Heck no, i am not transferring these files these sizes.
    Many of the test where on small files and files just over 500,000,000 bytes and used M1 for creating over 500 1 megabyte sized files.
    They all joined back to the original file with copy /b *.pie on a windows ntfs drive, and that will not happen on a linux server,
    because the order will not be done. I am sure there will be a program to put the pie files back together soon.

    Code:
    12/04/2018  12:18 PM    46,588,930,048 1.VHD
    12/04/2018  12:06 AM    46,588,930,048 WKST51-c.VHD
    12/04/2018  12:06 PM     8,438,939,648 WKST51-c.VHD-000001.pie
    12/04/2018  12:07 PM     8,438,939,648 WKST51-c.VHD-000002.pie
    12/04/2018  12:08 PM     8,438,939,648 WKST51-c.VHD-000003.pie
    12/04/2018  12:10 PM     8,438,939,648 WKST51-c.VHD-000004.pie
    12/04/2018  12:11 PM     8,438,939,648 WKST51-c.VHD-000005.pie
    12/04/2018  12:11 PM     4,394,231,808 WKST51-c.VHD-000006.pie
    p purvis

    Comment


    • #3
      I just spent a fortune of time apparently trying to do what seems to me to be impossible to remove windows file caching by using windows api functions of ReadFile and Writefile.
      Using the ReadFile alone might work but writing large files and even small ones without headaches of what seems to be some kind of block boundaries.
      I am not so sure i have the best chunk sizes being used, but from my test the CPU resources look good and the low chuck sizes used help out also.
      Even though you can really get the split files to a small size in megabytes, this program was intended for breaking down large files and files sizes that might fit well over the internet.
      Plus you transfer files when it seems the best time.
      The standard out text set so one could remove the waste by filtering out the text lines that have the word "created" in them and other lines are just for viewing the progress.
      I usually do not sent that much info to the screen but wanted the product to be somewhat would be useful now..
      p purvis

      Comment

      Working...
      X