Announcement

Collapse
No announcement yet.

Calculate large file size correctly

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate large file size correctly

    Code:
    'Is a quad value needed to do this calculation?
    ' MyString$ = FORMAT$(WFD.nFileSizeHigh * (%MAXDWORD + 1) + WFD.nFileSizeLow)
     
    #COMPILE EXE
    #INCLUDE "WIN32API.INC"
    DECLARE FUNCTION GetFileSizeAndAttributes (sFileName AS STRING) AS STRING
    FUNCTION PBMAIN AS LONG
        LOCAL sFileName AS STRING
        LOCAL sDefault  AS STRING
        LOCAL sResult   AS STRING
        sDefault = "c:\windows\winnt.bmp"
     
        DO
          sFileName = INPUTBOX$ ("File name?","Title",sDefault,0,0)
          IF LEN(sFileName) = 0 THEN EXIT DO
          sResult = GetFileSizeAndAttributes (sFileName)
          IF LEN(sResult) THEN MSGBOX sResult  ELSE  MSGBOX sFileName + " not found"
        LOOP
    END FUNCTION
    FUNCTION GetFileSizeAndAttributes (sFileName AS STRING) AS STRING
       LOCAL s     AS STRING
       LOCAL hfile AS DWORD
       LOCAL wfd   AS WIN32_FIND_DATA
       LOCAL a     AS DWORD
     
       s = sFileName
       IF RIGHT$(s,1) = "\" THEN s = LEFT$(s,LEN(s)-1)  'trim trailing /
       hfile = FindFirstFile(s + $NUL, WFD)             'search for file
       IF hfile = %INVALID_HANDLE_VALUE THEN            'file not found
          EXIT FUNCTION                                 'exit
       END IF
     
       FindClose(hfile)                                 'close handle
       a = wfd.dwFileAttributes                         'file attributes
       s =  FORMAT$(WFD.nFileSizeHigh * (%MAXDWORD + 1) + WFD.nFileSizeLow) + ","
       IF (a AND 32)  = 32  THEN  s = s + "A"           'Archive
       IF (a AND 16)  = 16  THEN  s = s + "D"           'Directory
       IF (a AND 8)   = 8   THEN  s = s + "V"           'Volume label
       IF (a AND 4)   = 4   THEN  s = s + "S"           'System
       IF (a AND 2)   = 2   THEN  s = s + "H"           'Hidden
       IF (a AND 1)   = 1   THEN  s = s + "R"           'Readonly
       FUNCTION = s
    END FUNCTION
     
    #IF 0
    'None of this is needed, just curious if the right calculation is used for large file sizes
    '
    ' [URL]http://www.powerbasic.com/support/pbforums/showthread.php?t=18703&highlight=maxdword[/URL]
    '
     ' What is the correct calculation determining the file size from a WIN32_FIND_DATA structure?
    ' I've read that many programs have calculated this incorrectly because of an error at a Microsoft site.
    ' [URL]http://www.freevbcode.com/ShowCode.asp?ID=6836[/URL]
    ' [URL]http://support.microsoft.com/default.aspx?scid=kb;EN-US;185476[/URL]
    ' Read program and the calculation is:   (WFD.nFileSizeHigh *  MAXDWORD) + WFD.nFileSizeLow
    '            where MAXDWORD is defined as &HFFFF  should be &HFFFFFFFF  like in win32api.inc
    '        quadvalue = WFD.nFileSizeHigh * (%MAXDWORD + 1) + WFD.nFileSizeLow
    'Correct calculating file size
    'http://www.freevbcode.com/ShowCode.asp?ID=6836
          MAX_PATH  = 260                                   ' max. length of full pathname
    %INVALID_HANDLE_VALUE = &HFFFFFFFF???
    %MAXDWORD = &HFFFFFFFF???                          'correct in win32api.inc
    DECLARE FUNCTION FindFirstFile LIB "KERNEL32.DLL" ALIAS "FindFirstFileA" (lpFileName AS ASCIIZ, lpFindFileData AS WIN32_FIND_DATA) AS DWORD
    DECLARE FUNCTION FindClose LIB "KERNEL32.DLL" ALIAS "FindClose" (BYVAL hFindFile AS DWORD) AS LONG
    TYPE FILETIME
      dwLowDateTime AS DWORD
      dwHighDateTime AS DWORD
    END TYPE
    TYPE WIN32_FIND_DATA
      dwFileAttributes AS DWORD
      ftCreationTime AS FILETIME
      ftLastAccessTime AS FILETIME
      ftLastWriteTime AS FILETIME
      nFileSizeHigh AS DWORD
      nFileSizeLow AS DWORD
      dwReserved0 AS DWORD
      dwReserved1 AS DWORD
      cFileName AS ASCIIZ * %MAX_PATH
      cAlternateFileName AS ASCIIZ * 14
    END TYPE
    #ENDIF
    'MAK
    'Purpose Create an integer class value of a specified data type.'
    'Syntax resultvar = MAK(datatype, loworderval, highorderval)'
    'Remarks Create an integer class value of a specified data type (WORD, DWORD, PTR, INTEGER, LONG, QUAD) from a low-order and a high-order part.
    Mike Doty
    Member
    Last edited by Mike Doty; 10 Sep 2007, 03:21 PM.

  • #2
    'Is a quad value needed to do this calculation?
    ' MyString$ = FORMAT$(WFD.nFileSizeHigh * (%MAXDWORD + 1) + WFD.nFileSizeLow)
    Well I could just ask what happened when you tried, but my money is on (%MAXDWORD+1) overflowing since it is not cast to a quad; therefore that part of equation is invalid ("undefined"), so the entire equation results in an undefined value. (Probably not what you were looking for).

    So, yes I would use a quad:
    Code:
      REDIM q(0) AS QUAD AT VARPTR (WFD.nFileSizeHigh)
      MyString$ = FORMAT$(q(0))
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      Calculate large file size correctly

      Mike,

      No doubt the valid formula is WFD.nFileSizeHigh * (%MAXDWORD + 1) + WFD.nFileSizeLow
      becose with WFD.nFileSizeHigh * (%MAXDWORD) + WFD.nFileSizeLow
      you could have the same value represented in 2 forms, consider the following,
      &h1 * (&hFFFFFFFF) + 0 would be the same as &h0 * (&hFFFFFFFF) + &hFFFFFFFF

      Note that the bad formula was expressed in old Win32 api help file.

      The filesize value corespond to a qWord or unsigned Quad,
      that can go as large as 18,446,744,073,709,551,615 or 18 exabytes
      there is no unsigned Quad under PB but the use of a Quad
      is big enought to keep up until 9,223,372,036,854,775,809 bytes,
      wicht is 9 exabytes or 9,000 millions gigabytes.

      With FORMAT$, argument is converted to extended precision and
      maximum number of significant digits is 18, your formula can go as up
      to more than 936 petabytes and still be precise to the byte.
      Bigger number will be expressed in scientific notation like 1.0E2
      or will have LSB set to zero if FORMAT$ is used with a mask like "0,000".

      So formula is valid until you need something bigger
      than 936 petabytes or 936,000,000,000,000,000 bytes

      megabytes = 1,000,000
      gigabytes = 1,000,000,000
      terabytes = 1,000,000,000,000
      petabytes = 1,000,000,000,000,000
      exa ......= 1,000,000,000,000,000,000
      zetta ....= 1,000,000,000,000,000,000,000
      yotta ....= 1,000,000,000,000,000,000,000,000

      Do not use Michael Mattias suggestion...
      DIM q(0) AS QUAD AT VARPTR(WFD.nFileSizeHigh),
      this is wrong becose high DWORD and low DWORD
      are not in same order as a QUAD.
      He's over-flowing statement is also wrong becose
      FORMAT$ convert the expression to Extended, not DWORD.

      An easy way to make calculation is using something like...
      qFileSize = WFD.nFileSizeLow + (WFD.nFileSizeHigh * &H100000000)
      or
      qFileSize = MAK(QUAD, WFD.nFileSizeLow , WFD.nFileSizeHigh)

      Pierre
      Pierre Bellisle
      Member
      Last edited by Pierre Bellisle; 10 Sep 2007, 02:56 PM.

      Comment


      • #4
        ty very much!

        Comment


        • #5
          Do not use Michael Mattias suggestion...
          DIM q(0) AS QUAD AT VARPTR(WFD.nFileSizeHigh),
          this is wrong becose high DWORD and low DWORD
          are not in same order as a QUAD
          You're right, good catch! My bad.

          I like the MAK() (new in 8x) method better, too.
          Michael Mattias
          Tal Systems (retired)
          Port Washington WI USA
          [email protected]
          http://www.talsystems.com

          Comment


          • #6
            Not correct result using MAK. What is wrong here?

            Code:
            #COMPILE EXE
            FUNCTION PBMAIN () AS LONG
               'DWORD range 0 to 4,294,967,295
               f$ = "#,"
             
               LOCAL dwlow     AS DWORD
               LOCAL dwHigh    AS DWORD
             
               LOCAL qTest     AS QUAD
               LOCAL qFileSize AS QUAD
             
               dwLow??? = 4294967295
               dwHigh??? =4294967295
             
               qTest&& = dwLow??? + dwHigh???                       'just adding seems correct
               ? FORMAT$(qTest&&,f$)                                'returns 8,589,934,590
             
               qFileSize&& = MAK(QUAD, dwLow??? , dwHigh???)        'error
               ? FORMAT$(qFileSize&&,f$)                            'returns -1
               WAITKEY$
            END FUNCTION
            Purpose Create an integer class value of a specified data type.
            Syntax resultvar = MAK(datatype, loworderval, highorderval)
            Remarks Create an integer class value of a specified data type (WORD, DWORD, PTR, INTEGER, LONG, QUAD) from a low-order and a high-order part.
            Mike Doty
            Member
            Last edited by Mike Doty; 10 Sep 2007, 03:18 PM.

            Comment


            • #7
              qFileSize = MAK(QUAD, WFD.nFileSizeLow , WFD.nFileSizeHigh)

              WFD.nFileSizeHIgh isn't added to WFD.nFileSizeLow to obtain value.
              So is WFD.nFileSizeHigh a multiplier used each time nFileSizeLow reaches the maximum value of a DWORD?

              Comment


              • #8
                So is WFD.nFileSizeHigh a multiplier used each time nFileSizeLow reaches the maximum value of a DWORD?
                Short answer - yes...

                Take a look at this short snippet that illustrates how a LONG is calculated based on it's 4 bytes:

                Code:
                    ? FORMAT$(CVL(CHR$(1, 2, 3, 4)))
                
                    lTest=(4*(256^3))+(3*(256^2))+(2*(256))+1
                    MSGBOX FORMAT$(lTest)
                Adam J. Drake
                Administrator
                Last edited by Adam J. Drake; 10 Sep 2007, 04:48 PM.
                Adam Drake
                PowerBASIC

                Comment


                • #9
                  Long answer

                  Mike,

                  The easier solution would be that PB provide us
                  a new unsigned data type like qWord.

                  In the mean time, the -1 Result is valid, remember that QUAD are signed,
                  they can go from minus 9.22*10^18 to 9.22*10^18 positive

                  By using dwHigh bigger than &h7FFFFFFF
                  you are flipping the sign bit and therefore
                  falling in the negative side of the Quad.

                  This is why a file bigger than 9,223,372,036,854,775,809 bytes
                  won't be proceed correctly.
                  If you want to get file lenght bigger than
                  9 exabytes then more complex code will be needed.

                  For the record...

                  -If filesize = 4,294,967,295 (&hFFFFFFFF) then
                  --WFD.nFileSizeHigh = 0
                  --WFD.nFileSizeLow = &hFFFFFFFF

                  -If filesize = 4,294,967,296 then
                  --WFD.nFileSizeHigh = 1
                  --WFD.nFileSizeLow = 0

                  -If filesize = 4,294,967,297 then
                  --WFD.nFileSizeHigh = 1
                  --WFD.nFileSizeLow = 2

                  -If filesize = (5 times 4,294,967,296) plus 1 then
                  --WFD.nFileSizeHigh = 5
                  --WFD.nFileSizeLow = 1

                  Comment


                  • #10
                    Code:
                    FUNCTION FileSize(BYVAL f AS STRING) AS QUAD
                      LOCAL FindData AS WIN32_FIND_DATA
                      LOCAL hDir AS LONG
                      hDir = FindFirstFile(BYVAL STRPTR(f), FindData)
                      IF hDir = -1 THEN  'if not found return -1
                        FUNCTION = -1
                      ELSE               'return number of bytes
                        FindClose hDir
                       'Thank you, Pierre Bellisle
                        FUNCTION = MAK(QUAD, FindData.nFileSizeLow , FindData.nFileSizeHigh)
                      END IF
                    END FUNCTION

                    Comment


                    • #11
                      This is why a file bigger than 9,223,372,036,854,775,809 bytes
                      won't be [processed] correctly.
                      I hate when that happens.
                      Michael Mattias
                      Tal Systems (retired)
                      Port Washington WI USA
                      [email protected]
                      http://www.talsystems.com

                      Comment


                      • #12
                        Purpose: To handle files over 4 gigabytes.
                        It also processes open files (flushed to the drive.)
                        Code:
                        FUNCTION FileSize(BYVAL f AS STRING) AS QUAD
                          LOCAL FindData AS WIN32_FIND_DATA
                          LOCAL hDir AS LONG
                          hDir = FindFirstFile(BYVAL STRPTR(f), FindData)
                          IF hDir = -1 THEN  'if not found return -1
                            FUNCTION = -1
                          ELSE               'return number of bytes
                            FindClose hDir
                           'Thank you, Pierre Bellisle
                            FUNCTION = MAK(QUAD, FindData.nFileSizeLow , FindData.nFileSizeHigh)
                          END IF
                        END FUNCTION
                        Mike Doty
                        Member
                        Last edited by Mike Doty; 7 Apr 2009, 12:57 PM.

                        Comment


                        • #13
                          If you want to get file length bigger than
                          9 exabytes then more complex code will be needed.
                          That'll be one happy day when I have to write code for such a file. It'd take 9 million of my current biggest terabyte HD's to cram it in there. But with Moore's Law or more, I'm not ruling it out. I think for current supercomputers, petabyte storage is becoming the norm

                          Comment


                          • #14
                            Originally posted by John Gleason View Post
                            That'll be one happy day when I have to write code for such a file. It'd take 9 million of my current biggest terabyte HD's to cram it in there. But with Moore's Law or more, I'm not ruling it out. I think for current supercomputers, petabyte storage is becoming the norm
                            That just sparked an errant thought, John. Imagine what the world would be like today if Edison or DaVinci or the early Egyptians or ... had had today's computers at their disposal.

                            I mean, just take a minute and let your mind wander. (Well, not everybody. There are some among us for whom it might be dangerous. {grin})

                            ==============================
                            "Show me a sane man and
                            I will cure him for you."
                            Carl Gustav Jung (1875-1961)
                            ==============================
                            It's a pretty day. I hope you enjoy it.

                            Gösta

                            JWAM: (Quit Smoking): http://www.SwedesDock.com/smoking
                            LDN - A Miracle Drug: http://www.SwedesDock.com/LDN/

                            Comment


                            • #15
                              For easy testing

                              Code:
                              #COMPILE EXE
                              #DIM ALL
                              #INCLUDE "WIN32API.INC"
                               
                              FUNCTION FileSize(BYVAL sFileName AS STRING) AS QUAD
                                REM Usage:  Bytes = FileSize(sFileName)  works with files over 4 gigabytes
                               
                                LOCAL FindData AS WIN32_FIND_DATA
                                LOCAL hDir AS LONG
                                hDir = FindFirstFile(BYVAL STRPTR(sFileName), FindData)
                                IF hDir = -1 THEN  'if not found return -1
                                  FUNCTION = -1
                                ELSE               'return number of bytes
                                  FindClose hDir
                                  'Thank you, Pierre Bellisle
                                  FUNCTION = MAK(QUAD, FindData.nFileSizeLow , FindData.nFileSizeHigh)
                                END IF
                              END FUNCTION
                               
                              FUNCTION PBMAIN AS LONG
                                'Create and later kill a temp file to test
                                LOCAL sTempFile AS STRING
                                LOCAL Bytes     AS QUAD
                                LOCAL hFile     AS LONG
                                sTempFile = "\junk.tmp"    'temp file to create, it is deleted after testing
                                Bytes     = 9876543210 '*****    modify this line for file size in bytes *******
                                hFile = FREEFILE
                                OPEN sTempFile FOR BINARY AS #hFile
                                IF ERR THEN ? "Unable to open " + sTempFile + " error" + STR$(ERRCLEAR):EXIT FUNCTION
                                SEEK #hFile, Bytes +1
                                IF ERR THEN ? "Unable to seek to byte" + STR$(Bytes) + " error" + STR$(ERRCLEAR):EXIT FUNCTION
                                SETEOF #hFile
                                IF ERR THEN ? "Unable to SETEOF, error" + STR$(ERRCLEAR):CLOSE #hFile:EXIT FUNCTION
                                ? sTempFile + " contains " +  FORMAT$(FileSize(sTempFile),"#,") + " bytes"
                                CLOSE #hFile
                                KILL sTempFile
                                IF ERR THEN
                                  ? "Unable to kill " + sTempFile + " error" + STR$(ERRCLEAR)
                                ELSE
                                  ? "Killed " + sTempFile
                                END IF
                                SLEEP 3000  'for PB/CC users
                              END FUNCTION

                              Comment

                              Working...
                              X