Announcement

Collapse

Forum Guidelines

This forum is for finished source code that is working properly. If you have questions about this or any other source code, please post it in one of the Discussion Forums, not here.
See more
See less

Reading Excel Files Part 2: Reading Cell Contents Without Automating Excel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PBWin Reading Excel Files Part 2: Reading Cell Contents Without Automating Excel

    Hey, it's been a long time since my first post Part 1. I've been so swamped with other things.

    I'm posting this as I learn. This example reads all worksheets in a workbook and spits the value of every cell out to a text file. It does not convert Excel dates and times into dates. It is recommended you look at the BIFF documentation linked on my first post.

    I hope you find this useful. It only seems to work on vanilla excel files for now...

    Code:
    ' ########################################################################################
    ' Reads the contents of an Excel BIFF file and outputs it to a text file.
    ' ########################################################################################
    #COMPILE EXE
    #DIM ALL
    #INCLUDE ONCE "propidl.inc"
    #INCLUDE ONCE "OLE2UTILS.inc"
    TYPE tIdentifier
        idNumber AS WORD
        szSize AS WORD
    END TYPE
    'Beginning of file for BIFF 5-7
    TYPE tBOF57
        vers AS WORD
        dt AS WORD
        rupBuild AS WORD
        rupYear AS WORD
    END TYPE
    'Beginning of file for BIFF 8
    TYPE tBOF8
        vers AS WORD
        dt AS WORD
        rupBuild AS WORD
        rupYear AS WORD
        bfh AS DWORD
        sfo AS DWORD
    END TYPE
    'This record stores the sheet name, sheet type, and stream position.
    TYPE tBoundSheet
        lbPlyPos AS DWORD 'Stream position of the start of the BOF record for the sheet
        grbit AS WORD     'Option flags
        'The following are made redundant with BIFF8 unicode strings.
        'They should be processed separately
        'cch as BYTE       'Length of the sheet name (in characters)
        'rgch as STRING    'Sheet name (grbit/rgb fields of Unicode String)
    END TYPE
    TYPE tWorksheetInfo
        lbPlyPos AS DWORD 'Stream position of the start of the BOF record for the sheet
        shtName AS ASCIIZ * 256 'Worksheet name
    END TYPE
    'A BiffString formatting run
    TYPE tRun
        charOffset AS WORD 'zero based starting character of the run
        fontIndex AS WORD  'zero based font record index
    END TYPE
    'SST: Shared String Table (FCh)
    TYPE tSharedStringTable
        cstTotal AS DWORD  'Total number of strings in the shared string table and extended string table (EXTSST record)
        cstUnique AS DWORD 'Number of unique strings in the shared string table
        'rgb               'Array of unique unicode strings.
    END TYPE
    TYPE tRow
        rw AS WORD         'Row number. (zero based)
        colMic AS WORD     'First defined column in the row. (zero based)
        colMac AS WORD     'Last defined column in the row, plus 1. (zero based)
        miyRw AS WORD      'Row height in units of 1/20th of a point. The miyRw field may have the 8000h (215) bit set,
                           'indicating that the row is standard height
        irwMac AS WORD     'Used by Excel to optimize loading the file; if you are creating a BIFF file, set irwMac to 0.
        reserved AS WORD   ' ...
        grbit AS WORD      'Option flags.
        ixfe AS WORD       'If fGhostDirty=1 (see grbit field), this is the index to the XF record for the row.
                           'Otherwise, this field is undefined. Note: ixfe uses only the low-order 12 bits of the field
                           '(bits 11–0). Bit 12 is fExAsc, bit 13 is fExDsc, and bits 14 and 15 are reserved. fExAsc and
                           'fExDsc are set to true if the row has a thick border on top or on bottom, respectively.
    END TYPE
    TYPE tRowOptions
        'This is an option from miyRw
        stdRowHt AS BYTE   'indicating that the row is standard height
        'These are options from grbit
        iOutLevel AS BYTE  'Outline level of the row
        fCollapsed AS BYTE '=1 if the row is collapsed in outlining
        fDyZero AS BYTE    '=1 if the row height is set to 0 (zero)
        fUnsynced AS BYTE  '=1 if the font height and row height are not compatible
        fGhostDirty AS BYTE'=1 if the row has been formatted, even if it contains all blank cells
        'These are options from ixfe
        fExAsc AS BYTE     'Thick row border on top
        fExDsc AS BYTE     'Thick row border on bottom
    END TYPE
    TYPE tRkNumber
        rw AS WORD         'Row number (zero based)
        col AS WORD        'Column number (zero based)
        ixfe AS WORD       'Index to the XF record that contains the cell format
        rk AS LONG         'RK Number
    END TYPE
    TYPE tRkRec
        ixfe AS WORD       'Index to the XF record that contains the cell format
        rk AS LONG         'RK Number
    END TYPE
    TYPE tMulRk
        rw AS WORD            'Row number (zero based)
        colFirst AS WORD      'First Column number (zero based)
        rgrkreg AS tRkRec PTR 'Array of 6-byte RKREC structures
        rkCount AS BYTE       'Number or rk elements (zero based)
        colLast AS WORD       'Last Column number (zero based)
    END TYPE
    TYPE tNumber
        rw AS WORD         'Row number (zero based)
        col AS WORD        'Column number (zero based)
        ixfe AS WORD       'Index to the XF record that contains the cell format
        num AS DOUBLE      'Number
    END TYPE
    TYPE tLabelSst
        rw AS WORD         'Row number (zero based)
        col AS WORD        'Column number (zero based)
        ixfe AS WORD       'Index to the XF record that contains the cell format
        isst AS DWORD      'Index into the SST record where actual string is stored
    END TYPE
    TYPE tFormat
        ifmt AS WORD       'Format index code (for internal use only)
        'What follows is a BiffString with a 2 byte cch length
    END TYPE
    TYPE tXfRecordBiff8
        ifnt         AS WORD
        ifmt         AS WORD
        fLocked      AS BIT * 1 IN WORD
        fHidden      AS BIT * 1
        fStyle       AS BIT * 1
        f123Prefix   AS BIT * 1
        ixfParent    AS BIT * 12
        alc          AS BIT * 3 IN WORD
        fWrap        AS BIT * 1
        alcV         AS BIT * 3
        fJustLast    AS BIT * 1
        trot         AS BIT * 8
        cIndent      AS BIT * 4 IN WORD
        fShrinkToFit AS BIT * 1
        fMergeCell   AS BIT * 1
        iReadOrder   AS BIT * 2
        reserved1    AS BIT * 2
        fAtrNum      AS BIT * 1
        fAtrFnt      AS BIT * 1
        fAtrAlc      AS BIT * 1
        fAtrBdr      AS BIT * 1
        fAtrPat      AS BIT * 1
        fAtrProt     AS BIT * 1
        dgLeft       AS BIT * 4 IN WORD
        dgRight      AS BIT * 4
        dgTop        AS BIT * 4
        dgBottom     AS BIT * 4
        icvLeft      AS BIT * 7 IN WORD
        icvRight     AS BIT * 7
        grbitDiag    AS BIT * 2
        icvTop       AS BIT * 7 IN DWORD
        icvBottom    AS BIT * 7
        icvDiag      AS BIT * 7
        dgDiag       AS BIT * 4
        fHasXFExt    AS BIT * 1
        fls          AS BIT * 6
        icvFore      AS BIT * 7 IN WORD
        icvBack      AS BIT * 7
        fSxButton    AS BIT * 1
        reserved2    AS BIT * 1
    END TYPE
    TYPE tXfRecordBiff7
        ifnt         AS WORD
        ifmt         AS WORD
        fLocked      AS BIT * 1 IN WORD
        fHidden      AS BIT * 1
        fStyle       AS BIT * 1
        f123Prefix   AS BIT * 1
        ixfParent    AS BIT * 12
        alc          AS BIT * 3 IN WORD
        fWrap        AS BIT * 1
        alcV         AS BIT * 3
        fJustLast    AS BIT * 1
        ori          AS BIT * 2
        fAtrNum      AS BIT * 1
        fAtrFnt      AS BIT * 1
        fAtrAlc      AS BIT * 1
        fAtrBdr      AS BIT * 1
        fAtrPat      AS BIT * 1
        fAtrProt     AS BIT * 1
        icvFore      AS BIT * 7 IN WORD
        icvBack      AS BIT * 6 'not a typo
        fSxButton    AS BIT * 1
        reserved1    AS BIT * 2
        fls          AS BIT * 6 IN WORD
        dgBottom     AS BIT * 3
        icvBottom    AS BIT * 7
        dgTop        AS BIT * 3 IN WORD
        dgLeft       AS BIT * 3
        dgRight      AS BIT * 3
        icvTop       AS BIT * 7
        icvLeft      AS BIT * 7 IN WORD
        icvRight     AS BIT * 7
        reserved2    AS BIT * 2
    END TYPE
    GLOBAL rkRec() AS tRkRec
    FUNCTION PBMAIN
        DIM hr AS LONG
        DIM pStorage AS IStorage
        DIM pStream AS IStream
        DIM wszName AS STRING
        'Movement variables
        DIM qNewPos AS QUAD
        DIM cbRead AS DWORD
        DIM cbSize AS QUAD
        DIM seekDistance AS QUAD
        'Excel file UDTs
        DIM Identifier AS tIdentifier
        DIM BoundSheet AS tBoundSheet
        DIM sst AS tSharedStringTable
        DIM rk AS tRkNumber
        DIM mulRk AS tMulRk
        DIM num AS tNumber
        DIM labelSst AS tLabelSst
        'Stored Excel File Info
        DIM BiffVersion AS BYTE
        DIM sstArray() AS STRING 'shared string table
        DIM numberFormats() AS STRING
        DIM curNumFmt AS LONG
        DIM xfArray8() AS tXfRecordBiff8
        DIM xfArray7() AS tXfRecordBiff7
        DIM curXf AS LONG
        DIM sheetNames() AS tWorksheetInfo
        DIM sheetNamesCount AS INTEGER
        DIM curSheet AS INTEGER
        DIM rw AS WORD, col AS WORD 'stored from formula records to grab the next String record
        'Immediate tmp variables
        DIM s AS STRING
        DIM i AS LONG
        'Timer variable
        DIM start AS DOUBLE
        start = TIMER
        wszName = UCODE$("Test.xls")
        hr = StgOpenStorage(STRPTR(wszName), NOTHING, _
            %STGM_DIRECT_SWMR OR %STGM_READWRITE OR %STGM_SHARE_EXCLUSIVE, _
            0, %NULL, pStorage)
        IF FAILED(hr) THEN
          MSGBOX "StgOpenStorage failure: " & HEX$(hr)
          EXIT FUNCTION
        END IF
        wszName = UCODE$("Workbook")
        hr = pStorage.OpenStream(STRPTR(wszName), 0, %STGM_READWRITE OR %STGM_SHARE_EXCLUSIVE, 0, pStream)
        IF FAILED(hr) THEN
            wszName = UCODE$("Book")
            hr = pStorage.OpenStream(STRPTR(wszName), 0, %STGM_READWRITE OR %STGM_SHARE_EXCLUSIVE, 0, pStream)
            IF FAILED(hr) THEN
                MSGBOX "IStorage.OpenStream failure: " & HEX$(hr)
                EXIT FUNCTION
            END IF
        END IF
        cbSize = IStream_GetSize(pStream)
        OPEN "out.txt" FOR OUTPUT ACCESS WRITE AS #1
        PopulateDefaultNumberFormats numberFormats()
        sheetNamesCount = -1
        curSheet = -1
        curNumFmt = &H31
        curXf = -1
        DO UNTIL qNewPos + 1 > cbSize
            'Echo the worksheet name if needed
            IF sheetNamesCount > -1 THEN
                IF qNewPos = sheetNames(curSheet + 1).lbPlyPos THEN
                    INCR curSheet
                    PRINT #1, "============================================================================================="
                    PRINT #1, "Worksheet: " &  sheetNames(curSheet).shtName
                    PRINT #1, "============================================================================================="
                END IF
            END IF
            'Read the current identifier
            hr = pStream.Read(VARPTR(Identifier), SIZEOF(Identifier), cbRead)
            hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
            IF LO(BYTE, Identifier.idNumber) = &H09 THEN                                    'BOF marker
                BiffVersion = HI(BYTE, Identifier.idNumber)
                'If needed you would read either tBOF57 or tBOF8 here depending on BiffVersion.
                'We are defaulting to the seek in the else statement for this example.
            END IF
            IF Identifier.idNumber = &H85 THEN                                              'BoundSheet inf
                hr = pStream.Read(VARPTR(BoundSheet), SIZEOF(BoundSheet), cbRead)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                s = ReadBiffString(1, BiffVersion, pStream)
                INCR sheetNamesCount
                REDIM PRESERVE sheetNames(sheetNamesCount) AS tWorksheetInfo
                sheetNames(sheetNamesCount).shtName = s
                sheetNames(sheetNamesCount).lbPlyPos = BoundSheet.lbPlyPos
            ELSEIF Identifier.idNumber = &HFC THEN                                          'Shared String Table
                hr = pStream.Read(VARPTR(sst), SIZEOF(sst), cbRead)
                REDIM sstArray(sst.cstUnique) AS STRING
                FOR i = 1 TO sst.cstUnique
                    sstArray(i - 1) = ReadBiffString(2, BiffVersion, pStream)
                NEXT
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
            ELSEIF Identifier.idNumber = &HFD THEN                                          'LabelSst
                hr = pStream.Read(VARPTR(labelSst), SIZEOF(labelSst), cbRead)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                PRINT #1, OrdinalToName(labelSst.rw, labelSst.col) & ": " & _
                  sstArray(labelSst.isst)
            ELSEIF Identifier.idNumber = &H406 OR Identifier.idNumber = &H06 THEN           'Formula
                hr = pStream.Read(VARPTR(rw), SIZEOF(rw), cbRead)
                hr = pStream.Read(VARPTR(col), SIZEOF(col), cbRead)
                seekDistance = Identifier.szSize - SIZEOF(rw) - SIZEOF(col)
                hr = pStream.Seek(seekDistance, %STREAM_SEEK_CUR, qNewPos)
            ELSEIF Identifier.idNumber = &H207 OR Identifier.idNumber = &H07 THEN           'String
                s = ReadBiffString(2, BiffVersion, pStream)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                PRINT #1, OrdinalToName(rw, col) & ": " & s
            ELSEIF Identifier.idNumber = &H7E  OR Identifier.idNumber = &H27E THEN          'RK Number
                hr = pStream.Read(VARPTR(rk), SIZEOF(rk), cbRead)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                IF BiffVersion = 8 THEN
                    PRINT #1, OrdinalToName(rk.rw, rk.col) & ": " & LTRIM$(STR$(NumFromRk(rk.rk))) & _
                      " Number Format: " & numberFormats(xfArray8(rk.ixfe).ifmt)
                ELSE
                    PRINT #1, OrdinalToName(rk.rw, rk.col) & ": " & LTRIM$(STR$(NumFromRk(rk.rk))) & _
                      " Number Format: " & numberFormats(xfArray7(rk.ixfe).ifmt)
                END IF
     
            ELSEIF Identifier.idNumber = &HBD THEN                                          'MulRK Structure
                ReadMulRkStructure pStream, mulRk, Identifier.szSize
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                FOR i = 0 TO mulRk.rkCount
                    IF BiffVersion = 8 THEN
                        PRINT #1, OrdinalToName(mulRk.rw, mulRk.colFirst + i) & ": " & _
                          LTRIM$(STR$(NumFromRk([email protected][i].rk))) & _
                          " Number Format: " & numberFormats(xfArray8([email protected][i].ixfe).ifmt)
                    ELSE
                        PRINT #1, OrdinalToName(mulRk.rw, mulRk.colFirst + i) & ": " & _
                          LTRIM$(STR$(NumFromRk([email protected][i].rk))) & _
                          " Number Format: " & numberFormats(xfArray7([email protected][i].ixfe).ifmt)
                    END IF
                NEXT
            ELSEIF Identifier.idNumber = &H03  OR Identifier.idNumber = &H203 THEN          'Number
                hr = pStream.Read(VARPTR(num), SIZEOF(num), cbRead)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
                IF BiffVersion = 8 THEN
                    PRINT #1, OrdinalToName(num.rw, num.col) & ": " & LTRIM$(STR$(num.num)) & _
                      " Number Format: " & numberFormats(xfArray8(num.ixfe).ifmt)
                ELSE
                    PRINT #1, OrdinalToName(num.rw, num.col) & ": " & LTRIM$(STR$(num.num)) & _
                      " Number Format: " & numberFormats(xfArray7(num.ixfe).ifmt)
                END IF
            ELSEIF Identifier.idNumber = &H1E  OR Identifier.idNumber = &H41E THEN          'Number Format
                INCR curNumFmt
                'Ghetto vector
                IF curNumFmt > UBOUND(numberFormats) THEN
                    REDIM PRESERVE numberFormats(UBOUND(numberFormats) + 20) AS STRING
                END IF
                seekDistance = SIZEOF(tFormat)
                hr = pStream.Seek(seekDistance, %STREAM_SEEK_CUR, qNewPos)
                numberFormats(curNumFmt) = ReadBiffString(2, BiffVersion, pStream)
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
            ELSEIF Identifier.idNumber =  &H43  OR Identifier.idNumber = &H243 _
                OR Identifier.idNumber = &H443  OR Identifier.idNumber = &H843 _
                OR Identifier.idNumber =  &HE0 THEN                                         'XF Record
                INCR curXf
                IF BiffVersion = 8 THEN
                    'Ghetto vector
                    IF curXf > UBOUND(xfArray8) THEN
                        REDIM PRESERVE xfArray8(UBOUND(xfArray8) + 20) AS tXfRecordBiff8
                    END IF
                    hr = pStream.Read(VARPTR(xfArray8(curXf)), SIZEOF(tXfRecordBiff8), cbRead)
                ELSE
                    'Ghetto vector
                    IF curXf > UBOUND(xfArray7) THEN
                        REDIM PRESERVE xfArray7(UBOUND(xfArray7) + 20) AS tXfRecordBiff7
                    END IF
                    hr = pStream.Read(VARPTR(xfArray7(curXf)), SIZEOF(tXfRecordBiff7), cbRead)
                END IF
                hr = pStream.Seek(0, %STREAM_SEEK_CUR, qNewPos)
            ELSE
                'PRINT #1, "Identifier: " & IdToName$(Identifier.idNumber) & " - Size: " & FORMAT$(Identifier.szSize)
                seekDistance = Identifier.szSize
                hr = pStream.Seek(seekDistance, %STREAM_SEEK_CUR, qNewPos)
            END IF
        LOOP
        CLOSE #1
        pStream = NOTHING
        pStorage = NOTHING
        MSGBOX "Total runtime: " & FORMAT$(TIMER - start, "#,##0.00")
    END FUNCTION
     
    ''This sub is used to set option flags for row UDTs. They are not used in this example, but
    ''I'm leaving this here incase it is useful to you.
    SUB SetRowOptions(row AS tRow, rowOpt AS tRowOptions)
        rowOpt.stdRowHt      = BIT(row.miyRw, 15)
        rowOpt.iOutLevel     = row.grbit AND &H07
        rowOpt.fCollapsed    = BIT(row.grbit, 4)
        rowOpt.fDyZero       = BIT(row.grbit, 5)
        rowOpt.fUnsynced     = BIT(row.grbit, 6)
        rowOpt.fGhostDirty   = BIT(row.grbit, 7)
        rowOpt.fExAsc        = BIT(row.ixfe,  12)
        rowOpt.fExDsc        = BIT(row.ixfe,  13)
    END SUB
    ''Converts an RK number to an IEEE number
    FUNCTION NumFromRk(rk AS LONG) AS DOUBLE
        DIM num AS DOUBLE
        DIM tmp AS LONG
        DIM ptmp AS LONG PTR
        IF ISTRUE BIT(rk, 1) THEN
            'int
            tmp = rk
            SHIFT RIGHT tmp, 2
            num = CDBL(tmp)
        ELSE
            'hi words of IEEE num
            tmp = rk AND &HFFFFFFFC
            ptmp = VARPTR(num) + 4
            @ptmp = tmp
        END IF
        IF ISTRUE BIT(rk, 0) THEN
            num /= 100
        END IF
        FUNCTION = num
    END FUNCTION
    ''Excel BIFF Multiple RK record
    SUB ReadMulRkStructure(pStream AS IStream, BYREF mulRk AS tMulRk, szSize AS WORD)
        DIM hr AS LONG
        DIM cbRead AS DWORD
        hr = pStream.Read(VARPTR(mulRk.rw), SIZEOF(mulRk.rw), cbRead)
        hr = pStream.Read(VARPTR(mulRk.colFirst), SIZEOF(mulRk.colFirst), cbRead)
        mulRk.rkCount = (szSize - 6) / 6 - 1
        REDIM rkRec(mulRk.rkCount) AS tRkRec
        hr = pStream.Read(VARPTR(rkRec(0)), SIZEOF(tRkRec) * (mulRk.rkCount + 1), cbRead)
        mulRk.rgrkreg = VARPTR(rkRec(0))
        hr = pStream.Read(VARPTR(mulRk.colLast), SIZEOF(mulRk.colLast), cbRead)
    END SUB
    ''Zero based row and column ordinals to Excel friendly address "A3"
    FUNCTION OrdinalToName(BYVAL rw AS WORD, BYVAL col AS WORD) AS STRING
        'Const BaseSize = 26 'Len("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
        DIM s AS STRING, tmp AS DOUBLE, i AS INTEGER, lastI AS INTEGER
        DIM b(26) AS BYTE
        DIM mycol AS LONG
        INCR col
        INCR rw
        IF col < 1 OR rw < 1 THEN
            EXIT FUNCTION
        END IF
        'A-Z
        ARRAY ASSIGN b() = 0, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90
        mycol = col
        DO WHILE mycol <> 0
            tmp = mycol
            i = 0
            DO WHILE tmp >= 26
                i = i + 1
                tmp = tmp / 26
            LOOP
            tmp = INT(tmp)  'truncate decimals
            s = s & CHR$(b(tmp))
            mycol = mycol - tmp * (26 ^ i)
            lastI = i
        LOOP
        FUNCTION = s & LTRIM$(STR$(rw))
    END FUNCTION
    ''Tested and appears to be working as expected. SizeOfCch should be 2 except in the case of worksheet
    ''names when it is 1. Refer to the BIFF docs.
    FUNCTION ReadBiffString(BYVAL SizeOfCch AS DWORD, BiffVersion AS BYTE, pStream AS IStream) AS STRING
        DIM hr AS LONG
        DIM cbRead AS DWORD
        DIM qNewPos AS QUAD
        DIM cc1 AS BYTE
        DIM cch AS WORD         'Count of characters in the string (Note: this is the number of chars, NOT bytes)
        DIM grbit AS BYTE       'Option flags (only used in BIFF8)
        DIM sRgb AS STRING      'Array of string characters and formatting runs
        DIM fHighByte AS BYTE   '=0 if all the characters in the string have a high byte of 00h
                                'and only the low bytes are saved in the file (compressed)
                                '=1 if at least one character in the string has a nonzero high byte and therefore
                                'all characters in the string are saved as double-byte characters (not compressed)
        DIM fExtSt AS BYTE      'Extended string flag
        DIM fRichSt AS BYTE     'Rich string flag
        DIM cchExtRst AS DWORD  'Length of ExtRst data
        DIM numRuns AS WORD     'The number of formatting runs in the BiffString
        'dim runs() as tRun     'The formatting runs.. not implemented in this function
        'dim i as word
        IF SizeOfCch = 1 THEN
            hr = pStream.Read(VARPTR(cc1), SIZEOF(cc1), cbRead)
            cch = cc1
        ELSE
            hr = pStream.Read(VARPTR(cch), SIZEOF(cch), cbRead)
        END IF
        IF BiffVersion < &H08 THEN
            FUNCTION = IStream_ReadText(pStream, CLNG(cch))
        ELSE
            hr = pStream.Read(VARPTR(grbit), SIZEOF(grbit), cbRead)
            SELECT CASE grbit
                CASE &H00 'the BiffString is compressed unicode
                    FUNCTION = IStream_ReadText(pStream, CLNG(cch))
                CASE &H01 'the BiffString is Unicode
                    FUNCTION = IStream_ReadText(pStream, CLNG(cch * 2))
                CASE ELSE
                    fHighByte = BIT(grbit, 0)
                    fExtSt    = BIT(grbit, 2)
                    fRichSt   = BIT(grbit, 3)
                    IF fExtSt = 0 THEN
                        'The BiffString is a rich string that contains a series of runs
                        hr = pStream.Read(VARPTR(numRuns), SIZEOF(numRuns), cbRead)
                        'redim runs(numRuns) as tRun
                        IF fHighByte = 0 THEN
                            'the BiffString is compressed unicode
                            FUNCTION = IStream_ReadText(pStream, CLNG(cch))
                        ELSE
                            'the BiffString is Unicode
                            FUNCTION = IStream_ReadText(pStream, CLNG(cch * 2))
                        END IF
                        'Read in all of the run data
                        'for i = 0 to numRuns - 1
                        '   hr = pStream.Read(varptr(runs(i)), sizeof(runs(i)), cbRead)
                        'NEXT
                        'Since we are not processing the formatting code, just bypass it
                        hr = pStream.Seek(CQUD(numRuns * 4), %STREAM_SEEK_CUR, qNewPos)
                    ELSE
                        'The BiffString is an Extended String which contains some undocumented features
                        IF fRichSt = 1 THEN
                            'The BiffString does contain formatting runs
                            hr = pStream.Read(VARPTR(numRuns), SIZEOF(numRuns), cbRead)
                            'redim runs(numRuns) as tRun
                        END IF
                        hr = pStream.Read(VARPTR(cchExtRst), SIZEOF(cchExtRst), cbRead)
                        IF fHighByte = 0 THEN
                            'the BiffString is compressed unicode
                            FUNCTION = IStream_ReadText(pStream, CLNG(cch))
                        ELSE
                            'the BiffString is Unicode
                            FUNCTION = IStream_ReadText(pStream, CLNG(cch * 2))
                        END IF
                        IF fRichSt = 1 THEN
                            'Read in all of the run data
                            'for i = 0 to numRuns - 1
                            '   hr = pStream.Read(varptr(runs(i)), sizeof(runs(i)), cbRead)
                            'NEXT
                            'Since we are not processing the formatting code, just bypass it
                            hr = pStream.Seek(CQUD(numRuns * 4), %STREAM_SEEK_CUR, qNewPos)
                        END IF
                        'Bypass undocumented ExtRst data
                        hr = pStream.Seek(CQUD(cchExtRst), %STREAM_SEEK_CUR, qNewPos)
                    END IF
            END SELECT
        END IF
    END FUNCTION
    SUB PopulateDefaultNumberFormats(numberFormats() AS STRING)
        REDIM numberFormats(&H40) AS STRING
        numberFormats(&H0) = "General"
        numberFormats(&H1) = "0"
        numberFormats(&H2) = "0.00"
        numberFormats(&H3) = "#,##0"
        numberFormats(&H4) = "#,##0.00"
        numberFormats(&H5) = "($#,##0_);($#,##0)"
        numberFormats(&H6) = "($#,##0_);[Red]($#,##0)"
        numberFormats(&H7) = "($#,##0.00_);($#,##0.00)"
        numberFormats(&H8) = "($#,##0.00_);[Red]($#,##0.00)"
        numberFormats(&H9) = "0%"
        numberFormats(&HA) = "0.00%"
        numberFormats(&HB) = "0.00E+00"
        numberFormats(&HC) = "# ?/?"
        numberFormats(&HD) = "# ??/??"
        numberFormats(&HE) = "m/d/yy"
        numberFormats(&HF) = "d-mmm-yy"
        numberFormats(&H10) = "d-mmm"
        numberFormats(&H11) = "mmm-yy"
        numberFormats(&H12) = "h:mm AM/PM"
        numberFormats(&H13) = "h:mm:ss AM/PM"
        numberFormats(&H22) = "h:mm"
        numberFormats(&H23) = "h:mm:ss"
        numberFormats(&H24) = "h:mm"
        numberFormats(&H25) = "(#,##0_);(#,##0)"
        numberFormats(&H26) = "(#,##0_);[Red](#,##0)"
        numberFormats(&H27) = "(#,##0.00_);(#,##0.00)"
        numberFormats(&H28) = "(#,##0.00_);[Red](#,##0.00)"
        numberFormats(&H29) = "_(* #,##0_);_(* (#,##0);_(* ""-""_);_(@_)"
        numberFormats(&H2A) = "_($* #,##0_);_($* (#,##0);_($* ""-""_);_(@_)"
        numberFormats(&H2B) = "_(* #,##0.00_);_(* (#,##0.00);_(* ""-""??_);_(@_)"
        numberFormats(&H2C) = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)"
        numberFormats(&H2D) = "mm:ss"
        numberFormats(&H2E) = "[h]:mm:ss"
        numberFormats(&H2F) = "mm:ss.0"
        numberFormats(&H30) = "##0.0E+0"
        numberFormats(&H31) = "@"
    END SUB
    Attached Files
    Last edited by David Maruca; 9 Jul 2010, 05:41 PM.

  • #2
    Realy interesting

    i tested the part 2 prog with the "test.xls" file that is in part one and i see a litle error:

    the C1 cell is formated as Date JJ/MM/AAAA and the result from the prog
    says it is a "C1: 40283 Number Format: m/d/yy"

    isn't it incorrect ?

    Thank's

    Dominique
    Dominique

    Comment


    • #3
      That's not incorrect. Excel stores dates and times as a number which is converted into a real date depending on the 1904 flag set in the workbook headers. This code does not convert that number. It simply reports the stored number and the cell format. If you open that xls in Excel and format it as a number you will see the same value.

      I don't have it with me and I am about to head out of town, but I can post some VB code I use to convert the date serial into a real date. I'm gone for almost a week. I'm sure this info can be found on google, though. Thanks for your interest.

      Comment


      • #4
        in fact

        in fact my question was not about the number by itself,
        but about the format listed:
        the prog says it is of the date format of : "m/d/yy"
        in fact it is a date format of : "dd/mm/yyyy"

        that was just a question about the info returned .

        I realy thank you for all these infos about BIFF format !

        Dominique
        Dominique

        Comment


        • #5
          I think I definitely still have some confusion from the docs about the styles array set up in the BIFF document. I changed the date format and crashed because of an array bounds issue lol. I will continue researching on this after my vacation.

          FYI the date format specified in the document in the first thread is marked as an internal format that is marked with an asterisk on my cell format dialog which means it responds to regional settings from the OS. I have the internal formats hard coded and they do not check OS settings. That may be why there is a discrepancy.

          Comment


          • #6
            Oh, it's also interesting to note that the BIFF format is remarkably close to the new OpenXML formats in the 2007 versions. In some cases I have had to use a combination of the MS BIFF docs, open office.org docs, and ECMA 376 to understand some things.

            Comment

            Working...
            X