Announcement

Collapse
No announcement yet.

Counting characters/letters

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting characters/letters

    Recently had a query about how to do this (in a different programming language). Having knocked up a solution, I decided to make a PB utility to do it as well
    It's trivial with ANSI strings where there are only a limited number of printable characters, as you can just increment values in an LONG array DIMed as (32 to 255) as you step through the string. and then step through the array, listing the CHR$() and value of each index where the value is greater than 0.
    i.e
    Code:
    DIM arrC(32 TO 255)
    FOR x = 1 TO LEN(mystring)
        INCR arrC(ASC(mystring,x))
    NEXT
    With a WSTRING where there is a huge range of potential Unicode values, that is not really practical, there are better ways. Here's one such.

    Not posted in Source Code, because it can probably be improved on. Have at it:

    '
    Code:
    #COMPILE EXE
    #DIM ALL
    
    %Case = 1 '0 = case sensitive, 1 = convert all to uppercase, 2 = convert all to lower case
    
    FUNCTION PBMAIN() AS LONG
        LOCAL wstr, wsOut,wstrPunct AS WSTRING
        LOCAL x,y AS LONG
        DIM arrOut() AS WSTRING
        DIM Freq() AS LONG
        wstrPunct = CHR$$(0 TO 31,".,;:!? ")   ' add other excluded characters as desired!
    
        wstr =  "Mississippi is not in Indiana. Neither is Arkansas! Is Missouri?" & $CRLF &  "These are accented Latin characters: " _
                    & CHR$$(&H0200,&H0201,&H0202,&H0203,&H0204,&H0205,&H0206,&H0207)
    
        SELECT CASE %Case
            CASE 1 : wstr=UCASE$(wstr)
            CASE 2 : wstr=LCASE$(wstr)
        END SELECT
        DIM freq(1 TO LEN(wstr)) AS LONG
        FOR x = 1 TO LEN(wstr)
            IF ASC(wstr,x) = 0 THEN ITERATE 
            freq(x) = 1
            FOR y = x+1 TO LEN(wstr)
                IF ASC(wstr,x) = ASC(wstr,y) THEN
                    freq(x) = freq(x) + 1
                    MID$(wstr,y,1) = CHR$(0)
                END IF
            NEXT
        NEXT
        FOR x =  1 TO LEN(wstr)
            IF ASC(wstr,x) <> 0 AND (INSTR(wstrPunct,MID$(wstr,x,1)) = 0) THEN
                wsOut &= $LF & HEX$(ASC(wstr,x),4) & $TAB & MID$(wstr,x,1) & $TAB &  STR$(freq(x))
            END IF
        NEXT
        DIM arrOut(1 TO PARSECOUNT(wsOut,$LF))
        PARSE wsOut,ArrOut(),$LF
        ARRAY SORT arrOut()
        ? JOIN$(arrOut(),$LF),,"Lettercount"
    END FUNCTION
    '
Working...
X