Announcement

Collapse
No announcement yet.

Tally Wrong Choice

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tally Wrong Choice

    I wanted to know how many times the whole word "ice" was found in " ice ice ". The answer I expected was 2.

    I used Tally which returns 1. Until I saw the result, I didn't realize that Tally starts counting the next occurrence following the entire search string. It doesn't use a moving window character by character. So if I use Tally and expected an answer of 2, there needed to be 2 spaces between ice.

    Here's the Tally code and also code that gives the answer I wanted:

    Code:
    Function PBMain() As Long
       Local i,iCount As Long, t$, s$
       s$ = " ice "
    
       'gave answer I didn't want
       t$ = " ice ice "
       ? Str$(Tally(t$, s$))
    
       'gives answer I want
       i = InStr(t$,s$)
       While i
          Incr iCount
          i = InStr(i+Len(s$)-1,t$,s$)   'Instr(i+1,t$, s$) works to but is more effort
       Wend
       ? Str$(iCount)
    End Function
    I've previously posted a function INSTRW, which would find occurrences of whole words, but it was overkill for this particular need.

  • #2
    Hey Gary!

    What if s$ = "o" and t$ = "Groovy" ?

    Comment


    • #3
      Howdy, Pierre!

      The code above would only be used to search for whole words, so s$ cannot be "o". It would be " o ", where all whitespace is converted to $spc.

      Comment


      • #4
        Howdy Gary,

        My logic is: If t$ contain two s$ then there must be four spaces in t$.
        I mean, i don't understand the wording "Tally wrong choice" and why you expected 2 as an answer.

        Also, what if t$ = " ice" & $CRLF & "ice " or " ice" & $TAB & "ice "?
        Is it allowable in your context?

        Comment


        • #5
          You could try
          '
          Code:
          #COMPILE EXE
          #DIM ALL
          FUNCTION PBMAIN () AS LONG
          LOCAL s,t AS STRING
          s = " ice "
          t = " ice ice "
          REPLACE s WITH s & " " IN t
          ? STR$(TALLY(t,s))
          END FUNCTION
          '

          Comment


          • #6
            Howdy, Pierre!

            You are correct in your description of how it works.

            In the app where I'm doing this, I do something like Replace WhiteSpace$ with $Spc in t$, so the $CRLF are handled.

            But I was thinking of Tally as a moving window, the width of s$, which goes from the left to right of t$, one character at a time, looking for a match of the underlying letters. If Tally did that, it would have given an answer of 2.


            And Howdy, Stuart!

            Yep, that would seem to do it as well.

            I'd guess that your string replacement suggestion would take longer than using INSTR?

            Comment


            • #7
              Just a small reflection - if search length = 1, the following will cause eternal loop (has to be closed via Task Manager) because following INSTR's will start from same point: INSTR( i+LEN(s$)-1, ..
              INSTR( i+1, .. is safer.

              Comment


              • #8
                Gary,
                The documentation on TALLY states
                When a match is found, the scan for the next match begins at the position immediately following the prior match.
                so you should not be expecting the search to move along character by character though the complete string.
                [CODE}#COMPILE EXE
                #DIM ALL
                FUNCTION PBMAIN () AS LONG
                LOCAL p,q,r,s,t AS STRING
                p= "ice "
                q= "ice"
                r= " ice"
                s = " ice "
                t = " ice ice "
                'REPLACE s WITH s & " " IN t
                ? STR$(TALLY(t,p))+$CRLF+STR$(TALLY(t,q))+$CRLF+STR$(TALLY(t,r))+$CRLF+STR$(TALLY(t,s))
                END FUNCTION [/CODE]
                Rod
                I want not 'not', not Knot, not Knott, not Nott, not knot, not naught, not nought, but aught.

                Comment


                • #9
                  Howdy, Borje!
                  Not "will", but "did"! One of my tests was with a single character word and I had to visit Task Manager to get out of it!


                  And yo, Rodney!
                  Yes, I made an unverified assumption. Didn't read Help. I've used Tally lots of times. It was this version with a word surrounded by spaces that caught me.

                  Comment


                  • #10
                    Gary,

                    I took your code and placed into a function that does not recalculate len(s$)-1 in each loop.
                    It is possible for a word to start a line without a leading space or end a line without a trailing space so wrapped line with $SPC.
                    Also wrapped search word in spaces in the function so it doesn't have to be passed wrapped with spaces.
                    Someone using pointers or assembler could beat the INSTR loop for performance.
                    Code:
                    FUNCTION PBMAIN() AS LONG
                     ? STR$(WordCount("ice ice","ice"))
                    END FUNCTION
                    
                    FUNCTION WordCount(BYVAL sMain AS STRING, BYVAL SearchFor AS STRING) AS LONG
                     LOCAL i,icount,LengthMinus1 AS LONG
                     sMain = WRAP$(sMain," "," ")
                     SearchFor = WRAP$(SearchFor,$SPC,$SPC)
                     LengthMinus1 = LEN(SearchFor) -1
                     i = INSTR(sMain,SearchFor)
                     WHILE i
                      INCR iCount
                      i = INSTR(i+LengthMinus1,sMain,SearchFor)
                     WEND
                     FUNCTION = iCount
                     ' https://forum.powerbasic.com/forum/user-to-user-discussions/powerbasic-for-windows/795149-tally-wrong-choice#post795163
                    END FUNCTION
                    How long is an idea? Write it down.

                    Comment


                    • #11
                      I guess it depends on what is being counted.

                      What about "ice ice, ice mice ice." I would expect 4

                      Comment


                      • #12
                        Hey Gary,

                        So the rule is:
                        If a substring is found (As is, no surrounding space)
                        and both PreChar and PostChar are less than 33, then this is considered a word.
                        If the MainString start with the SubString then PreChar is ignored.
                        If the MainString end with the SubString then PostChar is ignored.

                        Is that a valid description of the spec you want?

                        Comment


                        • #13
                          So if I wanted to count the number of 'a's in a string, it would not work unless each 'a' is at least two spaces away from the next 'a'????

                          Good heavens. That is definitely a weakness - bordering on a bug

                          I guess you could REMOVE$ everything except 'a' and then use a LEN?? But that might be slow??
                          [I]I made a coding error once - but fortunately I fixed it before anyone noticed[/I]
                          Kerry Farmer

                          Comment


                          • #14
                            To count the number of letter 'a' the match string would be "a".
                            To count the number of word 'a' the match string would be " a ".
                            In " ice ice " the space between the ice's is part of the first match, therefore the second ice does not have a leading space so does not match. (ref post 1 code)
                            TALLY works as advertised, just not what Gary "wanted", so title of this thread.

                            Cheers,
                            Dale

                            Comment


                            • #15
                              Dale

                              Duh - I do not understand sorry

                              If my string was 'a space a space a space a', then how many a's would tally count?
                              [I]I made a coding error once - but fortunately I fixed it before anyone noticed[/I]
                              Kerry Farmer

                              Comment


                              • #16
                                Code:
                                'count = TALLY (mainstring,matchstring)
                                FUNCTION PBMAIN () AS LONG
                                 ? STR$(TALLY("a a a a","a")) '4
                                 ? STR$(TALLY("a a a a","a ")) '3
                                END FUNCTION
                                How long is an idea? Write it down.

                                Comment


                                • #17
                                  Thanks Mike

                                  So if my string was 'aaaa' how many would it tally? [I should code it!]
                                  [I]I made a coding error once - but fortunately I fixed it before anyone noticed[/I]
                                  Kerry Farmer

                                  Comment


                                  • #18
                                    I coded it

                                    Code:
                                    DIM s AS STRING
                                    DIM ss AS STRING
                                    s = "ice"
                                    ss = "ice ice"
                                    ? TALLY (ss,s)
                                    WAITKEY$ 
                                    Gives 2

                                    DIM s AS STRING
                                    Code:
                                    DIM ss AS STRING
                                    s = "ice"
                                    ss = "iceice"
                                    ? TALLY (ss,s)
                                    WAITKEY$
                                    Gives 2

                                    Code:
                                    DIM s AS STRING
                                    DIM ss AS STRING
                                    s = "a"
                                    ss = "a a a a"
                                    ? TALLY 
                                    Gives 4

                                    Code:
                                    DIM s AS STRING
                                    DIM ss AS STRING
                                    s = "a"
                                    ss = "aaaa"
                                    ? TALLY (ss,s)
                                    WAITKEY$
                                    Gives 4

                                    which is totally what i expected.

                                    This is important to me - so forgive my denseness

                                    What point am I missing?

                                    Windows 10, latest PBCC

                                    Thanks

                                    [I]I made a coding error once - but fortunately I fixed it before anyone noticed[/I]
                                    Kerry Farmer

                                    Comment


                                    • #19
                                      Code:
                                      Looking for an exact word ice
                                      ice. ice, "ice"  and others are also valid.
                                      FUNCTION PBMAIN () AS LONG
                                       ? STR$(TALLY("dice, spice lice icey","ice")) 'tally finds 4, but 0 are the word ice
                                      END FUNCTION'
                                      How long is an idea? Write it down.

                                      Comment


                                      • #20
                                        Gary, depending on your exact goal and context, this one might be interesting for you...
                                        It is not optimized for speed but for little footprint in the ide.
                                        I have not exactly the same view but I think it's what you like...
                                        It should be robust enough.

                                        Code:
                                         LOCAL sMainString AS STRING
                                         LOCAL sSubString  AS STRING
                                         LOCAL index       AS LONG
                                         LOCAL WordCount   AS LONG
                                        
                                         sMainString = "ice ice iceberg ice$ %ice   iceice" & $CRLF & "rice" & $TAB & "ice,ice."
                                         sSubString  = "ice"  
                                        
                                         sMainString = SHRINK$(sMainString, $WHITESPACE & ",.?!=%$&") '<- Add all non word characters you want
                                         FOR index = 1 TO PARSECOUNT(sMainString, sSubString)
                                           IF LEN(PARSE$(sMainString, $SPC, index)) = LEN(sSubString) THEN INCR WordCount
                                         NEXT
                                        
                                         ? "[" & sMainString & "]" & $CRLF & "[" & sSubString & "]" & $CRLF & "Count: " & STR$(WordCount)

                                        Comment

                                        Working...
                                        X