Announcement

Collapse
No announcement yet.

Finding corrent name with misspelling?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding corrent name with misspelling?

    Hello. I am attempting to write a "location verification" program of sorts. I.e., if I have a death certificate (birth or marriage as well), and it says the birth took place in "Kentucky, Stott County, Georgetown," then the program can figure out that "Stott" should be "Scott" and thereby I have correct information to use.

    I have built a data file of about 2.6 million U.S. locations, but now I am wondering how to fuzzy logic this: When you run a spell checker, how does it decide which words you were likely trying to spell? Is there a formula to matching a misspelled name to possible correct name? I.e., suppose on the certificate I see "Fborih," I would like for the formula to give me something I can use to suggest "Florida." (I wonder if I am really looking for a Soundex formula? That is what genealogists use to match similar last names such as Smith and Smyth, Carneal and Carnall, for example.

    Using PowerBasic DOS, but hoping to get PowerBasic for Windows soon!

    Thanks.

    Robert

  • #2
    > Is there a formula to .....

    There is no one 'formula' to do what you want.

    There are things like SOUNDEX values you can use to get close, but at some point you will have to compare what you think a word might be against a table of valid values, and pick one.

    But even that will not be enough: For example, look at your thread title:
    By "Corrent" did you mean "Current?" or "Correct?" Both are valid words, both have but one letter which is different, and both kind of ("fuzzily") make sense .

    The ability to do this kind of thing is why Google(r) pays its people well.

    MCM
    Michael Mattias
    Tal Systems (retired)
    Port Washington WI USA
    [email protected]
    http://www.talsystems.com

    Comment


    • #3
      In your DB of states, list the valid counties for each state. If there are 2 similar counties, then look at boroughs/towns/cities within the counties and compare to what was given. Then just test each entry against the database. If you get 2 or more likely candidates, prompt the user (something like mapquest does).
      Scott Slater
      Summit Computer Networks, Inc.
      www.summitcn.com

      Comment


      • #4
        If there are 2 similar counties...
        I think what he's asking is how can you tell if they are similar? Would really like to know how Google does it myself.

        Good question.
        There are no atheists in a fox hole or the morning of a math test.
        If my flag offends you, I'll help you pack.

        Comment


        • #5
          Originally posted by Mel Bishop View Post
          I think what he's asking is how can you tell if they are similar? Would really like to know how Google does it myself.
          Yes, Mel, that is exactly what I am inquiring about. Thank you for restating that very clearly. There has to be a way to do this mathematically and programmatically, or Google couldn't do it.

          Thanks.

          Robert

          Comment


          • #6
            As Michael pointed out (perhaps not clear enough), one of the phonetic algorithms to find similaries in words is Soundex.

            For a PB/DOS implementation of Soundex, see http://www.powerbasic.com/support/pb...hlight=soundex

            Comment


            • #7
              There has to be a way to do this mathematically and programmatically, or Google couldn't do it.
              And to think your mother and father wondered how you could ever support yourself as a lexographer.
              Michael Mattias
              Tal Systems (retired)
              Port Washington WI USA
              [email protected]
              http://www.talsystems.com

              Comment

              Working...
              X