Announcement

Collapse
No announcement yet.

Finding corrent name with misspelling?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Michael Mattias
    replied
    There has to be a way to do this mathematically and programmatically, or Google couldn't do it.
    And to think your mother and father wondered how you could ever support yourself as a lexographer.

    Leave a comment:


  • Knuth Konrad
    replied
    As Michael pointed out (perhaps not clear enough), one of the phonetic algorithms to find similaries in words is Soundex.

    For a PB/DOS implementation of Soundex, see http://www.powerbasic.com/support/pb...hlight=soundex

    Leave a comment:


  • Robert E. Carneal
    replied
    Originally posted by Mel Bishop View Post
    I think what he's asking is how can you tell if they are similar? Would really like to know how Google does it myself.
    Yes, Mel, that is exactly what I am inquiring about. Thank you for restating that very clearly. There has to be a way to do this mathematically and programmatically, or Google couldn't do it.

    Thanks.

    Robert

    Leave a comment:


  • Mel Bishop
    replied
    If there are 2 similar counties...
    I think what he's asking is how can you tell if they are similar? Would really like to know how Google does it myself.

    Good question.

    Leave a comment:


  • Scott Slater
    replied
    In your DB of states, list the valid counties for each state. If there are 2 similar counties, then look at boroughs/towns/cities within the counties and compare to what was given. Then just test each entry against the database. If you get 2 or more likely candidates, prompt the user (something like mapquest does).

    Leave a comment:


  • Michael Mattias
    replied
    > Is there a formula to .....

    There is no one 'formula' to do what you want.

    There are things like SOUNDEX values you can use to get close, but at some point you will have to compare what you think a word might be against a table of valid values, and pick one.

    But even that will not be enough: For example, look at your thread title:
    By "Corrent" did you mean "Current?" or "Correct?" Both are valid words, both have but one letter which is different, and both kind of ("fuzzily") make sense .

    The ability to do this kind of thing is why Google(r) pays its people well.

    MCM

    Leave a comment:


  • Robert E. Carneal
    started a topic Finding corrent name with misspelling?

    Finding corrent name with misspelling?

    Hello. I am attempting to write a "location verification" program of sorts. I.e., if I have a death certificate (birth or marriage as well), and it says the birth took place in "Kentucky, Stott County, Georgetown," then the program can figure out that "Stott" should be "Scott" and thereby I have correct information to use.

    I have built a data file of about 2.6 million U.S. locations, but now I am wondering how to fuzzy logic this: When you run a spell checker, how does it decide which words you were likely trying to spell? Is there a formula to matching a misspelled name to possible correct name? I.e., suppose on the certificate I see "Fborih," I would like for the formula to give me something I can use to suggest "Florida." (I wonder if I am really looking for a Soundex formula? That is what genealogists use to match similar last names such as Smith and Smyth, Carneal and Carnall, for example.

    Using PowerBasic DOS, but hoping to get PowerBasic for Windows soon!

    Thanks.

    Robert
Working...
X