There has to be a way to do this mathematically and programmatically, or Google couldn't do it.
Announcement
Collapse
No announcement yet.
Finding corrent name with misspelling?
Collapse
X
-
-
As Michael pointed out (perhaps not clear enough), one of the phonetic algorithms to find similaries in words is Soundex.
For a PB/DOS implementation of Soundex, see http://www.powerbasic.com/support/pb...hlight=soundex
Leave a comment:
-
Originally posted by Mel Bishop View PostI think what he's asking is how can you tell if they are similar? Would really like to know how Google does it myself.There has to be a way to do this mathematically and programmatically, or Google couldn't do it.
Thanks.
Robert
Leave a comment:
-
If there are 2 similar counties...
Good question.
Leave a comment:
-
In your DB of states, list the valid counties for each state. If there are 2 similar counties, then look at boroughs/towns/cities within the counties and compare to what was given. Then just test each entry against the database. If you get 2 or more likely candidates, prompt the user (something like mapquest does).
Leave a comment:
-
> Is there a formula to .....
There is no one 'formula' to do what you want.
There are things like SOUNDEX values you can use to get close, but at some point you will have to compare what you think a word might be against a table of valid values, and pick one.
But even that will not be enough: For example, look at your thread title:
By "Corrent" did you mean "Current?" or "Correct?" Both are valid words, both have but one letter which is different, and both kind of ("fuzzily") make sense .
The ability to do this kind of thing is why Google(r) pays its people well.
MCM
Leave a comment:
-
Finding corrent name with misspelling?
Hello. I am attempting to write a "location verification" program of sorts. I.e., if I have a death certificate (birth or marriage as well), and it says the birth took place in "Kentucky, Stott County, Georgetown," then the program can figure out that "Stott" should be "Scott" and thereby I have correct information to use.
I have built a data file of about 2.6 million U.S. locations, but now I am wondering how to fuzzy logic this: When you run a spell checker, how does it decide which words you were likely trying to spell? Is there a formula to matching a misspelled name to possible correct name? I.e., suppose on the certificate I see "Fborih," I would like for the formula to give me something I can use to suggest "Florida." (I wonder if I am really looking for a Soundex formula? That is what genealogists use to match similar last names such as Smith and Smyth, Carneal and Carnall, for example.
Using PowerBasic DOS, but hoping to get PowerBasic for Windows soon!
Thanks.
RobertTags: None
Leave a comment: