Basic Examples
Basic Examples
Find the Soundex identifier of a word:
In[]:=
Out[]=
h660
Soundex identifier of a misspelled word (the result is the same as above):
In[]:=
Out[]=
h660
Scope
Scope
The character case does not matter:
In[]:=
Out[]=
d526
In[]:=
Out[]=
d526
Digits are not processed and become part of the result:
In[]:=
Out[]=
d452
Applications
Applications
Compute an association of dictionary words and corresponding Soundex codes:
In[]:=
AbsoluteTimingcodes=Association@Map#->[#]&,DictionaryLookup["*"];
Out[]=
{2.27908,Null}
Show the top 20 most “popular” Soundex codes:
In[]:=
top=TakeLargestBy[Tally[Values[codes]],#〚2〛&,20]
Out[]=
{{t652,403},{c536,317},{p632,235},{c623,219},{c526,214},{p625,212},{c516,212},{s162,208},{p623,203},{p612,198},{s362,193},{c535,190},{a200,183},{s163,182},{p622,182},{c512,176},{s352,170},{s363,165},{p626,163},{s520,163}}
Show that the Soundex codes adhere to the Pareto principle:
In[]:=
freqs=SortBy[Tally[Values[codes]],-#〚2〛&]〚All,2〛;freqs=Accumulate[freqs]/Total[freqs];ListLinePlot[freqs,PlotTheme"Detailed"]
Out[]=
Show the words corresponding to one of the top Soundex codes:
In[]:=
GroupBy[Normal[codes],#〚2〛&,#〚All,1〛&][top〚20,1〛]
Out[]=
{sames,samosa,Sancho,sanes,sang,Sang,sangs,sank,Sanka,sans,saunas,sawing,saying,sayings,scams,scans,scenes,scenic,schemas,schemes,schmoes,schmooze,schmuck,schnook,schnooks,schnoz,science,scions,sconce,scones,scums,seams,seance,seeing,seeings,seems,seines,semis,Seneca,sens,sense,sewing,shames,shams,shanghai,Shanghai,shank,shanks,Shawnees,shewing,shimmies,shims,shines,shinnies,shins,shoeing,shooing,showiness,showing,showings,shuns,shying,shyness,Siamese,Sihanouk,Sims,since,sines,sinews,sing,singe,sings,sink,sinks,sins,sinuous,sinus,skeins,skewing,skewness,skiing,skims,skins,skunk,skunks,skying,smack,smash,smock,smog,smoggy,smogs,smoke,smokey,Smokey,smoky,smooch,smoochy,smug,smugs,snack,snag,snags,snake,Snake,snaky,snazzy,sneak,sneaks,sneaky,sneeze,snick,snog,snogs,snooze,snows,snowshoe,snug,snugs,someways,Somoza,song,Songhai,Songhua,songs,sonic,sonics,Sonja,sonnies,sons,soonish,sowing,Soyinka,squamous,squeamish,suing,sumac,sums,sung,Sung,sunk,sunks,sunnies,Sunnis,suns,swains,swamis,swank,swanks,swanky,swans,Swansea,swaying,swims,swines,swing,swings,swinish,swoons,swung,sync,syncs,Synge}
Neat Examples
Neat Examples
Pick some words:
In[]:=
SeedRandom[332];words=RandomWord["CommonWords",12]
Out[]=
{reversal,reheat,charlatanism,fistulous,conformism,headdress,maidenhair,subculture,conservatism,laddie,crossbeam,negotiator}
Introduce random misspellings:
In[]:=
misspelled=MapThread[RandomChoice[{StringDrop[#1,{#2}],StringReplacePart[#,RandomChoice[CharacterRange["a","x"]],{#2,#2}]}]&,{words,RandomInteger[{1,StringLength[#]}]&/@words}]
Out[]=
{reveral,rehead,charlatasism,fistuous,conformsm,hsaddress,maienhair,subcultpre,consedvatism,ladie,vrossbeam,negotiatow}
Compare the corresponding Soundex codes for both word sets:
In[]:=
TallyMapThread[#1][#2]&,{words,misspelled}
Out[]=
{{False,6},{True,6}}
The Soundex codes of the misspelled words are found in the Soundex codes of the spelling correction lists:
In[]:=
MapThreadMemberQ/@SpellingCorrectionList[#2],[#1]&,{words,misspelled}
Out[]=
{True,True,True,True,True,True,True,True,True,True,True,True}