Peter Norvig用python写的拼写纠错
文章在这里:http://www.norvig.com/spell-correct.html
import re, string, collectionsdef words(text): return re.findall('+', text.lower()) def train(features): model = collections.defaultdict(lambda: 1) for f in features: model += 1 return modelNWORDS = train(words(file('Documents/holmes.txt').read()))def edits1(word): n = len(word) return set(+word for i in range(n)] + ## deletion +word+word+word for i in range(n-1)] + ## transposition +c+word for i in range(n) for c in string.lowercase] + ## alteration +c+word for i in range(n+1) for c in string.lowercase]) ## insertiondef known_edits2(word): return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)def known(words): return set(w for w in words if w in NWORDS)def correct(word): return max(known() or known(edits1(word)) or known_edits2(word) or , key=lambda w: NWORDS)
牛人就是牛人,这几行代码是在飞机上写的.
页:
[1]