simohayha 发表于 2013-1-27 06:13:37

Peter Norvig用python写的拼写纠错

文章在这里:
http://www.norvig.com/spell-correct.html
import re, string, collectionsdef words(text): return re.findall('+', text.lower()) def train(features):    model = collections.defaultdict(lambda: 1)    for f in features:      model += 1    return modelNWORDS = train(words(file('Documents/holmes.txt').read()))def edits1(word):    n = len(word)    return set(+word for i in range(n)] + ## deletion               +word+word+word for i in range(n-1)] + ## transposition               +c+word for i in range(n) for c in string.lowercase] + ## alteration               +c+word for i in range(n+1) for c in string.lowercase]) ## insertiondef known_edits2(word):    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)def known(words): return set(w for w in words if w in NWORDS)def correct(word):    return max(known() or known(edits1(word)) or known_edits2(word) or ,               key=lambda w: NWORDS)

牛人就是牛人,这几行代码是在飞机上写的.
页: [1]
查看完整版本: Peter Norvig用python写的拼写纠错