====== Statistical Text Processing ====== ===== Interesting Subproblems ===== * Chinese Word Segmentation * Chinese Spelling Checking * Bibliographic Attributes Extraction from Reference Strings ===== Papers ===== * Corpus-based Statistical Screening for Phrase Identification[[http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=79045|Details]] * Text Segmentation for Chinese Spell Checking. {{kb:cssc.pdf|}} * A Chinese OCR Spelling Check Approach Based on Statistical Language Models. {{kb:01401278.pdf|}} ===== Books ===== * [[http://nlp.stanford.edu/~manning/|Chris Manning]] and [[http://www-csli.stanford.edu/~schuetze/|Hinrich Schütze]], //Foundations of Statistical Natural Language Processing//, MIT Press. Cambridge, MA: May 1999. [[http://nlp.stanford.edu/fsnlp/|Details]] ===== Useful Links ===== * An annotated list of resources: [[http://nlp.stanford.edu/links/statnlp.html|Statistical natural language processing and corpus-based computational linguistics]] * NLP in Wikipedia [[http://en.wikipedia.org/wiki/Natural_language_processing#Statistical_NLP|Details]]