Aims & Objectives : To develop a good understanding of all aspects of Text Processing and to provide solid grounding in selected topics.
Course Content : String Processing – Efficient techniques for string processing. String searching algorithms – Knuth-Morris-Pratt, Boyer-Moore and Rabin-Karp algorithms. Processing binary strings. Incremental search techniques. Pattern Matching – Regular Expressions, regular grammars, deterministic and non-deterministic finite state machines for pattern matching. Corpus Analysis – Corpus creation. Storage and indexing techniques. Morpheme, word and sentence level statistics. Zipf’s law. Corpus indexing techniques. Word and sentence level n-grams. Analysis for Hidden Morkov models. Text tagging. Computational Techniques in Lexicography – From corpus to lexicon. Lexical knowledge bases – Electronic dictionaries and thesauri. Efficient storage and retrieval – B-Trees, TRIE, and Hashing. Dictionary analysis tools. Internal consistency and validation techniques. Dictionary updation and maintenance tools. Word Processing – Text layout – Justification, placement of figures, equations, etc. Paragraph and page formatting. Table of Contents, Index, and Bibliography creation. Footnotes and cross references. Spell Checking, Grammar Checking and Style Checking – Statistical and linguistic approaches to better writing tools. Isolated and context dependent spell and grammar checking tools. Introduction to Grammars and Parsers. Active Chart Parsing. Acceptance based, Relaxation based and Expectation based techniques. Multi-Script and Multi-lingual text processing – Scripts and Fonts – Multi-Script processing and GIST technology. Fonts and font libraries. Bilingual and Multi-lingual dictionaries, thesauri and word processors. Cryptology – Techniques for text encryption and decryption. Text Compression for efficient storage and transmission of textual data.
Applications to Natural Language processing, Speech Recognition, Optical Character Recognition, Information Retrieval and Office Automation.
Recommended Books :
Gerald Salton, “Automatic Text Processing”, Addison-Wesley, 1989.
Bran Boguraev, Ted Briscoe (Eds), Computational Lexicography for Natural Language Processing, Longman, 1989.
Robert Sedgewick, “Algorithms in C”, Addison Wesley, 1990
J.E. Hopcroft and J.D.Ullman, “Automata Theory, Languages and Computation”, Narosa, 1992.
A V Aho, Ravi Sethi, J D Ullman, “Compilers: Principles, Techniques and Tools”, Addison-Wesley, 1986.
S.N. Srihari, “Computer Text Recognition and Error Correction”, IEEE Computer Soceity Press, 1984.
Comment for University of Hyderabad (UOH) Syllabus of Master of Computer Applications (MCA), Semester IV- CS-700-level Advanced Course,CS-716 Text Processing
i want oosd seminar topics of mca final year…. i belong to osmania university