COMP30421: Natural Language Engineering (2008-2009)
The course unit aims to teach the principles and practice of natural language engineering and text mining:
To demonstrate how the essential components of NLE systems are built and modified.
To introduce the principal applications of NLE, including machine translation, information retrieval, information extraction.
To explain the major challenges in processing large-scale, real-world natural language (text mining).
To give students an understanding of what, realistically, NLE and text mining can and cannot hope to achieve, why, and why it matters.
To link theory in NLE with practical applications in text mining.
For more details of this course unit including slides etc, see the course unit webpage.
A student completing this course unit should:
Understand how to build a large-scale NLE system. ( A2, A5, B1 )
Know something about the principal practical applications of natural language engineering and text mining. ( A5 )
Understand the difficulties in NLE, and be able to predict, given an unfamiliar text, what problems it may cause. (A2, B3)
Be able to make an informed decision, given a previously unseen practical problem, as to which NLE techniques are likely to be worthwhile. ( B3 )
Evaluate the performance of NLE systems. (A2, A5, B3, C4, D4)
Be able to meet rigid short-term deadlines. ( D5 )
Assessment of Learning outcomesLearning outcome 1, 2, 3 are assessed by examination and coursework, learning outcome 2 and 4 by examination, and 5 by exam and coursework learning outcome 6 by coursework.
Contribution to Programme Learning OutcomesA2, A5, B1, B3, C4, D4, D5.
Introduction, motivation (1)
Mathematical foundations, Probabilistic grammars, Hidden Markov Models (3)
Overview, Linguistic foundations, POS tagging, tokenisation, Parsing, Word Sense Disambiguation (5)
Machine translation, Information Retrieval (document classification), information extraction, opinion mining
Core TextTitle: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition (2nd edition)
Author: Jurafsky, Daniel and James H. Martin
Publisher: Pearson International
Core TextTitle: Foundations of Statistical Natural Language Processing
Author: Manning, Christopher D. and Hinrich Schutze
Publisher: MIT Press
Supplementary TextTitle: Text mining for biology and biomedicine
Author: Ananiadou, Sophia and John McNaught (eds.).
Publisher: Artech House
Supplementary TextTitle: Text mining handbook: advanced approaches in analyzing unstructured data
Author: Feldman, Ronen and James Sanger