COMP30421: Natural Language Engineering (2007-2008)
The course unit aims to teach the principles and practice of natural language engineering:
To demonstrate how the essential components of NLE systems are built and modified.
To introduce the principal applications of NLE, including machine translation, information retrieval, information extraction, and the semantic web.
To explain the major challenges in processing large-scale, real-world natural language.
To align issues involved in NLE with related issues in software engineering and information systems.
To give students an understanding of what, realistically, NLE can and cannot hope to achieve, why, and why it matters.
A student completing this course unit should:
Understand how to build a large-scale NLE system. ( A2, A5, B1 )
Know something about the principal practical applications of natural language engineering. ( A5 )
Understand the difficulties in NLE, and be able to predict, given an unfamiliar text, what problems it may cause. (A2, B3)
Be able to make an informed decision, given a previously unseen practical problem, as to which NLE techniques are likely to be worthwhile. ( B3 )
Evaluate the performance of NLE systems. (A2, A5, B3, C4, D4)
Be able to meet rigid short-term deadlines. ( D5 )
Assessment of Learning outcomesLearning outcome 1, 2, 3 are assessed by examination and coursework, learning outcome 2 and 4 by examination, and 5 by exam and coursework learning outcome 6 by coursework.
Contribution to Programme Learning OutcomesA2, A5, B1, B3, C4, D4, D5.
Introduction, motivation (1)
Probabilistic grammars, N-grams, Hidden Markov Models (3)
Overview, POS tagging, Stemming and morphology, Parsing, Word Sense Disambiguation (5)
Corpora and coding systems, Lexicons and ontologies, Standards (3)
Machine translation, Spoken Language Dialogue Systems, Information Retrieval, Information Extraction, Intelligent Tutoring Systems (5)
Word Sense Disambiguation
Core TextTitle: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition (2nd edition)
Author: Jurafsky, Daniel and James H. Martin
Publisher: Pearson International
This is the course textbook.
Supplementary TextTitle: Foundations of Statistical Natural Language Processing
Author: Manning, Christopher D. and Hinrich Schutze
Publisher: MIT Press
This is the secondary text to the course.