Skip to navigation | Skip to main content | Skip to footer

COMP30421: Natural Language Engineering (2008-2009)

This is an archived syllabus from 2008-2009

Natural Language Engineering
Level: 3
Credit rating: 10
Pre-requisites: No Pre-requisites
Co-requisites: No Co-requisites
Duration: 11 weeks
Lectures: 11 x 2 hours
Examples classes: none
Lecturers: Sophia Ananiadou
Course lecturer: Sophia Ananiadou

Additional staff: view all staff
Sem 1 w1-5,7-12 Lecture 1.3 Wed 11:00 - 13:00 -
Assessment Breakdown
Exam: 80%
Coursework: 20%
Lab: 0%
Degrees for which this unit is optional
  • Artificial Intelligence BSc (Hons)


The course unit aims to teach the principles and practice of natural language engineering and text mining:

To demonstrate how the essential components of NLE systems are built and modified.
To introduce the principal applications of NLE, including machine translation, information retrieval, information extraction.
To explain the major challenges in processing large-scale, real-world natural language (text mining).
To give students an understanding of what, realistically, NLE and text mining can and cannot hope to achieve, why, and why it matters.
To link theory in NLE with practical applications in text mining.

For more details of this course unit including slides etc, see the course unit webpage.

Learning Outcomes

A student completing this course unit should:

Understand how to build a large-scale NLE system. ( A2, A5, B1 )
Know something about the principal practical applications of natural language engineering and text mining. ( A5 )
Understand the difficulties in NLE, and be able to predict, given an unfamiliar text, what problems it may cause. (A2, B3)
Be able to make an informed decision, given a previously unseen practical problem, as to which NLE techniques are likely to be worthwhile. ( B3 )
Evaluate the performance of NLE systems. (A2, A5, B3, C4, D4)
Be able to meet rigid short-term deadlines. ( D5 )

Assessment of Learning outcomes

Learning outcome 1, 2, 3 are assessed by examination and coursework, learning outcome 2 and 4 by examination, and 5 by exam and coursework learning outcome 6 by coursework.

Contribution to Programme Learning Outcomes

A2, A5, B1, B3, C4, D4, D5.


Introduction, motivation (1)


Mathematical foundations, Probabilistic grammars, Hidden Markov Models (3)


Overview, Linguistic foundations, POS tagging, tokenisation, Parsing, Word Sense Disambiguation (5)


Machine translation, Information Retrieval (document classification), information extraction, opinion mining

Reading List

Core Text
Title: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition (2nd edition)
Author: Jurafsky, Daniel and James H. Martin
ISBN: 9780135041963
Publisher: Pearson International
Year: 2009

Core Text
Title: Foundations of Statistical Natural Language Processing
Author: Manning, Christopher D. and Hinrich Schutze
ISBN: 0262133601
Publisher: MIT Press
Year: 2003

Supplementary Text
Title: Text mining for biology and biomedicine
Author: Ananiadou, Sophia and John McNaught (eds.).
ISBN: 158053984X
Publisher: Artech House
Year: 2006

Supplementary Text
Title: Text mining handbook: advanced approaches in analyzing unstructured data
Author: Feldman, Ronen and James Sanger
ISBN: 9780521836579
Publisher: CUP
Year: 2008