COMP34412 Natural Language Systems syllabus 2021-2022
The course unit will cover key linguistic and algorithmic foundations of natural language processing. The topics will include lexical processing, word sense disambiguation, information extraction, and speech recognition and speech synthesis. It will consider both rule-based and machine learning methods, and key applications such as document summarisation or sentiment analysis.
This course unit detail provides the framework for delivery in 20/21 and may be subject to change due to any additional Covid-19 impact. Please see Blackboard / course unit related emails for any further updates.
Enabling computers to use 'natural language' (the kind of language that people use to communicate with one another) is becoming more and more important. It allows both people to communicate with computers, and the computers to access the enormous amount of material that is stored as natural language text on the web or in document repositories. This course unit provides an introduction to the area of natural language processing (NLP) as one of the key areas of artificial intelligence. It aims to introduce essential components of NLP and explain the major challenges in processing large-scale, real-world natural language both in its written and spoken forms.
· Introduction, motivation, review of NLP principles
· Essential steps for NLP algorithms
o Part-of-speech tagging: probabilistic tagging, transformation-based learning
o Parsing: chunking, shallow parsing, statistical parsing
o Lexical semantics: lexical resources, word sense disambiguation algorithms
· Evaluation of natural language systems
o Crowdsourcing and inter-annotator agreement
· Information retrieval and extraction
o Document matching
o Named-entity recognition
o Template-filling, free text question answering systems
o Summarisation algorithms
· Natural language generation
o Surface realisation
o Discourse planning
· Machine translation
o Transfer-based approaches: the MT pyramid, transfer rules
o Statistical MT, memory-based MT
· Introduction to spoken language systems
o The nature of speech
o Speech synthesis
o Speech recognition
Lectures, practicals, surgeries, coursework
There is unassessed formative homework given to the students weekly; each lecture then starts with a 10 min feedback session where the students may review the homework from the previous week. Assessed coursework is due in weeks 6 and 11, with feedback in weeks 7 and 12 respectively.
- Lectures (22 hours)
- Analytical skills
- Problem solving
On successful completion of this unit, a student will be able to:
- Understand and explain the major challenges in processing large-scale, real-world natural language data.
- Demonstrate how the essential components of NLP systems are built and modified.
- Understand the opportunities and challenges of knowledge- and data-driven NLP methods.
- Explain the principal applications of NLP, including information extraction, question-answering, document summarisation, spoken language access to software services,
- Understand the issues involved in evaluating NLP systems.
|Natural language processing with Python||Bird, Steven.||9780596516499||O'Reilly||c2009.|
|Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition||Jurafsky, Dan, 1962-||9780135041963||Pearson/Prentice Hall||c2009.|
Course unit materials
Links to course unit teaching materials can be found on the School of Computer Science website for current students.