Skip to navigation | Skip to main content | Skip to footer

COMP60711 Data Engineering syllabus 2021-2022

COMP60711 Data Engineering

Level 6
Credits: 15
Enrolled students: 84

Course leader: Sandra Sampaio

Additional staff: view all staff

Assessment methods

  • 100% Coursework
Sem 1 w1-5 ONLINE Lecture Thu 09:00 - 11:00 -
Sem 1 w1-5 DROP-IN Simon TH D Fri 09:00 - 11:00 -
Sem 1 w1-5 ONLINE LabORATORY Thu 12:00 - 14:00 -
Sem 1 w1-5 DISCUSSION Schuster BLACKETT TH Thu 15:00 - 16:00 -
Sem 1 w2-5 DROP-IN Schuster BLACKETT TH Tue 16:00 - 17:00 -
Themes to which this unit belongs
  • Data Engineering and Systems Governance


This course unit detail provides the framework for delivery in 20/21 and may be subject to change due to any additional Covid-19 impact. Current students should see Blackboard/course unit related emails for any further updates.

All application areas are witnessing the "data deluge", i.e. the ever growing amount of digital data available as part of day-to-day activities in business, science, education, entertainment, etc. Indeed "Big data" has become part of modern vernacular. Engineering, managing and analysing such data is a key for success of all organisations. In addition to the need to work with huge volumes of data, current applications are also challenged with multi-modal data, including un- and semi-structured data, image and video data, spatial and temporal data, etc.


This module will examine the entire data life cycle, including data creation, modelling, acquisition, representation, use, maintenance, preservation and disposal. As the majority of data is stored in databases, the module will examine various database engineering approaches to support data management, including database design, data warehousing, maintenance and analytics. Data standards and data quality will be examined and the challenge of "big datasets" will be considered.


  • An overview of the data life cycle
  • Data engineering, modelling and design techniques
  • Data storage and warehousing
  • Data access and maintenance
  • Data analytics application and algorithms
  • Engineering non-traditional data types
  • Data standards and data quality

Feedback methods

Regular coursework, returned marked with feedback

Study hours

Employability skills

  • Analytical skills
  • Problem solving
  • Research
  • Written communication

Learning outcomes

On successful completion of this unit, a student will be able to:

  • Explain and apply the constituent steps of the data life cycle
  • Describe data engineering techniques; be able to apply and document large-scale data engineering for a given task, comprising various multimodal data types.
  • Describe and apply technical, ethical and societal issues related to data engineering, storage, access and maintenance.
  • Explain and apply the main principles of data analytics/ algorithms, and explain their application to various domains.
  • Describe relevant standards and best practice in data engineering, analyse shortcomings and identify possible strategies and approaches to overcome them.

Reading list

Data mining : concepts and techniques Han, Jiawei.9780123814807; 9780123814791Elsevier©2012.
Data mining : practical machine learning tools and techniques Witten, I. H.0128043571; 9780128043578Morgan Kaufmann Publisher[2017]
Artificial intelligence : a textbook Aggarwal, Charu C.,9783030723576; 3030723577Springer[2021]
Introduction to data mining Tan, Pang-Ning,9780273775324; 0273775324Pearson Education Limited2019.
Measuring data quality for ongoing improvement : a data quality assessment framework Sebastian-Coleman, Laura.9780123977540; 0123977541Elsevier Science2013.
The Philosophy of Information Quality null9783319071213Springer International Publishing ; Imprint Springer2014.
Information quality in information fusion and decision making null303003643X; 9783030036447; 3030036448; 9783030036430Springer[2019]

Additional notes

Course unit materials

Links to course unit teaching materials can be found on the Department of Computer Science website for current students.