This is an archived syllabus from 2019-2020
COMP60711 Data Engineering syllabus 2019-2020
COMP60711 Data Engineering
Level 6
Credits: 15
Enrolled students: 75
Course leader: John Keane
Additional staff: view all staff
Assessment methods
- 50% Written exam
- 50% Coursework
Semester | Event | Location | Day | Time | Group |
---|---|---|---|---|---|
Sem 1 w1 | Lecture | 2.19 | Thu | 09:00 - 17:00 | - |
Sem 1 w2-5 | Lecture | 2.19 | Thu | 09:00 - 13:00 | - |
Sem 1 w2-5 | Lab | 2.25 (A+B) | Thu | 13:00 - 17:00 | - |
- Data Engineering and Systems Governance
Overview
All application areas are witnessing the "data deluge", i.e. the ever growing amount of digital data available as part of day-to-day activities in business, science, education, entertainment, etc. Indeed "Big data" has become part of modern vernacular. Engineering, managing and analysing such data is a key for success of all organisations. In addition to the need to work with huge volumes of data, current applications are also challenged with multi-modal data, including un- and semi-structured data, image and video data, spatial and temporal data, etc.
Aims
This module will examine the entire data life cycle, including data creation, modelling, acquisition, representation, use, maintenance, preservation and disposal. As the majority of data is stored in databases, the module will examine various database engineering approaches to support data management, including database design, data warehousing, maintenance and analytics. Data standards and data quality will be examined and the challenge of "big datasets" will be considered.
Syllabus
- An overview of the data life cycle
- Data engineering, modelling and design techniques
- Data storage and warehousing
- Data access and maintenance
- Data analytics application and algorithms
- Engineering non-traditional data types
- Data standards and data quality
Feedback methods
Regular coursework, returned marked with feedbackStudy hours
Employability skills
- Analytical skills
- Problem solving
- Research
- Written communication
Learning outcomes
On successful completion of this unit, a student will be able to:
- Explain and apply the constituent steps of the data life cycle
- Describe data engineering techniques; be able to apply and document large-scale data engineering for a given task, comprising various multimodal data types.
- Describe and apply technical, ethical and societal issues related to data engineering, storage, access and maintenance.
- Explain and apply the main principles of data analytics/ algorithms, and explain their application to various domains.
- Describe relevant standards and best practice in data engineering, analyse shortcomings and identify possible strategies and approaches to overcome them.
Reading list
Title | Author | ISBN | Publisher | Year |
---|---|---|---|---|
Data mining : concepts and techniques | Han, Jiawei. | 9780123814807; 9780123814791 | Elsevier | ©2012. |
Data mining : practical machine learning tools and techniques | Witten, I. H. | 0128043571; 9780128043578 | Morgan Kaufmann Publisher | [2017] |
Artificial intelligence : a textbook | Aggarwal, Charu C., | 9783030723576; 3030723577 | Springer | [2021] |
Introduction to data mining | Tan, Pang-Ning, | 9780273775324; 0273775324 | Pearson Education Limited | 2019. |
Measuring data quality for ongoing improvement : a data quality assessment framework | Sebastian-Coleman, Laura. | 9780123977540; 0123977541 | Elsevier Science | 2013. |
The Philosophy of Information Quality | null | 9783319071213 | Springer International Publishing ; Imprint Springer | 2014. |
Information quality in information fusion and decision making | null | 303003643X; 9783030036447; 3030036448; 9783030036430 | Springer | [2019] |
Additional notes
Course unit materials
Links to course unit teaching materials can be found on the School of Computer Science website for current students.