Skip to navigation | Skip to main content | Skip to footer
Menu
Menu

COMP37332: Data Integration and Analysis (2008-2009)

This is an archived syllabus from 2008-2009

Data Integration and Analysis
Level: 3
Credit rating: 10
Pre-requisites: COMP20312 or INFO20015
Co-requisites: No Co-requisites
Duration: 11 weeks
Lectures: 22
Lecturers: Goran Nenadic, Sandra Sampaio
Course lecturers: Goran Nenadic

Sandra Sampaio

Additional staff: view all staff
Timetable
SemesterEventLocationDayTimeGroup
Sem 2 w19-26,30-33 Lecture LF15 Tue 12:00 - 14:00 -
Assessment Breakdown
Exam: 85%
Coursework: 0%
Lab: 15%
Degrees for which this unit is optional
  • Artificial Intelligence BSc (Hons)

Aims

The aim of the course is to give students an awareness of the problems and opportunities associated with data integration, including the analysis of the data once integrated.

Learning Outcomes

A student completing this course unit should:

Understand the main issues in data integration and analysis, and be aware of potential approaches (A).
Be aware of the principal challenges that have to be addressed in the development of distributed database systems (A).
Understand the key issues, advantages and problems in data integration and warehousing, including various architectures, models and designs (A, B).
Understand data analysis approaches, including OLAP and data mining, and be familiar with the data analysis process, its motivation, applicability, advantages and issues (A, B).
Be familiar with the principles and techniques for different data mining tasks, including data classification, clustering and association analysis (A, B).
Be able to identify application areas, opportunities and challenges in data integration and analysis tasks (B).

Assessment of Learning outcomes

All learning outcomes are assessed by examination and during lab sessions.

Examination: 85% (3 questions from 5).
Laboratory: 15% (5 sessions, practicing tools and methods discussed during the lectures).

Contribution to Programme Learning Outcomes

A2, A5, B3.

Syllabus

Introduction to Data Integration and Analysis

A review of current technologies, the issues raised by them, and outstanding problems. (1)

Distributed databases

Rationale; transparency; architectures; top-down and bottom-up distributed database design; Oracle as a case study. (5)

Data warehousing

Data models and architectures for warehousing; ETL (extract, transform, load) process; data integration and meta-data generation. (3)

Online analytical processing (OLAP)

Introduction to OLAP; OLAP operations and SQL extensions; case studies; trends and open issues. (4)

Data mining

Rationale, aims and approaches, KDD (knowledge discovery in databases); techniques and algorithms for association analysis, data classification and clustering; evaluation of data mining; application areas and case studies; major open issues. (9)

Reading List

Core Text
Title: Principles of distributed database systems (3rd edition)
Author: Ozsu, M. Tamer and Patrick Valduriez
ISBN: 9781441988331
Publisher: Springer
Edition: 3rd
Year: 2011


Supplementary Text
Title: Database systems: a practical approach to design, implementation, and management (4th edition)
Author: Connolly, Thomas and Carolyn Begg
ISBN: 0321210255
Publisher: Addison-Wesley
Edition: 4th
Year: 2005


Supplementary Text
Title: Fundamentals of database systems (5th edition)
Author: Elmasri, Ramez and Shamkanth B. Navathe
ISBN: 032141506X
Publisher: Pearson
Edition: 5th
Year: 2007