Skip to navigation | Skip to main content | Skip to footer
Menu
Menu

COMP33111: Data Integration and Analysis (2011-2012)

This is an archived syllabus from 2011-2012

Data Integration and Analysis
Level: 3
Credit rating: 10
Pre-requisites: COMP23111
Co-requisites: No Co-requisites
Duration: 11 weeks
Lectures: 22
Lecturers: Goran Nenadic
Course lecturer: Goran Nenadic

Additional staff: view all staff
Timetable
SemesterEventLocationDayTimeGroup
Sem 1 Lab 3rdLab Mon 12:00 - 13:00 -
Sem 1 Lecture LF15 Mon 13:00 - 15:00 -
Sem 1 Lab G23 Mon 15:00 - 16:00 -
Sem 1 w5,10 Lab 3rdLab Mon 13:00 - 15:00 wk5,10
Assessment Breakdown
Exam: 85%
Coursework: 0%
Lab: 15%

Themes to which this unit belongs
  • Enterprise Information Systems

Introduction

All application areas are witnessing the "data deluge", i.e. the ever growing amount of digital data available as part of day-to-day activities in business, science, education, entertainment, etc. Making sense of this data by integration and analysis is a key for success of any organisation. In addition to the need to work with huge volumes of data, current applications are also challenged with multi-modal data, including un- and semi-structured data, text, image and video data, spatial and temporal data, etc.

Aims

The aim of the course is to give students an awareness of the problems and opportunities associated with data integration, including the analysis of the data once integrated. Previous database courses focused on the infrastructure for managing and querying data, database design and database programming. This course unit focuses principally on making the most of data within an organisation through

Data integration: getting the data into a form that supports and facilitates aggregation, exploration and mining.
Data analysis: techniques for learning new lessons from the data.

Programme outcomeUnit learning outcomesAssessment
A2 A5Understand the main issues in data integration and analysis, and be aware of potential approaches.
  • Examination
  • Lab assessment
A2 A5 B3Understand the key issues, advantages and problems in data integration and warehousing, including various architectures, models and designs.
  • Lab assessment
  • Examination
A2 A5 B3Understand data analysis approaches, including OLAP and data mining, and be familiar with the data analysis process, its motivation, applicability, advantages and issues.
  • Examination
  • Lab assessment
A2 A5 B3Be familiar with the principles and techniques for different data mining tasks, including data classification, clustering and association analysis.
  • Lab assessment
  • Examination
A2 A5 B3Understand the main issues in multi-modal data, and be familiar with techniques for storage, querying and retrieval of multimedia data.
  • Examination
  • Lab assessment
B3Be able to identify application areas, opportunities and challenges in data integration and analysis tasks.
  • Examination
  • Lab assessment

Syllabus

Introduction to Data Integration and Analysis

A review of current technologies, the issues raised by them, and outstanding problems. (1)

Data warehousing

Data models and architectures for warehousing; ETL (extract, transform, load) process; data quality, integration and meta-data generation. (3)

Online analytical processing (OLAP)

Introduction to OLAP; OLAP operations and SQL extensions; case studies; trends and open issues. (4)

Data mining

Rationale, aims and approaches, KDD (knowledge discovery in databases); techniques and algorithms for association analysis, data classification and clustering; evaluation of data mining; application areas and case studies; major open issues. (10)

Mining and integration of multi-modal data

Nature of multi-modal and multimedia data; content-based querying and retrieval; meta-data generation, ontologies, semantic annotation and integration; querying and retrieval from textual databases. (4)

Reading List

Supplementary Text
Title: Fundamentals of database systems (5th edition)
Author: Elmasri, Ramez and Shamkanth B. Navathe
ISBN: 032141506X
Publisher: Pearson
Edition: 5th
Year: 2007


Supplementary Text
Title: Database systems: a practical approach to design, implementation, and management (4th edition)
Author: Connolly, Thomas and Carolyn Begg
ISBN: 0321210255
Publisher: Addison-Wesley
Edition: 4th
Year: 2005