COMP13212 Data Science syllabus 2019-2020

COMP13212 Data Science

Level 1
Credits: 10
Enrolled students: 265

Assessment methods

• 80% Written exam
• 20% Practical skills assessment
Timetable
SemesterEventLocationDayTimeGroup
Sem 2 Lecture 1.1 Tue 10:00 - 11:00 -
Sem 2 Lecture 1.1 Thu 12:00 - 13:00 -
Sem 2 B Lab G23 Fri 12:00 - 14:00 X
Sem 2 B Lab LF31 Fri 15:00 - 17:00 M+W
Sem 2 B Lab LF31 Thu 16:00 - 18:00 Y
Sem 2 B Lab LF31 Mon 16:00 - 18:00 Z
Sem 2 w20 Lecture Crawford House TH 1 Fri 10:00 - 11:00 -

Overview

This course unit has two objectives. The first is to introduce the student to a range of fundamental, non-trivial algotithms, and to the techniques required to analyse their correctness and running-time.

The second is to present a conceptual framework for analysing the intrinsic complexity of computational problems, which abstracts away from details of particular algorithms.

Aims

To give students awareness of the elements of the “Data Science

Process” (many, but not all, of which will be studied in detail in this course).

To give students practice in using python tools for data processing and analysis, including numpy, scipy.stats, pandas, and Jupyter notebooks.

To give students understanding and practice in exploration and visualization of data.

To give students understanding of uncertainty in data, including how to measure it, visualise it, and model it.

To give students an introduction to statistical thinking and Bayesian reasoning.

To give students an introduction to ethical considerations in analysing data and drawing responsible conclusions.

To give a brief introduction to machine learning by use of naive Bayes classification and linear and logistic regression.

Teaching methods

Lectures and coursework reported via Jupyter notebooks in Python.

Study hours

• Lectures (22 hours)
• Practical classes & workshops (12 hours)

Learning outcomes

On successful completion of this unit, a student will be able to:

• Demonstrate awareness of the “Data Science Process” by describing qualitatively how it would apply in a given situation.
• Demonstrate awareness of need for data cleaning descriptively and by doing elementary data cleaning and preparation in the laboratory.
• Demonstrate ability to measure and express uncertainty from a set of data and quantities derived from that data.
• Demonstrate ability to choose and build appropriate models of different datasets.
• Demonstrate ability to evaluate the quality of a model of a dataset.
• Demonstrate the ability compare different models of a dataset and models of different dataset in order to draw statistically sound conclusions about hypotheses or claims from the data.
• Demonstrate ability to use python tools to: read and write data sets to and from files, produce descriptive statistics and draw conclusions from these, produce graphical visualisation and draw conclusions, perform basic statistical tests including the difference between means, and perform a simple machine learning experiment by building an email spam filter using a naive Bayes classifier.