Skip to navigation | Skip to main content | Skip to footer
Menu
Menu

Enrolment on this course unit is limited to 70 students.

COMP38120 Documents, Services and Data on the Web syllabus 2017-2018

COMP38120 Documents, Services and Data on the Web

Level 3
Credits: 20
Enrolled students: 61

Course leader: Norman Paton


Additional staff: view all staff

Requisites

  • Pre-Requisite (Compulsory): COMP18112
  • Pre-Requisite (Compulsory): COMP23111
  • Pre-Requisite (Compulsory): COMP28112

Additional requirements

  • Students who are not from the School of Computer Science must have permission from both Computer Science and their home School to enrol.

Assessment methods

  • 60% Written exam
  • 40% Coursework
Timetable
SemesterEventLocationDayTimeGroup
Sem 1 w1-5,7-10 Workshop Collab Fri 14:00 - 16:00 -
Sem 1 w11-12 Lab 1.8 Fri 14:00 - 16:00 -
Sem 2 w19-20,26,30-32 Lab 1.8 Fri 13:00 - 15:00 -
Sem 2 w21-25,33 Workshop Collab Fri 13:00 - 15:00 -
Themes to which this unit belongs
  • Web and Distributed Systems

Overview

The web is a rich and rapidly evolving resource. In this course unit we will explore principles and techniques that underpin the web, and investigate how these are applied to provide webs of documents, services and data. In so doing, the concepts and standards associated with resource identification, access, description and scalability will be introduced, along with recurring functionalities such as publication and search.

This is a 20 credit course unit that runs for the entire year. Each semester involves workshops that introduce and provide experience with the key technical concepts, which are then brought together in an individual software project that will include the development of scalable search techniques for documents and data.

Aims

The aim of this course unit is to provide insights into and experience of techniques relating to documents, services and data on the web. The approach is that fundamental drivers, concepts and techniques for web documents, services and data are presented and discussed in workshop settings, and that a laboratory applies and evaluates the techniques in practice.

Syllabus

Enabling the web

  • The internet and the web.
  • Basic platform: URI, HTTP, DNS.
  • Recurring themes: browsing, searching, crawling, linking, annotating, dynamism, scale.
  • Web standards: HTTP, XML, RDF.

The document web

  • Document management.
  • Crawling and analysing the web.
  • Information retrieval: meeting information needs, indexing, ranking.
  • Web graph mining, including PageRank.
  • Enhancing search through analytics and annotation.

The services web

  • Services and the web.
  • Types of service: software, platform, infrastructure.
  • Cloud services: drivers and challenges.
  • Developing scalable cloud services, including map/reduce.

The web of data

  • Data on the web, shallow and deep web.
  • Linked open data, and the linked data principles.
  • Linked data design.
  • Publishing linked data.
  • Consuming and aggregating linked data.

Feedback methods

The unit consists of workshops and laboratories: both such formats are interactive and enable continuous formative feedback. Summative feedback will be provided on two assessed laboratory activities.

Study hours

  • Practical classes & workshops (46 hours)

Employability skills

  • Analytical skills
  • Innovation/creativity
  • Problem solving

Learning outcomes

Programme outcomeUnit learning outcomesAssessment
A5 A7Students will understand the key properties of web architectures and standards, and the key concepts and terminology of the domain.
  • Examination
A5Students will understand how the key web properties have been applied to the document web, and will be able to distinguish between different techniques for search.
  • Examination
  • Lab assessment
A5Students will understand the drivers for cloud services, and will have obtained experience with a representative cloud platform.
  • Lab assessment
  • Examination
A5Students will understand how the key web properties have been applied to the web of data, and will be able to demonstrate how they underpin data publication and consumption.
  • Examination
  • Lab assessment
B1 B3 C2 C6 D4Students will be able to apply techniques from services to problems from the data and document webs.
  • Lab assessment
B1 B3 C1 C6 D4Students will be able to evaluate the techniques applied in terms of their effectiveness and scalability in practice.
  • Lab assessment
A7Students will have an appreciation of the role of the information systems professional in web-scale processing.
  • Examination

Reading list

TitleAuthorISBNPublisherYearCore
Linked data: evolving the web into a global data spaceHeath, Tom and Christian Bizer9781608454303Morgan & Claypool2011
Data-intensive text processing with MapReduceLin, Jimmy and Chris Dyer9781608453429Morgan & Claypool2010
MapReduce design patterns: building effective algorithms and analytics for Hadoop and other systemsMiner, Donald and Adam Shook9781449327170O'Reilly Media2012
Economics of cloud computing: an overview for decision makersWilliams, Bill9781587143069Cisco Press2012
Introduction to information retrievalManning, Christopher D. and Prabhakar Raghavan and Hinrich Schutze9780521865715Cambridge University Press2008

Additional notes

Course unit materials

Links to course unit teaching materials can be found on the School of Computer Science website for current students.