Skip to navigation | Skip to main content | Skip to footer

This is an archived syllabus from 2013-2014

COMP60411 Semi-Structured Data and the Web syllabus 2013-2014

COMP60411 Semi-Structured Data and the Web

Level 6
Credits: 15
Enrolled students: 51

Course leader: Uli Sattler

Additional staff: view all staff

Assessment methods

  • 50% Written exam
  • 50% Coursework
Sem 1 P1 Lecture 2.19 Fri 09:00 - 09:00 -
Sem 1 P1 Lecture 2.19 Fri 14:00 - 14:00 -
Sem 1 P1 Lab 2.25abcd Fri 15:00 - 15:00 -
Themes to which this unit belongs
  • Advanced Web Technologies
  • Managing Data


This course unit covers various formalisms (predominantly XML) and applications for Semi-Structured Dat . Semi-structured data focuses on describing and querying data that comes in a format less tightly structured than that found in relational databases. Such data is dominant on the Web, from HTML pages and weblog feeds to SOAP messages and vector graphics.

See the pitch talk for more details.


This course unit aims to give students a good overview of the ideas and the techniques which are behind the description and query mechanisms for semi-structured data as well as various design and modelling issues that arise, especially in the context of the Web. We discuss the basic concepts of semi-structured data and their representation as well as three major families of formalisms for semi-structured data:

  • XML (including Schema languages for XML data (DTD and XMLSchema)
  • Processing and manipulating XML data (XPath, XQuery)
  • Some theoretical aspects of XML data processing)
  • HTML (including HTML 5, CSS, and AJAX), and knowledge representation based languages such as RDF and OWL.

Laboratory sessions will ground the abstract notions on practical cases and tools.


  • Introduction: Semi-structured data.
  • XML: core concepts
  • DTDs, a simple schema language for XML documents
  • XPath, a navigation language for XML documents
  • XML namespace: a concept ignored so far
  • XSLT, a transformation language for XML documents
  • DOM and SAX, a programmatic manipulation language for XML documents
  • XML Schema, a more expressive schema language for XML documents
  • XQuery, a query language for XML documents
  • HTML 5: text/html vs. application/xml+xhtml
  • Validation of HTML 5 (including use of Schematron)
  • CSS and the DOM: Web Data vs. Web Documents vs Web Applications
  • RDF and Linked Data
  • OWL: How inference can help data

Feedback methods

Weekly coursework will be collected via Blackboard, and feedback is provided through the same mechanism.

Study hours

  • Assessment written exam (2 hours)
  • Lectures (20 hours)
  • Practical classes & workshops (15 hours)

Employability skills

  • Analytical skills
  • Problem solving
  • Research
  • Written communication

Learning outcomes

On successful completion of this unit, a student will be able to:

Learning outcomes are detailed on the COMP60411 course unit syllabus page on the School of Computer Science's website for current students.

Reading list

Learning SQL [electronic resource] Beaulieu, Alan.9780596555580 (electronic bk.); 059655558X (electronic bk.); 9780596522360; 0596522363O'Reilly Media©2009.
Learning XML [electronic resource] Ray, Erik T.9780596516826; 0596516827; 9781449378875; 1449378870; 0596004206; 9780596004200; 0596004206O'Reilly©2003.
XML in a NutshellMeans, W. Scott ; Harold, Elliotte Rusty9780596007645O'Reilly Media, Inc2004-09-23
Learning SPARQL: querying and updating with SPARQL 1.1DuCharme, Bob9781449371432O'Reilly2013-07-03

Additional notes

Course unit materials

Links to course unit teaching materials can be found on the School of Computer Science website for current students.