Skip to navigation | Skip to main content | Skip to footer

This is an archived syllabus from 2013-2014

COMP60411 Semi-Structured Data and the Web syllabus 2013-2014

COMP60411 Semi-Structured Data and the Web

Level 6
Credits: 15
Enrolled students: 51

Course leader: Uli Sattler

Additional staff: view all staff

Assessment methods

  • 50% Written exam
  • 50% Coursework
Sem 1 P1 Lecture 2.19 Fri 09:00 - 09:00 -
Sem 1 P1 Lecture 2.19 Fri 14:00 - 14:00 -
Sem 1 P1 Lab 2.25abcd Fri 15:00 - 15:00 -
Themes to which this unit belongs
  • Advanced Web Technologies
  • Managing Data


This course unit covers various formalisms (predominantly XML) and applications for Semi-Structured Dat . Semi-structured data focuses on describing and querying data that comes in a format less tightly structured than that found in relational databases. Such data is dominant on the Web, from HTML pages and weblog feeds to SOAP messages and vector graphics.

See the pitch talk for more details.


This course unit aims to give students a good overview of the ideas and the techniques which are behind the description and query mechanisms for semi-structured data as well as various design and modelling issues that arise, especially in the context of the Web. We discuss the basic concepts of semi-structured data and their representation as well as three major families of formalisms for semi-structured data:

  • XML (including Schema languages for XML data (DTD and XMLSchema)
  • Processing and manipulating XML data (XPath, XQuery)
  • Some theoretical aspects of XML data processing)
  • HTML (including HTML 5, CSS, and AJAX), and knowledge representation based languages such as RDF and OWL.

Laboratory sessions will ground the abstract notions on practical cases and tools.


  • Introduction: Semi-structured data.
  • XML: core concepts
  • DTDs, a simple schema language for XML documents
  • XPath, a navigation language for XML documents
  • XML namespace: a concept ignored so far
  • XSLT, a transformation language for XML documents
  • DOM and SAX, a programmatic manipulation language for XML documents
  • XML Schema, a more expressive schema language for XML documents
  • XQuery, a query language for XML documents
  • HTML 5: text/html vs. application/xml+xhtml
  • Validation of HTML 5 (including use of Schematron)
  • CSS and the DOM: Web Data vs. Web Documents vs Web Applications
  • RDF and Linked Data
  • OWL: How inference can help data

Feedback methods

Weekly coursework will be collected via Blackboard, and feedback is provided through the same mechanism.

Study hours

  • Assessment written exam (2 hours)
  • Lectures (20 hours)
  • Practical classes & workshops (15 hours)

Employability skills

  • Analytical skills
  • Problem solving
  • Research
  • Written communication

Learning outcomes

Programme outcomeUnit learning outcomesAssessment
G1Have an understanding of the history and foundations semi-structured data and their representation.
  • Examination
G1Have an understanding of XML, schema languages for XML (DTD, XML Schema, Relax-NG, Schematron), processing, querying, and manipulating XML data (DOM, SAX, XPath, XQuery), and some theoretical aspects of XML data processing.
  • Examination
G1Have an understanding of HTML (esp. HTML 5), CSS, HTML validation, HTML processing and manipulation for Web Applications, and an an understanding of the design issues involved.
  • Examination
G1Have an understanding of knowledge represenation based languages such as RDF and OWL, their semantics, the theoretical aspects of their core inference services, and their applicability to dealing with semi-structured data.
  • Examination
G2 G3 G4Have mastered the basic range of techniques for representing, modelling, and querying semi-structured data, and be able to use tools developed for them.
  • Lab assessment
  • Examination

Reading list

COMP60411 does not have a specified reading list.

Additional notes

Course unit materials

Links to course unit teaching materials can be found on the School of Computer Science website for current students.