Current postgraduate taught students
COMP60411: Semi-structured Data and the Web (2011-2012)
This course unit covers various formalisms (predominantly XML) and applications for Semi-Structured Dat . Semi-structured data focuses on describing and querying data that comes in a format less tightly structured than that found in relational databases. Such data is dominant on the Web, from HTML pages and weblog feeds to SOAP messages and vector graphics.
See the pitch talk for more details.
This course unit aims to give students
a good overview of the ideas and the techniques which are behind the
description and query mechanisms for semi-structured data as well as various design and modelling
issues that arise, especially in the context of the Web. We discuss the basic concepts of
semi-structured data and their representation as well as three major families of formalisms for semi-structured data:
XML (including Schema languages for XML
data (DTD and XMLSchema), processing and manipulating XML data (XPath,
XQuery), and some theoretical aspects of XML data
processing), HTML (including HTML 5, CSS, and AJAX), and knowledge representation based languages such as RDF and OWL.
will ground the abstract notions on practical cases and tools.
|Programme outcome||Unit learning outcomes||Assessment|
|G1||Have an understanding of the history and foundations semi-structured data and their representation.|
|G1||Have an understanding of XML, schema languages for XML (DTD, XML Schema, Relax-NG, Schematron), processing, querying, and manipulating XML data (DOM, SAX, XPath, XQuery), and some theoretical aspects of XML data processing.|
|G1||Have an understanding of HTML (esp. HTML 5), CSS, HTML validation, HTML processing and manipulation for Web Applications, and an an understanding of the design issues involved.|
|G1||Have an understanding of knowledge represenation based languages such as RDF and OWL, their semantics, the theoretical aspects of their core inference services, and their applicability to dealing with semi-structured data.|
|G2 G3 G4||Have mastered the basic range of techniques for representing, modelling, and querying semi-structured data, and be able to use tools developed for them.|
Introduction: Semi-structured data.
XML: core concepts
DTDs, a simple schema language for XML documents
XPath, a navigation language for XML documents
XML namespace: a concept ignored so far
XSLT, a transformation language for XML documents
DOM and SAX, a programmatic manipulation language for XML documents
XML Schema, a more expressive schema language for XML documents
XQuery, a query language for XML documents
HTML 5: text/html vs. application/xml+xhtml
Validation of HTML 5 (including use of Schematron)
CSS and the DOM: Web Data vs. Web Documents vs Web Applications
RDF and Linked Data
OWL: How inference can help data
There's no need for the students taking the course to buy any book. However, there are some resources that a student may wish to consult:
W3C documents at http://www.w3.org/TR/...
We will use the