COMP60411 Modelling Data on the Web syllabus 2019-2020
This course unit covers central concepts and considerations for semi-structured and un-structured data on the web. Starting from the traditional relational view, we consider the four core concepts of data modelling, i.e., core data models, schema languages, query languages, and update mechanisms. We discuss these four core concepts for relational, tree, and graph shaped data models, and meet various formalisms, APIs, and languages that have been developed for these along the way, including XML, JSON, RDF, XQuery, XML Schema, and SPARQL.
We develop a framework for analysing whether a particular information model is well-designed, i.e., whether it makes good use of the features provided by the formalisms used, and whether it fits its purpose. We will make use of various example formalisms, APIs, and languages including XML, JSON, RDF, XQuery, XML Schema, and SPARQL and use these for various modelling exercises to deepen our understanding of these concepts.
This course unit aims to give students a good understanding of the core concepts of data modelling, and will give them some familiarity with various formalisms, APIs, and languages that have been developed for modelling data on the web, as well as various design and representation issues that arise. Students will learn how to compare different data modelling formalisms, and how to design or analyse a data management system, i.e., whether it makes good use of the features provided by the formalisms used, and whether it fits its purpose.
Laboratory sessions will ground the abstract notions on practical cases and tools.
• Introduction: four fundamental concepts of data modelling, i.e., core data models, schema languages, query languages, and update mechanisms
• Tree data and formalisms (XML, JSON)
• Schema Languages for tree data (DTDs, XML Schema, JSON Schema, and more)
• Query Languages for tree data (XPath, XQuery)
• APIs for tree data (DOM, SAX, ...)
• Updating tree data and robustness
• Graph data and formalisms (RDF, GraphML)
• Schema Languages for graph data (RDFS)
• Query Languages for graph data (SPARQL)
• APIs for graph data
• Updating graph data and robustness
Weekly coursework will be collected via Blackboard, and feedback is provided through the same mechanism.
- Lectures (20 hours)
- Practical classes & workshops (15 hours)
- Analytical skills
- Problem solving
- Written communication
On successful completion of this unit, a student will be able to:
- Have an understanding of the foundations of various forms of (semi-/un-)structured data and their formalisms.
- Have an understanding of the four fundamental concepts (core data models, schema languages, query languages, and update mechanisms), their relevant properties, and be able to analyse their use in a given data management system.
- Are able to use the basic range of techniques for representing, modelling, and querying (semi- or un-)structured data, and be able to use tools developed for them.
- Have an understanding of various formalisms developed for the fundamental concepts for (semi- or un-)structured data (XML, JSON, RDF, XML Schema, XQuery, SPARQL, etc), and be able to use them to model, describe, and query data.
- Be able to discuss trade-offs between various formalisms, and between different data models.
COMP60411 does not have a specified reading list.
Links to course unit teaching materials can be found on the School of Computer Science website for current students.