Alejandro Bia: XML-TEI document encoding, structuring, rendering and transformation
ALEJANDRO BIA is Vice Dean of Business Statistics and a researcher of the Operations Research Center (CIO) at the Miguel Hernández University (Elche, Spain). He holds a PhD in Computer Science from the University of Alicante, a MSc and a BS in Computer Science from ORT University, a Diploma in Computing and Information Systems from Oxford University, and a Universitary Expert in Technological Innovation in Education diploma from the Miguel Hernández University. He is a frequent instructor of XML-TEI workshops and seminars in several parts of the world. He has participated in more than 20 publicly funded projects, in some of them as principal investigator. From 1999 to 2004, he has been Head of Research and Development of the Miguel de Cervantes Digital Library at the University of Alicante, the biggest digital library of Spanish literary works and one of the first projects to use TEI in XML format. He is a long-time member of the DH community (since 1999), and has been elected member of the TEI Council for three periods (2002-2004, 2004-2006 and 2017-2018) and of the Executive Committee of the former Association for Literary and Linguistic Computing, now EADH, for two periods (2004-2008 and 2008-2011).
Short Description of Workshop
The vast majority of humanities research data is marked-up and stored as XML-TEI documents. We can say that TEI (Text Encoding Initiative) has become the de-facto standard technology within the digital humanities.
This workshop aims at introducing the attendees to the practical aspects of encoding XML documents marked-up according to the TEI guidelines, and then making use of these documents by applying other technolgies like Xpath, CSS, XSLT, and Xquery.
This workshop consists of a mix of talks and hands-on sessions, and the teaching approach is problem-based learning by means of exercises of increasing complexity.
It will be divided into two parts, lasting a week each, which can be taken independently.
- Week 1: will provide a gentle basic introduction to the production of digital text documents using XML and the TEI encoding scheme. It also deals with designing and validating document structures. This is a theoretical-practical, introductory-level course.
- Week 2: will move beyond encoding and will introduce Xpath, CSS, XSLT and XQuery. XML-TEI document rendering and transformation, deals with producing nice renderings by means of CSS stylesheets and XSLT transformations, and selecting/quering specific data using Xpath and Xquery. This is a hands-on, medium-level course.
Since its creation, TEI development has been driven by the needs of a large user community. The introduction to the features of TEI during the workshop will give participants a starting point to develop their own TEI-based projects.
Syllabus of week 1: Introduction to XML-TEI text encoding
INTRODUCTION:
- Introduction
- Ways of representing documents electronically
- Markup basics (SGML/XML)
DOCUMENT ENCODING:
- Text markup using XML and TEI
- High level tags
- Low level tags
- Markup of different literary styles
- Special-purpose tagsets
DOCUMENT STRUCTURING:
- Controlling document structure (DTDs and Schemas)
Syllabus of week 2: XML-TEI document rendering and transformation
INTRODUCTION
- The XML family of technologies
- Namespaces
DOCUMENT RENDERING:
- CSS sylesheets applied to XML and HTML documents
DOCUMENT TRANSFORMATION:
- Selecting nodes with Xpath
- XSLT transformations
- Visual modelling of document structures
OTHER ISSUES (if time allows)