# Documentverwerking oefeningen # Document models: - XML Validation, XSD, DTD, formal model # Document search & Transformation: - XSLT, XML, HTML representation, CSV representation # Typography: - boxes, glues, penalties - optimal breaking # Regex: - regexes for some classic stuff and a regexcrossword # Xpath: - Axis and node test - Children of the root: `/child::node()` - Child elements of the root: `/child::*` - Course summary information: `/course/summary//child::*` - Lecture times: `/course/summary/lectures/lecture/child::*/child::text()` - Style attributes: `//@style` or `//attribute::style` - Siblings: `//content/child::*[position()>1]` - Simple predicates - First paragraphs: `//chapter/content/child::paragraph[1]` - First paragraph every: `(//paragraph)[1]` - Chapters that are not super boring: `//chapter[excitement!="Quite boring to be honest"]` - Chapters appearing in the correct position: `//child::chapter[position()=@number]` - Chapters about documents: `//child::chapter/child::content/child::paragraph[contains(text(),"document")]/../..` - Paragraphs starting with "In": `//child::paragraph[starts-with(text(),"In")]` - Fixed length excitement: `//child::chapter/child::excitement[string-length(text())<10]` - Complicated predicates - No graphical content: `//child::chapter[count(child::content/child::figure|child::content/child::graph)=0]/@number` - Counting paragraphs: `//child::chapter[count(child::content/child::paragraph)>=4]` - Paragraph with bold font predeced by images: `//child::paragraph[count(child::b)>0][count(preceding-sibling::*[1]/self::image)=1]` - Title appears in text: `//child::title[contains(..//child::paragraph/text(), text())]`