HTML Interactions with XPath and XSLT

0
XPath and XSLT are powerful technologies used to manipulate and transform XML documents, and they are also applicable to HTML documents. The HTML standard defined by the WhatWG (Web Hypertext Application Technology Working Group) includes provisions for the interaction of XPath and XSLT with HTML documents, ensuring compatibility and facilitating the use of these technologies in a web context. Here we will explore the interactions between XPath and XSLT and HTML documents as specified by the WhatWG's Common Infrastructure terminology.

XPath and HTML Documents

XPath is a language for traversing XML documents, and it is commonly used for selecting elements and attributes from within an XML or HTML document. The HTML specification by WhatWG defines certain rules for implementing XPath 1.0 in the context of HTML documents.

One key modification introduced by the HTML standard is related to namespace handling within XPath expressions. The standard alters how QNames (Qualified Names) in node tests are expanded into expanded names. In the context of XPath and HTML, a QName in a node test is expanded using the namespace declarations from the expression context. If the QName has a prefix, there must be a corresponding namespace declaration in the expression context. The namespace URI is associated with the prefix, and it is an error if there is no declaration for the prefix.

Additionally, if the QName has no prefix and the principal node type of the axis is an element, then the default element namespace is used. The default element namespace is a context-specific value, defined as follows:
  • If the context node is from an HTML DOM, the default element namespace.
  • Otherwise, the default element namespace URI is null.
This adjustment enables the use of XPath 1.0 with HTML documents while taking into account the changes introduced in HTML regarding the namespace used for HTML elements.

It's important to note that these modifications to XPath 1.0 are willful violations of the XPath 1.0 specification, motivated by the necessity to maintain compatibility with legacy HTML content.

XSLT and HTML Documents

XSLT (Extensible Stylesheet Language Transformations) is used for transforming XML documents into different formats. When it comes to XSLT and HTML documents within the WhatWG's HTML standard, there are specific provisions to ensure compatibility and effective processing.

XSLT 1.0 processors, when outputting to a DOM (Document Object Model) with the output method set to "html," must adhere to the following rules:
  • If the transformation program outputs an element in no namespace, the processor must change the namespace of the element to the HTML namespace.
  • The local name of the element and the names of any non-namespaced attributes on the element must be ASCII-lowercased.
These requirements are willful violations of the XSLT 1.0 specification and are necessary to accommodate the changes introduced in HTML regarding namespaces and case-sensitivity rules. This ensures that XSLT transformations are compatible with DOM-based processing of HTML documents.

Interaction with the HTML Parser and Event Handling

The WhatWG HTML specification does not specify in detail how XSLT processing interacts with the HTML parser infrastructure. It does not define whether XSLT processors put elements into a stack of open elements or provide specific guidance on handling error pages and the event loop.

However, there are some general principles in place:
  • XSLT processors must stop parsing if they successfully complete.
  • The current document readiness should first be set to "interactive" and then to "complete" if an XSLT transformation is aborted.

Non-Normative Comments

The HTML specification also contains non-normative comments regarding the interaction of XSLT with HTML elements, particularly within the `<script>` element section, and the interaction of XSLT, XPath, and HTML within the `<template>` element section. These comments provide additional context and guidance for developers and implementors.

The WhatWG's HTML standard includes detailed provisions for the interaction of XPath and XSLT with HTML documents, ensuring compatibility and practical implementation. These modifications to XPath and XSLT are willful violations of their respective specifications but are necessary to support the changes introduced in HTML regarding namespaces, case-sensitivity, and other elements.
Tags

Post a Comment

0Comments
Post a Comment (0)