http://www.zorba-xquery.com/modules/converters/html

View as XML or JSON.

This module provides functions to tidy a HTML document.
The functions in this module take an HTML document (a string) as parameter, tidy it in order to result in valid XHTML, and return this XHTML document as a document-node.

Function Summary

parse ($html as xs:string) as document()

This function tidies the given HTML string and returns a valid XHTML document node.

parse ($html as xs:string, $options as element(html-options:options)) as document()

This function tidies the given HTML string and returns a valid XHTML document node.

Functions

parse#1

declare  function html:parse($html as xs:string) as document()

This function tidies the given HTML string and returns a valid XHTML document node.

This functions automatically sets the following tidying parameters:

  • output-xml=yes
  • doctype=omit
  • quote-nbsp=no
  • char-encoding=utf8
  • newline=LF
  • tidy-mark=no

Parameters

html as xs:string
the HTML string to tidy

Returns

document()
the tidied XML document

parse#2

declare  function html:parse($html as xs:string, $options as element(html-options:options)) as document()

This function tidies the given HTML string and returns a valid XHTML document node.

The second parameter allows to specify options that configure the tidy process. This parameter is a sequence of name=value pairs. Allowed parameter names and values are documented at http://tidy.sourceforge.net/docs/quickref.html.

Parameters

html as xs:string
the HTML string to tidy
options as element(html-options:options)
a set of name and value pairs that provide options to configure the tidy process that have to be validated against the "http://www.zorba-xquery.com/modules/converters/html-options" schema.

Returns

document()
the tidied XHTML document node