html-writing:   Writing HTML from SXML
1 Introduction
1.1 HTML5 Emphasis
1.2 script Element
2 Foreign Filters
current-html-writing-foreign-filter
3 Low-Level Writing
write-html-attribute-value-part-string
write-html-attribute-value-part
write-html-attributes
xexp->html-attribute-value-bytes
write-html-decl
write-html-pi
4 High-Level Writing
write-html
xexp->html
xexp->html-bytes
5 Known Issues
6 History
7 Legal
4:0

html-writing: Writing HTML from SXML🔗ℹ

Neil Van Dyke

1 Introduction🔗ℹ

The html-writing package provides support for writing HTML encoded as SXML. It includes programmatic foreign filter handling of non-SXML objects, and support for faux HTML within script elements.
This can be used for hand-constructed HTML, for HTML constructed by program, and for emitting HTML that has been read via the html-parsing package and then transformed.
For example:
(write-html
 '((html (head (title "My Title"))
         (body (@ (bgcolor "white"))
               (h1 "My Heading")
               (p "This is a paragraph.")
               (p "This is another paragraph."))))
 (current-output-port))
produces the output:

<html><head><title>My Title</title></head><body bgcolor="white"><h1>My Heading</h1><p>This is a paragraph.</p><p>This isanother paragraph.</p></body></html>

For a complementary way of writing HTML from SXML, see the html-template package.

1.1 HTML5 Emphasis🔗ℹ

This package will emphasize HTML5 behavior, now and in the future. Historically, this code was written for HTML 4.x and earlier, and for XHTML. XHTML is no longer supported. Pre-HTML5 HTML will be supported for the forseeable future, except when that conflicts with support for HTML5.

1.2 script Element🔗ℹ

The HTML script element is handled specially, for compatibility with HTML5 and pre-HTML5, as well as an aid for Ractive.js type="text/ractive". Specifically, when the content of the script element contains element SXML and other non-strings, those non-strings are written as what we’ll call “CDATA faux HTML“. In this CDATA faux HTML, strings are written without character entity escaping. This permits, say, the script to be some kind of template that looks like HTML and that contains bits of JavaScript. (We don’t endorse Ractive.js, but supporting script in this way seemed sensible for other reasons, and helped a Racket developer who happened to be using that framework.)

2 Foreign Filters🔗ℹ

A foreign filter is a procedure that is called by the writing procedures of this package when encountering an unrecognized object in the SXML. For one hypothetical use of this, consider the embedding of absolute URL objects in some SXML, which are converted to relative URLs at time HTML is written.
The foreign filter procedure is called with two arguments:
  • context, which is one the symbols 'content, 'attribute, or 'attribute-value.

  • object, which is the foreign object.

parameter

(current-html-writing-foreign-filter)

  html-writing-foreign-filter?
(current-html-writing-foreign-filter ff)  void?
  ff : html-writing-foreign-filter?
Parameter for the foreign filter to use.

3 Low-Level Writing🔗ℹ

This section lists some low-level writing procedures that are currently provided for historical reasons, and should be considered deprecated for now.
The two most common procedures for writing HTML from an SXML representation are the high-level write-html and xexp->html, described in the next section.

procedure

(write-html-attribute-value-part-string str    
  out)  void?
  str : string?
  out : output-port?
Deprecated.

procedure

(write-html-attribute-value-part thing out)  void?

  thing : any/c
  out : output-port?
Deprecated.

procedure

(write-html-attributes attr-or-list out)  void?

  attr-or-list : any/c
  out : output-port?
Deprecated.

procedure

(xexp->html-attribute-value-bytes xexp)  void?

  xexp : any/c
Deprecated.

procedure

(write-html-decl thing out)  void?

  thing : any/c
  out : output-string?
Deprecated.

procedure

(write-html-pi thing out)  void?

  thing : any/c
  out : output-string?
Deprecated.

4 High-Level Writing🔗ℹ

procedure

(write-html xexp out)  void?

  xexp : xexp?
  out : output-port?
Writes conventional HTML of the SXML xexp to output port out. If out is not specified, the default is the current output port. HTML elements of types that are always empty are written using HTML4-compatible XHTML tag syntax.
No inter-tag whitespace or line breaks not explicit in xexp is emitted. The xexp should normally include a newline at the end of the document.

procedure

(xexp->html xexp)  string?

  xexp : xexp?
Yields an HTML encoding of SXML xexp as a string. For example (using the html->xexp procedure from package html-parsing, to show going full-circle):
> (xexp->html
   (html->xexp
    "<P>This is<br<b<I>bold </foo>italic</ b > text.</p>"))
  "<p>This is<br><b><i>bold italic</i></b> text.</p>"
Note that, since this procedure constructs a string, it is normally best used when the HTML is small. When encoding HTML documents of conventional size, write-html is likely more efficient.

procedure

(xexp->html-bytes xexp)  bytes?

  xexp : xexp?
Like xexp->html, but returns a byte string instead of a string.

5 Known Issues🔗ℹ

  • Determine whether currently implemented way of handling !DOCTYPE is sufficiently SXML-compliant, before documenting it.

  • Move more to pure SXML: remove the support for character literals?

  • Move more to pure SXML: consider whether to eliminate & character entity references.

  • Many other TODO items from source code.

6 History🔗ℹ

  • Version 4:0 — 2016-03-25
    • Restored some undocumented provides that turned out to be used by the html-template package, and added documentation for them. Renamed html->bytes to xexp->html-bytes, and html-attribute-value->bytes to xexp->html-attribute-value-bytes for naming consistency. (Thanks to Sam Tobin-Hochstadt for reporting.)

    • Documentation tweaks.

  • Version 3:5 — 2016-03-22
    • Non-CDATA SXML content of the script element is now treated as “CDATA faux HTML”, for the benefit of Ractive.js.

    • Made error message for an attribute not nested in a list say “attribute, not just an attribute name” rather than “valid foreign object”.

    • Made error-html-writing-foreign-filter say “valid SXML object” rather than “valid foreign object”.

  • Version 3:4 — 2016-03-22
    • Adding support for using SXML element syntax within script element content, to support Ractive.js type="text/ractive", even though we believe that content to be strictly string CDATA, not DOM elements. Part of the rationale is that this should not conflict with the normal HTML5 uses of script, and it is consistent with older HTML interpretations. (Suggested by Dan Prager.)

    • Removed some vestigal provides, for undocumented identifiers.

    • Organized unit tests a bit.

    • Documentation improvements, including to Known Issues.

  • Version 3:3 — 2016-03-21
    • For HTML5 compatibility, the contents of the script element are written without character entity escaping. (Thanks to Daniel Prager.)

    • Fixed bug in which a valid null inside of a list would invoke the foreign filter, rather than being ignored.

    • Added documentation about HTML5 now emphasized by this package.

  • Version 3:2 — 2016-03-02
    • Tweaked info.rkt, filenames.

    • Changed “SXML/xexp” references to “SXML”.

  • Version 3:1 — 2016-02-25
    • Fixed deps.

  • Version 3:0 — 2016-02-25
    • Moving from PLaneT to new package system.

    • Documented foreign filters.

    • Cleaning up documentation for now.

  • Version 2:0 — 2012-06-12
    • Heavy API and implementation changes (although the previous version was not really documented), including the following.

    • All out arguments are now mandatory rather than optional.

    • All foreign-filter arguments have been removed.

    • Foreign filter context value symbol renamed from 'attribute (singular) to 'attributes (plural).

    • The suffix /fixed has been removed from all identifiers, since all procedures now have fixed arguments.

    • write-html-attribute-or-list is renamed to write-html-attributes.

    • write-html-attribute-value-string is renamed to write-html-attribute-value-part-string.

    • Added html->bytes and html-attribute-value->bytes.

    • We no longer do backward-compatible XHTML empty element terminators like the string " />"; now they’re just ">".

    • In attribute values, some additional characters are now written as numeric character references: ASCII 0 through 31, and 127.

    • Got rid of the *splice* form that we have experimentally added, and philosophically switched back to SXML’s arbitrarily nested lists for splicing with generally less allocation.

    • Restored some of the handling of unnecessary list nesting in SXML/xexp, which had been removed after forking from HtmlPrag and a brief experiment with *splice* when trying to unify SXML and PLT xexprs.

    • Converted to McFly and Overeasy.

  • Version 0.1 — Version 1:0 — 2011-08-21
    • Part of forked development from HtmlPrag, with substantial changes.

7 Legal🔗ℹ

Copyright 2004–2012, 2016 Neil Van Dyke. This program is Free Software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License,or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See http://www.gnu.org/licenses/ for details. For other licenses and consulting, please contact the author.