7 TEI Element Representation
This section documents the representations used by this library for individual TEI elements. TEI elements are translated from native XML form into raw x-expressions that satisfy any-tei-xexpr/c. Internally, this library uses TEI element structs to provide a further layer of abstraction.
Many of the functions documented in this section expose details of the current schema used by Digital Ricœur for our TEI XML documents. The details of this schema are subject to change: indeed, a major purpose of this library is to provide clients with an API that remains stable across changes to the schema. Programmers are strongly advised to use higher-level abstractions instead of the low-level operations documented in this section whenever possible.
7.1 TEI X-Expression Contracts
value
Currently, any-tei-xexpr/c and related contracts check all of Digital Ricœur’s project-specific requirements and most other requirements inherited from the TEI standard. However, ricoeur/tei does not currently implement a full XML validator. Nonetheless, values that are not valid according to all XML and DR-TEI.dtd rules should never be constructed: they may cause subtle errors with, at best, obscure error messages.
Full XML validation may be added to any-tei-xexpr/c in the future. Currently, high-level clients (like directory-corpus%) should use valid-xml-file? or directory-validate-xml for validation and ensure that (xmllint-available?) returns #true.
syntax
(tei-xexpr/c elem-name-id)
Using (tei-xexpr/c elem-name-id) produces the same contract as (dynamic-tei-xexpr/c (quote elem-name-id)), but tei-xexpr/c expands to the specific contract at compile-time, and a syntax error is raised if (quote elem-name-id) would not satisfy tei-element-name/c.
procedure
(dynamic-tei-xexpr/c name) → flat-contract?
name : tei-element-name/c
value
7.2 Common Element Interface
procedure
(tei-element? v) → any/c
v : any/c
procedure
v : any/c
procedure
(elements-only-element? v) → any/c
v : any/c
Internally, there is a distinct TEI element struct type for each type of element in Digital Ricœur’s customized TEI schema. See Formal Specification in TEI Encoding Guidelines for Digital Ricœur for a complete listing. However, the specific representations of most TEI element struct types are kept private to this library: for robustness against future changes to Digital Ricœur’s TEI schema, clients are urged to use high-level interfaces that abstract over the details of the document structure.
Every TEI element struct satisfies either content-containing-element? or elements-only-element? (but not both) depending on whether the element type of which it is an instance may ever contain textual data directly.
procedure
v : (or/c tei-element? raw-xexpr-atom/c)
For implementation details, see prop:element->plain-text.
7.2.1 Struct–X-Expression Conversion
procedure
(xexpr->tei-element xs) → tei-element?
xs : any-tei-xexpr/c
procedure
→ (and/c any-tei-xexpr/c normalized-xexpr-element/c) e : tei-element?
Do not attempt to use this function as a substitute for write-tei-document.
procedure
e : (or/c tei-element? normalized-xexpr-atom/c)
Do not attempt to use this function as a substitute for write-tei-document.
7.2.2 Traversing TEI Element Structs
The accessors documented in this subsection are especially likely to expose brittle details that will break upon changes to Digital Ricœur’s schema for our TEI XML documents.
procedure
e : tei-element?
procedure
→ (listof (list/c symbol? string-immutable/c)) e : tei-element?
procedure
→ (listof (or/c tei-element? normalized-xexpr-atom/c)) e : tei-element?
procedure
e : elements-only-element?
For a TEI element struct that satisfies elements-only-element?, the list returned by tei-element-get-body will never contain any strings: any insignificant whitespace inside such elements is dropped when the TEI element struct is constructed. However, the result of tei-element-get-body may still be different from tei-get-body/elements-only, as the list returned by tei-element-get-body may contain values satisfying normalized-comment/c or normalized-p-i/c.
match expander
(tei-element name-pat attributes-pat body-pat)
match expander
(content-containing-element name-pat attributes-pat body-pat)
match expander
(elements-only-element name-pat attributes-pat body-pat maybe-elements-only)
maybe-elements-only =
| #:elements-only body/elements-only-pat
7.3 Specialized Element Interfaces
Functions for working with a few specific TEI element struct types are provided by this library; however, such functions are especially brittle and may change in incompatable ways, or even be removed entirely, in future versions of this library.
For most purposes, the segment interface is a much better choice than the functions documented below (which are in fact used in the implementation of tei-document-segments). However, they do serve some specific use-cases that have not yet motivated a higher-level interface: most prominently, “TEI Lint” uses these functions to generate warnings about likely numbering errors.
7.3.1 Elements with Responsible Parties
procedure
v : any/c
procedure
(tei-element-resp elem [default]) →
(if default symbol? (or/c symbol? #f)) elem : tei-element-can-have-resp? default : (or/c 'ricoeur #f) = 'ricoeur
Note that tei-element-resp only accesses the resp attribute (if any) of the specific TEI element struct elem. Actually determining the “responsible party” for an element also requires consideration of its parent elements. This resolution is performed for segments and can be accessed with segment-resp-string and segment-by-ricoeur?: the primary purpose of tei-element-resp is to implement those higher-level functions.
For implementation details, see declare-resp-field.
7.3.2 Page-break Elements
procedure
(pb-get-kind pb) → (or/c 'none 'number 'roman 'other)
pb : tei-pb?
procedure
(pb-get-numeric pb) → (maybe/c natural-number/c)
pb : tei-pb?
procedure
(pb-get-page-string pb) → (maybe/c string-immutable/c)
pb : tei-pb?
'none: The page was not numbered.
'number: The page was numbered with an Arabic numeral.
'roman: The page was numbered with a Roman numeral.
'other: The page has a “number” according to the n attribute, but the n attribute value is not in a format this library can understand.
When the kind is 'number or 'roman, pb-get-numeric returns a just value containing the page number as a Racket integer.
Unless the kind is 'none, pb-get-page-string returns a just value containing the raw string given as the n attribute.
Recall that a pb element marks the beginning the specified page.
procedure
(tei-get-page-breaks elem) → (listof tei-pb?)
elem : tei-element?
This function is most often used with TEI document values.
7.3.3 Footnote & Endnote Elements
procedure
(tei-note-get-place note) → (or/c 'foot 'end)
note : tei-note?
procedure
(tei-note-get-n note) → string-immutable/c
note : tei-note?
procedure
(tei-note-get-transl? note) → (or/c #f 'transl)
note : tei-note?
7.3.4 Chapter & Section Elements
procedure
(div-get-n elem) → (maybe/c string-immutable/c)
elem : div?
procedure
(div-get-type elem) → div-type/c
elem : div?
value
=
(or/c 'chapter 'part 'section 'dedication 'contents 'intro 'bibl 'ack 'index)