3.1 Working with Corpus Objects🔗ℹ
In practice, this parameter should usually be initialized
with a directory-corpus% instance.
Note that the returned instance set does not
contain the TEI document values with which the corpus
was created.
Corpus objects generally avoid retaining their
encapsulated TEI document values after initialization.
Currently, the result of (get-instance-info-set)
always satisfies (instance-set/c plain-instance-info?),
but that is not guaranteed to be true in future versions
of this library.
For each TEI document doc, the returned hash table
will have a key of (instance-title/symbol doc)
mapped to the value (tei-document-checksum doc).
Thus, any two corpus objects that return equal?
hash tables, even across runs of the program, are guaranteed
to encapsulate the very same TEI documents.
3.3 Deriving New Corpus Classes🔗ℹ
Clients of this library will want to extend the
corpus object
system to support additional features by implementing
new classes derived from
corpus%.
There are two main points where derived classes will want to interpose
on
corpus%’s initialization:
A few classes, like directory-corpus%,
will want to supply an alternate means of constructing
the full instance set of TEI documents
to be encapsulated by the corpus object.
This is easily done using standard features of the racket/class
object system, such as init and super-new, to control the
initialization of the base class.
More often, derived classes will want to use the complete
instance set of TEI documents to initialize some extended functionality:
for example, corpus% itself extends a primitive, unexported class this way
to initialize a searchable document set.
The ricoeur/tei library provides special support
for these kinds of extensions through three syntactic forms:
corpus-mixin, corpus-mixin+interface,
and define-corpus-mixin+interface.
Most clients should use define-corpus-mixin+interface,
but it is best understood as an extension of the simpler forms.
Most clients should use the higher-level corpus-mixin+interface
or define-corpus-mixin+interface, rather than using corpus-mixin
directly.
A key design consideration is that a corpus% instance does
not keep its TEI documents reachable after its initialization,
as TEI document values can be rather large.
Derived classes are urged to follow this practice:
they should initialize whatever state they need for their extended functionality,
but they should allow the TEI documents to be garbage-collected
as soon as possible.
Concretely, this means that corpus% does not store
the instance set of TEI documents in a
field
(neither public nor private), as objects’ fields are reachable after initialization.
Instead, derived classes can access the instance set of TEI documents
during initialization using super-docs or super-docs-evt:
Examples:
|
> (new (printing-corpus-mixin corpus%)) |
|
(wrapper-object:...pus/plain-corpus.rkt:81:7 ...) |
|
|
interface-decl | | = | | (interface (super<%> ...) | interface-method-clause ...) |
| | | | | | (interface* (super<%> ...) | ([prop-expr val-expr] ...) | interface-method-clause ...) |
| | | | | | interface-method-clause | | = | | method-id | | | | | | [method-id contract-expr] |
|
|
|
Like
corpus-mixin, but evaluates to two values,
a mixin and an assosciated interface.
...
Most clients should use the higher-level define-corpus-mixin+interface,
rather than using corpus-mixin+interface directly.
|
|
name-spec | | = | | base-id | | | | | | [id-mixin id<%>] | | | | | | interface-decl* | | = | | (interface (super<%> ...) | interface-method-clause* ...) |
| | | | | | (interface* (super<%> ...) | ([prop-expr val-expr] ...) | interface-method-clause* ...) |
| | | | | | interface-method-clause* | | = | | interface-method-clause | | | | | | ext-method-clause | | | | | | interface-method-clause | | = | | method-id | | | | | | [method-id contract-expr] | | | | | | ext-method-clause | | = | | [ext-clause-part ...] | | | | | | ext-clause-part | | = | | method-definition-form ; required | | | | | | #:contract contract-expr | | | | | | #:proc proc-id | | | | | | with-current-decl | | | | | | method-definition-form | | = | | (define/method (method-id kw-formal ...) | body ...+) |
| | | | | | define/method | | = | | define/public | | | | | | define/pubment | | | | | | define/public-final | | | | | | with-current-decl | | = | | #:with-current with-current-id | #:else [else-body ...+] |
| | | | | | #:with-current/infer | #:else [else-body ...+] |
|
|
|
|
If no ext-method-clause appears,
equivalent to:
The
ext-method-clause variant extends the grammar of
interface
and
interface* to support defining functions related to
one of the interface’s methods: