pretty-expressive: a pretty expressive printer
(require pretty-expressive) | package: pretty-expressive |
This library implements a pretty expressive printer, following the algorithm presented in Porncharoenwase et al. (2023). The pretty printer is expressive, provably optimal, and practically efficient. It is similar to another library pprint, but that library employs a greedy algorithm. As a result, pretty-expressive, when compared to PPrint, is more expressive and optimal, at the cost of being less efficient.
This documentation and its structure are shamelessly copied/adapted from the PPrint library.
1 Getting Started
Pretty printing is a process for producing human readable text from structured data. Users encode the structured data together with styling choices in an abstract document, which we’ll call a doc. This doc contains printing instructions: things like text, newlines, indentation, and styling. It can also contain choices (alt) between two or more alternatives, resulting in many possible layouts for a document. The pretty printer’s job is to pick the optimal layout from among all of the choices. E.g., the one that minimizes the number of lines which not exceeding the page width limit.
Here’s a simple example of pretty printing a document encoding a fragment of code.
; Build a document
> (define doc (<> (text "while (true) {") (nest 4 (<> nl (text "f();") nl (<> (text "if (done())") (let ([exit-doc (text "exit();")]) (alt (<> space exit-doc) (nest 4 (<> nl exit-doc))))))) nl (text "}")))
It has a choice between two alternatives (alt), so it has two possible layouts. If we print it with a page width limit of 80, we get one layout, and if we print it with a page width limit of 20 we get another:
> (pretty-print doc #:page-width 80)
while (true) {
f();
if (done()) exit();
}
> (pretty-print doc #:page-width 20)
while (true) {
f();
if (done())
exit();
}
2 Documents
The library provides many functions (see Constructing Documents) for building and combining docs, which can then be printed (see Printing Documents).
3 Best Practice for Document Construction
The arguments to alt should typically have the same content, but with different formats. Although the tree size of a doc containing alt tends to blow up exponentially, the time complexity of our algorithm depends on the DAG size of the doc. As a result, provided that sub-documents are sufficiently shared, the DAG size will be small, allowing efficient pretty printing.
As an example, say we want to pretty print an S-expression with three possible styles for each “list”: horizontal style, vertical style, and argument list style. That is,
(a b c d)
could be rendered as itself or
(a b c d)
or
(a b c d)
We can construct a function to convert an S-expression to a doc:
> (define (pretty s) (match s [(list) (<+> lparen rparen)] [(list x) (<+> lparen (pretty x) rparen)] [(list x xs ...) ; Calculate all subdocuments first to share their references (define x-doc (pretty x)) (define xs-doc (map pretty xs)) (<+> lparen (alt (as-concat (cons x-doc xs-doc)) (v-concat (cons x-doc xs-doc)) (<+> x-doc space (v-concat xs-doc))) rparen)] [_ (text s)]))
We can then pretty print it:
> (define abcd-doc (pretty '("a" "b" "c" "d"))) > (pretty-print abcd-doc #:page-width 10) (a b c d)
> (pretty-print abcd-doc #:page-width 6)
(a b
c
d)
> (pretty-print abcd-doc #:page-width 4)
(a
b
c
d)
The important point is that we reuse x-doc and xs-doc across branches of alt. Had we call (pretty x) and (map pretty xs) multiple times in branches of alt, both doc construction and pretty-print would be inefficient.
4 Library Documentation
4.1 Printing Documents
procedure
(pretty-print d [ #:page-width page-width #:computation-width computation-width #:offset offset #:out out #:special special]) → void? d : doc? page-width : natural? = (current-page-width)
computation-width : (or/c #f natural?) = (current-computation-width) offset : natural? = (current-offset) out : output-port? = (current-output-port) special : (-> any/c output-port? void?) = (current-special)
The optimality objective for this pretty printing is given by default-cost-factory.
> (define doc (<$> (<+> lparen (<$> (text "'Rhoam Bosphoramus Hyrule'") (text "'Daphnes Nohansen Hyrule'")) rparen) (<+> lparen (text "'2B'") space (text "'9S'") space (text "'A2'") rparen))) > (pretty-print doc)
('Rhoam Bosphoramus Hyrule'
'Daphnes Nohansen Hyrule')
('2B' '9S' 'A2')
The offset argument is particularly helpful when there is already some preceding text printed to the screen, and we wish to pretty-printing after that.
The special argument is used for printing special results.
> (define prefix-s "values are: ")
> (begin (display prefix-s) (pretty-print (align doc) #:offset (string-length prefix-s)))
values are: ('Rhoam Bosphoramus Hyrule'
'Daphnes Nohansen Hyrule')
('2B' '9S' 'A2')
; Without #:offset, the output will not be correctly aligned.
> (begin (display prefix-s) (pretty-print (align doc)))
values are: ('Rhoam Bosphoramus Hyrule'
'Daphnes Nohansen Hyrule')
('2B' '9S' 'A2')
procedure
(pretty-format d [ #:page-width page-width #:computation-width computation-width #:offset offset #:special special]) → string? d : doc? page-width : natural? = (current-page-width)
computation-width : (or/c #f natural?) = (current-computation-width) offset : natural? = (current-offset) special : (-> any/c output-port? void?) = (current-special)
procedure
(pretty-print/factory d F [ #:offset offset #:out out #:special special]) → void? d : doc? F : cost-factory? offset : natural? = (current-offset) out : output-port? = (current-output-port) special : (-> any/c output-port? void?) = (current-special)
procedure
(pretty-format/factory d F [ #:offset offset #:special special]) → string? d : doc? F : cost-factory? offset : natural? = (current-offset) special : (-> any/c output-port? void?) = (current-special)
procedure
(pretty-print/factory/info d F [ #:offset offset #:out out #:special special]) → info? d : doc? F : cost-factory? offset : natural? = (current-offset) out : output-port? = (current-output-port) special : (-> any/c output-port? void?) = (current-special)
procedure
(pretty-format/factory/info d F [ #:offset offset #:special special]) →
string? info? d : doc? F : cost-factory? offset : natural? = (current-offset) special : (-> any/c output-port? void?) = (current-special)
struct
(struct info (tainted? cost) #:extra-constructor-name make-info) tainted? : boolean? cost : any/c
Changed in version 1.1 of package pretty-expressive: Removed the out component.
4.2 Constructing Documents
> (pretty-print (text "Portal")) Portal
DrRacket, in particular, sets up its current-output-port so that one can prints an image as a special value.
See also Best Practice for Document Construction.
> (pretty-print (<$> (text "Tears of the Kingdom") (text "Breath of the Wild") (text "Ocarina of Time")))
Tears of the Kingdom
Breath of the Wild
Ocarina of Time
> (define left-doc (<$> (text "Splatoon") (text "Nier")))
> (define right-doc (<$> (text "Automata") (text "FEZ"))) > (pretty-print (<> left-doc right-doc))
Splatoon
NierAutomata
FEZ
> (pretty-print (<+> left-doc right-doc))
Splatoon
NierAutomata
FEZ
> (pretty-print (<s> left-doc right-doc))
Splatoon
Nier Automata
FEZ
> (pretty-print (<+s> left-doc right-doc))
Splatoon
Nier Automata
FEZ
> (pretty-print (<> (text "when 1 = 2:") (nest 4 (<> nl (text "print 'oh no!'")))))
when 1 = 2:
print 'oh no!'
The increment does not affect content on the current line.
; "when 1 = 2:" is not further indented
> (pretty-print (nest 4 (<> (text "when 1 = 2:") nl (text "print 'oh no!'"))))
when 1 = 2:
print 'oh no!'
> (define subd (reset (<> (text "#<<EOF") nl (text "Zelda") nl (text "Baba is you") nl (text "EOF"))))
> (pretty-print (<> (text "when 1 = 2:") (nest 4 (<> nl (text "print ") subd))))
when 1 = 2:
print #<<EOF
Zelda
Baba is you
EOF
> (pretty-print (<> (text "a") fail)) the document fails to print
> (pretty-print (alt (<> (text "a") fail) (text "b"))) b
> (define the-comment (full (text "# this is a comment"))) > (define the-code (text "print(1)")) > (pretty-print (<> the-comment nl the-code))
# this is a comment
print(1)
> (pretty-print (<> the-comment the-code)) the document fails to print
> (pretty-print (alt (<> the-comment the-code) (<> the-comment nl the-code)))
# this is a comment
print(1)
> (pretty-print (<> the-comment nl (full (text "# this is another comment"))))
# this is a comment
# this is another comment
> (pretty-print (<> the-comment (text ""))) # this is a comment
> (define doc (<> (text "a") nl (text "b") nl (text "c"))) > (pretty-print doc)
a
b
c
> (pretty-print (flatten doc)) a b c
> (define doc2 (<> (text "a") break (text "b") break (text "c"))) > (pretty-print doc2)
a
b
c
> (pretty-print (flatten doc2)) abc
> (define doc3 (<> (text "a") hard-nl (text "b") hard-nl (text "c"))) > (pretty-print doc3)
a
b
c
> (pretty-print (flatten doc3)) the document fails to print
> (define doc4 (<> (text "a") (newline ", ") (text "b") (newline ", ") (text "c"))) > (pretty-print doc4)
a
b
c
> (pretty-print (flatten doc4)) a, b, c
4.3 Constants
4.4 Parameters
parameter
(current-page-width page-width) → void? page-width : natural?
= 80
parameter
(current-computation-width) → (or/c #f natural?)
(current-computation-width computation-width) → void? computation-width : (or/c #f natural?)
= #f
parameter
(current-offset offset) → void? offset : natural?
= 0
parameter
(current-special) → (-> any/c output-port? void?)
(current-special special) → void? special : (-> any/c output-port? void?)
= write-special
4.5 Match Expanders
Internally, a doc is either a :text, :newline, :concat, :alternatives, :align, :reset, :nest, :full, :fail, or :cost. We provide these match expanders to allow doc processing (see Processing Documents). The match expanders are illegal outside of the pattern position of the match form. Keep in mind that this list is unstable and could change across versions of the library.
syntax
(:text s len)
syntax
(:special s len)
syntax
(:newline s)
syntax
(:concat da db)
syntax
(:alternatives da db)
syntax
(:align d)
syntax
(:reset d)
syntax
(:nest n d)
syntax
(:full d)
syntax
(:fail)
syntax
(:cost c d)
5 Cost factory
Pretty printers choose an optimal layout from a document by minimizing an optimality objective. Unlike other pretty printers, which have built-in optimality objectives, pretty-expressive allows you to customize an optimality objective via the cost factory interface.
struct
(struct cost-factory (cost<=? cost+ cost-text cost-nl limit) #:extra-constructor-name make-cost-factory) cost<=? : (-> any/c any/c any/c) cost+ : (-> any/c any/c any/c) cost-text : (-> natural? natural? any/c) cost-nl : (-> natural? any/c) limit : natural?
(cost<=? a b) determines whether the cost a is less than or equal to the cost b.
(cost+ a b) combines costs a and b together to produce a new cost.
(cost-text c len) gives the cost of placing text of length len at column position c.
(cost-nl i) gives the cost of a newline followed by an indentation of i spaces.
limit is the computation width limit.
These functions should at minimum satisfy the following properties:
cost<=? should be a total order: reflexive, antisymmetric, and total.
For all costs a, b, c, and d, such that (cost<=? a b) and (cost<=? c d), cost+ should satisfy (cost<=? (cost+ a c) (cost+ b d)).
For all c, c*, and len, such that (<= c c*), cost-text should satisfy (cost<=? (cost-text c len) (cost-text c* len)).
For all i and i* such that (<= i i*), cost-nl should satisfy (cost<=? (cost-nl i) (cost-nl i*)).
cost+ should be commutative and associative.
(cost+ (cost-text c len) (cost-text (+ c len) len*)) should be equal to (cost-text c (+ len len*)).
procedure
(default-cost-factory [ #:page-width page-width #:computation-width computation-width]) → cost-factory? page-width : natural? = (current-page-width)
computation-width : (or/c #f natural?) = (current-computation-width)
Internally, this cost factory is implemented as:
(define (default-cost-factory #:page-width [page-width (current-page-width)] #:computation-width [computation-width (current-computation-width)]) (cost-factory (match-lambda** [((list b1 h1) (list b2 h2)) (cond [(= b1 b2) (<= h1 h2)] [else (< b1 b2)])]) (match-lambda** [((list b1 h1) (list b2 h2)) (list (+ b1 b2) (+ h1 h2))]) (λ (pos len) (define stop (+ pos len)) (cond [(> stop page-width) (define maxwc (max page-width pos)) (define a (- maxwc page-width)) (define b (- stop maxwc)) (list (* b (+ (* 2 a) b)) 0)] [else (list 0 0)])) (λ (i) (list 0 1)) (or computation-width (exact-floor (* page-width 1.2)))))
5.1 More cost factory examples
Consider the example in Best Practice for Document Construction. Each list can be rendered with three possible styles: horizontal style, vertical style, and argument list style.
> (pretty-print (pretty '("abc" "def" ("ghi" "jkl" "mno"))) #:page-width 15)
(abc def (ghi
jkl
mno))
Indeed, this is an optimal layout according to default-cost-factory, because it does not have any badness, and two newlines are minimal.
However, let’s say that we consider the vertical style to be not pretty. The vertical style should still be a possibility however, since it can help us avoid going over the page width limit and minimize the number of newlines in many situations. We simply would prefer other styles when all else is equal. In this case, we would prefer the output:
(abc def (ghi jkl mno))
To address this issue, we construct a new cost factory.
> (define (my-cost-factory #:page-width [page-width (current-page-width)] #:computation-width [computation-width (current-computation-width)]) (cost-factory (match-lambda** [((list b1 h1 sc1) (list b2 h2 sc2)) (cond [(= b1 b2) (cond [(= h1 h2) (<= sc1 sc2)] [else (< h1 h2)])] [else (< b1 b2)])]) (match-lambda** [((list b1 h1 sc1) (list b2 h2 sc2)) (list (+ b1 b2) (+ h1 h2) (+ sc1 sc2))]) (λ (pos len) (define stop (+ pos len)) (cond [(> stop page-width) (define maxwc (max page-width pos)) (define a (- maxwc page-width)) (define b (- stop maxwc)) (list (* b (+ (* 2 a) b)) 0 0)] [else (list 0 0 0)])) (λ (i) (list 0 1 0)) (or computation-width (exact-floor (* page-width 1.2)))))
The cost of my-cost-factory is similar to that of default-cost-factory, but it has an extra component: style cost. When all else is equal, we prefer a cost with less style cost.
We can now construct a function to convert an S-expression to a doc. It penalizes the vertical style by adding a style cost to that choice.
> (define (new-pretty s) (match s [(list) (<+> lparen rparen)] [(list x) (<+> lparen (new-pretty x) rparen)] [(list x xs ...) (define x-doc (new-pretty x)) (define xs-doc (map new-pretty xs)) (<+> lparen (alt (as-concat (cons x-doc xs-doc)) ; Add a style cost to penalize the vertical style (cost (list 0 0 1) (v-concat (cons x-doc xs-doc))) (<+> x-doc space (v-concat xs-doc))) rparen)] [_ (text s)]))
Now we can pretty print as we desired:
> (pretty-print/factory (new-pretty '("abc" "def" ("ghi" "jkl" "mno"))) (my-cost-factory #:page-width 15))
(abc def
(ghi jkl
mno))
; Three styles are still possible > (define new-abcd-doc (new-pretty '("a" "b" "c" "d"))) > (pretty-print/factory new-abcd-doc (my-cost-factory #:page-width 10)) (a b c d)
> (pretty-print/factory new-abcd-doc (my-cost-factory #:page-width 6))
(a b
c
d)
> (pretty-print/factory new-abcd-doc (my-cost-factory #:page-width 4))
(a
b
c
d)
6 Processing Documents
(require pretty-expressive/process) | |
package: pretty-expressive |
procedure
(doc-process f d) → doc?
f : procedure? d : doc?
Prefer using this function over manual matching against all match expanders. The list of match expanders could change across versions of this library, making programs that directly matches against all expanders brittle to changes. Using this function on the other hand makes doc processing stable across versions.
Bibliography
Sorawee Porncharoenwase, Justin Pombrio, and Emina Torlak. A pretty expressive printer. In Proc. Object-Oriented Programming, Systems, Languages and Applications, 2023. |