5 Refactoring Recipes

8.18.0.13

5 Refactoring Recipes🔗ℹ

This section is not meant to be read straight through, unless you are very studious.

A green identifier like Team indicates something that is being added to define-schema. A red highlight like (foo (bar x)) is used to help you follow a piece of code as it gets relocated, possibly with small adjustments.

5.1 Elemental Recipes🔗ℹ

5.1.1 Join -> Expression🔗ℹ

This recipe simply rewrites a join into a self-contained expression that can be relocated in future refactorings.

; original version:
(from x TableX
      ....
      (join y TableY
            ....)
      ....)
; refactored version:
(from x TableX
      ....
      (define y
        (join y TableY #:to x
              ....))
      ....)

The refactored code is bigger! Is this a step in the wrong direction? If we stop refactoring here, you could argue that it is. The point of this recipe is to set up future refactoring. In the original version, there was an unwritten #:to x that would be lost if we relocated the code. By making #:to x explicit, we can now relocate the code using normal Racket techniques.

5.1.2 Expression -> Procedure🔗ℹ

In this recipe, we extract any expression into a procedure. Unless you are brand new to Racket, you have done this before.

; original version:
(from x TableX
      ....
      (select (foo (bar x)
                   (baz x)))
      ....)
; refactored version:
(define (NEW-PROC x)
  (foo (bar x)
       (baz x)))
(from x TableX
      ....
      (select (NEW-PROC x))
      ....)

If we assume that foo returns a Number?, you can add a contract to NEW-PROC as follows:

(define/contract (NEW-PROC x)
  (-> (instanceof TableX) Number?)
  (foo (bar x)
       (baz x)))

Here is one more example. This time the expression is a join. Remember that you might have to use the Join -> Expression recipe first if your join does not have the #:to argument specified.

; original version:
(from x TableX
      ....
      (define y
        (join y TableY #:to x
              (join-on (.= (YID y)
                           (YID x)))))
      ....)
; refactored version:
(define/contract (Y-given-X x)
  (-> (instanceof TableX) (instanceof TableY))
  (join y TableY #:to x
        (join-on (.= (YID y)
                     (YID x)))))
(from x TableX
      ....
      (define y (Y-given-X x))
      ....)

5.1.3 Procedure -> define-schema🔗ℹ

If you are using define-schema and you have a procedure like this

(define/contract (NEW-PROC x)
  (-> (instanceof TableX) any/c)
  (foo (bar x)
       (baz x)))

It is a candidate for being moved into define-schema assuming that:

It accepts one argument. (The argument is x in this example.)
That argument is an instanceof a table from your schema definition. (The table is TableX in this example.)

First we need to decide which keyword is appropriate based on what this procedure returns. Choose from:

#:property if the return value is a Scalar? (or one of its subtypes)
#:has-group if the return value is a singular grouped join. "Singular" means that adding the join to a query of TableX will not increase the number of rows that will be returned in the result set. "Grouped" means that the join contains at least one group-by clause.
#:has-one if the return value is a singular simple join. "Singular" means that adding the join to a query of TableX will not increase the number of rows that will be returned in the result set. "Simple" means that each of the join’s clauses is either a join-on or join-type clause.

If you were unable to choose a keyword, then NEW-PROC probably does not belong inside define-schema, but you can keep it as a separate procedure. Let’s just pretend that (foo ....) returns a Scalar?, so we choose the #:property keyword. We add that expression into define-schema and replace the single argument (which was x) with this as follows:

(define-schema my-schema
  ....
  (table TableX
         ....
         #:property
         [NEW-PROC
          (foo (bar this)
               (baz this))]
         ....)
  ....)

Warning: this recipe is not complete! Continue reading the following subsections.

On Strict Comparisons

If you have a strict comparison involving this, you should add a fallback if one is not already present.

; Notice that `this` is used in a strict comparison...
(.= (foo bar)
(foo this))
; ... and surround it with a fallback:
(.= (foo bar)
(?? (foo this) /void))

For the purposes of the Using define-schema walkthrough, you can just always add the /void fallback as seen above and move on. Or if you are not satisfied with this hand-waving, you should first read Nullability and then this May Be Null.

Alternatively, you don’t have to add the fallback now. If your code worked without a fallback prior to applying this recipe, it will still work without a fallback after applying this recipe. But future callers of this procedure might get an error.

On Joins

If the definition of NEW-PROC returns a join, you will have something like the following code. You can omit the #:to this if you want, because define-schema will automatically add it for you.

[NEW-PROC
(join y TableY #:to this
clauses ....)]

On Left Joins

I recommend that every join you add to define-schema should never remove rows from the result set. For example, perhaps a Player #:has-one Team, but this relationship is optional (that is, a Player might have no current Team). In this case, the join should have (join-type 'left) so that callers who use this join do not accidentally filter out Players who have no current Team. If a caller really wants to convert a 'left join into an 'inner join, they can do so as follows:

(from p Player
      ; (Team p) returns a 'left join ...
      (join t (Team p)
            ; ... but we can override that here:
            (join-type 'inner))
      ....)

Note that almost every #:has-group relationship should be a left join, because a group containing zero members is considered a failed join and unless it is a left join, rows will be filtered from the result.

5.1.4 Join <-> Define🔗ℹ

This recipe allows you to convert a join to a definition and back. This only works if (TableY x) returns a join?:

(from x TableX
      ....
      (join y (TableY x))
      ....)
; is almost equivalent to
(from x TableX
      ....
      (define y (TableY x))
      ....)

The preceding examples are "almost equivalent" because there is a subtle case in which they are not equivalent.

When y is joined, the join is immediately added to the query and is guaranteed to appear in the generated SQL.
When y is defined, the join is not immediately added to the query. If y is used as content in some clauses that follow, it will be added to the query at that time and both versions become equivalent. But if y is an unused definition, it essentially does not exist and both versions are not equivalent.

5.2 Compound Recipes🔗ℹ

These recipes use one or more of the Elemental Recipes.

5.2.1 Singular Join -> Schema Definition🔗ℹ

This recipe moves a singular join into define-schema.

Caution: In this example, the single-argument procedure Team happens to share its name with the existing table Team. This name-sharing is very common with singular joins, but not required.

; current code:
(from p Player
      ....
      (join t Team
            (join-on (.= (TeamID t)
                         (TeamID p))))
      ....)
; desired code:
(from p Player
      ....
      (join t (Team p))
      ....)

First we use the Join -> Expression recipe as follows:

(from p Player
      ....
      (define t
        (join t Team #:to p
              (join-on (.= (TeamID t)
                           (TeamID p)))))
      ....)

Next we use the Expression -> Procedure recipe as follows:

(define/contract (NEW-PROC p)
  (-> (instanceof Player) (instanceof Team))
  (join t Team #:to p
        (join-on (.= (TeamID t)
                     (TeamID p)))))
(from p Player
      ....
      (define t (NEW-PROC p))
      ....)

Next we use the Procedure -> define-schema recipe to move NEW-PROC into our schema definition. We also immediately rename it to Team.

(define-schema
  ....
  (table Player
         ....
         #:has-one
         [Team
          (join t Team
                (join-on (.= (TeamID t)
                             (?? (TeamID this) /void))))]
         ....)
  ....)
(from p Player
      ....
      (define t (Team p))
      ....)

Finally we use the Join <-> Define recipe to make sure we are equivalent to our starting position:

(from p Player
      ....
      (join t (Team p))
      ....)

Singular Join Naming

As mentioned above, the single-argument procedure Team shares its name with the table Team. But this does not have to be the case. You could, for example, name the procedure CurrentTeam instead. Then the refactored code would look like this:

(define-schema
  ....
  (table Player
         ....
         #:has-one
         [CurrentTeam
          (join t Team
                (join-on (.= (TeamID t)
                             (?? (TeamID this) /void))))]
         ....)
  ....)
(from p Player
      ....
      (join t (CurrentTeam p))
      ....)

5.2.2 Grouped Join -> Schema Definition🔗ℹ

This recipe moves a grouped join into define-schema.

; current code:
(from t Team
      ....
      (join playersG Player
            (group-by (TeamID playersG))
            (join-on (.= (TeamID playersG)
                         (TeamID t))))
      ....)
; desired code:
(from t Team
      ....
      (join playersG (PlayersG t))
      ....)

First we use the Join -> Expression recipe to produce code like this:

(from t Team
      ....
      (define playersG
        (join playersG Player #:to t
              (group-by (TeamID playersG))
              (join-on (.= (TeamID playersG)
                           (TeamID t)))))
      ....)

Next we use the Expression -> Procedure recipe to produce code like this:

(define/contract (NEW-PROC t)
  (-> (instanceof Team) (instanceof Player))
  (join playersG Player #:to t
        (group-by (TeamID playersG))
        (join-on (.= (TeamID playersG)
                     (TeamID t)))))
(from t Team
      ....
      (define playersG (NEW-PROC t))
      ....)

Next we use the Procedure -> define-schema recipe to move NEW-PROC into our schema definition. We also immediately rename it to PlayersG. My personal convention is that the name of a grouped join ends with "G".

(define-schema
  ....
  (table Team
         ....
         #:has-group
         [PlayersG
          (join playersG Player
                (group-by (TeamID playersG))
                (join-on (.= (TeamID playersG)
                             (?? (TeamID this) /void))))]
         ....)
  ....)
(from t Team
      ....
      (define playersG (PlayersG t))
      ....)

Finally we use the Join <-> Define recipe to make sure we are equivalent to our starting position:

(from t Team
      ....
      (join playersG (PlayersG t))
      ....)

And this recipe is complete.

5.2.3 Scalar -> Schema Definition🔗ℹ

This recipe moves a scalar into define-schema.

; current code:
(from p Player
      ....
      (select (./ (ShotsMade p)
                  (ShotsTaken p)))
      ....)
; desired code:
(from p Player
      ....
      (select (ShootingPercentage p))
      ....)

First we use the Expression -> Procedure recipe as follows:

(define/contract (NEW-PROC p)
  (-> (instanceof Player) Scalar?)
  (./ (ShotsMade p)
      (ShotsTaken p)))
(from p Player
      ....
      (select (NEW-PROC p))
      ....)

Finally we use the Procedure -> define-schema recipe to move NEW-PROC into our schema definition. We also immediately rename it to ShootingPercentage.

(define-schema
  ....
  (table Player
         ....
         #:property
         [ShootingPercentage
          (./ (ShotsMade this)
              (ShotsTaken this))]
         ....)
  ....)
(from p Player
      ....
      (select (ShootingPercentage p))
      ....)

And this recipe is complete.

5.2.4 Scalar Flattening🔗ℹ

This recipe is a special case of the Scalar -> Schema Definition recipe. This recipe says that if (TeamName (Team p)) is already defined, we can easily add a new property (TeamName p) which will be equal to the original expression.

; current code:
(from p Player
      ....
      (select (TeamName (Team p)))
      ....)
; desired code:
(from p Player
      ....
      (select (TeamName p))
      ....)

First we use the Expression -> Procedure recipe as follows:

(define/contract (NEW-PROC p)
  (-> (instanceof Player) Scalar?)
  (TeamName (Team p)))
(from p Player
      ....
      (select (NEW-PROC p))
      ....)

Finally we use the Procedure -> define-schema recipe to move NEW-PROC into our schema definition. We also immediately rename it to TeamName.

(define-schema
  ....
  (table Player
         ....
         #:property
         [TeamName
          (TeamName (Team this))]
         ....)
  ....)
(from p Player
      ....
      (select (TeamName p))
      ....)

And this recipe is complete.

5.2.5 Inline Join🔗ℹ

This recipe moves a join inline. This is mostly used to set up further refactoring. It does not add anything to define-schema.

; current code:
(from p Player
      ....
      (join t (Team p))
      ....
      (select (TeamName t))
      ....)
; desired code:
(from p Player
      ....
      ; this code gets removed:
      (join t (Team p))
      ....
      (select (TeamName (Team p)))
      ....)

We first use the Join <-> Define recipe as follows:

(from p Player
      ....
      (define t (Team p))
      ....
      (select (TeamName t))
      ....)

Now we just use normal refactoring techniques to replace t with its definition as follows:

(from p Player
      ....
      ; this code is removed:
      (define t (Team p))
      ....
      (select (TeamName (Team p)))
      ....)

And this recipe is complete. They key point is that if we proceed to use the Expression -> Procedure recipe on (TeamName (Team p)), the resulting procedure will now accept one argument which is an (instanceof Player). In the original version, it would have wanted an (instanceof Team).

5.2.6 Name Clarification🔗ℹ

This recipe creates a more descriptive name for a procedure. This recipe assumes we are using define-schema.

; current code:
(from t Team
      ....
      (select (Name t))
      ....)
; desired code:
(from t Team
      ....
      (select (TeamName t))
      ....)

We want to create TeamName as an alias for Name when the argument is an (instanceof Team). First we use the Expression -> Procedure recipe to get the following code:

(define/contract (NEW-PROC t)
  (-> (instanceof Team) Scalar?)
  (Name t))
(from t Team
      ....
      (select (NEW-PROC t))
      ....)

Finally we use the Procedure -> define-schema recipe to move NEW-PROC into our schema definition. We also immediately rename it to TeamName.

(define-schema
  ....
  (table Team
         ....
         #:property
         [TeamName
          (Name this)]
         ....)
  ....)
(from t Team
      ....
      (select (TeamName t))
      ....)

And we are done.

Note that define-schema automatically sets the #:as name of each #:property, as if you had written the following:

#:property
[TeamName
(>> (Name this)
#:as 'TeamName)]

Be aware of this to avoid breaking any existing call sites that depend on the original name appearing in the result set. The examples in this documentation ignore this caveat because this recipe is always used on a brand new query that has no call sites yet.

1	Read Me First
2	Using define-schema
3	Aggregates
4	Nullability
5	Refactoring Recipes
6	Reference
7	Plisqin as a Research Language