On this page:
where
where*
deduplicate

4 Filtering🔗ℹ

This operation subsets a data frame, returning rows which satisfy a given condition.

syntax

(where df (bound-column ...) body ...)

 
  df : (or/c data-frame? grouped-data-frame?)
Returns df, except only rows in which body returns true are kept.

The bound variables in body are values of the given bound-columns. The frame is iterated upon, and for each row, body is checked with the given bound variables.

Examples:
> (~> example-df
      (where (adult) (> adult 3))
      show)

data-frame: 2 rows x 4 columns

┌───┬───┬─────┬───┐

│grp│trt│adult│juv│

├───┼───┼─────┼───┤

│b  │b  │4    │40 │

├───┼───┼─────┼───┤

│b  │b  │5    │50 │

└───┴───┴─────┴───┘

> (~> example-df
      (where (grp juv) (and (string=? grp "b") (< juv 50)))
      show)

data-frame: 2 rows x 4 columns

┌───┬───┬─────┬───┐

│grp│trt│adult│juv│

├───┼───┼─────┼───┤

│b  │a  │3    │30 │

├───┼───┼─────┼───┤

│b  │b  │4    │40 │

└───┴───┴─────┴───┘

syntax

(where* df (column-name ...) (match-pattern ...))

 
  df : (or/c data-frame? grouped-data-frame?)
Returns df, except only rows in which column-name matches each match-pattern is kept. See match.

There must be exactly as many column-names as there are match-patterns.

Example:
> (~> example-df
      (where* (grp juv) ("b" (? (λ (x) (< x 50)) _)))
      show)

data-frame: 2 rows x 4 columns

┌───┬───┬─────┬───┐

│grp│trt│adult│juv│

├───┼───┼─────┼───┤

│b  │a  │3    │30 │

├───┼───┼─────┼───┤

│b  │b  │4    │40 │

└───┴───┴─────┴───┘

syntax

(deduplicate df column-name ...)

 
  df : (or/c data-frame? grouped-data-frame?)
Returns df, except all combinations of each column-name will not be duplicated. respects groups, so if the data-frame is grouped, each group will be deduplicated, not the entire frame.

Note that this is with respect to combinations of the input column-names: so (deduplicate df col1 col2) is not the same as (deduplicate (deduplicate df col2) col1).

Examples:
> (~> example-df
      (deduplicate trt)
      show)

data-frame: 2 rows x 4 columns

┌───┬───┬─────┬───┐

│grp│trt│adult│juv│

├───┼───┼─────┼───┤

│a  │b  │1    │10 │

├───┼───┼─────┼───┤

│b  │a  │3    │30 │

└───┴───┴─────┴───┘

> (~> example-df
      (group-with "grp")
      (deduplicate trt)
      show)

data-frame: 3 rows x 4 columns

groups: (grp)

┌───┬───┬───┬─────┐

│juv│trt│grp│adult│

├───┼───┼───┼─────┤

│10 │b  │a  │1    

├───┼───┼───┼─────┤

│30 │a  │b  │3    

├───┼───┼───┼─────┤

│40 │b  │b  │4    

└───┴───┴───┴─────┘