4 Filtering
This operation subsets a data frame, returning rows which satisfy a given condition.
syntax
(where df (bound-column ...) body ...)
df : (or/c data-frame? grouped-data-frame?)
The bound variables in body are values of the given bound-columns. The frame is iterated upon, and for each row, body is checked with the given bound variables.
> (~> example-df (where (adult) (> adult 3)) show)
data-frame: 2 rows x 4 columns
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│b │b │4 │40 │
├───┼───┼─────┼───┤
│b │b │5 │50 │
└───┴───┴─────┴───┘
> (~> example-df (where (grp juv) (and (string=? grp "b") (< juv 50))) show)
data-frame: 2 rows x 4 columns
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│b │a │3 │30 │
├───┼───┼─────┼───┤
│b │b │4 │40 │
└───┴───┴─────┴───┘
syntax
(where* df (column-name ...) (match-pattern ...))
df : (or/c data-frame? grouped-data-frame?)
There must be exactly as many column-names as there are match-patterns.
> (~> example-df (where* (grp juv) ("b" (? (λ (x) (< x 50)) _))) show)
data-frame: 2 rows x 4 columns
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│b │a │3 │30 │
├───┼───┼─────┼───┤
│b │b │4 │40 │
└───┴───┴─────┴───┘
syntax
(deduplicate df column-name ...)
df : (or/c data-frame? grouped-data-frame?)
Note that this is with respect to combinations of the input column-names: so (deduplicate df col1 col2) is not the same as (deduplicate (deduplicate df col2) col1).
> (~> example-df (deduplicate trt) show)
data-frame: 2 rows x 4 columns
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│a │b │1 │10 │
├───┼───┼─────┼───┤
│b │a │3 │30 │
└───┴───┴─────┴───┘
> (~> example-df (group-with "grp") (deduplicate trt) show)
data-frame: 3 rows x 4 columns
groups: (grp)
┌───┬───┬───┬─────┐
│juv│trt│grp│adult│
├───┼───┼───┼─────┤
│10 │b │a │1 │
├───┼───┼───┼─────┤
│30 │a │b │3 │
├───┼───┼───┼─────┤
│40 │b │b │4 │
└───┴───┴───┴─────┘