3 Grouping and splitting
The overwhelming majority of operations in Sawzall respect the "grouping" of a data-frame. Most operations are done on groups defined by variables, so grouping takes an existing frame and converts it into a grouped one, in which operations are performed by group.
3.1 Grouping
procedure
(grouped-data-frame? v) → boolean?
v : any/c
procedure
(group-with df var ...) → grouped-data-frame?
df : data-frame? var : string?
This does not change how the data-frame is displayed with show or introspect, but the result is internally different, and cannot be used with regular data-frame operators like df-select.
> (~> example-df (group-with "grp" "trt") show)
data-frame: 5 rows x 4 columns
groups: (trt grp)
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│a │b │1 │10 │
├───┼───┼─────┼───┤
│a │b │2 │20 │
├───┼───┼─────┼───┤
│b │a │3 │30 │
├───┼───┼─────┼───┤
│b │b │4 │40 │
├───┼───┼─────┼───┤
│b │b │5 │50 │
└───┴───┴─────┴───┘
procedure
(ungroup df) → data-frame?
df : (or/c data-frame? grouped-data-frame?)
If df is not grouped, this does nothing.
procedure
(ungroup-once df) → (or/c data-frame? grouped-data-frame?)
df : (or/c data-frame? grouped-data-frame?)
If df is not grouped, this does nothing.
3.2 Splitting
The following operations behave similar to the above counterparts, but they return a list instead of a grouped data frame, so you must use map to do sequential groups or perform operations.
These operations are also notably less performant due to the amount of copying involved.
procedure
(split-with df var) → (listof data-frame?)
df : data-frame? var : string?
procedure
(combine df ...) → data-frame?
df : data-frame?