5 Slicing

5 Slicing🔗ℹ

This operation subsets a data frame, returning columns specified by a smaller expression language.

syntax
(slice df slice-spec)

slice-spec = string
| [string-literal ...]
| regexp
| everything
| (or slice-spec ...)
| (and slice-spec ...)
| (not slice-spec)
| (all-in string-sequence)
| (any-in string-sequence)
| (starting-with string)
| (ending-with string)
| (containing string)

   df : (or/c data-frame? grouped-data-frame?)
   string : string?
   regexp : regexp?
   string-literal : string?
   string-sequence : (sequence/c string?)

Constructs a new data-frame with columns from the input df, with columns specified by the evaluation of slice-spec.

slice-spec is an expression in a much smaller language. Values in this language are:

A string?, to select a single string.
A regexp?, to select columns with names matching that regular expression.

The language has the following operators:

[str ...]

str : string?
Selects the columns with names str. This doesn’t have to be brackets, but it is recommended for readability’s sake.
The strings supplied here must be string literals, or else syntax errors would be too poor. If you want to use a variable, use all-in or any-in.

syntax
everything
Selects every column in the given data-frame.

syntax
(or spec ...)
Selects the union of the given specs.

syntax
(and spec ...)
Selects the intersection of the given specs.

syntax
(not spec ...)
Selects the complement of (everything but) the given specs.

syntax
(all-in sequence)

sequence : (sequence/c string?)
Selects all variables with names in the given sequence. If a name is present in sequence but not the data-frame, this errors.
The input sequence cannot be infinite, or this does not terminate.

syntax
(any-in sequence)

sequence : (sequence/c string?)
Like all-in, but does not error when a name is not present in sequence, and merely does not select it.

syntax
(starting-with suffix)

prefix : string?
Selects columns with names beginning with the given prefix.

syntax
(ending-with suffix)

suffix : string?
Selects columns with names ending with the given suffix.

syntax
(containing substr)

substr : string?
Selects columns with names containing the given substr.

Using these outside of the context of slice is a syntax error (aside from and, or, and not, for obvious reasons).

This operation will not remove variables that a grouped data frame is grouped by, as this would destroy group invariants.

Examples:

> (~> example-df
      (slice "trt")
      show)
data-frame: 5 rows x 1 columns
┌───┐
│trt│
├───┤
│b  │
├───┤
│b  │
├───┤
│a  │
├───┤
│b  │
├───┤
│b  │
└───┘
> (~> example-df
      (slice (not ["trt" "grp"]))
      show)
data-frame: 5 rows x 2 columns
┌───┬─────┐
│juv│adult│
├───┼─────┤
│10 │1    │
├───┼─────┤
│20 │2    │
├───┼─────┤
│30 │3    │
├───┼─────┤
│40 │4    │
├───┼─────┤
│50 │5    │
└───┴─────┘
> (~> example-df
      (slice (containing "t"))
      show)
data-frame: 5 rows x 2 columns
┌───┬─────┐
│trt│adult│
├───┼─────┤
│b  │1    │
├───┼─────┤
│b  │2    │
├───┼─────┤
│a  │3    │
├───┼─────┤
│b  │4    │
├───┼─────┤
│b  │5    │
└───┴─────┘

procedure
(take-rows df beg end) → (or/c data-frame? grouped-data-frame?)
  df : (or/c data-frame? grouped-data-frame?)
  beg : exact-nonnegative-integer?
  end : exact-nonnegative-integer?

Takes rows from df, starting at beg and ending at end, and returns a new data-frame with those rows.

If df is grouped, this takes rows from inside each group.

Examples:

> (~> example-df
      (take-rows 0 3)
      show)
data-frame: 3 rows x 4 columns
┌───┬───┬─────┬───┐
│grp│trt│adult│juv│
├───┼───┼─────┼───┤
│a  │b  │1    │10 │
├───┼───┼─────┼───┤
│a  │b  │2    │20 │
├───┼───┼─────┼───┤
│b  │a  │3    │30 │
└───┴───┴─────┴───┘
> (~> example-df
      (group-with "trt")
      (take-rows 0 1)
      show)
data-frame: 2 rows x 4 columns
groups: (trt)
┌───┬───┬───┬─────┐
│juv│trt│grp│adult│
├───┼───┼───┼─────┤
│30 │a  │b  │3    │
├───┼───┼───┼─────┤
│10 │b  │a  │1    │
└───┴───┴───┴─────┘

1	Constructing data-frames
2	Displaying data
3	Grouping and splitting
4	Filtering
5	Slicing
6	Creating and modifying columns
7	Summarizing
8	Joining
9	Sorting
10	Pivoting
11	Separating variables
12	Unnesting
13	Missing values