find_columns {datawizard} | R Documentation |
find_columns()
returns column names from a data set that
match a certain search pattern, while get_columns()
returns the found data.
data_select()
is an alias for get_columns()
, and data_find()
is an alias
for find_columns()
.
find_columns( data, select = NULL, exclude = NULL, ignore_case = FALSE, regex = FALSE, verbose = TRUE, ... ) data_find( data, select = NULL, exclude = NULL, ignore_case = FALSE, regex = FALSE, verbose = TRUE, ... ) get_columns( data, select = NULL, exclude = NULL, ignore_case = FALSE, regex = FALSE, verbose = TRUE, ... ) data_select( data, select = NULL, exclude = NULL, ignore_case = FALSE, regex = FALSE, verbose = TRUE, ... )
data |
A data frame. |
select |
Variables that will be included when performing the required tasks. Can be either
If |
exclude |
See |
ignore_case |
Logical, if |
regex |
Logical, if |
verbose |
Toggle warnings. |
... |
Arguments passed down to other functions. Mostly not used yet. |
Note that it is possible to either pass an entire select helper or only the pattern inside a select helper as a function argument:
foo <- function(data, pattern) { find_columns(data, select = starts_with(pattern)) } foo(iris, pattern = "Sep") foo2 <- function(data, pattern) { find_columns(data, select = pattern) } foo2(iris, pattern = starts_with("Sep"))
This means that it is also possible to use loop values as arguments or patterns:
for (i in c("Sepal", "Sp")) { head(iris) |> find_columns(select = starts_with(i)) |> print() }
However, this behavior is limited to a "single-level function". It will not work in nested functions, like below:
inner <- function(data, arg) { find_columns(data, select = arg) } outer <- function(data, arg) { inner(data, starts_with(arg)) } outer(iris, "Sep")
In this case, it is better to pass the whole select helper as the argument of
outer()
:
outer <- function(data, arg) { inner(data, arg) } outer(iris, starts_with("Sep"))
find_columns()
returns a character vector with column names that matched
the pattern in select
and exclude
, or NULL
if no matching column name
was found. get_columns()
returns a data frame with matching columns.
Functions to rename stuff: data_rename()
, data_rename_rows()
, data_addprefix()
, data_addsuffix()
Functions to reorder or remove columns: data_reorder()
, data_relocate()
, data_remove()
Functions to reshape, pivot or rotate data frames: data_to_long()
, data_to_wide()
, data_rotate()
Functions to recode data: rescale()
, reverse()
, categorize()
, recode_values()
, slide()
Functions to standardize, normalize, rank-transform: center()
, standardize()
, normalize()
, ranktransform()
, winsorize()
Split and merge data frames: data_partition()
, data_merge()
Functions to find or select columns: data_select()
, data_find()
Functions to filter rows: data_match()
, data_filter()
# Find columns names by pattern find_columns(iris, starts_with("Sepal")) find_columns(iris, ends_with("Width")) find_columns(iris, regex("\\.")) find_columns(iris, c("Petal.Width", "Sepal.Length")) # starts with "Sepal", but not allowed to end with "width" find_columns(iris, starts_with("Sepal"), exclude = contains("Width")) # find numeric with mean > 3.5 numeric_mean_35 <- function(x) is.numeric(x) && mean(x, na.rm = TRUE) > 3.5 find_columns(iris, numeric_mean_35)