Package 'svBase'

Title: 'SciViews::R' - Base Functions
Description: Functions to manipulate the three main classes of "data frames" for 'SciViews::R': data.frame, data.table and tibble. Allow to select the preferred one, and to convert more carefully between the three, taking care of correct presentation of row names and data.table's keys. More homogeneous way of creating these three data frames and of printing them on the R console.
Authors: Philippe Grosjean [aut, cre] (ORCID: <https://orcid.org/0000-0002-2694-9471>)
Maintainer: Philippe Grosjean <[email protected]>
License: MIT + file LICENSE
Version: 1.7.4
Built: 2026-05-17 09:46:29 UTC
Source: https://github.com/SciViews/svBase

Help Index


Base Objects like Data Frames for 'SciViews::R'

Description

The {svBase} package sets up the way data frames (with objects like R base's data.frame, data.table and tibble tbl_df) are managed in SciViews::R. The user can select the class of object it uses by default and many other SciViews::R functions return that format. Conversion from one to the other is made easier, including for the management of data.frame's row names or data.table's keys. Also homogeneous ways to create a data frame or to print it are also provided.

Important functions

  • dtx() creates a data frame in the preferred format, with the following functions dtbl(), dtf() and dtt() that force respectively the creation of a data frame in one of the specified three formats. Use getOption("SciViews.as_dtx", default = as_dtt) to specify which function to use to convert into the preferred format.

Author(s)

Maintainer: Philippe Grosjean [email protected] (ORCID)

See Also

Useful links:


Create an alias (also known as) for an object whose a short help page and/or original help page can be viewed with .?obj.

Description

When a function or object is renamed, the link to its original help page is lost in R. Using aka() (also known as) with correct ⁠alias=⁠ allows to keep track of the original help page and get it with .?obj. Moreover, one can also populate a short man page with description, seealso and example or add a short comment message that is displayed at the same time in the R console.

Usage

aka(
  obj,
  alias = NULL,
  comment = "",
  description = NULL,
  seealso = NULL,
  example = NULL,
  url = NULL
)

## S3 method for class 'aka'
print(
  x,
  hyperlink_type = getOption("hyperlink_type", default = .hyperlink_type()),
  ...
)

## S3 method for class 'aka'
str(object, ...)

Arguments

obj

An R object.

alias

The full qualified name of the alias object whose help page should be retained as pkg::name. If NULL (by default), use obj.

comment

A comment to place in obj (will also be printed when calling .?obj).

description

A description of the function for the sort man page.

seealso

A character string of functions in the form fun or pkg::fun.

example

A character string with code for a short example.

url

An http or https URL pointing to the help page for the function on the Internet.

x

An aka object

hyperlink_type

The type of hyperlink supported. The default value should be ok. Use "none" to suppress hyperlink, "href" for http(s):// link that opens a web page, or "help" in RStudio to open the corresponding help page in the Help tab.

...

Further arguments (not used yet)

object

An aka object

Value

The original obj with the comment attribute set or replaced with ⁠comment =⁠ plus a src attribute set to ⁠alias =⁠ and description, seealso, example, and url attributes also possibly populated. If the object is a function, its class becomes aka and function.

Examples

# Say you prefer is.true() similar to is.na() or is.null()
# but R provides isTRUE().
library(svBase)
# Also defining a short man page
is.true <- aka(isTRUE, description = "Check if an object is TRUE.",
  seealso = c("is.false", "logical"), example = c("is.true(TRUE)", "is.true(FALSE)", "is.true(NA)"))
# This way, you still got access to the right help page for is.true()
## Not run: 
.?is.true

## End(Not run)

Alternate assignment (multiple and/or collect results from dplyr)

Description

These alternate assignment operators can be used to perform multiple assignment (also known as destructuring assignment). These are imported from the {zeallot} package (see the corresponding help page at zeallot::operator for complete description). They also performs a dplyr::collect() allowing to get results from dplyr extensions like {dtplyr} for data.tables, or {dbplyr} for databases. Finally these two assignment operators also make sure that the preferred data frame object is returned by using default_dtx().

Usage

value %->% x

x %<-% value

## Default S3 method:
collect(x, ...)

Arguments

value

The object to be assigned.

x

A name, or a name structure for multiple (deconstructing) assignment, or any object that does not have a specific [dplyr::collect[]) method for collect.default().

...

further arguments passed to the method (not used for the default one)

Details

These assignation operator are overloaded to get interesting properties in the context of {tidyverse} pipelines and to make sure to always return our preferred data frame object (data.frame, data.table, or tibble). Thus, before being assigned, value is modified by calling dplyr::collect() on it and by applying default_dtx().

Value

These operators invisibly return value. collect.default() simply return x.

Examples

# The alternate assignment operator performs three steps:
# 1) Collect results from dbplyr or dtplyr
library(dplyr)
library(data.table)
library(dtplyr)
library(svBase)
dtt <- data.table(x = 1:5, y = rnorm(5))
dtt |>
  mutate(x2 = x^2) |>
  select(x2, y) ->
  res

print(res)
class(res) # This is a data frame

dtt |>
  lazy_dt() |>
  mutate(x2 = x^2) |>
  select(x2, y) ->
  res

print(res)
class(res) # This is NOT a data frame

# Same pipeline, but assigning with %->%
dtt |>
  lazy_dt() |>
  mutate(x2 = x^2) |>
  select(x2, y) %->%
  res

print(res)
class(res) # res is the preferred data frame (data.table by default)

# 2) Convert data frame in the chosen format using default_dtx()
dtf <- data.frame(x = 1:5, y = rnorm(5))
class(dtf)
res %<-% dtf
class(res) # A data.table by default
# but it can be changed with options("SciViews.as_dtx)

# 3) If the zeallot syntax is used, make multiple assignment
c(X, Y) %<-% dtf # Variables of dtf assigned to different names
X
Y

# The %->% is meant to be used in pipelines, otherwise it does the same

Coerce objects into data.trames, data.frames, data.tables, tibbles or matrices

Description

Objects are coerced into the desired class. For as_dtx(), the desired class is obtained from getOption("SciViews.as_dtx"), with a default value producing a data.trame object. If the data are grouped with dplyr::group_by(), the resulting data frame is also dplyr::ungroup()ed in the process.

Usage

as_dtx(x, ..., rownames = NULL, keep.key = TRUE, byref = FALSE)

as_dtrm(x, ..., rownames = NULL, keep.key = TRUE, byref = FALSE)

as_dtf(x, ..., rownames = NULL, keep.key = TRUE, byref = NULL)

as_dtt(x, ..., rownames = NULL, keep.key = TRUE, byref = FALSE)

as_dtbl(x, ..., rownames = NULL, keep.key = TRUE, byref = NULL)

default_dtx(x, ..., rownames = NULL, keep.key = TRUE, byref = FALSE)

## S3 method for class 'tbl_df'
as.matrix(x, row.names = NULL, optional = FALSE, ...)

as_matrix(x, rownames = NULL, ...)

Arguments

x

An object.

...

Further arguments passed to the methods (not used yet).

rownames

The name of the column with row names. If NULL, it is assessed from getOptions("SciViews.dtx.rownames").

keep.key

Do we keep the data.table key into a "key" attribute or do we restore data.table or data.trame key from the attribute?

byref

If TRUE, the object is modified by reference when converted into a data.table or a data.trame (faster, but not conventional). This is FALSE by default, or NULL if the argument does not apply in the context.

row.names

Same as rownames, but for base R functions.

optional

logical, If TRUE, setting row names and converting column names to syntactically correct names is optional.

Value

The coerced object. For as_dtx(), the coercion is determined from getOption("SciViews.as_dtx") which must return one of the four other as_dt...() functions (as_dtrm by default). The default_dtx() does the same as as_dtx() if the object is a data.trame, a data.frame, a data.table, or a tibble, but it return the unmodified object for any other class (including subclassed data frames). This is a convenient function to force conversion only between those four objects classes.

Note

Use as_matrix() instead of base::as.matrix(): it has different default arguments to better account for rownames in data.table and tibble!

Examples

# A data.frame
dtf <- dtf(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE))

# Convert into a tibble
(dtbl <- as_dtbl(dtf))
# Since row names are trivial (1 -> 5), a .rownames column is not added

dtf2 <- dtf
rownames(dtf2) <- letters[1:5]
dtf2

# Now, the conversion into a tibble adds .rownames
(dtbl2 <- as_dtbl(dtf2))
# and data frame row names are set again when converted bock to dtf
as_dtf(dtbl2)

# It also work for conversions data.frame <-> data.table
(dtt2 <- as_dtt(dtf2))
as_dtf(dtt2)
# or data.frame <-> data.trame
(dtrm2 <- as_dtrm(dtf2))
as_dtf(dtrm2)

# It does not work when converting a tibble or a data.table into a matrix
# with as.matrix()
as.matrix(dtbl2)
# ... but as_matrix() does the job!
as_matrix(dtbl2)

# The name for row in dtrm, dtt and dtbl is in:
# (data.frame's row names are converted into a column with this name)
getOption("SciViews.dtx.rownames", default = ".rownames")

# Convert into the preferred data frame object (data.trame by default)
(dtx2 <- as_dtx(dtf2))
class(dtx2)

# The default data frame object used:
getOption("SciViews.as_dtx", default = as_dtrm)

# default_dtx() does the same as as_dtx(),
# but it also does not change other objects
# So, it is safe to use whatever the object you pass to it
(dtx2 <- default_dtx(dtf2))
class(dtx2)
# Any other object than data.trame, data.frame, data.table or tbl_df
# is not converted
res <- default_dtx(1:5)
class(res)
# No conversion if the data frame is subclassed
dtf3 <- dtf2
class(dtf3) <- c("subclassed", "data.frame")
class(default_dtx(dtf3))

# data.table keys are converted into a 'key' attribute and back
library(data.table)
setkey(dtt2, 'x')
haskey(dtt2)
key(dtt2)

(dtf3 <- as_dtf(dtt2))
attributes(dtf3)
# Key is restored when converted back into a data.table (also from a tibble)
(dtt3 <- as_dtt(dtf3))
haskey(dtt3)
key(dtt3)

# Grouped tibbles are ungrouped with as_dtbl() or as_dtx()/default_dtx()!
mtcars |> dplyr::group_by(cyl) -> mtcars_grouped
class(mtcars_grouped)
mtcars2 <- as_dtbl(mtcars_grouped)
class(mtcars2)

Force computation of a lazy tidyverse object

Description

When {dplyr} or {tidyr} verbs are applied to a data.table or a database connection, they do not output data frames but objects like dtplyr_step or tbl_sql that are called lazy data frames. The actual process is triggered by using as_dtx(), or more explicitly with dplyr::collect() which coerces the result to a tibble. If you want the default {svBase} data frame object instead, use collect_dtx(), or if you want a specific object, use one of the other variants.

Usage

collect_dtx(x, ...)

collect_dtrm(x, ...)

collect_dtf(x, ...)

collect_dtt(x, ...)

collect_dtbl(x, ...)

Arguments

x

A data.frame, data.table, tibble or a lazy data frame (dtplyr_step, tbl_sql...).

...

Arguments passed on to methods for dplyr::collect().

Value

A data frame (data.frame, data.table or tibble's tbl_df), the default version for collect_dtx().

Examples

# Assuming the default data frame for svBase is a data.table
mtcars_dtt <- as_dtt(mtcars)
library(dplyr)
library(dtplyr)
# A lazy data frame, not a "real" data frame!
mtcars_dtt |> lazy_dt() |> select(mpg:disp) |> class()
# A data frame
mtcars |> select(mpg:disp) |> class()
# A data table
mtcars_dtt |> select(mpg:disp) |> class()
# A tibble, always!
mtcars_dtt |> lazy_dt() |> select(mpg:disp) |> collect() |> class()
# The data frame object you want, default one specified for svBase
mtcars_dtt |> lazy_dt() |> select(mpg:disp) |> collect_dtx() |> class()

The data-dot mechanism

Description

The data-dot mechanism injects automatically .data = . in the call to a function when it detects it is necessary (most of the time, when ⁠.data=⁠ is missing, or a unnamed first argument is not suitable as .data, i.e., it is not a data.frame). This is useful to avoid having to avoid writing . "everywhere" in your functions when you use the explicit pipe operator ⁠%>.%⁠, or with .= ... constructs. The data-dot mechanism may fail with an error message if it cannot inject . as ⁠.data=⁠, or when . is not found. It may also be prohibited if the variable .SciViews.implicit.data.dot is set to ,0 FALSE (see examples).

See Also

prepare_data_dot()

Examples

# Here is how you create a data-dot function
my_subset <- function(.data = (.), i, j) {
  # This makes it a data-dot function
  if (!prepare_data_dot(.data))
    return(recall_with_data_dot())

  # Code of the function
  # Second argument (i here) must not be a data.frame to avoid confusion
  message(".env has ", paste(names(.env), collapse = ", "))
  .data[i, j]
}
dtf1 <- data.frame(x = 1:3, y = 4:6)
my_subset(dtf1, 1, 'y')
# If .data is in '.', it can be omitted
.= dtf1
my_subset(1, 'y')

# This mechanism is potentially confusing. You can inactivate it anywhere:
.SciViews.implicit.data.dot <- FALSE
# This time next call is wrong
try(my_subset(1, 'y'))
# You must indicate '.' explicitly in that case:
my_subset(., 1, 'y')
rm(.SciViews.implicit.data.dot) # Reactivate it
my_subset(1, 'y') # Implicit again
# Note that, if you have not defined '.' and try to use it, you got
# an error:
rm(.)
try(my_subset(1, 'y'))

Create a data frame (data.trame, base's data.frame, data.table or tibble's tbl_df)

Description

Create a data frame (data.trame, base's data.frame, data.table or tibble's tbl_df)

Usage

dtx(..., .name_repair = c("check_unique", "unique", "universal", "minimal"))

dtrm(..., .name_repair = c("check_unique", "unique", "universal", "minimal"))

dtbl(..., .name_repair = c("check_unique", "unique", "universal", "minimal"))

dtf(..., .name_repair = c("check_unique", "unique", "universal", "minimal"))

dtt(..., .name_repair = c("check_unique", "unique", "universal", "minimal"))

Arguments

...

A set of name-value pairs. The content of the data frame. See tibble::tibble() for more details on the way dynamic-dots are processed.

.name_repair

The way problematic column names are treated, see also tibble::tibble() for details.

Value

A data frame as a data.trame object for dtrm(), a tbl_df object for dtbl(), a data.frame for dtf() or a data.table for dtt().

Note

data.trame, data.table and tibble's tbl_df do no use row names. However, you can add a column named .rownames(by default), or the name that is in getOption("SciViews.dtx.rownames") and it will be automatically set as row names when the object is converted into a data.frame with as_dtf(). For dtf(), just create a column of this name and it is directly used as row names for the resulting data.frame object.

See Also

dtx_rows(), is_dtx(), collect_dtx()

Examples

dtrm1 <- dtrm(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE)
)
class(dtrm1)

dtbl1 <- dtbl(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE)
)
class(dtbl1)

dtf1 <- dtf(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE)
)
class(dtf1)

dtt1 <- dtt(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE))
class(dtt1)

# Using dtx(), one construct the preferred data frame object
# (a data.trame by default, can be changed with options(SciViews.as_dtx = ...))
dtx1 <- dtx(
  x = 1:5,
  y = rnorm(5),
  f = letters[1:5],
  l = sample(c(TRUE, FALSE), 5, replace = TRUE))
class(dtx1) # data.trame by default

# Use dtx_rows() to easily create a data frame:
dtx2 <- dtx_rows(
  ~x, ~y, ~f,
   1,  3, 'a',
   2,  4, 'b'
)
dtx2
class(dtx2)

# This is how you specify row names for dtf (data.frame)
dtf(x = 1:3, y = 4:6, .rownames = letters[1:3])

Row-wise creation of a data frame

Description

The presentation of the data (see examples) is easier to read than with the traditional column-wise entry in dtx(). This could be used to enter small tables in R, but do not abuse of it!

Usage

dtx_rows(...)

dtrm_rows(...)

dtf_rows(...)

dtt_rows(...)

dtbl_rows(...)

Arguments

...

Specify the structure of the data frame by using formulas for variable names like ~x for variable x. Then, use one argument per value in the data frame. It is possible to unquote with ⁠!!⁠ and to unquote-splice with ⁠!!!⁠.

Value

A data frame of class data.trame for dtrm_rows(), data.frame for dtf_rows(), data.table for dtt_rows(), tibble tbl_df for dtbl_rows() and the default object with dtx_rows().

Examples

df <- dtx_rows(
  ~x, ~y, ~group,
   1,  3,    "A",
   6,  2,    "A",
   10, 4,    "B"
)
df

Find all functions in an Expression

Description

Return a character vector containing the name of all the functions in an expression or a call.

Usage

expr_funs(expr, max.names = -1L, unique = FALSE, exclude.names = NULL)

Arguments

expr

an expression or call from which the name of the function are to be extracted.

max.names

the maximum number of names to be returned. -1 indicates no limit (other than vector size limits).

unique

a logical value which indicates whether duplicate names should be removed from the value.

exclude.names

a character vector with names to exclude, or NULL for none.

Details

The c code is adapted from base R code do_allnames() (the later one allows to extract either variables, or variables + functions, but not functions alone).

Value

A character vector with the extracted function names.

Examples

# A formula where some names are simultaneously functions and variables
ff <- ~z(x, y, z, TRUE, "test", l = 4) + (y(z, x(l)) + y(2))
all.vars(ff)
all.names(ff, unique = TRUE)
expr_funs(ff)
expr_funs(ff, unique = TRUE)
expr_funs(ff, exclude.names = "~")

Convert a string into a formula and/or change the associated environment

Description

This function is used to transform a name like 'var' into a formula ~var.

Usage

f_(x, env = parent.frame())

Arguments

x

Either a name (character string) or a formula

env

The environment to associate with the formula

Value

A formula

Examples

f_('var')

Formula-masking interface using either standard evaluation, or formulas

Description

The formula-masking is a little bit like the data-masking used in dplyr, except it uses formulas for non-standard evaluation of the arguments, otherwise, it uses standard evaluation. It allows to use both approaches (standard and non-standard) within the same function. Moreover, the intention of the user and the possible non-standard evaluation of the arguments are clearer through formulas.

Usage

formula_masking(
  ...,
  .max.args = NULL,
  .must.be.named = FALSE,
  .make.names = FALSE,
  .no.se = FALSE,
  .no.se.msg = gettext("Standard evaluation is not allowed."),
  .envir = parent.frame(2L),
  .frame = parent.frame(),
  .verbose = .op$verbose
)

Arguments

...

Arguments to be processed by formula-masking.

.max.args

The maximum allowed arguments in ....

.must.be.named

If TRUE, all arguments must be named.

.make.names

If TRUE, unnamed arguments are named automatically.

.no.se

If TRUE, standard evaluation is not allowed.

.no.se.msg

The message to be used if standard evaluation is not allowed.

.envir

The environment where to expand formulas (possibly superseded by the environment attached to the first formula).

.frame

The frame where the focus in the calling stack should be set in error messages (not used yet).

.verbose

If TRUE, some additional information about formulas substitution is printed.

Value

A list with components:

  • dots: A list of arguments, either expressions (if standard evaluation was used) or expressions extracted from the right-hand side of the formulas (if formulas were used).

  • env: The environment where the expressions should be evaluated.

  • are_formulas: TRUE if formulas were used, FALSE if standard evaluation was used.

See Also

formula_select(), rlang::is_formula(), rlang::f_lhs(), rlang::f_rhs()

Examples

# TODO...

Formula-select is similar to tidy-select in the tidyverse, but using formulas

Description

The formula-select interface allows to give arguments in the tidy-select style withing formulas, or as standard-evaluated arguments. It thus combines both approaches within the same function, and makes it clearer which are the intention: standard evaluation, or non-standard evaluation when formulas are used.

Usage

formula_select(
  ...,
  .fast.allowed.funs = NULL,
  .max.args = NULL,
  .must.be.named = FALSE,
  .make.names = FALSE,
  .no.se = FALSE,
  .no.se.msg = gettext("Standard evaluation is not allowed."),
  .envir = parent.frame(2L),
  .frame = parent.frame()
)

Arguments

...

Arguments to be processed by formula-masking.

.fast.allowed.funs

A character vector of function names that are allowed for a fast treatment (usually though collapse functions). If any other function is used, the slower tidy-select mechanism is used, see tidyselect::eval_select().

.max.args

The maximum allowed arguments in ....

.must.be.named

If TRUE, all arguments must be named.

.make.names

If TRUE, unnamed arguments are named automatically.

.no.se

If TRUE, standard evaluation is not allowed.

.no.se.msg

The message to be used if standard evaluation is not allowed.

.envir

The environment where to expand formulas (possibly superseded by the environment attached to the first formula).

.frame

The frame where the focus in the calling stack should be set in error messages (not used yet).

Value

A list with components:

  • dots: the processed arguments (formulas are turned into expressions)

  • are_formulas: whether the arguments were formulas

  • env: The environment where the expressions should be evaluated.

  • fastselect: whether fast selection can be used

See Also

formula_masking(), https://tidyselect.r-lib.org/articles/syntax.html or vignette("syntax", package = "tidyselect") for a technical description of the rules of evaluation.

Examples

# TODO...

Fast (flexible and friendly) statistical functions (mainly from collapse) for matrix-like and data frame objects

Description

The fast statistical function, or fast-flexible-friendly statistical functions are prefixed with "f". These vectorized functions supersede the no-f functions, bringing the capacity to work smoothly on matrix-like and data frame objects. Most of them are defined in the {collapse} package For instance, base mean() operates on a vector, but not on a data frame. A matrix is recognized as a vector and a single mean is returned. On, the contrary, collapse::fmean() calculates one mean per column. It does the same for a data frame, and it does so usually quicker than base functions. No need for colMeans(), a separate function to do so. Fast statistical functions also recognize grouping with collapse::fgroup_by(), sgroup_by() or dplyr::group_by() and calculate the mean by group in this case. Again, no need for a different function like stats::ave(). Finally, these functions also have a ⁠TRA=⁠ argument that computes, for instance, if TRA = "-", ⁠(x f(x))⁠ very efficiently (for instance to calculate residuals by subtracting the mean). Another particularity is the ⁠na.rm=⁠ argument that is TRUE by default, while it is FALSE by default for mean(). These are generic functions with methods for matrix, data.frame, grouped_df and a default method used for simple numeric vectors. Most of them are defined in the {collapse} package, but there are a couple more here, together with an alternate syntax to replace ⁠TRA=⁠ with ⁠%_f%⁠.

Usage

list_fstat_functions()

fn(x, ...)

fna(x, ...)

x %replacef% expr

x %replace_fillf% expr

x %-f% expr

x %+f% expr

x %-+f% expr

x %/f% expr

x %/*100f% expr

x %*f% expr

x %modf% expr

x %-modf% expr

Arguments

x

A numeric vector, matrix, data frame or grouped data frame (class 'grouped_df').

...

Further arguments passed to the method, like ⁠w=⁠, a numeric vector of (non-negative) weights that may contain missing values, or ⁠TRA=⁠, a quoted operator indicating the transformation to perform: "replace" to get a vector of same size of x with results, "replace_fill" idem but also replace missing data, "-" to subtract, "+" to add, "-+" to subtract and add the global statistic, "/" to divide, "%" to divide and multiply by 100 (percent), "*" to multiply, "%%" to take the modulus (remainder from division by the statistic) and "-%%" to subtract modulus ('i.e., to floor the data by the statistic), see collapse::TRA(). Also ⁠na.rm=⁠, a logical indicating if we skip missing values in x if TRUE(by default). If FALSE for any missing data in x, NAis returned. For details and other arguments, see the corresponding help page in the collapse package.

expr

The expression to evaluate as RHS of the ⁠%__f%⁠ operators.

Value

The number of all observations for fn() or the number of missing observations for fna(). list_fstat_functions() returns a list of all the known fast statistical functions.

Note

The page collapse::fast-statistical-functions gives more details. fn() count all observations, including NAs, fna() counts only NAs, where collapse::fnobs() counts non-missing observations. Instead of ⁠TRA=⁠ one can use the ⁠%__f%⁠ functions where ⁠__⁠ is replace, replace_fill, -, +, ⁠-+⁠, /, ⁠/*100⁠ for TRA="%", *, mod for TRA="%%", or -mod for TRA="-%%". See example.

Examples

library(collapse)
data(iris)
iris_num <- iris[, -5] # Only numerical variables
mean(iris$Sepal.Length) # OK, but mean(iris_num does not work)
colMeans(iris_num)
# Same
fmean(iris_num)
# Idem, but mean by group for all 4 numerical variables
iris |> fgroup_by(Species) |> fmean()
# Residuals (x - mean(x)) by group
iris |> fgroup_by(Species) |> fmean(TRA = "-")
# The same calculation, in a little bit more expressive way
iris |> fgroup_by(Species) %-f% fmean()
# or:
iris_num %-f% fmean(g = iris$Species)

Translate messages in a different language than the one in the R session

Description

Message translation in R is obtained with base::gettext() or base::ngettext(). But, there is no way to specify that one needs translated messages in a different language than the current one in R. Here are alternate functions that have an additional ⁠lang=⁠ argument allowing to do so. If the ⁠lang=⁠ argument is not provided in the call, they use the language defined in the R session. It is useful to define a different language, for instance, to keep R error and warning messages in English, but to generate translation for tables and figures in a different language in a report.

Usage

gettext_(..., domain = NULL, trim = TRUE, lang = get_sciviews_lang())

gettextf_(fmt, ..., domain = NULL, trim = TRUE, lang = get_sciviews_lang())

ngettext_(n, msg1, msg2, domain = NULL)

get_language(unset = "en")

set_language(lang, unset = get_language())

get_sciviews_lang(unset = get_language())

set_sciviews_lang(lang, unset = "en")

check_lang(lang, allow_uppercase = FALSE)

test_gettext_lang(lang = get_sciviews_lang(), n = 1)

Arguments

...

one of more character vectors.

domain

the 'domain' for the translation, a character string or NULL; see base::gettext() for more details. For ngettext_(), it should combine the domain and the lang, like "domain/lang" (e.g., "NULL/en_US" or "R-stats/fr"). This is a workaround to define the language, because base version of that function does not allow additional arguments and we have to remain compatible here.

trim

logical indicating if the white space trimming should happen.

lang

the target language (could be two lowercase letters, e.g., "en" for English, "fr" for French, "de" for German, etc.). One can also further specify variants, e.g., "en_US", or "en_GB", or even "fr_FR.UTF-8". For get_sciviews_lang() and set_sciviews_lang(), it is the secondary language. For the other functions, it is the R session language that is used by default. One can specify the alternate SciViews language globally with either the SCIVIEWS_LANG environment variable, or with the R option SciViews_lang, but it is a better practice to use set_sciviews_lang() in the R session. If missing, NULL, or "", the default is used from unset. For the SciViews language, uppercase letters are accepted, and they mean "translate more" (typically, factor and ordered levels are also translated, for instance).

fmt

a character vector of format strings, each of up to 8192 bytes.

n

a non-negative integer.

msg1

the message to be used in English for n = 1.

msg2

the message to be used in English for ⁠n = 0, 2, 3, ...⁠

unset

The default language to use if not defined yet, "en" (English) by default for regular R language, and the currently defined R language for the alternate SciViews language.

allow_uppercase

logical indicating if uppercase letters are allowed for the first two letters of the language code (FALSE by default, but should be TRUE for the SciViews language).

Details

To prepare your package for translation with these functions, you should import gettext_(), gettextf_() and ngettext_() from svBase. Then, you define gettext <- gettext_, gettextf <- gettextf_ and ngettext <- ngettext_ somewhere in your package. To prepare translation strings, you change the current directory of your R console to the base folder of the sources of your package and you issue tools::update_pkg_po(".") in R (or you include it in the tests: for an example, see tests/testthat/test-translations.R in the source of the svBase package). Then, you perform the translation for different languages with, say, poEdit, and recompile your package.

Value

A character vector with translated messages for the gettext...() functions.

test_gettext_lang() just serves to test and demonstrate the translation in a given language.

get_language() and get_sciviews_lang() return the current language. set_language()and set_sciviews_lang() return the previous language invisibly (with an attribute ⁠attr(*, "ok")⁠ a logical indicating success.

check_lang() validates a ⁠lang=⁠ argument by returning TRUE invisibly, otherwise, it stop()s.

See Also

base::gettext(), base::gettextf(), tools::update_pkg_po().

Examples

get_language()
get_sciviews_lang()

old_lang <- set_language("fr") # Switch to French for R language
old_sv_lang <- set_sciviews_lang("fr") # Switch to French for SciViews also

# R look for messages to be translated into gettext() calls, not gettext_()
# So, rename accordingly in your package:
gettext <- svBase::gettext_
gettextf <- svBase::gettextf_
ngettext <- svBase::ngettext_

# Retrieve strings in same language
gettext("empty model supplied", "incompatible dimensions",
 domain="R-stats", lang = "fr")

# Retrieve strings in different languages
gettext("empty model supplied", "incompatible dimensions",
  domain="R-stats", lang = "en")
gettext("empty model supplied", "incompatible dimensions",
  domain="R-stats", lang = "de")

# Try to get strings translated in an unknown language (just return the strings)
gettext("empty model supplied", "incompatible dimensions",
  domain="R-stats", lang = "xx")

# Test with some translations from the svMisc package itself:
svBase::test_gettext_lang()
svBase::test_gettext_lang("fr", n = 1)
svBase::test_gettext_lang("fr", n = 2)
svBase::test_gettext_lang("en", n = 1)
svBase::test_gettext_lang("en", n = 2)

# Restore original languages
set_language(old_lang)
set_sciviews_lang(old_sv_lang)
rm(old_lang, old_sv_lang, gettext, gettextf, ngettext)

# In case you must check if a lang= argument gets a correct value:
check_lang("en")
check_lang("en_US.UTF-8")
# Only for SciViews language!
check_lang("FR", allow_uppercase = TRUE)
# But these are incorrect
try(check_lang("EN"))
try(check_lang(""))
try(check_lang(NA_character_))
try(check_lang(NULL))
try(check_lang(42))
try(check_lang(c("en", "fr")))
try(check_lang("Fr", allow_uppercase = TRUE))

Test if the object is a data frame (data.trame, data.frame, data.table or tibble)

Description

Test if the object is a data frame (data.trame, data.frame, data.table or tibble)

Usage

is_dtx(x, strict = TRUE)

is_dtrm(x, strict = TRUE)

is_dtf(x, strict = TRUE)

is_dtt(x, strict = TRUE)

is_dtbl(x, strict = TRUE)

Arguments

x

An object

strict

Should this be strictly the corresponding class TRUE, by default, or could it be subclassed too (FALSE). With strict = TRUE, the grouped_df tibbles and grouped_ts tsibbles are also considered (tibbles or tsibbles where dplyr::group_by() was applied).

Value

These functions return TRUE if the object is of the correct class, otherwise they return FALSE. is_dtx() return TRUE if x is one of a data.frame, data.table, tibble, or data.trame.

Examples

# data(mtcars)
is_dtf(mtcars) # TRUE
is_dtx(mtcars) # Also TRUE
is_dtt(mtcars) # FALSE
is_dtbl(mtcars) # FALSE
is_dtrm(mtcars) # FALSE
# but...
is_dtt(as_dtt(mtcars)) # TRUE
is_dtx(as_dtt(mtcars)) # TRUE
is_dtbl(as_dtbl(mtcars)) # TRUE
is_dtx(as_dtbl(mtcars)) # TRUE
is_dtrm(as_dtrm(mtcars)) # TRUE
is_dtx(as_dtrm(mtcars)) # TRUE
is_dtx(as_dtbl(mtcars) |> dplyr::group_by(cyl)) # TRUE (special case)

is_dtx("some string") # FALSE

Set label (and units)

Description

Set the label, as well as the units attributes to an object. The label can be used for better display as plot axes labels, or as table headers in pretty-formatted R outputs. The units are usually associated to the label in axes labels for plots. cl() is a shortcut for concatenate (c()) and labelise().

Usage

labelise(x, label, units = NULL, as_labelled = FALSE, ...)

labelize(x, label, units = NULL, as_labelled = FALSE, ...)

## Default S3 method:
labelise(x, label, units = NULL, as_labelled = FALSE, ...)

## S3 method for class 'data.frame'
labelise(x, label, units = NULL, as_labelled = FALSE, self = TRUE, ...)

cl(..., label = NULL, units = NULL, as_labelled = FALSE)

unlabelise(x, ...)

unlabelize(x, ...)

## Default S3 method:
unlabelise(x, ...)

## S3 method for class 'data.frame'
unlabelise(x, ..., self = FALSE)

Arguments

x

An object.

label

The character string to set as label attribute to x.

units

The units (optional) as a character string to set for x.

as_labelled

Should the object be converted as a labelled S3 object (no by default)? If you don't make labelled objects, subsetting the data will lead to a lost of label and units attributes for all variables. On the other hand, labelled objects are not always correctly handled by R code.

...

Further arguments: items to be concatenated in a vector using c(...) for cl(). Columns where label should be eliminated for unlabelise().

self

Do we label the data.frame itself (self = TRUE, by default) or variables within that data.frame (self = FALSE)? In the later case, ⁠label=⁠ and ⁠units=⁠ must be either lists or character vectors of the same length as x, or be named with the names of several or all x variables. For unlabelise(), the default is self = FALSE.

Details

The same mechanism as the one used in package Hmisc is used here. However, Hmisc always add the labelled class to an object, while here, this is optional. Setting this class make the object more nicely printed, and subsettable without loosing these attributes. But it conflicts with a class of the same name in package haven, used for other purposes. So, here, one can also opt not to set it, using as_labelled = FALSE.

Value

The x object plus a label attribute, and possibly, a units attribute.

See Also

label(), base::units()

Examples

# Labelise a vector:
x <- 1:10
x <- labelise(x, label = "A suite of integers", units = "cm")
x
# or, in a single operation:
x <- cl(1:10, label = "A suite of integers", units = "cm")
x
# Not adding the labelled class:
x <- cl(1:10, label = "Integers", units = "cm", as_labelled = FALSE)
x
# Unlabelising a labelised object
unlabelise(x)

# Labelise a data.frame
iris <- labelise(datasets::iris, "The famous iris dataset")
unlabelise(iris)
# but if you indicate self = FALSE, you can labelise variables within the
# data.frame (use a list or character vector of same length as x, or a
# named list or character vector):
iris <- labelise(iris, self = FALSE, label = list(
  Sepal.Length = "Length of the sepals",
  Petal.Length = "Length of the petals"
  ), units = c(rep("cm", 4), NA))
iris <- unlabelise(iris, self = FALSE)

Prepare a data_dot function

Description

Prepare a function that uses the data-dot mechanism. In case the argument (usually, .data = (.)) is missing or is not a data frame in a call to a "data-dot" function, it is recalled after injection . as first argument.

Usage

prepare_data_dot(x, is_top_call = TRUE)

prepare_data_dot2(x, y, is_top_call = TRUE)

recall_with_data_dot(
  call,
  arg = ".data",
  value = quote((.)),
  env = parent.frame(2L),
  abort_msg = gettextf("`%s` must be a 'data.frame'.", arg),
  abort_msg2 = gettext("Implicit data-dot (.) not permitted"),
  abort_msg3 = gettext("Data-dot mechanism activated, but no `.` object found."),
  abort_msg4 =
    gettext("{.code {deparse(call0)}} rewritten as:\n{.code {deparse(call)}}"),
  abort_msg5 = gettextf("Define `.` before this call, or provide `%s =` explicitly.",
    arg),
  abort_msg6 = gettextf("See {.help svBase::data_dot_mechanism} for more infos."),
  abort_frame = parent.frame()
)

recall_with_data_dot2(
  call,
  arg = "x",
  arg2 = "y",
  value = quote((.)),
  env = parent.frame(2L),
  abort_msg = gettextf("`%s` and `%s` must both be 'data.frame'.", arg, arg2),
  abort_msg2 = gettext("Implicit data-dot (.) not permitted"),
  abort_msg3 = gettext("Data-dot mechanism activated, but no `.` object found."),
  abort_msg4 =
    gettext("{.code {deparse(call0)}} rewritten as:\n{.code {deparse(call)}}"),
  abort_msg5 = gettextf("Define `.` before this call, or provide `%s =` explicitly.",
    arg),
  abort_msg6 = gettextf("See {.help svBase::data_dot_mechanism} for more infos."),
  abort_frame = parent.frame()
)

Arguments

x

An argument to check.

is_top_call

A logical indicating if this is a top-level call (TRUE by default) that should be focused in the call stack in case of an error.

y

A second argument.

call

A call object, usually a function call. Could be omitted, and in this case, sys.call() is invoked.

arg

The name of the argument to inject, usually '.data' (default). For prepare_data_dot2(), it is x by default

value

The value to inject, usually the symbol . (default).

env

The environment where the evaluation of the data-dot-injected call should be evaluated (by default, parent.frame(2L), should rarely be changed).

abort_msg

The message to use in case the '.data' argument is wrong.

abort_msg2

An additional message to append to the error message in case data-dot-injection is not permitted (when .SciViews.implicit.data.dot != TRUE, see example).

abort_msg3

An error message when . is not found.

abort_msg4

Before and after data-dot replacement message.

abort_msg5

An additional explanation when .is not found.

abort_msg6

A hint to read the documentation of the data-dot mechanism.

abort_frame

The environment to use for the error message, by default, the caller environment (should rarely be changed).

arg2

The name of the second argument, y by default.

Details

The call is not checked if it is a correct function call. When called from within a function, passing sys.call() as call, it should be always correct. prepare_data_dot2() and recall_with_data_dot2() are similar, but for functions that have two first arguments that must be data frames, generally called xand y.

Value

TRUE if the preparation is correct for prepare_data_dot(), FALSE otherwise. The result from evaluating the data-dot-injected call for recall_with_data_dot().

See Also

data_dot_mechanism

Examples

library(svBase)
# Use this (rename) function to get extra-info in the error message about the
# data-dot mechanism automatically
stop <- stop_

# Here is how you create a data-dot function
my_subset <- function(.data = (.), i, j) {
  if (!prepare_data_dot(.data))
    return(recall_with_data_dot())

  if (!is.numeric(i))
    stop("{.arg i} must be numeric") # Use this function
  if (i == 1)
    message(".env has ", paste(names(.env), collapse = ", "))
  .data[i, j]
}

dtf1 <- data.frame(x = 1:3, y = 4:6)
# The message shows the objects available in the function environment
my_subset(dtf1, 1, 'y')
# This is wrong
try(my_subset(dtf1, 'y'))
.= dtf1
my_subset(2, 'y')
# Error message with indication that the data-dot mechanism is activated
try(my_subset('y'))
# Data-dot mechanism not activate, but dot object used
try(my_subset(., 'y'))
# Wrong .data=
try(my_subset(.data = 'y'))

# When data-dot is not permitted...
.SciViews.implicit.data.dot <- FALSE
try(my_subset(2, 'y'))
rm(.SciViews.implicit.data.dot)

# When `.` is not found...
rm(.)
try(my_subset(2, 'y'))

Process error message (replace it with a better version)

Description

The tra list indicates which (partial) error message should be replaced by a more elaborate version, using the cli::cli_abort() syntax.

Usage

process_error_msg(msg, tra, fixed = TRUE)

Arguments

msg

The original error message.

tra

The translation list, where the names are the partial error and the content is the new error message(s).

fixed

Should a fixed or a regular expression comparison be done between the names of tra and the msg? Defaults to TRUE, which is faster. In case you need different behavior for different items, you can also supply a logical vector of the same length as tra.

Value

The replaced error message.

Examples

dir <- "/temp/dir"
file <- "~/some file.txt"
tra = list(
 "Directory not found" = c("The directory {.path {dir}} does not exist.",
   i = "Please create it first."),
 "file does not exist" = c("The file {.file {file}} does not exist.",
   i = "Check the path and try again.")
)
msg <- "The file does not exist"
try(cli::cli_abort(process_error_msg(msg, tra)))

msg <- "Directory not found on this machine"
try(cli::cli_abort(process_error_msg(msg, tra)))

Retarget a formula to a different environment

Description

Change the environment of a formula to a specified environment.

Usage

retarget(x, env = parent.frame())

Arguments

x

A formula (or a quosure), or a list of formulas.

env

The environment to which the formula(s) should be retargeted. Defaults to the parent frame.

Value

The formula or list of formulas with the new environment set.

Examples

# A rather inefficient way to build a formula... for the sake of the demo!
make_formula <- function(x)
  as.formula(x)
f <- make_formula("x ~ log(y) + z")
f
f <- retarget(f)
f
# OK, but the environment associated to this formula is...
# the environment of the function:
rlang::f_env(f)

# With a list of formulas (local() creates a new environment):
fl <- local(list(y ~x^2, z~ sqrt(cos(x^2) + sin(x^2))))
fl # Not in GlobalEnv
retarget(fl) # Retargeted to GlobalEnv

Create a section in a list (collection of functions and other objects).

Description

A section tags a list to sort its items. It is particularly useful when you create a collection of function (or other objects) to ease the access to these functions. Sections are displayed in printed and "str"ed versions of the list and are also functions that cut the list to the section content only. get_section() is the workhorse function that does the section extraction.

Usage

section(obj, title)

## S3 method for class 'section'
print(x, ...)

## S3 method for class 'section'
str(object, ...)

get_section(x, title)

Arguments

obj

A list object.

title

The title of the section. It must match the name of the list item. For a title "My section title", the name must be "0__MY_SECTION_NAME__" that is both a syntactically correct name and something that emphasizes the entry as a title.

x

A list containing the section

...

Further arguments (not used yet)

object

A list to use for section extraction

Value

A function that is able to extract the corresponding section from the list.

Examples

#TODO...

Speedy functions (mainly from collapse and data.table) to manipulate data frames

Description

[Deprecated]

These function are deprecated to the benefit of the functions whose name ends with an underscore ⁠_⁠ (e.g., sselect() -> svTidy::select_()) in the svTidy package.

The Tidyverse defines a coherent set of tools to manipulate data frames that use a non-standard evaluation and sometimes require extra care. These functions, like dplyr::mutate() or dplyr::summarise() are defined in the {dplyr} and {tidyr} packages. The {collapse} package provides a couple of functions with similar interface, but with different and much faster code. For instance, collapse::fselect() is similar to dplyr::select(), or collapse::fsummarise() is similar to dplyr::summarise(). Not all functions are implemented, arguments and argument names differ, and the behavior may be very different, like collapse::frename() which uses old_name = new_name, while dplyr::rename() uses new_name = old_name! The speedy functions all are prefixed with an "s", like smutate(), and build on the work initiated in {collapse} to propose a series of paired functions with the tidy ones. So, smutate() and dplyr::mutate() are "speedy" and "tidy" counterparts and they are used in a very similar, if not identical way. This notation using a "s" prefix is there to draw the attention on their particularities. Their classes are function and speedy_fn. Avoid mixing tidy, speedy and non-tidy/speedy functions in the same pipeline. This is a global page to present all the speedy functions in one place. It is not meant to be a clear and detailed help page of all individual "s" functions. Please, refer to the corresponding help page of the non-"s" paired function for more details! You can use the {svMisc}'s .?smutate syntax to go to the help page of the non-"s" function with a message.

Usage

list_speedy_functions()

sgroup_by(.data, ...)

sungroup(.data, ...)

srename(.data, ...)

srename_with(.data, .fn, .cols = everything(), ...)

sfilter(.data, ...)

sfilter_ungroup(.data, ...)

sselect(.data, ...)

smutate(.data, ..., .keep = "all")

smutate_ungroup(.data, ..., .keep = "all")

stransmute(.data, ...)

stransmute_ungroup(.data, ...)

ssummarise(.data, ...)

sfull_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

sleft_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

sright_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

sinner_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

sbind_rows(..., .id = NULL)

scount(
  x,
  ...,
  wt = NULL,
  sort = FALSE,
  name = NULL,
  .drop = dplyr::group_by_drop_default(x),
  sort_cat = TRUE,
  decreasing = FALSE
)

stally(
  x,
  wt = NULL,
  sort = FALSE,
  name = NULL,
  sort_cat = TRUE,
  decreasing = FALSE
)

sadd_count(
  x,
  ...,
  wt = NULL,
  sort = FALSE,
  name = NULL,
  .drop = NULL,
  sort_cat = TRUE,
  decreasing = FALSE
)

sadd_tally(
  x,
  wt = NULL,
  sort = FALSE,
  name = NULL,
  sort_cat = TRUE,
  decreasing = FALSE
)

sbind_cols(
  ...,
  .name_repair = c("unique", "universal", "check_unique", "minimal")
)

sarrange(.data, ..., .by_group = FALSE)

spull(.data, var = -1, name = NULL, ...)

sdistinct(.data, ..., .keep_all = FALSE)

sdrop_na(data, ...)

sreplace_na(data, replace, ...)

spivot_longer(data, cols, names_to = "name", values_to = "value", ...)

spivot_wider(data, names_from = name, values_from = value, ...)

suncount(data, weights, .remove = TRUE, .id = NULL)

sunite(data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)

sseparate(
  data,
  col,
  into,
  sep = "[^[:alnum:]]+",
  remove = TRUE,
  convert = FALSE,
  ...
)

sseparate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)

sfill(data, ..., .direction = c("down", "up", "downup", "updown"))

sextract(
  data,
  col,
  into,
  regex = "([[:alnum:]]+)",
  remove = TRUE,
  convert = FALSE,
  ...
)

Arguments

.data

A data frame (data.frame, data.table or tibble's tbl_df)

...

Arguments dependent to the context of the function and most of the time, not evaluated in a standard way (cf. the tidyverse approach).

.fn

A function to use.

.cols

The list of the column where to apply the transformation. For the moment, only all existing columns, which means .cols = everything() is implemented

.keep

Which columns to keep. The default is "all", possible values are "used", "unused", or "none" (see dplyr::mutate()).

x

A data frame (data.frame, data.table or tibble's tbl_df).

y

A second data frame.

by

A list of names of the columns to use for joining the two data frames.

suffix

The suffix to the column names to use to differentiate the columns that come from the first or the second data frame. By default it is c(".x", ".y").

copy

This argument is there for compatibility with the "t" matching functions, but it is not used here.

.id

The name of the column for the origin id, either names if all other arguments are named, or numbers.

wt

Frequency weights. Can be NULL or a variable. Use data masking.

sort

If TRUE largest group will be shown on top.

name

The name of the new column in the output (n by default, and no existing column must have this name, or an error is generated).4

.drop

Are levels with no observations dropped (TRUE by default).

sort_cat

Are levels sorted (TRUE by default).

decreasing

Is sorting done in decreasing order (FALSE by default)?

.name_repair

How should the name be "repaired" to avoid duplicate column names? See dplyr::bind_cols() for more details.

.by_group

Logical. If TRUE rows are first arranger by the grouping variables in any. FALSE by default.

var

A variable specified as a name, a positive or a negative integer (counting from the end). The default is -1 and returns last variable.

.keep_all

If TRUE keep all variables in .data.

data

A data frame, or for replace_na() a vector or a data frame.

replace

If data is a vector, a unique value to replace NAs, otherwise, a list of values, one per column of the data frame.

cols

A selection of the columns using tidy-select syntax, seetidyr::pivot_longer().

names_to

A character vector with the name or names of the columns for the names.

values_to

A string with the name of the column that receives the values.

names_from

The column or columns containing the names (use tidy selection and do not quote the names).

values_from

Idem for the column or columns that contain the values.

weights

A vector of weight to use to "uncount" data.

.remove

If TRUE, and weights is the name of a column, that column is removed from data.

col

The name quoted or not of the new column with united variable.

sep

Separator to use between values for united or separated columns.

remove

If TRUE the initial columns that are separated are also removed from data.

na.rm

If TRUE, NAs are eliminated before uniting the values.

into

Name of the new column to put separated variables. Use NA for items to drop.

convert

If ⁠'TRUE⁠ resulting values are converted into numeric, integer or logical.

.direction

Direction in which to fill missing data: "down" (by default), "up", or "downup" (first down, then up), "updown" (the opposite).

regex

A regular expression used to extract the desired values (use one group with ( and ⁠)⁠ for each element of into).

Value

See corresponding "non-s" function for the full help page with indication of the return values.

Note

The ssummarise() function does not support n() as does dplyr::summarise(). You can use fn() instead, but then, you must give a variable name as argument. The fn() alternative can also be used in dplyr::summarise() for homogeneous syntax between the two. From {dplyr}, the dplyr::slice() and slice_xxx() functions are not added yet because they are not available for {dbplyr}. Also dplyr::anti_join(), dplyr::semi_join() and dplyr::nest_join() are not implemented yet. From {tidyr} tidyr::expand(), tidyr::chop(), tidyr::unchop(), tidyr::nest(), tidyr::unnest(), tidyr::unnest_longer(), tidyr::unnest_wider(), tidyr::hoist(), tidyr::pack() and tidyr::unpack() are not implemented yet.

Examples

# TODO...

Enhanced stop() and warning()

Description

stop_() is a wrapper around cli::cli_abort() that provides more control on the stop message and also provides nice formatting and glue interpolation. This version uses gettext() to translate the message, on the contrary to cli::cli_abort(). warning_() is similar to warning() but uses call. = FALSE by default. Finally, stop_top_call() allows to tag from where an error should be reported (see examples).

Usage

stop_(
  ...,
  call. = FALSE,
  domain = NULL,
  class = NULL,
  call = stop_top_call(2L),
  envir = parent.frame(),
  last_call = sys.call(-1L)
)

warning_(
  ...,
  call. = FALSE,
  immediate. = FALSE,
  noBreaks. = FALSE,
  domain = NULL
)

stop_top_call(nframe = 2L)

object_info(x)

Arguments

...

One or more character strings with the error or warning message(s). Name them '*' =, 'i' =, 'v' =, 'x' = or '!' = to format message items. First message item is considered to be '!' by default.

call.

Logical, whether to include the call in the warning message. Not used for stop_().

domain

see gettext(). If NA, messages will not be translated.

class

The subclass of the error condition message

call

The execution environment of a currently running function where the error should be reported from.

envir

The environment where to evaluate the glue expressions.

last_call

The last call issued by the user, used to check if a dot (.) object was invoked.

immediate.

Logical, whether to issue the warning immediately even if getOption("warn") <= 0. Note that this is not respected for condition objects.

noBreaks.

logical, indicating as far as possible the message should be output as a single line when options(warn = 1).

nframe

The number of frames to go up the call stack to find the top call for stop condition messages.

x

An R object to describe.

Value

stop_() and warning_() are invoked for their side-effects, but stop_() actually stops execution of the current code. stop_top_call() return the top call to be used for stop condition messages.

See Also

stop(), warning(), cli::cli_abort(), gettext()

Examples

# If you want to include the error messages in the translation strings in
# your package, you have to rename `stop_()` into `stop()` and `warning_()`
# into `warning()`, because [tools::xgettext2pot()] will only pick up the
# later ones.
stop <- stop_
warning <- warning_

# Call not integrated by default now
warning("just a test")
warning("just a test", call. = TRUE)

if (FALSE) {# Avoid running code that generates errors automatically
# Correctly formatted stop messages
n <- "some text"
stop("{.var n} must be a numeric vector",
  x = "You've supplied a {.cls {class(n)}} vector.")

n <- 1:18
stop("{.var n} must be a scalar numeric:",
  i = "There {?is/are} {length(n)} element{?s}.",
  x = "Indicate a single numeric, not: {n}.")

# When issued from within a function, the function call is used in the error
test1 <- function(x) {
  stop("{.var n} must be a scalar numeric:",
    i = "There {?is/are} {length(x)} element{?s}.")
}
test1(1:3)

# If another function calls `test1()`, error is still reported from test1:

test2 <- function(x) {
  test1(x)
}
test2(1:3)

# In such a case, it is better to report the error from `test2()`.
# You can do that by stating `._top_call_. <- TRUE` in the body of `test2()`.
test2 <- function(x) {
  .__top_call__. <- TRUE
  test1(x)
}
test2(1:3)
}# End of if(FALSE)

rm(stop, warning)

Define a function as being 'subsettable' using $ operator

Description

In case a textual argument allows for selecting the result, for instance, if plot() allows for several charts that you can choose with a ⁠type=⁠ or ⁠which=⁠, making the function 'subsettable' also allows to indicate fun$variant(). The subsettable_type2 variant is faster for only internal implementation of various types, while subsettable_type first searches for a function with ⁠name.<generic>$type()⁠. See examples.

Usage

## S3 method for class 'subsettable_type'
x$name

## S3 method for class 'subsettable_type2'
x$name

## S3 method for class 'subsettable_which'
x$name

name_function_type(fun, method = NULL, type)

list_types(fun, method = NULL)

get_type(fun, method = NULL, type, stop.if.missing = TRUE)

args_type(fun, method = NULL, type)

## S3 method for class 'subsettable_type'
.DollarNames(x, pattern = "")

Arguments

x

A subsettable_type function.

name

The value to use for the ⁠type=⁠ argument.

fun

The name of the function (as a scalar string).

method

An optional method name (as a scalar string).

type

The type to select (as a scalar string).

stop.if.missing

If TRUE (default), an error is raised if the requested type is not found. If FALSE, NULL is returned instead.

pattern

A regular expression. Only matching names are returned.

Examples

# Simple selection of type with a switch inside the function itself
foo <- structure(function(x, type = c("histogram", "boxplot"), ...) {
  type <- match.arg(type, c("histogram", "boxplot"))
  switch(type,
    histogram = hist(x, ...),
    boxplot = boxplot(x, ...),
    stop("unknow type")
  )
}, class = c("function", "subsettable_type2"))
foo

# This function can be used as usual:
foo(rnorm(50), type = "histogram")
# ... but also this way:
foo$histogram(rnorm(50))
foo$boxplot(rnorm(50))

# A more complex use, where it is possible to define additional types easily.
# It also allow for completion after fun$... and completion of functions
# arguments, depending on the selected type (to avoid putting all arguments
# for all types together, otherwise, it is a mess)
head2 <- structure(function(data, n = 10, ..., type = "default") {
  # This was the old (static) aaproach: not possible to add a new type
  # without modifying the function head2()
  #switch(type,
  #  default = `.head2$default`(data, n = n, ...),
  #  fun = `.head2$fun`(data, n = n, ...)
  #)
  # This is the new (dynamic) approach
  get_type("head2", type = type)(data, n = n, ...)
}, class = c("subsettable_type", "function", "head2"))

# We define two types for head2(): default and fun
`head2_default` <- function(data, n = 10, ...) {
  head(data, n = n)
}

# Apply a fun on head() - just an example, not necessarily useful
`head2_fun` <- function(data, n = 10, fun = summary, ...) {
  head(data, n = n) |> fun(...)
}

head2(iris)
head2(iris, type = "default") # Idem
head2$default(iris) # Idem
head2$fun(iris) # The other type, with fun = summary()
head2$fun(iris, fun = str)

# Now, the completion (e.g., in RStudio or Positron)
# 1. Type head2$ and you got the list of available types
# 2. Select "default" then hit <tab>, you got the list of args for default
# 3. Do the same but select "fun", now you got the arguments for the fun type
# 4. Just write a new `.head2_<type>` function and <type> is automatically
#    integrated!

Tidy functions (mainly from dplyr and tidyr) to manipulate data frames

Description

[Deprecated]

These function are deprecated to the benefit of the functions whose name ends with an underscore ⁠_⁠ (e.g., select() -> svTidy::select_()) in the svTidy package.

The Tidyverse defines a coherent set of tools to manipulate data frames that use a non-standard evaluation and sometimes require extra care. These functions, like dplyr::mutate() or dplyr::summarise() are defined in the {dplyr} and {tidyr} packages. When using variants, like {dtplyr} for data.frame objects, or {dbplyr} to work with external databases, successive commands in a pipeline are pooled together but not computed. One has to dplyr::collect() the result to get its final form. Most of the tidy functions that have their "speedy" counterpart prefixed with "s" are listed withlist_tidy_functions(). Their main usages are (excluding less used arguments, or those that are not compatibles with the speedy "s" counterpart functions):

  • group_by(.data, ...)

  • ungroup(.data)

  • rename(.data, ...)

  • rename_with(.data, .fn, .cols = everything(), ...)

  • filter(.data, ...)

  • select(.data, ...)

  • mutate(.data, ..., .keep = "all")

  • transmute(.data, ...)

  • summarise(.data, ...)

  • full_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

  • left_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

  • right_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

  • inner_join(x, y, by = NULL, suffix = c(".x", ".y"), copy = FALSE, ...)

  • bind_rows(..., .id = NULL)

  • bind_cols(..., .name_repair = c("unique", "universal", "check_unique", "minimal"))

  • arrange(.data, ..., .by_group = FALSE)

  • count(x, ..., wt = NULL, sort = FALSE, name = NULL)

  • tally(x, wt = NULL, sort = FALSE, name = NULL)

  • add_count(x, ..., wt = NULL, sort = FALSE, name = NULL)

  • add_tally(x, wt = NULL, sort = FALSE, name = NULL)

  • pull(.data, var = -1, name = NULL)

  • distinct(.data, ..., .keep_all = FALSE)

  • drop_na(data, ...)

  • replace_na(data, replace)

  • pivot_longer(data, cols, names_to = "name", values_to = "value")

  • pivot_wider(data, names_from = name, values_from = value)

  • uncount(data, weights, .remove = TRUE, .id = NULL)

  • unite(data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)

  • separate(data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE)

  • separate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)

  • fill(data, ..., .direction = c("down", "up", "downup", "updown"))

  • extract(data, col, into, regex = "([[:alnum:]]+)", remove = TRUE, convert = FALSE) plus the functions defined here under.

Usage

list_tidy_functions()

filter_ungroup(.data, ...)

mutate_ungroup(.data, ..., .keep = "all")

transmute_ungroup(.data, ...)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See dplyr::mutate() for more details.

...

Arguments dependent to the context of the function and most of the time, not evaluated in a standard way (cf. the tidyverse approach).

.keep

Which columns to keep. The default is "all", possible values are "used", "unused", or "none" (see dplyr::mutate()).

Value

See corresponding "non-t" function for the full help page with indication of the return values. list_tidy_functions() returns a list of all the tidy(verse) functions that have their speedy "s" counterpart, see speedy_functions.

Note

The help page here is very basic and it aims mainly to list all the tidy functions. For more complete help, see the {dplyr} or {tidyr} packages. From {dplyr}, the dplyr::slice() and slice_xxx() functions are not added yet because they are not available for {dbplyr}. Also dplyr::anti_join(), dplyr::semi_join() and dplyr::nest_join() are not implemented yet. From {dplyr}, the dplyr::slice() and slice_xxx() functions are not added yet because they are not available for {dbplyr}. From {tidyr} tidyr::expand(), tidyr::chop(), tidyr::unchop(), tidyr::nest(), tidyr::unnest(), tidyr::unnest_longer(), tidyr::unnest_wider(), tidyr::hoist(), tidyr::pack() and tidyr::unpack() are not implemented yet.

See Also

collapse::num_vars() to easily keep only numeric columns from a data frame, collapse::fscale() for scaling and centering matrix-like objects and data frames.

Examples

# TODO...