[Maturing]

qplot_walk() helps you to visually assess the distribution of your data. It uses geom_bar() (for non double variables) or geom_histogram() (for double variables) to walk through each selected variable from a data frame.

qplot_walk(
  data,
  ...,
  cols = NULL,
  pattern = NULL,
  ignore = "character",
  remove_id = TRUE,
  relative_freq = FALSE,
  midday_change = TRUE
)

Arguments

data

An atomic or a data.frame object.

...

(optional) additional arguments to be passed to geom_bar() (for non double variables) or geom_histogram() (for double variables).

cols

(optional) (only for data frames) a character object indicating column names in data for plotting. If NULL, qplot_walk() will use all columns in data. This setting only works if pattern = NULL (default: NULL).

pattern

(optional) (only for data frames) a string with a regular expression to select column names in data for plotting. This setting only works if cols = NULL (default: NULL).

ignore

(optional) (only for data frames) a character object indicating which object classes the function must ignore. This setting can be used with cols and pattern. Assign NULL to disable this behavior (default: "character").

remove_id

(optional) (only for data frames) a logical value indicating if the function must ignore column names in data that match with the regular expression "^id$|[\\._-]id$" (default: TRUE).

relative_freq

(optional) a logical value indicating if the y axis must function must return the relative frequency of the bins/bars (default: FALSE).

midday_change

(optional) a logical value indicating if the function must apply a midday change for hms variables with values greater than 22:00:00 (see the Details section to learn more) (default: TRUE).

Value

An invisible NULL. This function don't aim to return values.

Details

Requirements

This function requires the ggplot2, grDevices, and utils packages and can only run in interactive mode. The utils and grDevices packages comes with a standard R installation and is typically loaded by default. Most people also run R interactively.

If you don't have any or one of the packages mentioned above, you can install them with install.packages("ggplot2", "grDevices", "utils").

Plot recover

qplot_walk() clears all plots after it runs. For that reason, the function first emits a dialog message warning the user of this behavior before it runs. If you want to recover a single distribution plot, assign the variable vector to the data argument.

Additional arguments to geom_bar() or geom_histogram()

qplot_walk() uses ggplot2 geom_bar() (for non double variables) or geom_histogram() (for double variables) to generate plots. If you are familiar with these functions, you can pass additional arguments to the them using the ellipsis argument (...).

Note that x, y, and data arguments are reserved for qplot_walk().

Duration, Period, and difftime objects

To help with the visualization, qplot_walk() automatically converts Duration, Period, and difftime objects to hms.

Midday change

Time variables with values greater than 22:00:00 will automatically be converted to POSIXct` and be attached to a two-day timeline using the midday hour as a cutting point, i.e., all values with 12 hours or more will be placed on day 1, and all the rest will be placed on day 2.

This is made to better represent time vectors that cross the midnight hour. You can disable this behavior by using midday_change = FALSE.

Example: Say you have a vector of time values that cross the midnight hour (e.g., an hms vector with 22:00, 23:00, 00:00, 01:00 values). If you use midday_change = FALSE, your data will be represented linearly.

00:00 01:00                                22:00 23:00
  |-----|------------------------------------|-----|------->

By using midday_change = TRUE (default), qplot_walk() will fit your data to a circular time frame of 24 hours.

             day 1                         day 2
                22:00 23:00 00:00 01:00
------------------|-----|-----|-----|---------------------->

id variables

qplot_walk() will ignore any variable with the follow name pattern "^id$|[\\._-]id$", i.e., any variable named id or that ends with .id, _id, or -id.

You can disable this behavior using remove_id = FALSE.

Examples

if (interactive() && requireNamespace("datasets", quietly = TRUE)) {

## Ploting a single column from 'data'

qplot_walk(datasets::iris$Sepal.Length)

## Ploting all columns from 'data'

qplot_walk(datasets::iris)

## Ploting selected columns from 'data'

qplot_walk(datasets::iris, cols = c("Petal.Length", "Petal.Width"))

## Ploting selected columns from 'data' using a name pattern

qplot_walk(datasets::iris, pattern = "\\.Width$")
}