Generate a draft validation plan in a new .R or .Rmd file using an input data table. Using this workflow, the data table will be scanned to learn about its column data and a set of starter validation steps (constituting a validation plan) will be written. It's best to use a data extract that contains at least 1000 rows and is relatively free of spurious data.

Once in the file, it's possible to tweak the validation steps to better fit the expectations to the particular domain. While column inference is used to generate reasonable validation plans, it is difficult to infer the acceptable values without domain expertise. However, using draft_validation() could get you started on floor 10 of tackling data quality issues and is in any case better than starting with an empty code editor view.

draft_validation(
  tbl,
  tbl_name = NULL,
  file_name = tbl_name,
  path = NULL,
  lang = NULL,
  output_type = c("R", "Rmd"),
  add_comments = TRUE,
  overwrite = FALSE,
  quiet = FALSE
)

Arguments

tbl

The input table. This can be a data frame, tibble, a tbl_dbi object, or a tbl_spark object.

tbl_name

A optional name to assign to the input table object. If no value is provided, a name will be generated based on whatever information is available. This table name will be displayed in the header area of the agent report generated by printing the agent or calling get_agent_report().

file_name

An optional name for the .R or .Rmd file. This should be a name without an extension. By default, this is taken from the tbl_name but if nothing is supplied for that, the name will contain the text "draft_validation_" followed by the current date and time.

path

A path can be specified here if there shouldn't be an attempt to place the generated file in the working directory.

lang

The language to use when creating comments for the automatically- generated validation steps. By default, NULL will create English ("en") text. Other options include French ("fr"), German ("de"), Italian ("it"), Spanish ("es"), Portuguese ("pt"), Turkish ("tr"), Chinese ("zh"), Russian ("ru"), Polish ("pl"), Danish ("da"), Swedish ("sv"), and Dutch ("nl").

output_type

An option for choosing what type of output should be generated. By default, this is an .R script ("R") but this could alternatively be an R Markdown document ("Rmd").

add_comments

Should there be comments that explain the features of the validation plan in the generated document? By default, this is TRUE.

overwrite

Should a file of the same name be overwritten? By default, this is FALSE.

quiet

Should the function not inform when the file is written? By default this is FALSE.

Value

Invisibly returns TRUE if the file has been written.

Function ID

1-11

See also

Examples

if (interactive()) {

# Draft validation plan for the
# `dplyr::storms` dataset
draft_validation(tbl = dplyr::storms)

}