Generate an HTML report that scours the input table data. Before calling up an agent to validate the data, it's a good idea to understand the data with some level of precision. Make this the initial step of a well-balanced data quality reporting workflow. The reporting output contains several sections to make everything more digestible, and these are:
Table dimensions, duplicate row counts, column types, and reproducibility information
A summary for each table variable and further statistics and summaries depending on the variable type
A matrix plot that shows interactions between variables
A set of correlation matrix plots for numerical variables
A summary figure that shows the degree of missingness across variables
A table that provides the head and tail rows of the dataset
The output HTML report will appear in the RStudio Viewer and can also be
integrated in R Markdown HTML output. If you need the output HTML as a
string, it's possible to get that by using as.character()
(e.g.,
scan_data(tbl = mtcars) %>% as.character()
). The resulting HTML string is a
complete HTML document where Bootstrap and jQuery are embedded
within.
scan_data(tbl, sections = "OVICMS", navbar = TRUE, lang = NULL, locale = NULL)
tbl | The input table. This can be a data frame, tibble, a |
---|---|
sections | The sections to include in the finalized |
navbar | Should there be a navigation bar anchored to the top of the
report page? By default this is |
lang | The language to use for label text in the report. By default,
|
locale | An optional locale ID to use for formatting values in the
report according the locale's rules. Examples include |
1-1
Other Planning and Prep:
action_levels()
,
create_agent()
,
create_informant()
,
db_tbl()
,
draft_validation()
,
file_tbl()
,
tbl_get()
,
tbl_source()
,
tbl_store()
,
validate_rmd()
if (interactive()) { # Get an HTML document that describes all of # the data in the `dplyr::storms` dataset tbl_scan <- scan_data(tbl = dplyr::storms) }