Planning and PrepShould you need to understand your data a bit more, use the |
|
---|---|
Thoroughly scan a table to better understand it |
|
Create a pointblank agent object |
|
Create a pointblank informant object |
|
Modify pointblank validation testing options within R Markdown documents |
|
Set action levels: failure thresholds and functions to invoke |
|
Get a table from a database |
|
Get a table from a local or remote file |
|
Define a store of tables with table-prep formulas: a table store |
|
Obtain a table-prep formula from a table store |
|
Obtain a materialized table via a table store |
|
Validation, Expectation, and Test FunctionsValidation steps are either used with an agent object or, more simply, just with the table of interest. When used with an agent, each step function called works to build up a validation plan (which is executed with the |
|
Are column data less than a fixed value or data in another column? |
|
Are column data less than or equal to a fixed value or data in another column? |
|
|
Are column data equal to a fixed value or data in another column? |
|
Are column data not equal to a fixed value or data in another column? |
Are column data greater than or equal to a fixed value or data in another column? |
|
Are column data greater than a fixed value or data in another column? |
|
|
Do column data lie between two specified values or data in other columns? |
|
Do column data lie outside of two specified values or data in other columns? |
|
Are column data part of a specified set of values? |
|
Are data not part of a specified set of values? |
|
Is a set of values entirely accounted for in a column of values? |
|
Is a set of values a subset of a column of values? |
|
Are column data increasing by row? |
|
Are column data decreasing by row? |
Are column data |
|
|
Are column data not |
|
Do strings in column data match a regex pattern? |
Do column data agree with a predicate expression? |
|
Perform multiple rowwise validations for joint validity |
|
Are row data distinct? |
|
|
Do the columns contain character/string data? |
|
Do the columns contain numeric values? |
|
Do the columns contain integer values? |
|
Do the columns contain logical values? |
Do the columns contain R |
|
Do the columns contain |
|
Do the columns contain R |
|
Do one or more columns actually exist? |
|
|
Do columns in the table (and their types) match a predefined schema? |
Information FunctionsWe can progressively add information to an informant object by using the collection of |
|
Add information that focuses on aspects of the data table as a whole |
|
Add information that focuses on aspects of a data table's columns |
|
Add information that focuses on some key aspect of the data table |
|
Generate a useful text 'snippet' from the target table |
|
A |
|
A |
|
A |
|
A |
|
EmailingSometimes we want to email a report of a validation because of the importance of the information contained therein. The |
|
Send email at a validation step or at the end of an interrogation |
|
Create an email object from a pointblank agent or informant |
|
Provide simple email message body components: body |
|
Provide simple email message body components: footer |
|
LoggingLogging validation failure conditions makes for a good practice during data quality analysis. The |
|
Enable logging of failure conditions at the validation step level |
|
Agent: Interrogate and ReportIf we have an agent object that has a plan (i.e., validation steps), the |
|
Given an agent that has a validation plan, perform an interrogation |
|
Get a summary report from an agent |
|
Informant: Incorporate and ReportIf we have an informant object that has been loaded with information from using the |
|
Given an informant object, update and incorporate table snippets |
|
Get a table information report from an informant object |
|
Post-interrogationThe agent always has a special list called an x-list. Access that by invoking the |
|
Get the agent's x-list |
|
Collect data extracts from a validation step |
|
Sunder the data, splitting it into 'pass' and 'fail' pieces |
|
Did all of the validations fully pass? |
|
Transform a pointblank agent to a testthat test file |
|
Object OpsWe have options for writing an agent or informant to disk with the |
|
Write a pointblank agent or informant to disk |
|
Read a pointblank agent or informant from disk |
|
Set a data table to an agent or informant |
|
Remove a data table associated with an agent or informant |
|
Set a table-reading function to an agent or informant |
|
Remove a table-reading function associated with an agent or informant |
|
Activate one or more of an agent's validation steps |
|
Deactivate one or more of an agent's validation steps |
|
Remove one or more of an agent's validation steps |
|
The MultiagentThe multiagent is a group of agents, each tasked with their own interrogation to perform. As a group, they provide an interesting and informative bit of reporting that tracks the evolution of data quality checks over time. With a multiagent object, we can get a data quality report that handles changes in the target data and matches data validation steps across all agent. The predominant application is retrospective analysis of data quality for a target table. |
|
Create a pointblank multiagent object |
|
Read pointblank agents stored on disk as a multiagent |
|
Get a summary report using multiple agents |
|
pointblank YAMLYAML files can be used in pointblank for two distinct purposes: (1) to define agents and their validation plans, and (2) to define information for tables. The |
|
Write pointblank objects to YAML files |
|
Read a pointblank YAML file to create an agent object |
|
Read a pointblank YAML file to create an informant object |
|
Get an agent from pointblank YAML and |
|
Display pointblank YAML using an agent or a YAML file |
|
Display validation expressions using pointblank YAML |
|
Get an informant from pointblank YAML and |
|
Execute all agent and informant YAML tasks |
|
Utility and Helper Functions |
|
Generate a table column schema manually or with a reference table |
|
Determine if one or more columns exist in a table. |
|
Put the current date into a file name |
|
Put the current date-time into a file name |
|
The next generation of |
|
Specify a file for download from GitHub |
|
Datasets |
|
A small table that is useful for testing |
|
A SQLite version of the |