In an agent-based workflow, after interrogation with interrogate() we can get the row data that didn't pass row-based validation steps with the get_data_extracts() function. The amount of data available in a particular extract depends on both the fraction of test units that didn't pass a validation step and the level of sampling or explicit collection from that set of units.

The availability of data extracts for each row-based validation step is depends on whether extract_failed is set to TRUE within the interrogate() call (it is by default). The amount of fail rows extracted depends on the collection parameters in interrogate(), and the default behavior is to collect up to the first 5000 fail rows.

Row-based validation steps are based on the validation functions of the form col_vals_*() and also include conjointly() and rows_distinct(). Only those types of validation steps can provide data extracts.

get_data_extracts(agent, i = NULL)



An agent object of class ptblank_agent. It should have had interrogate() called on it, such that the validation steps were carried out and any sample rows from non-passing validations could potentially be available in the object.


The validation step number, which is assigned to each validation step in the order of definition. If NULL (the default), all data extract tables will be provided in a list object.


A list of tables if i is not provided, or, a standalone table if i is given.

Function ID


See also


# Create a simple table with a # column of numerical values tbl <- dplyr::tibble(a = c(5, 7, 8, 5)) # Create 2 simple validation steps # that test whether values within # column `a` agent <- create_agent(tbl = tbl) %>% col_vals_between(vars(a), 4, 6) %>% col_vals_lte(vars(a), 7) %>% interrogate( extract_failed = TRUE, get_first_n = 10 ) # Get row sample data for those rows # in `tbl` that did not pass the first # validation step (`col_vals_between`) agent %>% get_data_extracts(i = 1)
#> # A tibble: 2 x 1 #> a #> <dbl> #> 1 7 #> 2 8