This utility function can help you easily determine whether a column of a specified name is present in a table object. This function works well enough on a table object but it can also be used as part of a formula in any validation function's active argument. Using active = ~ . %>% has_columns("column_1") means that the validation step will be inactive if the target table doesn't contain a column named column_1. We can also use multiple columns in vars() so having active = ~ . %>% has_columns(vars(column_1, column_2)) in a validation step will make it inactive at interrogate() time unless the columns column_1 and column_2 are both present.

has_columns(x, columns)

Arguments

x

The table object.

columns

One or more column names that are to be checked for existence in the table x.

Value

A length-1 logical vector.

Function ID

13-2

See also

Other Utility and Helper Functions: affix_datetime(), affix_date(), col_schema(), from_github(), stop_if_not()

Examples

# The `small_table` dataset in the
# package has the columns `date_time`,
# `date`, and the `a` through `f`
# columns
small_table
#> # A tibble: 13 × 8
#>    date_time           date           a b             c      d e     f    
#>    <dttm>              <date>     <int> <chr>     <dbl>  <dbl> <lgl> <chr>
#>  1 2016-01-04 11:00:00 2016-01-04     2 1-bcd-345     3  3423. TRUE  high 
#>  2 2016-01-04 00:32:00 2016-01-04     3 5-egh-163     8 10000. TRUE  low  
#>  3 2016-01-05 13:32:00 2016-01-05     6 8-kdg-938     3  2343. TRUE  high 
#>  4 2016-01-06 17:23:00 2016-01-06     2 5-jdo-903    NA  3892. FALSE mid  
#>  5 2016-01-09 12:36:00 2016-01-09     8 3-ldm-038     7   284. TRUE  low  
#>  6 2016-01-11 06:15:00 2016-01-11     4 2-dhe-923     4  3291. TRUE  mid  
#>  7 2016-01-15 18:46:00 2016-01-15     7 1-knw-093     3   843. TRUE  high 
#>  8 2016-01-17 11:27:00 2016-01-17     4 5-boe-639     2  1036. FALSE low  
#>  9 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high 
#> 10 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high 
#> 11 2016-01-26 20:07:00 2016-01-26     4 2-dmx-010     7   834. TRUE  low  
#> 12 2016-01-28 02:51:00 2016-01-28     2 7-dmx-010     8   108. FALSE low  
#> 13 2016-01-30 11:23:00 2016-01-30     1 3-dka-303    NA  2230. TRUE  high 

# With `has_columns()` we can check for
# column existence by using it directly
# with the table; a column name can be
# verified as present by using it in
# double quotes
small_table %>% has_columns("date")
#> [1] TRUE

# Multiple column names can be supplied;
# this is `TRUE` because both columns are
# present in `small_table`
small_table %>% has_columns(c("a", "b"))
#> [1] TRUE

# It's possible to supply column names
# in `vars()` as well
small_table %>% has_columns(vars(a, b))
#> [1] TRUE

# Because column `h` isn't present, this
# returns `FALSE` (all specified columns
# need to be present to obtain `TRUE`)
small_table %>% has_columns(vars(a, h))
#> [1] FALSE

# The `has_columns()` function can be
# useful in expressions that involve the
# target table, especially if it is
# uncertain that the table will contain
# a column that's involved in a validation

# In the following agent-based validation,
# the first two steps will be 'active'
# because all columns checked for in the
# expressions are present; the third step
# is inactive because column `j` isn't
# there (without the `active` statement we
# would get an evaluation failure in the
# agent report)
agent <- 
  create_agent(
    read_fn = ~ small_table,
    tbl_name = "small_table"
  ) %>%
  col_vals_gt(
    vars(c), value = vars(a),
    active = ~ . %>% has_columns(vars(a, c))
  ) %>%
  col_vals_lt(
    vars(h), value = vars(d),
    preconditions = ~ . %>% dplyr::mutate(h = d - a),
    active = ~ . %>% has_columns(vars(a, d))
  ) %>%
  col_is_character(
    vars(j),
    active = ~ . %>% has_columns("j")
  ) %>%
  interrogate() 
#>  Step 3 is not set as active. Skipping.