Validate.stop

Validate.stop(i=None, scalar=False)

Provides a dictionary of the stopping status for each validation step.

The stopping status (stop) for a validation step is True if the fraction of failing test units meets or exceeds the threshold for the stopping level. Otherwise, the status is False.

The ascribed name of stop is semantic and does not imply that the validation process is halted, it is simply a status indicator that could be used to trigger a stoppage of the validation process. Here’s how it fits in with other status indicators:

This method provides a dictionary of the stopping status for each validation step. If the scalar=True argument is provided and i= is a scalar, the value is returned as a scalar instead of a dictionary.

Parameters

i : int | list[int] | None = None

The validation step number(s) from which the stopping status is obtained. Can be provided as a list of integers or a single integer. If None, all steps are included.

scalar : bool = False

If True and i= is a scalar, return the value as a scalar instead of a dictionary.

Returns

: dict[int, bool] | bool

A dictionary of the stopping status for each validation step or a scalar value.

Examples

In the example below, we’ll use a simple Polars DataFrame with three columns (a, b, and c). There will be three validation steps, and the first step will have some failing test units, the rest will be completely passing. We’ve set thresholds here for each of the steps by using thresholds=(2, 4, 5), which means:

  • the warn threshold is 2 failing test units
  • the stop threshold is 4 failing test units
  • the notify threshold is 5 failing test units

After interrogation, the stop() method is used to determine the stop status for each validation step.

import polars as pl
import pointblank as pb

tbl = pl.DataFrame(
    {
        "a": [3, 4, 9, 7, 2, 3, 8],
        "b": [9, 8, 10, 5, 10, 6, 2],
        "c": ["a", "b", "a", "a", "b", "b", "a"]
    }
)

validation = (
    pb.Validate(data=tbl, thresholds=(2, 4, 5))
    .col_vals_gt(columns="a", value=5)
    .col_vals_lt(columns="b", value=15)
    .col_vals_in_set(columns="c", set=["a", "b"])
    .interrogate()
)

validation.stop()
{1: True, 2: False, 3: False}

The returned dictionary provides the stop status for each validation step. The first step has a True value since the number of failing test units meets the threshold for the stop level. The second and third steps have False values since the number of failing test units was 0, which is below the threshold for the stop level.

We can also visually inspect the stop status across all steps by viewing the validation table:

validation
Pointblank Validation
2024-12-20|15:09:40
PolarsWARN2STOP4NOTIFY5
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W S N EXT
#CF142B 1
col_vals_gt
col_vals_gt()
a 5 7 3
0.43
4
0.57
#4CA64C 2
col_vals_lt
col_vals_lt()
b 15 7 7
1.00
0
0.00
#4CA64C 3
col_vals_in_set
col_vals_in_set()
c a, b 7 7
1.00
0
0.00
2024-12-20 15:09:40 UTC< 1 s2024-12-20 15:09:40 UTC

We can see that there are filled yellow and red circles in the first step (far right side, in the W and S columns) indicating that the warn and stop thresholds were met. The other steps have empty yellow and red circles. This means that thresholds were ‘set but not met’ in those steps.

If we wanted to check the stop status for a single validation step, we can provide the step number. Also, we could have the value returned as a scalar by setting scalar=True (ensuring that i= is a scalar).

validation.stop(i=1)
{1: True}

The returned value is True, indicating that the first validation step had the stop threshold met.