Validate.stop

Validate.stop(i=None, scalar=False)

Provides a dictionary of the stopping status for each validation step.

The stopping status (stop) for a validation step is True if the fraction of failing test units meets or exceeds the threshold for the stopping level. Otherwise, the status is False.

The ascribed name of stop is semantic and does not imply that the validation process is halted, it is simply a status indicator that could be used to trigger a stoppage of the validation process. Here’s how it fits in with other status indicators:

warn: the status obtained by calling warn(), least severe
stop: the status obtained by calling stop(), middle severity
notify: the status obtained by calling notify(), most severe

This method provides a dictionary of the stopping status for each validation step. If the scalar=True argument is provided and i= is a scalar, the value is returned as a scalar instead of a dictionary.

Parameters

i : int | list[int] | None = None: The validation step number(s) from which the stopping status is obtained. Can be provided as a list of integers or a single integer. If None, all steps are included.
scalar : bool = False: If True and i= is a scalar, return the value as a scalar instead of a dictionary.

Returns

: dict[int, bool] | bool: A dictionary of the stopping status for each validation step or a scalar value.

Examples

In the example below, we’ll use a simple Polars DataFrame with three columns (a, b, and c). There will be three validation steps, and the first step will have some failing test units, the rest will be completely passing. We’ve set thresholds here for each of the steps by using thresholds=(2, 4, 5), which means:

the warn threshold is 2 failing test units
the stop threshold is 4 failing test units
the notify threshold is 5 failing test units

After interrogation, the stop() method is used to determine the stop status for each validation step.

import polars as pl
import pointblank as pb

tbl = pl.DataFrame(
    {
        "a": [3, 4, 9, 7, 2, 3, 8],
        "b": [9, 8, 10, 5, 10, 6, 2],
        "c": ["a", "b", "a", "a", "b", "b", "a"]
    }
)

validation = (
    pb.Validate(data=tbl, thresholds=(2, 4, 5))
    .col_vals_gt(columns="a", value=5)
    .col_vals_lt(columns="b", value=15)
    .col_vals_in_set(columns="c", set=["a", "b"])
    .interrogate()
)

validation.stop()

{1: True, 2: False, 3: False}

The returned dictionary provides the stop status for each validation step. The first step has a True value since the number of failing test units meets the threshold for the stop level. The second and third steps have False values since the number of failing test units was 0, which is below the threshold for the stop level.

We can also visually inspect the stop status across all steps by viewing the validation table:

validation

		STEP	COLUMNS	VALUES	EVAL	UNITS	PASS	FAIL	W	S	N	EXT
Pointblank Validation
2024-12-20\|15:09:40 PolarsWARN2STOP4NOTIFY5
#CF142B	1	col_vals_gt()	a	5	✓	7	3 0.43	4 0.57	●	●	○
#4CA64C	2	col_vals_lt()	b	15	✓	7	7 1.00	0 0.00	○	○	○	—
#4CA64C	3	col_vals_in_set()	c	a, b	✓	7	7 1.00	0 0.00	○	○	○	—
2024-12-20 15:09:40 UTC< 1 s2024-12-20 15:09:40 UTC

We can see that there are filled yellow and red circles in the first step (far right side, in the W and S columns) indicating that the warn and stop thresholds were met. The other steps have empty yellow and red circles. This means that thresholds were ‘set but not met’ in those steps.

If we wanted to check the stop status for a single validation step, we can provide the step number. Also, we could have the value returned as a scalar by setting scalar=True (ensuring that i= is a scalar).

validation.stop(i=1)

{1: True}

The returned value is True, indicating that the first validation step had the stop threshold met.