Pointblank Validation | |||||||||||||
2025-01-20|18:17:40 DuckDBWARN0.05STOP0.1NOTIFY0.15 |
|||||||||||||
STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
#4CA64C | 1 |
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
#4CA64C | 2 |
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
#CF142B | 3 |
|
✓ | 2000 | 1701 0.85 |
299 0.15 |
● | ● | ○ | — | |||
#FFBF00 | 4 |
|
✓ | 2000 | 1993 1.00 |
7 0.00 |
● | ○ | ○ | — | |||
#CF142B | 5 |
|
✓ | 1 | 0 0.00 |
1 1.00 |
● | ● | ● | — | |||
2025-01-20 18:17:40 UTC< 1 s2025-01-20 18:17:40 UTC |
Set Failure Threshold Levels
Set threshold levels to better gauge adverse data quality.
import pointblank as pb
= (
validation
pb.Validate(=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
data=pb.Thresholds( # setting relative threshold defaults for all steps
thresholds=0.05, # 5% failing test units: warn threshold (yellow)
warn_at=0.10, # 10% failed test units: stop threshold (red)
stop_at=0.15 # 15% failed test units: notify threshold (blue)
notify_at
),
)="item_type", set=["iap", "ad"])
.col_vals_in_set(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
.col_vals_regex(columns="item_revenue", value=0.05)
.col_vals_gt(columns
.col_vals_gt(="session_duration",
columns=4,
value=(5, 10, 20) # setting absolute thresholds for *this* step (warn, stop, notify)
thresholds
)="end_day")
.col_exists(columns
.interrogate()
)
validation
Preview of Input Table
DuckDBRows2000Columns11 |
|||||||||||