Pointblank Validation | |||||||||||||
2025-03-07|19:46:37 DuckDBWARNING0.05ERROR0.1CRITICAL0.15 |
|||||||||||||
STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | E | C | EXT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
#4CA64C | 1 |
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
#4CA64C | 2 |
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
#EBBC14 | 3 |
|
✓ | 2000 | 1701 0.85 |
299 0.15 |
● | ● | ○ | — | |||
#AAAAAA | 4 |
|
✓ | 2000 | 1993 1.00 |
7 0.00 |
● | ○ | ○ | — | |||
#FF3300 | 5 |
|
✓ | 1 | 0 0.00 |
1 1.00 |
● | ● | ● | — | |||
2025-03-07 19:46:37 UTC< 1 s2025-03-07 19:46:37 UTC |
Set Failure Threshold Levels
Set threshold levels to better gauge adverse data quality.
import pointblank as pb
= (
validation
pb.Validate(=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
data=pb.Thresholds( # setting relative threshold defaults for all steps
thresholds=0.05, # 5% failing test units: warning threshold (gray)
warning=0.10, # 10% failed test units: error threshold (yellow)
error=0.15 # 15% failed test units: critical threshold (red)
critical
),
)="item_type", set=["iap", "ad"])
.col_vals_in_set(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
.col_vals_regex(columns="item_revenue", value=0.05)
.col_vals_gt(columns
.col_vals_gt(="session_duration",
columns=4,
value=(5, 10, 20) # setting absolute thresholds for *this* step (W, E, C)
thresholds
)="end_day")
.col_exists(columns
.interrogate()
)
validation
Preview of Input Table
DuckDBRows2,000Columns11 |
|||||||||||