| Pointblank Validation | |||||||||||||
2025-10-29|23:17:19 DuckDBWARNING0.05ERROR0.1CRITICAL0.15 |
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | E | C | EXT | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #4CA64C | 1 |
col_vals_in_set()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
| #4CA64C | 2 |
col_vals_regex()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
○ | ○ | ○ | — | |||
| #EBBC14 | 3 |
col_vals_gt()
|
✓ | 2000 | 1701 0.85 |
299 0.15 |
● | ● | ○ | — | |||
| #AAAAAA | 4 |
col_vals_gt()
|
✓ | 2000 | 1993 1.00 |
7 0.00 |
● | ○ | ○ | — | |||
| #FF3300 | 5 |
col_exists()
|
✓ | 1 | 0 0.00 |
1 1.00 |
● | ● | ● | — | |||
2025-10-29 23:17:19 UTC< 1 s2025-10-29 23:17:19 UTC |
|||||||||||||
Set Failure Threshold Levels
Set threshold levels to better gauge adverse data quality.
import pointblank as pb
validation = (
pb.Validate(
data=pb.load_dataset(dataset="game_revenue", tbl_type="duckdb"),
thresholds=pb.Thresholds( # setting relative threshold defaults for all steps
warning=0.05, # 5% failing test units: warning threshold (gray)
error=0.10, # 10% failed test units: error threshold (yellow)
critical=0.15 # 15% failed test units: critical threshold (red)
),
)
.col_vals_in_set(columns="item_type", set=["iap", "ad"])
.col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
.col_vals_gt(columns="item_revenue", value=0.05)
.col_vals_gt(
columns="session_duration",
value=4,
thresholds=(5, 10, 20) # setting absolute thresholds for *this* step (W, E, C)
)
.col_exists(columns="end_day")
.interrogate()
)
validationPreview of Input Table
DuckDBRows2,000Columns11 |
|||||||||||