Pointblank Validation | |||||||||||||
Example using a Parquet dataset. Parquet |
|||||||||||||
STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | E | C | EXT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
#4CA64C | 1 |
col_vals_lt()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
— | — | — | — | |||
#4CA64C | 2 |
col_vals_gt()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
— | — | — | — | |||
#4CA64C66 | 3 |
col_vals_gt()
|
✓ | 2000 | 1982 0.99 |
18 0.01 |
— | — | — | — | |||
#4CA64C | 4 |
col_vals_in_set()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
— | — | — | — | |||
#4CA64C | 5 |
col_vals_regex()
|
✓ | 2000 | 2000 1.00 |
0 0.00 |
— | — | — | — | |||
2025-04-15 14:42:53 UTC< 1 s2025-04-15 14:42:53 UTC |
Using Parquet Data
A Parquet dataset can be used for data validation, thanks to Ibis.
import pointblank as pb
import ibis
= ibis.read_parquet("data/game_revenue.parquet")
game_revenue
= (
validation =game_revenue, label="Example using a Parquet dataset.")
pb.Validate(data="item_revenue", value=200)
.col_vals_lt(columns="item_revenue", value=0)
.col_vals_gt(columns="session_duration", value=5)
.col_vals_gt(columns="item_type", set=["iap", "ad"])
.col_vals_in_set(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
.col_vals_regex(columns
.interrogate()
)
validation
Preview of Input Table
ParquetRows2,000Columns11 |
|||||||||||