Using Parquet Data
A Parquet dataset can be used for data validation, thanks to Ibis.
Pointblank Validation |
Example using a Parquet dataset.Parquet |
|
|
STEP |
COLUMNS |
VALUES |
TBL |
EVAL |
UNITS |
PASS |
FAIL |
W |
E |
C |
EXT |
#4CA64C |
1 |
col_vals_lt()
|
item_revenue |
200 |
|
✓ |
2000 |
2000 1.00 |
0 0.00 |
— |
— |
— |
— |
#4CA64C |
2 |
col_vals_gt()
|
item_revenue |
0 |
|
✓ |
2000 |
2000 1.00 |
0 0.00 |
— |
— |
— |
— |
#4CA64C66 |
3 |
col_vals_gt()
|
session_duration |
5 |
|
✓ |
2000 |
1982 0.99 |
18 0.01 |
— |
— |
— |
— |
#4CA64C |
4 |
col_vals_in_set()
|
item_type |
iap, ad |
|
✓ |
2000 |
2000 1.00 |
0 0.00 |
— |
— |
— |
— |
#4CA64C |
5 |
col_vals_regex()
|
player_id |
[A-Z]{12}\d{3} |
|
✓ |
2000 |
2000 1.00 |
0 0.00 |
— |
— |
— |
— |
2025-03-07 19:46:49 UTC< 1 s2025-03-07 19:46:50 UTC |
import pointblank as pb
import ibis
game_revenue = ibis.read_parquet("data/game_revenue.parquet")
validation = (
pb.Validate(data=game_revenue, label="Example using a Parquet dataset.")
.col_vals_lt(columns="item_revenue", value=200)
.col_vals_gt(columns="item_revenue", value=0)
.col_vals_gt(columns="session_duration", value=5)
.col_vals_in_set(columns="item_type", set=["iap", "ad"])
.col_vals_regex(columns="player_id", pattern=r"[A-Z]{12}\d{3}")
.interrogate()
)
validation
Preview of Input Table
ParquetRows2,000Columns11 |
|
|
|
|
|
|
|
|
|
|
|
|
1 |
ECPANOIXLZHF896 |
ECPANOIXLZHF896-eol2j8bs |
2015-01-01 01:31:03+00:00 |
2015-01-01 01:31:27+00:00 |
iap |
offer2 |
8.99 |
16.3 |
2015-01-01 |
google |
Germany |
2 |
ECPANOIXLZHF896 |
ECPANOIXLZHF896-eol2j8bs |
2015-01-01 01:31:03+00:00 |
2015-01-01 01:36:57+00:00 |
iap |
gems3 |
22.49 |
16.3 |
2015-01-01 |
google |
Germany |
3 |
ECPANOIXLZHF896 |
ECPANOIXLZHF896-eol2j8bs |
2015-01-01 01:31:03+00:00 |
2015-01-01 01:37:45+00:00 |
iap |
gold7 |
107.99 |
16.3 |
2015-01-01 |
google |
Germany |
4 |
ECPANOIXLZHF896 |
ECPANOIXLZHF896-eol2j8bs |
2015-01-01 01:31:03+00:00 |
2015-01-01 01:42:33+00:00 |
ad |
ad_20sec |
0.76 |
16.3 |
2015-01-01 |
google |
Germany |
5 |
ECPANOIXLZHF896 |
ECPANOIXLZHF896-hdu9jkls |
2015-01-01 11:50:02+00:00 |
2015-01-01 11:55:20+00:00 |
ad |
ad_5sec |
0.03 |
35.2 |
2015-01-01 |
google |
Germany |
1996 |
NAOJRDMCSEBI281 |
NAOJRDMCSEBI281-j2vs9ilp |
2015-01-21 01:57:50+00:00 |
2015-01-21 02:02:50+00:00 |
ad |
ad_survey |
1.332 |
25.8 |
2015-01-11 |
organic |
Norway |
1997 |
NAOJRDMCSEBI281 |
NAOJRDMCSEBI281-j2vs9ilp |
2015-01-21 01:57:50+00:00 |
2015-01-21 02:22:14+00:00 |
ad |
ad_survey |
1.35 |
25.8 |
2015-01-11 |
organic |
Norway |
1998 |
RMOSWHJGELCI675 |
RMOSWHJGELCI675-vbhcsmtr |
2015-01-21 02:39:48+00:00 |
2015-01-21 02:40:00+00:00 |
ad |
ad_5sec |
0.03 |
8.4 |
2015-01-10 |
other_campaign |
France |
1999 |
RMOSWHJGELCI675 |
RMOSWHJGELCI675-vbhcsmtr |
2015-01-21 02:39:48+00:00 |
2015-01-21 02:47:12+00:00 |
iap |
offer5 |
26.09 |
8.4 |
2015-01-10 |
other_campaign |
France |
2000 |
GJCXNTWEBIPQ369 |
GJCXNTWEBIPQ369-9elq67md |
2015-01-21 03:59:23+00:00 |
2015-01-21 04:06:29+00:00 |
ad |
ad_5sec |
0.12 |
18.5 |
2015-01-14 |
organic |
United States |