Pointblank Validation | |||||||||||||
2025-01-31|16:28:20 DuckDB |
|||||||||||||
STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | S | N | EXT | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
#4CA64C66 | 1 |
|
✓ | 1 | 0 0.00 |
1 1.00 |
— | — | — | — | |||
2025-01-31 16:28:20 UTC< 1 s2025-01-31 16:28:20 UTC |
Step Report: Schema Check
When a schema doesn’t match, a step report gives you the details.
Report for Validation Step 1 ✗ | |||||||
COLUMN SCHEMA MATCH COMPLETE IN ORDER COLUMN ≠ column DTYPE ≠ dtype float ≠ float64 |
|||||||
TARGET | EXPECTED | ||||||
---|---|---|---|---|---|---|---|
COLUMN | DTYPE | COLUMN | DTYPE | ||||
1 | 1 | ✓ | ✗ | ||||
2 | 2 | ✗ | — | ||||
3 | 3 | ✓ | ✓ | ||||
4 | 4 | ✓ | |||||
5 | 5 | ✓ | |||||
6 | 6 | ✓ | ✓ | ||||
7 | 7 | ✓ | ✓ | ||||
8 | 8 | ✓ | ✗ | ||||
Supplied Column Schema: [('date_time', 'timestamp'), ('dates', 'date'), ('a', 'int64'), ('b',), ('c',), ('d', 'float64'), ('e', ['bool', 'boolean']), ('f', 'str')] |
import pointblank as pb
# Create a schema for the target table (`small_table` as a DuckDB table)
= pb.Schema(
schema =[
columns"date_time", "timestamp"), # this dtype doesn't match
("dates", "date"), # this column name doesn't match
("a", "int64"),
("b",), # omit dtype to not check for it
("c",), # "" "" "" ""
("d", "float64"),
("e", ["bool", "boolean"]), # try several dtypes (second one matches)
("f", "str"), # this dtype doesn't match
(
]
)
# Use the `col_schema_match()` validation method to perform a schema check
= (
validation =pb.load_dataset(dataset="small_table", tbl_type="duckdb"))
pb.Validate(data=schema)
.col_schema_match(schema
.interrogate()
)
validation
=1) validation.get_step_report(i
Preview of Input Table
DuckDBRows13Columns8 |
||||||||