| Pointblank Validation | |||||||||||||
2026-04-22|21:11:59 Polars |
|||||||||||||
| STEP | COLUMNS | VALUES | TBL | EVAL | UNITS | PASS | FAIL | W | E | C | EXT | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #4CA64C | 1 |
specially()
Recent data available (within 2 days of 2023-12-31) |
✓ | 1 | 1 1.00 |
0 0.00 |
— | — | — | — | |||
| #4CA64C | 2 |
col_vals_ge()
All data points are from the last week |
✓ | 4 | 4 1.00 |
0 0.00 |
— | — | — | — | |||
| #4CA64C | 3 |
specially()
Most recent data is from today |
✓ | 1 | 1 1.00 |
0 0.00 |
— | — | — | — | |||
| #4CA64C | 4 |
col_vals_not_null()
No missing timestamps |
✓ | 4 | 4 1.00 |
0 0.00 |
— | — | — | — | |||
2026-04-22 21:11:59 UTC< 1 s2026-04-22 21:11:59 UTC |
|||||||||||||
Validating Data Freshness
Use date-based validations to ensure your data is current and recent.
import pointblank as pb
import polars as pl
from datetime import date, datetime, timedelta
# Create sample data with mixed freshness levels
freshness_data = pl.DataFrame({
"data_timestamp": [
datetime(2023, 12, 28, 10, 30), # 3 days ago from Dec 31
datetime(2023, 12, 29, 14, 15), # 2 days ago
datetime(2023, 12, 30, 9, 45), # 1 day ago
datetime(2023, 12, 31, 16, 20), # Today
],
"sensor_id": ["TEMP_01", "TEMP_02", "TEMP_01", "TEMP_03"],
"reading": [22.5, 21.8, 23.1, 22.9],
"quality_score": [0.95, 0.88, 0.92, 0.97]
})
# Assuming today is 2023-12-31, check for data freshness
current_date = date(2023, 12, 31)
freshness_cutoff = current_date - timedelta(days=2) # Data should be within 2 days
validation = (
pb.Validate(freshness_data)
.specially(
expr=lambda df: df.filter(
pl.col("data_timestamp").dt.date() >= freshness_cutoff
).height > 0,
brief=f"Recent data available (within 2 days of {current_date})"
)
.col_vals_ge(
columns="data_timestamp",
value=current_date - timedelta(days=7), # Within last week
brief="All data points are from the last week"
)
.specially(
expr=lambda df: (
df.select(pl.col("data_timestamp").max()).item().date() >= current_date
),
brief="Most recent data is from today"
)
.col_vals_not_null(
columns="data_timestamp",
brief="No missing timestamps"
)
.interrogate()
)
validationPreview of Input Table
PolarsRows4Columns4 |
||||