Data validation made beautiful and powerful
Pointblank is a powerful, yet elegant data validation framework for Python that transforms how you ensure data quality. With its intuitive, chainable API, you can quickly validate your data against comprehensive quality checks and visualize results through stunning, interactive reports that make data issues immediately actionable.
Whether you’re a data scientist, data engineer, or analyst, Pointblank helps you catch data quality issues before they impact your analyses or downstream systems.
Getting Started in 30 Seconds
import pointblank as pb
= (
validation =pb.load_dataset(dataset="small_table"))
pb.Validate(data="d", value=100) # Validate values > 100
.col_vals_gt(columns="c", value=5) # Validate values <= 5
.col_vals_le(columns=["date", "date_time"]) # Check columns exist
.col_exists(columns# Execute and collect results
.interrogate()
)
# Get the validation report from the REPL with:
validation.get_tabular_report().show()
# From a notebook simply use:
validation
Real-World Example
import pointblank as pb
import polars as pl
# Load your data
= pl.read_csv("sales_data.csv")
sales_data
# Create a comprehensive validation
= (
validation
pb.Validate(=sales_data,
data="sales_data", # Name of the table for reporting
tbl_name="Real-world example.", # Label for the validation, appears in reports
label=(0.01, 0.02, 0.05), # Set thresholds for warnings, errors, and critical issues
thresholds=pb.Actions( # Define actions for any threshold exceedance
actions="Major data quality issue found in step {step} ({time})."
critical
),=pb.FinalActions( # Define final actions for the entire validation
final_actions
pb.send_slack_notification(="https://hooks.slack.com/services/your/webhook/url"
webhook_url
)
),=True, # Add automatically-generated briefs for each step
brief
)# Check numeric ranges with precision
.col_vals_between( =["price", "quantity"],
columns=0, right=1000
left
)# Ensure that columns ending with '_id' don't have null values
.col_vals_not_null( =pb.ends_with("_id")
columns
)# Validate patterns with regex
.col_vals_regex( ="email",
columns="^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
pattern
)# Check categorical values
.col_vals_in_set( ="status",
columnsset=["pending", "shipped", "delivered", "returned"]
)# Combine multiple conditions
.conjointly( lambda df: pb.expr_col("revenue") == pb.expr_col("price") * pb.expr_col("quantity"),
lambda df: pb.expr_col("tax") >= pb.expr_col("revenue") * 0.05
)
.interrogate() )
Major data quality issue found in step 7 (2025-04-16 15:03:04.685612+00:00).
# Get an HTML report you can share with your team
"browser") validation.get_tabular_report().show(
# Get a report of failing records from a specific step
=3).show("browser") # Get failing records from step 3 validation.get_step_report(i
Join the Community
We’d love to hear from you! Connect with us:
- GitHub Issues for bug reports and feature requests
- Discord server for discussions and help
- Contributing guidelines if you’d like to help improve Pointblank
Installation
You can install Pointblank using pip:
pip install pointblank
You can also install Pointblank from Conda-Forge by using:
conda install conda-forge::pointblank
If you don’t have Polars or Pandas installed, you’ll need to install one of them to use Pointblank.
pip install "pointblank[pl]" # Install Pointblank with Polars
pip install "pointblank[pd]" # Install Pointblank with Pandas
To use Pointblank with DuckDB, MySQL, PostgreSQL, or SQLite, install Ibis with the appropriate backend:
pip install "pointblank[duckdb]" # Install Pointblank with Ibis + DuckDB
pip install "pointblank[mysql]" # Install Pointblank with Ibis + MySQL
pip install "pointblank[postgres]" # Install Pointblank with Ibis + PostgreSQL
pip install "pointblank[sqlite]" # Install Pointblank with Ibis + SQLite
Technical Details
Pointblank uses Narwhals to work with Polars and Pandas DataFrames, and integrates with Ibis for database and file format support. This architecture provides a consistent API for validating tabular data from various sources.
Contributing to Pointblank
There are many ways to contribute to the ongoing development of Pointblank. Some contributions can be simple (like fixing typos, improving documentation, filing issues for feature requests or problems, etc.) and others might take more time and care (like answering questions and submitting PRs with code changes). Just know that anything you can do to help would be very much appreciated!
Please read over the contributing guidelines for information on how to get started.
Roadmap
We’re actively working on enhancing Pointblank with:
- Additional validation methods for comprehensive data quality checks
- Advanced logging capabilities
- Messaging actions (Slack, email) for threshold exceedances
- LLM-powered validation suggestions and data dictionary generation
- JSON/YAML configuration for pipeline portability
- CLI utility for validation from the command line
- Expanded backend support and certification
- High-quality documentation and examples
If you have any ideas for features or improvements, don’t hesitate to share them with us! We are always looking for ways to make Pointblank better.
Code of Conduct
Please note that the Pointblank project is released with a contributor code of conduct.
By participating in this project you agree to abide by its terms.
đź“„ License
Pointblank is licensed under the MIT license.
© Posit Software, PBC.
🏛️ Governance
This project is primarily maintained by Rich Iannone. Other authors may occasionally assist with some of these duties.