CLI Interactive Demos

These CLI demos showcase practical data quality workflows that you can use!

🎬 Workflow-Based Demonstrations
  • Essential validations for everyday data quality checks
  • Data exploration tools that require no Python knowledge
  • CI/CD integration patterns for automated data quality
  • Complete pipelines from exploration to production validation
Prerequisites

To follow along with these demonstrations:

pip install pointblank
pb --help  # Verify installation

Getting Started with the CLI

Learn the basics of Pointblank’s CLI and run your first validation:

Getting Started CLI overview and your first data quality validation

Essential Data Quality Validations

See the most commonly used validation checks that catch critical data issues:

Essential Validations Duplicate detection, null checks, and data extract debugging

Data Exploration Without Python

Discover how to profile and explore data using CLI tools that don’t require Python knowledge:

Data Exploration Preview data, find missing values, and generate column summaries

CI/CD Integration & Automation

Learn how to integrate data quality checks into automated pipelines:

CI/CD Integration Exit codes, pipeline integration, and automated quality gates

Complete Data Quality Workflow

Follow an end-to-end data quality pipeline combining exploration, validation, and profiling:

Complete Workflow Full pipeline: explore → validate → profile → automate

Getting Started

Ready to implement data quality workflows? Here’s how to get started:

1. Install and Verify

pip install pointblank
pb --help

2. Explore Various Data Sources

# Built-in datasets
pb preview small_table

# Local files with patterns
pb preview "data/*.parquet"
pb scan sales_data.csv

# GitHub repositories (no download required)
pb preview "https://github.com/user/repo/blob/main/data.csv"
pb missing "https://raw.githubusercontent.com/user/repo/main/sales.parquet"

# Database connections
pb info "duckdb:///warehouse/analytics.ddb::customers"

3. Run Essential Validations

# Check for duplicate rows
pb validate-simple small_table --check rows-distinct

# Validate data from multiple sources
pb validate-simple "data/*.parquet" --check col-vals-not-null --column customer_id
pb validate-simple "https://github.com/user/repo/blob/main/sales.csv" --check rows-distinct

# Extract failing data for debugging
pb validate-simple small_table --check col-vals-gt --column a --value 5 --show-extract

4. Integrate with CI/CD

# Use exit codes for automation (0 = pass, 1 = fail)
pb validate-simple small_table --check rows-distinct && echo "✅ Quality checks passed"