Importing External Schemas

Many teams already have data schemas defined in other tools: JSON Schema files for API validation, Frictionless Table Schemas for open data, dbt schema.yml files for analytics pipelines, or Pandera/Pydantic models in application code. Rather than manually rewriting these specifications as Pointblank validation steps, you can import them directly.

The import_contract() function reads an external schema definition and produces a ContractImport object containing everything Pointblank needs to validate data: column types, constraints, and mapped validation steps. From there you can create a Validate workflow, a Contract object, generate equivalent Python code, or produce a YAML definition, all with a single function call.

Quick Start

The fastest path from an external schema to running validation:

import pointblank as pb
import polars as pl

# Define a JSON Schema (could also be loaded from a file)
user_schema = {
    "type": "object",
    "properties": {
        "user_id": {"type": "integer"},
        "email": {"type": "string", "format": "email"},
        "age": {"type": "integer", "minimum": 0, "maximum": 150},
        "status": {"type": "string", "enum": ["active", "inactive", "pending"]},
    },
    "required": ["user_id", "email"],
}

# Import the schema
result = pb.import_contract(user_schema, format="json_schema")

# Create sample data and validate
users = pl.DataFrame(
    {
        "user_id": [1, 2, 3, 4, 5],
        "email": [
            "alice@example.com",
            "bob@corp.io",
            "charlie@mail.org",
            "dave@startup.co",
            "eve@company.net",
        ],
        "age": [28, 34, 45, 22, 31],
        "status": ["active", "active", "inactive", "pending", "active"],
    }
)

result.to_validate(data=users).interrogate()
Pointblank Validation
2026-06-13|17:47:12
Polars
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C 1
col_schema_match
col_schema_match()
SCHEMA 1 1
1.00
0
0.00
#4CA64C 2
col_vals_not_null
col_vals_not_null()
user_id 5 5
1.00
0
0.00
#4CA64C 3
col_vals_not_null
col_vals_not_null()
email 5 5
1.00
0
0.00
#4CA64C 4
col_vals_within_spec
col_vals_within_spec()
email email 5 5
1.00
0
0.00
#4CA64C 5
col_vals_ge
col_vals_ge()
age 0 5 5
1.00
0
0.00
#4CA64C 6
col_vals_le
col_vals_le()
age 150 5 5
1.00
0
0.00
#4CA64C 7
col_vals_in_set
col_vals_in_set()
status active, inactive, pending 5 5
1.00
0
0.00

Notes

Step 1 (schema_check) Schema validation passed.

Schema Comparison
TARGET EXPECTED
COLUMN DATA TYPE COLUMN DATA TYPE
1 user_id Int64 1 user_id Int64
2 email String 2 email String
3 age Int64 3 age Int64
4 status String 4 status String
Supplied Column Schema:
[('user_id', 'Int64'), ('email', 'String'), ('age', 'Int64'), ('status', 'String')]
Schema Match Settings
COMPLETE
IN ORDER
COLUMN ≠ column
DTYPE ≠ dtype
float ≠ float64

That’s it. The JSON Schema minimum, maximum, enum, format, and required keywords were automatically translated into the appropriate Pointblank validation steps. Each keyword becomes a dedicated validation check: minimum becomes col_vals_ge(), maximum becomes col_vals_le(), enum becomes col_vals_in_set(), and so on. The schema’s required array generates col_vals_not_null() steps for each listed field, ensuring that null values are caught at validation time.

How It Works

The import process has three stages:

  1. Parse: the external schema is read (from a file path, a dict, or a Python object)
  2. Map: each constraint in the source format is translated to a Pointblank validation method
  3. Package: the results are stored in a ContractImport object with multiple output options

flowchart LR
    A[External Schema] --> B[import_contract]
    B --> C[ContractImport]
    C --> D[.to_validate‹data›]
    C --> E[.to_contract‹›]
    C --> F[.to_python‹›]
    C --> G[.to_yaml‹›]

The ContractImport object is your bridge between the external world and Pointblank. It doesn’t execute anything on its own. Rather, it holds the translated specification and lets you choose how to use it. This separation is intentional: you can inspect the translation results, check for any warnings, and decide how to proceed before committing to a particular output format.

The ContractImport Object

After calling import_contract(), you get back a ContractImport with these key attributes and methods:

Attribute / Method Description
.columns List of (column_name, dtype) tuples detected from the source
.constraints List of MappedConstraint objects (method + kwargs)
.warnings Messages about constraints that couldn’t be translated
.coverage Fraction of source constraints successfully mapped (0.0–1.0)
.metadata Extra metadata (title, description) from the source
.to_validate(data) Build a Validate object ready for .interrogate()
.to_contract(name) Build a Contract object for pipeline use
.to_python() Generate equivalent Python code as a string
.to_yaml() Generate Pointblank YAML configuration
.summary() Return a human-readable summary

Inspecting What Was Imported

Before running validation, it’s often useful to inspect what the import produced:

result = pb.import_contract(user_schema, format="json_schema")
print(result.summary())
Contract Import Summary
  Format: json_schema
  Columns detected: 4
  Constraints mapped: 6
  Coverage: 100%

If any constraints couldn’t be mapped, they appear in .warnings:

# A schema with an unmappable format
schema_with_date = {
    "type": "object",
    "properties": {
        "created_at": {"type": "string", "format": "date-time"},
    },
}
result = pb.import_contract(schema_with_date, format="json_schema")

if result.warnings:
    for w in result.warnings:
        print(f"⚠ {w}")

print(f"\nCoverage: {result.coverage:.0%}")
⚠ Column 'created_at': JSON Schema format 'date-time' has no Pointblank equivalent — skipped.

Coverage: 0%

The coverage metric tells you what fraction of the source constraints were successfully translated. A coverage of 100% means everything mapped cleanly; lower values mean some constraints were skipped (with details in warnings). This transparency is important because no translation between formats is perfect. By checking coverage and warnings before running validation, you can be confident about exactly which parts of your original schema are being enforced and which parts might need manual attention.

Supported Formats

Pointblank ships with adapters for the two most universal tabular schema formats. Additional adapters (dbt, Pydantic, Pandera) are planned for future releases.

pb.list_adapters()
{'frictionless': {'class': 'FrictionlessAdapter',
  'file_extensions': ['.resource.json', '.datapackage.json'],
  'supports_import': True,
  'supports_export': True},
 'json_schema': {'class': 'JSONSchemaAdapter',
  'file_extensions': ['.schema.json'],
  'supports_import': True,
  'supports_export': True}}

JSON Schema

JSON Schema is a widely used format for describing the structure of JSON data. Because tabular data (DataFrames) can be modeled as arrays of JSON objects, JSON Schema is a natural fit for defining column-level constraints.

Constraint mapping:

JSON Schema Keyword Pointblank Method
type: "integer" / "number" / "string" / "boolean" Schema dtype check
minimum col_vals_ge()
maximum col_vals_le()
exclusiveMinimum col_vals_gt()
exclusiveMaximum col_vals_lt()
enum col_vals_in_set()
pattern col_vals_regex()
format: "email" col_vals_within_spec(spec="email")
format: "uri" col_vals_within_spec(spec="url")
format: "ipv4" / "ipv6" col_vals_within_spec(spec="ipv4"/"ipv6")
const col_vals_eq()
required (array of field names) col_vals_not_null()

Importing from a file:

# File-based import (auto-detects .schema.json extension)
result = pb.import_contract("models/user_profile.schema.json")

Importing from a dict:

product_schema = {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "Product Catalog",
    "description": "Expected structure for product data",
    "type": "object",
    "properties": {
        "sku": {"type": "string", "pattern": "^[A-Z]{3}-[0-9]{4}$"},
        "price": {"type": "number", "exclusiveMinimum": 0},
        "category": {"type": "string", "enum": ["electronics", "clothing", "food", "other"]},
        "in_stock": {"type": "boolean"},
    },
    "required": ["sku", "price", "category"],
}

result = pb.import_contract(product_schema, format="json_schema")

# Inspect what was mapped
for c in result.constraints:
    print(f"  {c.method}({c.kwargs})")
  col_vals_not_null({'columns': 'sku'})
  col_vals_regex({'columns': 'sku', 'pattern': '^[A-Z]{3}-[0-9]{4}$'})
  col_vals_not_null({'columns': 'price'})
  col_vals_gt({'columns': 'price', 'value': 0})
  col_vals_not_null({'columns': 'category'})
  col_vals_in_set({'columns': 'category', 'set': ['electronics', 'clothing', 'food', 'other']})

Each constraint from the JSON Schema has been translated into the corresponding Pointblank method call. The pattern keyword on the sku field became a col_vals_regex() step, exclusiveMinimum became col_vals_gt() (note the strict inequality), and the three required fields each generated a col_vals_not_null() step. You can iterate over result.constraints like this to verify the translation before running any validation.

Frictionless Data Table Schema

Frictionless Data is a set of standards for describing and packaging data. The Table Schema format is particularly well-suited for tabular data validation, with explicit support for column types, constraints, and primary/foreign keys.

Constraint mapping:

Frictionless Feature Pointblank Method
type Schema dtype check
constraints.required: true col_vals_not_null()
constraints.unique: true rows_distinct()
constraints.minimum / maximum col_vals_ge() / col_vals_le()
constraints.enum col_vals_in_set()
constraints.pattern col_vals_regex()
primaryKey col_vals_not_null() + rows_distinct()
foreignKeys ⚠ Warning (cross-table not yet supported)

Importing a standalone Table Schema:

inventory_schema = {
    "fields": [
        {"name": "item_id", "type": "integer", "constraints": {"required": True, "unique": True}},
        {"name": "name", "type": "string", "constraints": {"required": True}},
        {"name": "quantity", "type": "integer", "constraints": {"minimum": 0}},
        {"name": "warehouse", "type": "string", "constraints": {"enum": ["NYC", "LAX", "ORD"]}},
    ],
    "primaryKey": "item_id",
}

result = pb.import_contract(inventory_schema, format="frictionless")
print(result.summary())
Contract Import Summary
  Format: frictionless
  Columns detected: 4
  Constraints mapped: 7
  Coverage: 100%

Notice how the primaryKey field generates both a not-null check and a uniqueness check for item_id. This is the correct semantic interpretation: a primary key must always be present and must uniquely identify each row. The field-level constraints.required and constraints.unique also contribute their own checks, so the adapter deduplicates where appropriate.

Importing from a Data Package:

Data Packages bundle multiple resources (tables) together. You can select which resource to import by name or index:

data_package = {
    "name": "ecommerce-data",
    "resources": [
        {
            "name": "customers",
            "path": "customers.csv",
            "schema": {
                "fields": [
                    {"name": "id", "type": "integer", "constraints": {"required": True}},
                    {"name": "email", "type": "string"},
                ],
            },
        },
        {
            "name": "orders",
            "path": "orders.csv",
            "schema": {
                "fields": [
                    {"name": "order_id", "type": "integer", "constraints": {"required": True}},
                    {"name": "amount", "type": "number", "constraints": {"minimum": 0}},
                ],
            },
        },
    ],
}

# Import a specific resource by name
orders_import = pb.import_contract(data_package, format="frictionless", resource="orders")
print(f"Columns: {[name for name, _ in orders_import.columns]}")
print(f"Constraints: {len(orders_import.constraints)}")
Columns: ['order_id', 'amount']
Constraints: 2

The resource= parameter accepts either a string (the resource name) or an integer (the resource index). When omitted, the first resource in the package is used. This makes it straightforward to work with multi-table data packages where each table has its own schema definition.

Output Options

Once you have a ContractImport, you can use it in several ways depending on your workflow.

Direct Validation with .to_validate()

The most common path is to get a Validate object, pass your data, and run it:

schema = {
    "type": "object",
    "properties": {
        "temperature": {"type": "number", "minimum": -50, "maximum": 60},
        "humidity": {"type": "number", "minimum": 0, "maximum": 100},
        "station_id": {"type": "string"},
    },
    "required": ["temperature", "humidity", "station_id"],
}

weather_data = pl.DataFrame(
    {
        "temperature": [22.5, 18.3, -5.1, 35.0, 28.7],
        "humidity": [45.0, 78.2, 30.0, 92.5, 55.0],
        "station_id": ["WX-001", "WX-002", "WX-003", "WX-001", "WX-004"],
    }
)

imported = pb.import_contract(schema, format="json_schema")
imported.to_validate(data=weather_data).interrogate()
Pointblank Validation
2026-06-13|17:47:12
Polars
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C 1
col_schema_match
col_schema_match()
SCHEMA 1 1
1.00
0
0.00
#4CA64C 2
col_vals_not_null
col_vals_not_null()
temperature 5 5
1.00
0
0.00
#4CA64C 3
col_vals_ge
col_vals_ge()
temperature -50 5 5
1.00
0
0.00
#4CA64C 4
col_vals_le
col_vals_le()
temperature 60 5 5
1.00
0
0.00
#4CA64C 5
col_vals_not_null
col_vals_not_null()
humidity 5 5
1.00
0
0.00
#4CA64C 6
col_vals_ge
col_vals_ge()
humidity 0 5 5
1.00
0
0.00
#4CA64C 7
col_vals_le
col_vals_le()
humidity 100 5 5
1.00
0
0.00
#4CA64C 8
col_vals_not_null
col_vals_not_null()
station_id 5 5
1.00
0
0.00

Notes

Step 1 (schema_check) Schema validation passed.

Schema Comparison
TARGET EXPECTED
COLUMN DATA TYPE COLUMN DATA TYPE
1 temperature Float64 1 temperature Float64
2 humidity Float64 2 humidity Float64
3 station_id String 3 station_id String
Supplied Column Schema:
[('temperature', 'Float64'), ('humidity', 'Float64'), ('station_id', 'String')]
Schema Match Settings
COMPLETE
IN ORDER
COLUMN ≠ column
DTYPE ≠ dtype
float ≠ float64

The .to_validate() method returns a fully configured Validate object with all imported constraints already applied as validation steps. You get the familiar validation report showing pass/fail counts for each check. Because the Validate object is not yet interrogated when created, you also have the option of adding additional validation steps before calling .interrogate().

You can also pass additional arguments to the Validate constructor:

imported.to_validate(
    data=weather_data,
    tbl_name="weather_readings",
    label="Daily sensor check",
    thresholds=pb.Thresholds(warning=0.05, error=0.10),
)

Any keyword argument accepted by the Validate class can be passed through here, including tbl_name, label, thresholds, owner, and consumers. This gives you full control over how the validation is configured without needing to modify the import result.

Creating a Reusable Contract with .to_contract()

If you want to store the imported schema as a Pointblank Contract for use in pipelines or repeated validation:

imported = pb.import_contract(inventory_schema, format="frictionless")
contract = imported.to_contract(name="inventory_check", version="1.0.0", owner="warehouse-team")
print(contract)
Contract(name='inventory_check', direction='source', version='1.0.0', schema=<defined>, steps=7)

The resulting Contract can be serialized to YAML, used in Pipeline, or shared with other teams. This is particularly valuable when you want to maintain a stable contract definition that outlives the original external schema file. The Contract object carries all the metadata (version, owner, description) that makes it suitable for team workflows and CI/CD pipelines.

Generating Python Code with .to_python()

When you want to see (or save) the equivalent Pointblank Python code that would be generated from the import:

imported = pb.import_contract(user_schema, format="json_schema")
print(imported.to_python())
import pointblank as pb

validation = (
    pb.Validate(data=data)
    .col_schema_match(schema=pb.Schema(user_id="Int64", email="String", age="Int64", status="String"))
    .col_vals_not_null(columns='user_id')
    .col_vals_not_null(columns='email')
    .col_vals_within_spec(columns='email', spec='email')
    .col_vals_ge(columns='age', value=0)
    .col_vals_le(columns='age', value=150)
    .col_vals_in_set(columns='status', set=['active', 'inactive', 'pending'])
)

validation.interrogate()

The generated code is syntactically valid Python that you can copy directly into a script or notebook. It uses the standard Pointblank method-chaining style, making it easy to read and modify. This is especially useful for:

  • understanding exactly what validation steps an import produces
  • generating starter code that you can then customize
  • documentation and onboarding (show teams what their schema “means” in validation terms)

Once you have the generated code, you can paste it into your project and modify it freely. Add extra validation steps, remove checks that don’t apply, or adjust parameter values. The generated code has no dependency on the original schema file, so it serves as a clean handoff point between the schema world and your Python codebase.

Generating YAML with .to_yaml()

For workflows that use Pointblank’s YAML-based validation:

imported = pb.import_contract(user_schema, format="json_schema")
print(imported.to_yaml())
validation:
  steps:
  - col_schema_match:
      schema:
        user_id: Int64
        email: String
        age: Int64
        status: String
  - col_vals_not_null:
      columns: user_id
  - col_vals_not_null:
      columns: email
  - col_vals_within_spec:
      columns: email
      spec: email
  - col_vals_ge:
      columns: age
      value: 0
  - col_vals_le:
      columns: age
      value: 150
  - col_vals_in_set:
      columns: status
      set:
      - active
      - inactive
      - pending

The YAML output follows Pointblank’s validation YAML format, with each constraint appearing as a separate step entry. You can save this output to a file and use it with pb.yaml_interrogate() or pb.validate_yaml() for configuration-driven workflows where validation rules are managed as YAML files rather than Python code.

Exporting Contracts

The reverse operation (taking a Pointblank Contract or Validate object and writing it out in an external format) is handled by export_contract():

# Create a contract
contract = pb.Contract(
    name="sensor_data",
    schema=pb.Schema(temperature="Float64", humidity="Float64", station_id="String"),
    steps=[
        pb.Step("col_vals_ge", columns="temperature", value=-50),
        pb.Step("col_vals_le", columns="temperature", value=60),
        pb.Step("col_vals_ge", columns="humidity", value=0),
        pb.Step("col_vals_le", columns="humidity", value=100),
        pb.Step("col_vals_not_null", columns=["temperature", "humidity", "station_id"]),
    ],
)

# Export to JSON Schema
json_schema = pb.export_contract(contract, format="json_schema")
json_schema
{'$schema': 'https://json-schema.org/draft/2020-12/schema',
 'type': 'object',
 'title': 'sensor_data',
 'properties': {'temperature': {'type': 'number',
   'minimum': -50,
   'maximum': 60},
  'humidity': {'type': 'number', 'minimum': 0, 'maximum': 100},
  'station_id': {'type': 'string'}},
 'required': ['temperature', 'humidity', 'station_id']}
# Export to Frictionless Table Schema
table_schema = pb.export_contract(contract, format="frictionless")
table_schema
{'fields': [{'name': 'temperature',
   'type': 'number',
   'constraints': {'minimum': -50, 'maximum': 60, 'required': True}},
  {'name': 'humidity',
   'type': 'number',
   'constraints': {'minimum': 0, 'maximum': 100, 'required': True}},
  {'name': 'station_id', 'type': 'string', 'constraints': {'required': True}}]}

Each format produces the output structure that is native to that standard. JSON Schema export creates a valid $schema-annotated document with properties, type, and required fields. Frictionless export creates a Table Schema with fields and constraints entries. Both formats can be fed directly into tools that consume those standards, such as form validators, data catalogs, or documentation generators.

You can also write directly to a file:

pb.export_contract(contract, "output/sensor_data.schema.json", format="json_schema")
pb.export_contract(contract, "output/sensor_data.resource.json", format="frictionless")

When a destination path is provided, the output is written to that file (creating parent directories as needed) and also returned from the function. This makes it convenient to both persist the output and inspect it in the same call.

Round-Trip Fidelity

Importing a schema and then exporting it back should produce an equivalent result. This is important for workflows where you maintain schemas in an external format but want to validate with Pointblank:

# Start with a JSON Schema
original = {
    "type": "object",
    "properties": {
        "score": {"type": "integer", "minimum": 0, "maximum": 100},
        "grade": {"type": "string", "enum": ["A", "B", "C", "D", "F"]},
    },
    "required": ["score"],
}

# Import → Contract → Export
imported = pb.import_contract(original, format="json_schema")
contract = imported.to_contract(name="grades")
exported = pb.export_contract(contract, format="json_schema")

# The exported schema preserves the constraints
print(f"Original constraints on 'score': minimum={original['properties']['score']['minimum']}, "
      f"maximum={original['properties']['score']['maximum']}")
print(f"Exported constraints on 'score': minimum={exported['properties']['score'].get('minimum')}, "
      f"maximum={exported['properties']['score'].get('maximum')}")
Original constraints on 'score': minimum=0, maximum=100
Exported constraints on 'score': minimum=0, maximum=100

Round-trip fidelity is tested as part of Pointblank’s test suite. The general guarantee is that any constraint that can be expressed in both Pointblank and the target format will survive the round trip. Constraints that are unique to one format (like JSON Schema’s $ref or Pointblank’s pre= argument) may not survive, but the core numeric bounds, enum checks, null checks, and pattern constraints will always round-trip cleanly.

Auto-Detection

When the format is obvious from the source content, you can omit the format= parameter:

# JSON Schema: detected by presence of "$schema" or "type" + "properties"
result = pb.import_contract({"type": "object", "properties": {"x": {"type": "integer"}}})
print(f"Detected: {result.source_format}")

# Frictionless: detected by presence of "fields" list
result = pb.import_contract({"fields": [{"name": "x", "type": "integer"}]})
print(f"Detected: {result.source_format}")
Detected: json_schema
Detected: frictionless

For file-based imports, the extension is also used for detection (.schema.json maps to JSON Schema, .resource.json or .datapackage.json maps to Frictionless). Auto-detection is a convenience feature that works well for common cases. When working with ambiguous files or dict inputs that could match multiple formats, it is best to specify format= explicitly to avoid any possibility of misdetection.

Combining Imports with Extra Checks

An imported schema gives you a baseline, but you can always add more Pointblank checks on top:

imported = pb.import_contract(user_schema, format="json_schema")

# Start from the import but add custom checks
validation = (
    imported
    .to_validate(data=users, tbl_name="enriched_check")
    .col_vals_regex(columns="email", pattern=r".*\.(com|io|org|net|co)$")
    .rows_distinct(columns_subset="user_id")
    .interrogate()
)

validation
Pointblank Validation
2026-06-13|17:47:12
Polarsenriched_check
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C 1
col_schema_match
col_schema_match()
SCHEMA 1 1
1.00
0
0.00
#4CA64C 2
col_vals_not_null
col_vals_not_null()
user_id 5 5
1.00
0
0.00
#4CA64C 3
col_vals_not_null
col_vals_not_null()
email 5 5
1.00
0
0.00
#4CA64C 4
col_vals_within_spec
col_vals_within_spec()
email email 5 5
1.00
0
0.00
#4CA64C 5
col_vals_ge
col_vals_ge()
age 0 5 5
1.00
0
0.00
#4CA64C 6
col_vals_le
col_vals_le()
age 150 5 5
1.00
0
0.00
#4CA64C 7
col_vals_in_set
col_vals_in_set()
status active, inactive, pending 5 5
1.00
0
0.00
#4CA64C 8
col_vals_regex
col_vals_regex()
email .*\.(com|io|org|net|co)$ 5 5
1.00
0
0.00
#4CA64C 9
rows_distinct
rows_distinct()
user_id 5 5
1.00
0
0.00

Notes

Step 1 (schema_check) Schema validation passed.

Schema Comparison
TARGET EXPECTED
COLUMN DATA TYPE COLUMN DATA TYPE
1 user_id Int64 1 user_id Int64
2 email String 2 email String
3 age Int64 3 age Int64
4 status String 4 status String
Supplied Column Schema:
[('user_id', 'Int64'), ('email', 'String'), ('age', 'Int64'), ('status', 'String')]
Schema Match Settings
COMPLETE
IN ORDER
COLUMN ≠ column
DTYPE ≠ dtype
float ≠ float64

This pattern works well when the external schema covers structural and type constraints, but your team has additional business rules that only make sense in the Pointblank context. The imported constraints form the foundation, and your additional .col_vals_*() or .rows_*() calls layer on top. Because .to_validate() returns a standard Validate object, you have full access to the entire Pointblank API for adding checks, setting thresholds, or attaching actions.

Migration from Other Tools

A key use case for import_contract() is migration: bringing existing validation definitions from other tools into Pointblank without manual rewriting.

Coming from JSON Schema

If your team uses JSON Schema for API validation and you want the same rules applied to DataFrames:

# Your existing JSON Schema (maybe generated by your API framework)
result = pb.import_contract("api/schemas/user.schema.json")

# Now use it for DataFrame validation in your data pipeline
validation = result.to_validate(data=raw_users_df).interrogate()

This approach is particularly powerful when your API team already maintains JSON Schema definitions for request/response validation. Those same schemas can now serve double duty: validating API payloads at the service boundary and validating the resulting DataFrames in your analytics pipeline. You get consistent enforcement across both layers without writing the rules twice.

Coming from Frictionless

If you have data packages from open data sources or research datasets:

# Import from an existing data package descriptor
result = pb.import_contract("data/datapackage.json", resource="observations")

# Validate the actual CSV data against the declared schema
validation = result.to_validate(data=observations_df).interrogate()

Frictionless Data Packages are common in open data portals, government datasets, and academic research repositories. By importing their Table Schemas directly, you can validate downloaded data against its declared structure without needing to manually inspect the descriptor file and rewrite each constraint. This is especially valuable when working with unfamiliar datasets where the schema descriptor is your primary documentation of what the data should contain.

Generating a Starting Point

Even if you don’t plan to keep using the external format, importing is a great way to bootstrap a Pointblank contract:

# Import from your existing schema
imported = pb.import_contract("legacy_schema.json", format="json_schema")

# Save as a YAML contract you'll maintain going forward
contract = imported.to_contract(name="my_table", version="1.0.0")
contract.to_yaml("contracts/my_table.yaml")

Now you have a Pointblank-native contract that you can extend and evolve independently of the original source. You can add new validation steps, adjust thresholds, or incorporate business rules that go beyond what the original schema format could express.

Conclusion

The contract import/export system lets you bridge the gap between external schema definitions and Pointblank’s validation engine. Rather than maintaining duplicate specifications across tools, you can keep your source of truth in whichever format suits your team and import it into Pointblank whenever you need runtime validation. The key points to remember:

  • Use pb.import_contract() to read external schemas and translate them into Pointblank checks
  • The ContractImport object gives you multiple output options: direct validation, reusable contracts, generated Python code, or YAML
  • Check .coverage and .warnings to understand how completely the translation covered your original schema
  • Use pb.export_contract() to write Pointblank contracts back to external formats for sharing with other tools
  • Combine imports with additional Pointblank-specific checks for the most thorough validation coverage

Whether you are migrating from another validation tool, bootstrapping contracts from existing schemas, or maintaining interoperability with external systems, the adapter framework gives you a clean path between external specifications and Pointblank’s validation engine. As new adapters are added in future releases, the same import_contract() interface will continue to work, so any code you write today will gain new format support automatically.