has_rows()function

Check whether a table has a certain number of rows.

USAGE

has_rows(count=None, *, min=None, max=None)

The has_rows() function returns a callable that, when given a table, checks whether the row count satisfies a specified condition. It is designed for use with the active= parameter of validation methods so that a validation step can be conditionally skipped when the target table is too small, too large, or empty.

The callable supports several modes:

A note is attached to any skipped step in the validation report explaining the row count condition that was not met.

The callable is evaluated against the original table before any pre= processing is applied. This means the column check is performed on the raw input data, not on a pre-processed version of it.

Parameters

count : int | None = None

The exact number of rows the table should have. Cannot be used together with min= or max=.

min : int | None = None

The minimum number of rows (inclusive) the table should have. Can be used alone or with max=.

max : int | None = None

The maximum number of rows (inclusive) the table should have. Can be used alone or with min=.

Returns

Callable[[Any], bool]

A callable that accepts a table and returns True if the row count satisfies the specified condition, False otherwise. When the callable returns False, it stores diagnostic information that is used to generate a descriptive note in the validation report.

How It Works

When interrogate() is called, each validation step whose active= parameter is a callable will have that callable evaluated with the target table. If the callable returns False, the step is deactivated and an explanatory note is added to the validation report. The note is locale-aware: if the Validate object was created with a non-English locale=, the note will be translated accordingly.

Examples


Skip a validation step if the table is empty:

import pointblank as pb
import polars as pl

tbl = pl.DataFrame({"x": [1, 2, 3]})
empty_tbl = pl.DataFrame({"x": []})

validation = (
    pb.Validate(data=empty_tbl)
    .col_vals_gt(columns="x", value=0, active=pb.has_rows())
    .interrogate()
)

validation
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C66 1
col_vals_gt
col_vals_gt()
x 0

Notes

Step 1 (active_check) Step skipped — Row count check failed: table is empty.

The step was skipped because the table has no rows.

Only run a step when the table has at least a minimum number of rows:

validation = (
    pb.Validate(data=tbl)
    .col_vals_gt(columns="x", value=0, active=pb.has_rows(min=100))
    .interrogate()
)

validation
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C66 1
col_vals_gt
col_vals_gt()
x 0

Notes

Step 1 (active_check) Step skipped — Row count check failed: expected at least 100 row(s), found 3.

The step was skipped because the table has only 3 rows, which is fewer than the required minimum of 100.

You can also check for an exact count or a range:

validation = (
    pb.Validate(data=tbl)
    .col_vals_gt(columns="x", value=0, active=pb.has_rows(count=3))
    .col_vals_gt(columns="x", value=0, active=pb.has_rows(min=2, max=10))
    .col_vals_gt(columns="x", value=0, active=pb.has_rows(count=100))
    .interrogate()
)

validation
STEP COLUMNS VALUES TBL EVAL UNITS PASS FAIL W E C EXT
#4CA64C 1
col_vals_gt
col_vals_gt()
x 0 3 3
1.00
0
0.00
#4CA64C 2
col_vals_gt
col_vals_gt()
x 0 3 3
1.00
0
0.00
#4CA64C66 3
col_vals_gt
col_vals_gt()
x 0

Notes

Step 3 (active_check) Step skipped — Row count check failed: expected exactly 100 row(s), found 3.

The first two steps ran because the table has exactly 3 rows (matching count=3) and falls within the range [2, 10]. The third step was skipped because 3 does not equal 100.