import pointblank as pb
= (
validation
pb.Validate(=pb.load_dataset(dataset="small_table", tbl_type="pandas"),
data="small_table",
tbl_name="Example for the get_step_report() method",
label=(1, 0.20, 0.40)
thresholds
)="d", value=3500)
.col_vals_lt(columns="c", left=1, right=8)
.col_vals_between(columns="a", value=3)
.col_vals_gt(columns="b", pattern=r"\d-[a-z]{3}-\d{3}")
.col_vals_regex(columns
.interrogate()
)
validation
Validate.get_step_report
=None, header=':default:', limit=10) Validate.get_step_report(i, columns_subset
Get a detailed report for a single validation step.
The get_step_report()
method returns a report of what went well—or what failed spectacularly—for a given validation step. The report includes a summary of the validation step and a detailed breakdown of the interrogation results. The report is presented as a GT table object, which can be displayed in a notebook or exported to an HTML file.
The get_step_report()
is still experimental. Please report any issues you encounter in the Pointblank issue tracker.
Parameters
i :
int
-
The step number for which to get the report.
columns_subset :
str
|list
[str
] |Column
| None = None-
The columns to display in a step report that shows errors in the input table. By default all columns are shown (
None
). If a subset of columns is desired, we can provide a list of column names, a string with a single column name, aColumn
object, or aColumnSelector
object. The last two options allow for more flexible column selection using column selector functions. Errors are raised if the column names provided don’t match any columns in the table (when provided as a string or list of strings) or if column selector expressions don’t resolve to any columns. header :
str
= ':default:'-
Options for customizing the header of the step report. The default is the
":default:"
value which produces a header with a standard title and set of details underneath. Aside from this default, free text can be provided for the header. This will be interpreted as Markdown text and transformed internally to HTML. You can provide one of two templating elements:{title}
and{details}
. The default header has the template"{title}{details}"
so you can easily start from that and modify as you see fit. If you don’t want a header at all, you can setheader=None
to remove it entirely. limit :
int
| None = 10-
The number of rows to display for those validation steps that check values in rows (the
col_vals_*()
validation steps). The default is10
rows and the limit can be removed entirely by settinglimit=None
.
Returns
:
GT
-
A GT table object that represents the detailed report for the validation step.
Examples
Let’s create a validation plan with a few validation steps and interrogate the data. With that, we’ll have a look at the validation reporting table for the entire collection of steps and what went well or what failed.
There were four validation steps performed, where the first three steps had failing test units and the last step had no failures. Let’s get a detailed report for the first step by using the get_step_report()
method.
=1) validation.get_step_report(i
Report for Validation Step 1 ASSERTION
2 / 13 TEST UNIT FAILURES IN COLUMN 6 EXTRACT OF ALL 2 ROWS (WITH TEST UNIT FAILURES IN RED): |
||||||||
The report for the first step is displayed. The report includes a summary of the validation step and a detailed breakdown of the interrogation results. The report provides details on what the validation step was checking, the extent to which the test units failed, and a table that shows the failing rows of the data with the column of interest highlighted.
The second and third steps also had failing test units. Reports for those steps can be viewed by using get_step_report(i=2)
and get_step_report(i=3)
respectively.
The final step did not have any failing test units. A report for the final step can still be viewed by using get_step_report(i=4)
. The report will indicate that every test unit passed and a prview of the target table will be provided.
=4) validation.get_step_report(i
Report for Validation Step 4 ✓ ASSERTION
13 TEST UNITS ALL PASSED IN COLUMN 4 PREVIEW OF TARGET TABLE: |
||||||||
If you’d like to trim down the number of columns shown in the report, you can provide a subset of columns to display. For example, if you only want to see the columns a
, b
, and c
, you can provide those column names as a list.
=1, columns_subset=["a", "b", "c"]) validation.get_step_report(i
Report for Validation Step 1 ASSERTION
2 / 13 TEST UNIT FAILURES IN COLUMN 6 (NOT SHOWN) EXTRACT OF ALL 2 ROWS : |
|||
If you’d like to increase or reduce the maximum number of rows shown in the report, you can provide a different value for the limit
parameter. For example, if you’d like to see only up to 5 rows, you can set limit=5
.
=3, limit=5) validation.get_step_report(i
Report for Validation Step 3 ASSERTION
7 / 13 TEST UNIT FAILURES IN COLUMN 3 EXTRACT OF FIRST 5 ROWS (WITH TEST UNIT FAILURES IN RED): |
||||||||
Step 3 actually had 7 failing test units, but only the first 5 rows are shown in the step report because of the limit=5
parameter.