Many validation methods have a columns= argument that can be used to specify the columns for validation (e.g., col_vals_gt(), col_vals_regex(), etc.). The starts_with() selector function can be used to select one or more columns that start with some specified text. So if the set of table columns consists of
[name_first, name_last, age, address]
and you want to validate columns that start with "name", you can use columns=starts_with("name"). This will select the name_first and name_last columns.
There will be a validation step created for every resolved column. Note that if there aren’t any columns resolved from using starts_with() (or any other expression using selector functions), the validation step will fail to be evaluated during the interrogation process. Such a failure to evaluate will be reported in the validation results but it won’t affect the interrogation process overall (i.e., the process won’t be halted).
Parameters
text:str
The text that the column name should start with.
case_sensitive:bool=False
Whether column names should be treated as case-sensitive. The default is False.
Returns
:StartsWith
A StartsWith object, which can be used to select columns that start with the specified text.
Relevant Validation Methods where starts_with() can be Used
This selector function can be used in the columns= argument of the following validation methods:
col_vals_gt()
col_vals_lt()
col_vals_ge()
col_vals_le()
col_vals_eq()
col_vals_ne()
col_vals_between()
col_vals_outside()
col_vals_in_set()
col_vals_not_in_set()
col_vals_null()
col_vals_not_null()
col_vals_regex()
col_exists()
The starts_with() selector function doesn’t need to be used in isolation. Read the next section for information on how to compose it with other column selectors for more refined ways to select columns.
Additional Flexibilty through Composition with Other Column Selectors
The starts_with() function can be composed with other column selectors to create fine-grained column selections. For example, to select columns that start with "a" and end with "e", you can use the starts_with() and ends_with() functions together. The only condition is that the expressions are wrapped in the col() function, like this:
col(starts_with("a") & ends_with("e"))
There are four operators that can be used to compose column selectors:
& (and)
| (or)
- (difference)
~ (not)
The & operator is used to select columns that satisfy both conditions. The | operator is used to select columns that satisfy either condition. The - operator is used to select columns that satisfy the first condition but not the second. The ~ operator is used to select columns that don’t satisfy the condition. As many selector functions can be used as needed and the operators can be combined to create complex column selection criteria (parentheses can be used to group conditions and control the order of evaluation).
Examples
Suppose we have a table with columns name, paid_2021, paid_2022, and person_id and we’d like to validate that the values in columns that start with "paid" are greater than 10. We can use the starts_with() column selector function to specify the columns that start with "paid" as the columns to validate.
From the results of the validation table we get two validation steps, one for paid_2021 and one for paid_2022. The values in both columns were all greater than 10.
We can also use the starts_with() function in combination with other column selectors (within col()) to create more complex column selection criteria (i.e., to select columns that satisfy multiple conditions). For example, to select columns that start with "paid" and match the text "2023" or "2024", we can use the & operator to combine column selectors.