In Great Tables 0.16.0, we introduced the mask= parameter in loc.body(), enabling users to apply conditional styling to rows on a per-column basis more efficiently when working with a Polars DataFrame. This post will demonstrate how it works and compare it with the “old-fashioned” approach:
Leveraging the mask= parameter in loc.body(): Use Polars expressions for streamlined styling.
Utilizing the locations= parameter in GT.tab_style(): Pass a list of loc.body() objects.
Let’s dive in.
Preparations
We’ll use the built-in dataset gtcars to create a Polars DataFrame. Next, we’ll select the columns mfr, drivetrain, year, and hp to create a small pivoted table named df_mini. Finally, we’ll pass df_mini to the GT object to create a table named gt, using drivetrain as the rowname_col= and mfr as the groupname_col=, as shown below:
cs.numeric() targets numerical columns, and .gt(650) checks if the cell value is greater than 650.
pl.nth(-2, -1) targets the last two columns, and .is_null() identifies missing values.
Did you notice that we can use Polars selectors and expressions to dynamically identify columns at runtime? This is definitely a killer feature when working with pivoted operations.
The mask= parameter acts as a syntactic sugar, streamlining the process and removing the need to loop through columns manually.
Using mask= Independently
mask= should not be used in combination with the columns or rows arguments. Attempting to do so will raise a ValueError.
Utilizing the locations= parameter in GT.tab_style()
A more “old-fashioned” approach involves passing a list of loc.body() objects to the locations= parameter in GT.tab_style():
( gt.tab_style( style=style.text(color="red"), locations=[loc.body(columns=col, rows=pl.col(col).gt(650))for col in year_cols], ).tab_style( style=style.fill(color="lightgrey"), locations=[loc.body(columns=col, rows=pl.col(col).is_null())for col in year_cols[-2:]], ))
This approach, though functional, demands additional effort:
Explicitly preparing the column names in advance.
Specifying the columns= and rows= arguments for each loc.body() in the loop.
While effective, it is less efficient and more verbose compared to the first approach.
Wrapping up
With the introduction of the mask= parameter in loc.body(), users can now style the table body in a more vectorized-like manner, akin to using df.apply() in Pandas, enhancing the overall user experience.
We extend our gratitude to @igorcalabria for suggesting this feature in #389 and providing an insightful explanation of its utility. A special thanks to @henryharbeck for providing the second approach.
We hope you enjoy this new functionality as much as we do! Have ideas to make Great Tables even better? Share them with us via GitHub Issues. We’re always amazed by the creativity of our users! See you, until the next great table.