---------------------------------------------------------------------- This is the API documentation for the great_tables library. ---------------------------------------------------------------------- ## Table Creation All tables created in Great Tables begin by using `GT()`. With this class, we supply the input data table and some basic options for creating a stub and row groups (with the `rowname_col=` and `groupname_col=` arguments). All GT methods are documented on their own pages. GT(data: 'Any', rowname_col: 'str | None' = None, groupname_col: 'str | None' = None, auto_align: 'bool' = True, id: 'str | None' = None, locale: 'str | None' = None) Create a **Great Tables** object. The `GT()` class creates the `GT` object when provided with tabular data. Using this class is the the first step in a typical **Great Tables** workflow. Once we have this object, we can take advantage of numerous methods to get the desired display table for publication. There are a few table structuring options we can consider at this stage. We can choose to create a table stub containing row labels through the use of the `rowname_col=` argument. Further to this, row groups can be created with the `groupname_col=` argument. Both arguments take the name of a column in the input table data. Typically, the data in the `groupname_col=` column will consist of categorical text whereas the data in the `rowname_col=` column will often contain unique labels (perhaps being unique across the entire table or unique only within the different row groups). Parameters ---------- data A DataFrame object. rowname_col The column name in the input `data=` table to use as row labels to be placed in the table stub. groupname_col The column name in the input `data=` table to use as group labels for generation of row groups. auto_align Optionally have column data be aligned depending on the content contained in each column of the input `data=`. id By default (with `None`) the table ID will be a random, ten-letter string as generated through internal use of the `random_id()` function. A custom table ID can be used here by providing a string. locale An optional locale identifier that can be set as the default locale for all functions that take a `locale` argument. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT A GT object is returned. Examples -------- Let's use the `exibble` dataset for the next few examples, we'll learn how to make simple output tables with the `GT()` class. The most basic thing to do is to just use `GT()` with the dataset as the input. ```{python} from great_tables import GT, exibble GT(exibble) ``` This dataset has the `row` and `group` columns. The former contains unique values that are ideal for labeling rows, and this often happens in what is called the 'stub' (a reserved area that serves to label rows). With the `GT()` class, we can immediately place the contents of the `row` column into the stub column. To do this, we use the `rowname_col=` argument with the appropriate column name. ```{python} from great_tables import GT, exibble GT(exibble, rowname_col="row") ``` This sets up a table with a stub, the row labels are placed within the stub column, and a vertical dividing line has been placed on the right-hand side. The `group` column contains categorical values that are ideal for grouping rows. We can use the `groupname_col=` argument to place these values into row groups. ```{python} from great_tables import GT, exibble GT(exibble, rowname_col="row", groupname_col="group") ``` By default, values in the body of a table (and their column labels) are automatically aligned. The alignment is governed by the types of values in a column. If you'd like to disable this form of auto-alignment, the `auto_align=False` option can be taken. ```{python} from great_tables import GT, exibble GT(exibble, rowname_col="row", auto_align=False) ``` What you'll get from that is center-alignment of all table body values and all column labels. Note that row labels in the the stub are still left-aligned; and `auto_align=` has no effect on alignment within the table stub. However which way you generate the initial table object, you can modify it with a huge variety of methods to further customize the presentation. Formatting body cells is commonly done with the family of formatting methods (e.g., `fmt_number()`, `fmt_date()`, etc.). The package supports formatting with internationalization ('i18n' features) and so locale-aware methods all come with a `locale=` argument. To avoid having to use that argument repeatedly, the `GT()` class has its own `locale=` argument. Setting a locale in that will make it available globally. Here's an example of how that works in practice when setting `locale = "fr"` in `GT()` prior to using formatting methods: ```{python} from great_tables import GT, exibble ( GT(exibble, rowname_col="row", locale="fr") .fmt_currency(columns="currency") .fmt_scientific(columns="num") .fmt_date(columns="date", date_style="day_month_year") ) ``` In this example, the `fmt_currency()`, `fmt_scientific()`, and `fmt_date()` methods understand that the locale for this table is `"fr"` (French), so the appropriate formatting for that locale is apparent in the `currency`, `num`, and `date` columns. ## Major structural table parts A table can contain a few useful components for conveying additional information. These include a header (with a titles and subtitle), a footer (with source notes), and additional areas for labels (row group labels, column spanner labels, the stubhead label). We can perform styling on targeted table locations with the [`tab_style()`](`great_tables.GT.tab_style`) method. tab_header(self: 'GTSelf', title: 'str | Text', subtitle: 'str | Text | None' = None, preheader: 'str | list[str] | None' = None) -> 'GTSelf' Add a table header. We can add a table header to the output table that contains a title and even a subtitle with the `tab_header()` method. A table header is an optional table component that is positioned above the column labels. We have the flexibility to use Markdown or HTML formatting for the header's title and subtitle with the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions. Parameters ---------- title Text to be used in the table title. We can elect to use the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. subtitle Text to be used in the table subtitle. We can elect to use the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. preheader Optional preheader content that is rendered above the table. Can be supplied as a list of strings. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use a small portion of the `gtcars` dataset to create a table. A header part can be added to the table with the `tab_header()` method. We'll add a title and the optional subtitle as well. With the [`md()`](`great_tables.md`) helper function, we can make sure the Markdown formatting is interpreted and transformed. ```{python} from great_tables import GT, md from great_tables.data import gtcars gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5) ( GT(gtcars_mini) .tab_header( title=md("Data listing from **gtcars**"), subtitle=md("`gtcars` is an R dataset") ) ) ``` We can alternatively use the [`html()`](`great_tables.html`) helper function to retain HTML elements in the text. ```{python} from great_tables import GT, md, html from great_tables.data import gtcars gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5) ( GT(gtcars_mini) .tab_header( title=md("Data listing gtcars"), subtitle=html("From gtcars") ) ) ``` tab_spanner(self: 'GTSelf', label: 'str | BaseText', columns: 'SelectExpr' = None, spanners: 'str | list[str] | None' = None, level: 'int | None' = None, id: 'str | None' = None, gather: 'bool' = True, replace: 'bool' = False) -> 'GTSelf' Insert a spanner above a selection of column headings. This part of the table contains, at a minimum, column labels and, optionally, an unlimited number of levels for spanners. A spanner will occupy space over any number of contiguous column labels and it will have an associated label and ID value. This method allows for mapping to be defined by column names, existing spanner ID values, or a mixture of both. The spanners are placed in the order of calling `tab_spanner()` so if a later call uses the same columns in its definition (or even a subset) as the first invocation, the second spanner will be overlaid atop the first. Options exist for forcibly inserting a spanner underneath others (with `level` as space permits) and with `replace`, which allows for full or partial spanner replacement. Parameters ---------- label The text to use for the spanner label. We can optionally use the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. Alternatively, units notation can be used (see [`define_units()`](`great_tables.define_units`) for details). columns The columns to target. Can either be a single column name or a series of column names provided in a list. spanners The spanners that should be spanned over, should they already be defined. One or more spanner ID values (in quotes) can be supplied here. This argument works in tandem with the `columns` argument. level An explicit level to which the spanner should be placed. If not provided, **Great Tables** will choose the level based on the inputs provided within `columns` and `spanners`, placing the spanner label where it will fit. The first spanner level (right above the column labels) is `0`. id The ID for the spanner. When accessing a spanner through the `spanners` argument of `tab_spanner()` the `id` value is used as the reference (and not the `label`). If an `id` is not explicitly provided here, it will be taken from the `label` value. It is advisable to set an explicit `id` value if you plan to access this cell in a later call and the label text is complicated (e.g., contains markup, is lengthy, or both). Finally, when providing an `id` value you must ensure that it is unique across all ID values set for spanner labels (the method will throw an error if `id` isn't unique). gather An option to move the specified `columns` such that they are unified under the spanner. Ordering of the moved-into-place columns will be preserved in all cases. By default, this is set to `True`. replace Should new spanners be allowed to partially or fully replace existing spanners? (This is a possibility if setting spanners at an already populated `level`.) By default, this is set to `False` and an error will occur if some replacement is attempted. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's create a table using a small portion of the `gtcars` dataset. Over several columns (`hp`, `hp_rpm`, `trq`, `trq_rpm`, `mpg_c`, `mpg_h`) we'll use `tab_spanner()` to add a spanner with the label `"performance"`. This effectively groups together several columns related to car performance under a unifying label. ```{python} from great_tables import GT, md from great_tables.data import gtcars colnames = ["model", "hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"] gtcars_mini = gtcars[colnames].head(10) ( GT(gtcars_mini) .tab_spanner( label="performance", columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"] ) ) ``` One cool feature of `tab_spanner()` is its support for multiple levels, allowing you to group columns in various ways. For example, you can create three bottom spanners and a top spanner: ```{python} ( GT(gtcars_mini) .tab_spanner( label="hp", columns=["hp", "hp_rpm"], ) .tab_spanner( label="trq", columns=["trq", "trq_rpm"], ) .tab_spanner( label="mpg", columns=["mpg_c", "mpg_h"], ) .tab_spanner( label="performance", columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"], ) ) ``` Did you notice that the spanners stacked automatically? What if you want granular control to specify a spanner in a specific hierarchy? **Great Tables** has you covered. By using the `level=` parameter, you can easily adjust the hierarchy of spanners. For example, by specifying `level=0` for the last call of `tab_spanner()`, you can place that spanner at the bottom level (level `0`) instead of the top level (level `2`). ```{python} ( GT(gtcars_mini) .tab_spanner( label="hp", columns=["hp", "hp_rpm"], ) .tab_spanner( label="performance", columns=["hp", "hp_rpm", "trq", "trq_rpm"], ) .tab_spanner( label="trq", columns=["trq", "trq_rpm"], level=0, ) ) ``` We can also use Markdown formatting for the spanner label. In this example, we'll use `gt.md("*Performance*")` to make the label italicized. ```{python} ( GT(gtcars_mini) .tab_spanner( label=md("*Performance*"), columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"] ) ) ``` tab_spanner_delim(self: 'GTSelf', delim: 'str' = '.', columns: 'SelectExpr' = None, split: "Literal['first', 'last']" = 'last', limit: 'int' = -1, reverse: 'bool' = False) -> 'GTSelf' Insert spanners by splitting column names with a delimiter. This generates one or more spanners (and sets column labels), by splitting the column name by the specified delimiter text (delim) and placing the fragments from top to bottom (i.e., higher-level spanners to the column labels) or vice versa. For example, the three side-by-side column names rating_1, rating_2, and rating_3 will by default produce a spanner labeled "rating" above columns labeled "1", "2", and "3". Parameters ---------- delim Delimiter for splitting, default to `"."`. columns The columns to target. Can either be a single column name or a series of column names provided in a list. split Should the delimiter splitting occur from the "last" instance of the delim character or from the "first"? The default here uses the "last" keyword, and splitting begins at the last instance of the delimiter in the column name. This option only has some consequence when there is a limit value applied that is lesser than the number of delimiter characters for a given column name (i.e., number of splits is not the maximum possible number). limit Limit for splitting. An optional limit to place on the splitting procedure. The default -1 means that a column name will be split as many times are there are delimiter characters. In other words, the default means there is no limit. If an integer value is given to limit then splitting will cease at the iteration given by limit. This works in tandem with split since we can adjust the number of splits from either the right side (split = "last") or left side (split = "first") of the column name. reverse Should the order of split names be reversed? By default, this is `False`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's create a table table that includes the column names province.NL_ZH.pop, province.NL_ZH.gdp, province.NL_NH.pop, and province.NL_NH.gdp, we can see that we have a naming system that has a well-defined structure. We start with the more general to the left ("province") and move to the more specific on the right ("pop"). If the columns are in the table in this exact order, then things are in an ideal state as the eventual spanner labels will form from this neighboring. When using tab_spanner_delim() here with delim set as "." we get the following table: ```{python} import polars as pl import polars.selectors as cs from great_tables import GT data = { "province.NL_ZH.pop": [1, 2, 3], "province.NL_ZH.gdp": [4, 5, 6], "province.NL_NH.pop": [7, 8, 9], "province.NL_NH.gdp": [10, 11, 12], } gt = GT(pl.DataFrame(data)) gt.tab_spanner_delim() ``` ```{python} gt.tab_spanner_delim(limit=1) ``` ```{python} # the name "province" repeats in the styled table, # because the first spanner is column names gt.tab_spanner_delim(reverse=True) ``` ```{python} from great_tables.data import towny lil_towny = ( pl.DataFrame(towny) .select("name", cs.starts_with("population")) .head() ) GT(lil_towny).tab_spanner_delim(delim="_") ``` tab_stub(self: 'GTSelf', rowname_col: 'str | None' = None, groupname_col: 'str | None' = None) -> 'GTSelf' Add a table stub, to emphasize row and group information. Parameters ---------- rowname_col: The column to use for row names. By default, no row names added. groupname_col: The column to use for group names. By default no group names added. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- By default, all data is together in the body of the table. ```{python} from great_tables import GT, exibble GT(exibble) ``` The table stub separates row names with a vertical line, and puts group names on their own line. ```{python} GT(exibble).tab_stub(rowname_col="row", groupname_col="group") ``` tab_stubhead(self: 'GTSelf', label: 'str | Text') -> 'GTSelf' Add label text to the stubhead. Add a label to the stubhead of a table. The stubhead is the lone element that is positioned left of the column labels, and above the stub. If a stub does not exist, then there is no stubhead (so no change will be made when using this method in that case). We have the flexibility to use Markdown formatting for the stubhead label (through use of the [`md()`](`great_tables.md`) helper function). Furthermore, we can use HTML for the stubhead label so long as we also use the [`html()`](`great_tables.html`) helper function. Parameters ---------- label The text to be used as the stubhead label. We can optionally use the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using a small subset of the `gtcars` dataset, we can create a table with row labels. Since we have row labels in the stub (via use of `rowname_col="model"` in the `GT()` call) we have a stubhead, so, let's add a stubhead label (`"car"`) with the `tab_stubhead()` method to describe what's in the stub. ```{python} from great_tables import GT from great_tables.data import gtcars gtcars_mini = gtcars[["model", "year", "hp", "trq"]].head(5) ( GT(gtcars_mini, rowname_col="model") .tab_stubhead(label="car") ) ``` We can also use Markdown formatting for the stubhead label. In this example, we'll use `md("*Car*")` to make the label italicized. ```{python} from great_tables import GT, md from great_tables.data import gtcars ( GT(gtcars_mini, rowname_col="model") .tab_stubhead(label=md("*Car*")) ) ``` tab_footnote(self: 'GTSelf', footnote: 'str | Text', locations: 'Loc | None | list[Loc | None]' = None, placement: 'PlacementOptions' = 'auto') -> 'GTSelf' Add a table footnote. `tab_footnote()` can make it a painless process to add a footnote to a table. There are commonly two components to a footnote: (1) a footnote mark that is attached to the targeted cell content, and (2) the footnote text itself that is placed in the table's footer area. Each unit of footnote text in the footer is linked to an element of text or otherwise through the footnote mark. The footnote system in **Great Tables** presents footnotes in a way that matches the usual expectations, where: 1. footnote marks have a sequence, whether they are symbols, numbers, or letters 2. multiple footnotes can be applied to the same content (and marks are always presented in an ordered fashion) 3. footnote text in the footer is never exactly repeated, **Great Tables** reuses footnote marks where needed throughout the table 4. footnote marks are ordered across the table in a consistent manner (left to right, top to bottom) Each call of `tab_footnote()` will either add a different footnote to the footer or reuse existing footnote text therein. One or more cells outside of the footer are targeted using location classes from the `loc` module (e.g., `loc.body()`, `loc.column_labels()`, etc.). You can choose to *not* attach a footnote mark by simply not specifying anything in the `locations` argument. By default, **Great Tables** will choose which side of the text to place the footnote mark via the `placement="auto"` option. You are, however, always free to choose the placement of the footnote mark (either to the `"left"` or `"right"` of the targeted cell content). Parameters ---------- footnote The text to be used in the footnote. We can optionally use [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) to style the text as Markdown or to retain HTML elements in the footnote text. locations The cell or set of cells to be associated with the footnote. Supplying any of the location classes from the `loc` module is a useful way to target the location cells that are associated with the footnote text. These location classes are: `loc.title`, `loc.stubhead`, `loc.spanner_labels`, `loc.column_labels`, `loc.row_groups`, `loc.stub`, `loc.body`, etc. Additionally, we can enclose several location calls within a `list()` if we wish to link the footnote text to different types of locations (e.g., body cells, row group labels, the table title, etc.). placement Where to affix footnote marks to the table content. Two options for this are `"left"` or `"right"`, where the placement is either to the absolute left or right of the cell content. By default, however, this option is set to `"auto"` whereby **Great Tables** will choose a preferred left-or-right placement depending on the alignment of the cell content. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- This example table will be based on the `towny` dataset. We have a header part, with a title and a subtitle. We can choose which of these could be associated with a footnote and in this case it is the `"subtitle"`. This table has a stub with row labels and some of those labels are associated with a footnote. So long as row labels are unique, they can be easily used as row identifiers in `loc.stub()`. The third footnote is placed on the `"Density"` column label. Here, changing the order of the `tab_footnote()` calls has no effect on the final table rendering. ```{python} import polars as pl from great_tables import GT, loc, md from great_tables.data import towny towny_mini = ( pl.from_pandas(towny) .filter(pl.col("csd_type") == "city") .select(["name", "density_2021", "population_2021"]) .top_k(10, by="population_2021") .sort("population_2021", descending=True) ) ( GT(towny_mini, rowname_col="name") .tab_header( title=md("The 10 Largest Municipalities in `towny`"), subtitle="Population values taken from the 2021 census." ) .fmt_integer() .cols_label( density_2021="Density", population_2021="Population" ) .tab_footnote( footnote="Part of the Greater Toronto Area.", locations=loc.stub(rows=[ "Toronto", "Mississauga", "Brampton", "Markham", "Vaughan" ]) ) .tab_footnote( footnote=md("Density is in terms of persons per {{km^2}}."), locations=loc.column_labels(columns="density_2021") ) .tab_footnote( footnote="Census results made public on February 9, 2022.", locations=loc.subtitle() ) .tab_source_note( source_note=md("Data taken from the `towny` dataset.") ) .opt_footnote_marks(marks="letters") ) ``` tab_source_note(self: 'GTSelf', source_note: 'str | Text') -> 'GTSelf' Add a source note citation. Add a source note to the footer part of the table. A source note is useful for citing the data included in the table. Several can be added to the footer, simply use the `tab_source_note()` method multiple times and they will be inserted in the order provided. We can use Markdown formatting for the note, or, if the table is intended for HTML output, we can include HTML formatting. Parameters ---------- source_note Text to be used in the source note. We can optionally use the [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- With three columns from the `gtcars` dataset, let's create a new table. We can use the `tab_source_note()` method to add a source note to the table footer. Here we are citing the data source but this method can be used for any text you'd prefer to display in the footer component of the table. ```{python} from great_tables import GT from great_tables.data import gtcars gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5) ( GT(gtcars_mini, rowname_col="model") .tab_source_note(source_note="From edmunds.com") ) ``` tab_style(self: 'GTSelf', style: 'CellStyle | list[CellStyle]', locations: 'Loc | list[Loc]') -> 'GTSelf' Add custom style to one or more cells With the `tab_style()` method we can target specific cells and apply styles to them. We do this with the combination of the `style` and `location` arguments. The `style` argument requires use of styling classes (e.g., `style.fill(color="red")`) and the `location` argument needs to be an expression of the cells we want to target using location targeting classes (e.g., `loc.body(columns=)`). With the available suite of styling classes, here are some of the styles we can apply: - the background color of the cell (`style.fill()`'s `color`) - the cell's text color, font, and size (`style.text()`'s `color`, `font`, and `size`) - the text style (`style.text()`'s `style`), enabling the use of italics or oblique text. - the text weight (`style.text()`'s `weight`), allowing the use of thin to bold text (the degree of choice is greater with variable fonts) - the alignment of text (`style.text()`'s `align`) - cell borders with the `style.borders()` class Parameters ---------- style The styles to use for the cells at the targeted `locations`. The `style.text()`, `style.fill()`, and `style.borders()` classes can be used here to more easily generate valid styles. locations The cell or set of cells to be associated with the style. The `loc.body()` class can be used here to easily target body cell locations. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use a small subset of the `exibble` dataset to demonstrate how to use `tab_style()` to target specific cells and apply styles to them. We'll start by creating the `exibble_sm` table (a subset of the `exibble` table) and then use `tab_style()` to apply a light cyan background color to the cells in the `num` column for the first two rows of the table. We'll then apply a larger font size to the cells in the `fctr` column for the last four rows of the table. ```{python} from great_tables import GT, style, loc, exibble exibble_sm = exibble[["num", "fctr", "row", "group"]] ( GT(exibble_sm, rowname_col="row", groupname_col="group") .tab_style( style=style.fill(color="lightcyan"), locations=loc.body(columns="num", rows=["row_1", "row_2"]), ) .tab_style( style=style.text(size="22px"), locations=loc.body(columns=["fctr"], rows=[4, 5, 6, 7]), ) ) ``` Let's use `exibble` once again to create a simple, two-column output table (keeping only the `num` and `currency` columns). With the `tab_style()` method (called thrice), we'll add style to the values already formatted by `fmt_number()` and `fmt_currency()`. In the `style` argument of the first two `tab_style()` call, we can define multiple types of styling with the `style.fill()` and `style.text()` classes (enclosing these in a list). The cells to be targeted for styling require the use of `loc.body()`, which is used here with different columns being targeted. For the final `tab_style()` call, we demonstrate the use of `style.borders()` class as the `style` argument, which is employed in conjunction with `loc.body()` to locate the row to be styled. ```{python} from great_tables import GT, style, loc, exibble ( GT(exibble[["num", "currency"]]) .fmt_number(columns="num", decimals=1) .fmt_currency(columns="currency") .tab_style( style=[ style.fill(color="lightcyan"), style.text(weight="bold") ], locations=loc.body(columns="num") ) .tab_style( style=[ style.fill(color="#F9E3D6"), style.text(style="italic") ], locations=loc.body(columns="currency") ) .tab_style( style=style.borders(sides=["top", "bottom"], weight='2px', color="red"), locations=loc.body(rows=[4]) ) ) ``` tab_options(self: 'GTSelf', container_width: 'str | None' = None, container_height: 'str | None' = None, container_padding_x: 'str | None' = None, container_padding_y: 'str | None' = None, container_overflow_x: 'str | None' = None, container_overflow_y: 'str | None' = None, table_width: 'str | None' = None, table_layout: 'str | None' = None, table_margin_left: 'str | None' = None, table_margin_right: 'str | None' = None, table_background_color: 'str | None' = None, table_additional_css: 'str | list[str] | None' = None, table_font_names: 'str | list[str] | None' = None, table_font_size: 'str | None' = None, table_font_weight: 'str | int | float | None' = None, table_font_style: 'str | None' = None, table_font_color: 'str | None' = None, table_font_color_light: 'str | None' = None, table_border_top_style: 'str | None' = None, table_border_top_width: 'str | None' = None, table_border_top_color: 'str | None' = None, table_border_bottom_style: 'str | None' = None, table_border_bottom_width: 'str | None' = None, table_border_bottom_color: 'str | None' = None, table_border_left_style: 'str | None' = None, table_border_left_width: 'str | None' = None, table_border_left_color: 'str | None' = None, table_border_right_style: 'str | None' = None, table_border_right_width: 'str | None' = None, table_border_right_color: 'str | None' = None, heading_background_color: 'str | None' = None, heading_align: 'str | None' = None, heading_title_font_size: 'str | None' = None, heading_title_font_weight: 'str | int | float | None' = None, heading_subtitle_font_size: 'str | None' = None, heading_subtitle_font_weight: 'str | int | float | None' = None, heading_padding: 'str | None' = None, heading_padding_horizontal: 'str | None' = None, heading_border_bottom_style: 'str | None' = None, heading_border_bottom_width: 'str | None' = None, heading_border_bottom_color: 'str | None' = None, heading_border_lr_style: 'str | None' = None, heading_border_lr_width: 'str | None' = None, heading_border_lr_color: 'str | None' = None, column_labels_background_color: 'str | None' = None, column_labels_font_size: 'str | None' = None, column_labels_font_weight: 'str | int | float | None' = None, column_labels_text_transform: 'str | None' = None, column_labels_padding: 'str | None' = None, column_labels_padding_horizontal: 'str | None' = None, column_labels_vlines_style: 'str | None' = None, column_labels_vlines_width: 'str | None' = None, column_labels_vlines_color: 'str | None' = None, column_labels_border_top_style: 'str | None' = None, column_labels_border_top_width: 'str | None' = None, column_labels_border_top_color: 'str | None' = None, column_labels_border_bottom_style: 'str | None' = None, column_labels_border_bottom_width: 'str | None' = None, column_labels_border_bottom_color: 'str | None' = None, column_labels_border_lr_style: 'str | None' = None, column_labels_border_lr_width: 'str | None' = None, column_labels_border_lr_color: 'str | None' = None, column_labels_hidden: 'bool | None' = None, row_group_background_color: 'str | None' = None, row_group_font_size: 'str | None' = None, row_group_font_weight: 'str | int | float | None' = None, row_group_text_transform: 'str | None' = None, row_group_padding: 'str | None' = None, row_group_padding_horizontal: 'str | None' = None, row_group_border_top_style: 'str | None' = None, row_group_border_top_width: 'str | None' = None, row_group_border_top_color: 'str | None' = None, row_group_border_bottom_style: 'str | None' = None, row_group_border_bottom_width: 'str | None' = None, row_group_border_bottom_color: 'str | None' = None, row_group_border_left_style: 'str | None' = None, row_group_border_left_width: 'str | None' = None, row_group_border_left_color: 'str | None' = None, row_group_border_right_style: 'str | None' = None, row_group_border_right_width: 'str | None' = None, row_group_border_right_color: 'str | None' = None, row_group_as_column: 'bool | None' = None, table_body_hlines_style: 'str | None' = None, table_body_hlines_width: 'str | None' = None, table_body_hlines_color: 'str | None' = None, table_body_vlines_style: 'str | None' = None, table_body_vlines_width: 'str | None' = None, table_body_vlines_color: 'str | None' = None, table_body_border_top_style: 'str | None' = None, table_body_border_top_width: 'str | None' = None, table_body_border_top_color: 'str | None' = None, table_body_border_bottom_style: 'str | None' = None, table_body_border_bottom_width: 'str | None' = None, table_body_border_bottom_color: 'str | None' = None, stub_background_color: 'str | None' = None, stub_font_size: 'str | None' = None, stub_font_weight: 'str | int | float | None' = None, stub_text_transform: 'str | None' = None, stub_border_style: 'str | None' = None, stub_border_width: 'str | None' = None, stub_border_color: 'str | None' = None, stub_row_group_font_size: 'str | None' = None, stub_row_group_font_weight: 'str | int | float | None' = None, stub_row_group_text_transform: 'str | None' = None, stub_row_group_border_style: 'str | None' = None, stub_row_group_border_width: 'str | None' = None, stub_row_group_border_color: 'str | None' = None, data_row_padding: 'str | None' = None, data_row_padding_horizontal: 'str | None' = None, summary_row_background_color: 'str | None' = None, summary_row_text_transform: 'str | None' = None, summary_row_padding: 'str | None' = None, summary_row_padding_horizontal: 'str | None' = None, summary_row_border_style: 'str | None' = None, summary_row_border_width: 'str | None' = None, summary_row_border_color: 'str | None' = None, grand_summary_row_background_color: 'str | None' = None, grand_summary_row_text_transform: 'str | None' = None, grand_summary_row_padding: 'str | None' = None, grand_summary_row_padding_horizontal: 'str | None' = None, grand_summary_row_border_style: 'str | None' = None, grand_summary_row_border_width: 'str | None' = None, grand_summary_row_border_color: 'str | None' = None, footnotes_marks: 'str | list[str] | None' = None, source_notes_background_color: 'str | None' = None, source_notes_font_size: 'str | None' = None, source_notes_padding: 'str | None' = None, source_notes_padding_horizontal: 'str | None' = None, source_notes_border_bottom_style: 'str | None' = None, source_notes_border_bottom_width: 'str | None' = None, source_notes_border_bottom_color: 'str | None' = None, source_notes_border_lr_style: 'str | None' = None, source_notes_border_lr_width: 'str | None' = None, source_notes_border_lr_color: 'str | None' = None, source_notes_multiline: 'bool | None' = None, source_notes_sep: 'str | None' = None, row_striping_background_color: 'str | None' = None, row_striping_include_stub: 'bool | None' = None, row_striping_include_table_body: 'bool | None' = None, quarto_disable_processing: 'bool | None' = None) -> 'GTSelf' Modify the table output options. Modify the options available in a table. These options are named by the components, the subcomponents, and the element that can adjusted. Parameters ---------- container_width The width of the table's container. Can be specified as a single-length character with units of pixels or as a percentage. If provided as a scalar numeric value, it is assumed that the value is given in units of pixels. container_height The height of the table's container. container_padding_x The horizontal padding of the table's container. Can be specified as a single-length character with units of pixels or as a percentage. If provided as a scalar numeric value, it is assumed that the value is given in units of pixels. container_padding_y The vertical padding of the table's container. Same rules apply as for `container_padding_x`. container_overflow_x An option to enable scrolling in the horizontal direction when the table content overflows the container dimensions. Using `True` (the default) means that horizontal scrolling is enabled to view the entire table in those directions. With `False`, the table may be clipped if the table width or height exceeds the `container_width`. container_overflow_y An option to enable scrolling in the vertical direction when the table content overflows. Same rules apply as for `container_overflow_x`; the dependency here is that of the table height (`container_height`). table_width The width of the table. Can be specified as a string with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. table_layout The value for the `table-layout` CSS style in the HTML output context. By default, this is `"fixed"` but another valid option is `"auto"`. table_margin_left The size of the margins on the left of the table within the container. Can be specified as a single-length value with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. Using `table_margin_left` will overwrite any values set by `table_align`. table_margin_right The size of the margins on the right of the table within the container. Same rules apply as for `table_margin_left`. Using `table_margin_right` will overwrite any values set by `table_align`. table_background_color The background color for the table. A color name or a hexadecimal color code should be provided. table_additional_css Additional CSS that can be added to the table. This can be used to add any custom CSS that is not covered by the other options. table_font_names The names of the fonts used for the table. This should be provided as a list of font names. If the first font isn't available, then the next font is tried (and so on). table_font_size The font size for the table. Can be specified as a string with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. table_font_weight The font weight of the table. Can be a text-based keyword such as `"normal"`, `"bold"`, `"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`, inclusive. Note that only variable fonts may support the numeric mapping of weight. table_font_style The font style for the table. Can be one of either `"normal"`, `"italic"`, or `"oblique"`. table_font_color The text color used throughout the table. A color name or a hexadecimal color code should be provided. table_font_color_light The text color used throughout the table when the background color is dark. A color name or a hexadecimal color code should be provided. table_border_top_style The style of the table's absolute top border. Can be one of either `"solid"`, `"dotted"`, `"dashed"`, `"double"`, `"groove"`, `"ridge"`, `"inset"`, or `"outset"`. table_border_top_width The width of the table's absolute top border. Can be specified as a string with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. table_border_top_color The color of the table's absolute top border. A color name or a hexadecimal color code should be provided. table_border_bottom_style The style of the table's absolute bottom border. table_border_bottom_width The width of the table's absolute bottom border. table_border_bottom_color The color of the table's absolute bottom border. table_border_left_style The style of the table's absolute left border. table_border_left_width The width of the table's absolute left border. table_border_left_color The color of the table's absolute left border. table_border_right_style The style of the table's absolute right border. table_border_right_width The width of the table's absolute right border. table_border_right_color The color of the table's absolute right border. heading_background_color The background color for the heading. A color name or a hexadecimal color code should be provided. heading_align Controls the horizontal alignment of the heading title and subtitle. We can either use `"center"`, `"left"`, or `"right"`. heading_title_font_size The font size for the heading title element. heading_title_font_weight The font weight of the heading title. heading_subtitle_font_size The font size for the heading subtitle element. heading_subtitle_font_weight The font weight of the heading subtitle. heading_padding The amount of vertical padding to incorporate in the `heading` (title and subtitle). Can be specified as a string with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. heading_padding_horizontal The amount of horizontal padding to incorporate in the `heading` (title and subtitle). Can be specified as a string with units of pixels or as a percentage. If provided as a numeric value, it is assumed that the value is given in units of pixels. heading_border_bottom_style The style of the header's bottom border. heading_border_bottom_width The width of the header's bottom border. If the `width` of this border is larger, then it will be the visible border. heading_border_bottom_color The color of the header's bottom border. heading_border_lr_style The style of the left and right borders of the `heading` location. heading_border_lr_width The width of the left and right borders of the `heading` location. If the `width` of this border is larger, then it will be the visible border. heading_border_lr_color The color of the left and right borders of the `heading` location. column_labels_background_color The background color for the column labels. A color name or a hexadecimal color code should be provided. column_labels_font_size The font size to use for all column labels. column_labels_font_weight The font weight of the table's column labels. column_labels_text_transform The text transformation for the column labels. Either of the `"uppercase"`, `"lowercase"`, or `"capitalize"` keywords can be used. column_labels_padding The amount of vertical padding to incorporate in the `column_labels` (this includes the column spanners). column_labels_padding_horizontal The amount of horizontal padding to incorporate in the `column_labels` (this includes the column spanners). column_labels_vlines_style The style of all vertical lines ('vlines') of the `column_labels`. column_labels_vlines_width The width of all vertical lines ('vlines') of the `column_labels`. column_labels_vlines_color The color of all vertical lines ('vlines') of the `column_labels`. column_labels_border_top_style The style of the top border of the `column_labels` location. column_labels_border_top_width The width of the top border of the `column_labels` location. If the `width` of this border is larger, then it will be the visible border. column_labels_border_top_color The color of the top border of the `column_labels` location. column_labels_border_bottom_style The style of the bottom border of the `column_labels` location. column_labels_border_bottom_width The width of the bottom border of the `column_labels` location. If the `width` of this border is larger, then it will be the visible border. column_labels_border_bottom_color The color of the bottom border of the `column_labels` location. column_labels_border_lr_style The style of the left and right borders of the `column_labels` location. column_labels_border_lr_width The width of the left and right borders of the `column_labels` location. If the `width` of this border is larger, then it will be the visible border. column_labels_border_lr_color The color of the left and right borders of the `column_labels` location. column_labels_hidden An option to hide the column labels. If providing `True` then the entire `column_labels` location won't be seen and the table header (if present) will collapse downward. row_group_background_color The background color for the row group labels. A color name or a hexadecimal color code should be provided. row_group_font_weight The font weight for all row group labels present in the table. row_group_font_size The font size to use for all row group labels. row_group_padding The amount of vertical padding to incorporate in the row group labels. row_group_border_top_style The style of the top border of the `row_group` location. row_group_border_top_width The width of the top border of the `row_group` location. If the `width` of this border is larger, then it will be the visible border. row_group_border_top_color The color of the top border of the `row_group` location. row_group_border_bottom_style The style of the bottom border of the `row_group` location. row_group_border_bottom_width The width of the bottom border of the `row_group` location. If the `width` of this border is larger, then it will be the visible border. row_group_border_bottom_color The color of the bottom border of the `row_group` location. row_group_border_left_style The style of the left border of the `row_group` location. row_group_border_left_width The width of the left border of the `row_group` location. If the `width` of this border is larger, then it will be the visible border. row_group_border_left_color The color of the left border of the `row_group` location. row_group_border_right_style The style of the right border of the `row_group` location. row_group_border_right_width The width of the right border of the `row_group` location. If the `width` of this border is row_group_border_right_color The color of the right border of the `row_group` location. row_group_as_column An option to render the row group labels as a column. If `True`, then the row group labels will be rendered as a column to the left of the table body. If `False`, then the row group labels will be rendered as a separate row above the grouping of rows. table_body_hlines_style The style of all horizontal lines ('hlines') in the `table_body`. table_body_hlines_width The width of all horizontal lines ('hlines') in the `table_body`. table_body_hlines_color The color of all horizontal lines ('hlines') in the `table_body`. table_body_vlines_style The style of all vertical lines ('vlines') in the `table_body`. table_body_vlines_width The width of all vertical lines ('vlines') in the `table_body`. table_body_vlines_color The color of all vertical lines ('vlines') in the `table_body`. table_body_border_top_style The style of the top border of the `table_body` location. table_body_border_top_width The width of the top border of the `table_body` location. If the `width` of this border is larger, then it will be the visible border. table_body_border_top_color The color of the top border of the `table_body` location. table_body_border_bottom_style The style of the bottom border of the `table_body` location. table_body_border_bottom_width The width of the bottom border of the `table_body` location. If the `width` of this border table_body_border_bottom_color The color of the bottom border of the `table_body` location. stub_background_color The background color for the stub. A color name or a hexadecimal color code should be provided. stub_font_size The font size to use for all row labels present in the table stub. stub_font_weight The font weight for all row labels present in the table stub. stub_text_transform The text transformation for the row labels present in the table stub. stub_border_style The style of the vertical border of the table stub. stub_border_width The width of the vertical border of the table stub. stub_border_color The color of the vertical border of the table stub. stub_row_group_font_size The font size for the row group column in the stub. stub_row_group_font_weight The font weight for the row group column in the stub. stub_row_group_text_transform The text transformation for the row group column in the stub. stub_row_group_border_style The style of the vertical border of the row group column in the stub. stub_row_group_border_width The width of the vertical border of the row group column in the stub. stub_row_group_border_color The color of the vertical border of the row group column in the stub. data_row_padding The amount of vertical padding to incorporate in the body/stub rows. data_row_padding_horizontal The amount of horizontal padding to incorporate in the body/stub rows. source_notes_background_color The background color for the source notes. A color name or a hexadecimal color code should be provided. source_notes_font_size The font size to use for all source note text. source_notes_padding The amount of vertical padding to incorporate in the source notes. source_notes_padding_horizontal The amount of horizontal padding to incorporate in the source notes. source_notes_multiline An option to either put source notes in separate lines (the default, or `True`) or render them as a continuous line of text with `source_notes_sep` providing the separator (by default `" "`) between notes. source_notes_sep The separating characters between adjacent source notes when rendered as a continuous line of text (when `source_notes_multiline` is `False`). The default value is a single space character (`" "`). source_notes_border_bottom_style The style of the bottom border of the `source_notes` location. source_notes_border_bottom_width The width of the bottom border of the `source_notes` location. If the `width` of this border is larger, then it will be the visible border. source_notes_border_bottom_color The color of the bottom border of the `source_notes` location. source_notes_border_lr_style The style of the left and right borders of the `source_notes` location. source_notes_border_lr_width The width of the left and right borders of the `source_notes` location. If the `width` of this border is larger, then it will be the visible border. source_notes_border_lr_color The color of the left and right borders of the `source_notes` location. row_striping_background_color The background color for striped table body rows. A color name or a hexadecimal color code should be provided. row_striping_include_stub An option for whether to include the stub when striping rows. row_striping_include_table_body An option for whether to include the table body when striping rows. quarto_disable_processing Whether to disable Quarto table processing. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a new table with a number of table components added. We can use this object going forward to demonstrate some of the features available in the `tab_options()` method. ```{python} from great_tables import GT, exibble, md gt_tbl = ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") ) gt_tbl ``` We can modify the table width to be set as `"100%`". In effect, this spans the table to entirely fill the content width area. This is done with the `table_width` option. ```{python} gt_tbl.tab_options(table_width="100%") ``` With the `table_background_color` option, we can modify the table's background color. Here, we want that to be `"lightcyan"`. ```{python} gt_tbl.tab_options(table_background_color="lightcyan") ``` The data rows of a table typically take up the most physical space but we have some control over the extent of that. With the `data_row_padding` option, it's possible to modify the top and bottom padding of data rows. We'll do just that in the following example, reducing the padding to a value of `"3px"`. ```{python} gt_tbl.tab_options(data_row_padding="3px") ``` The size of the title and the subtitle text in the header of the table can be altered with the `heading_title_font_size` and `heading_subtitle_font_size` options. Here, we'll use the `"small"` and `"x-small"` keyword values. ```{python} gt_tbl.tab_options(heading_title_font_size="small", heading_subtitle_font_size="x-small") ``` ## Formatting column data Columns of data can be formatted with the `fmt_*()` methods. We can specify the rows of these columns quite precisely with the `rows` argument. We get to apply these methods exactly once to each data cell (last call wins). Need to do custom formatting? Use the [`fmt()`](`great_tables.GT.fmt`) method and define your own formatter. The `sub_*()` methods allow you to perform substitution operations and `data_color()` provides a lot of power for colorizing body cells based on their data values. fmt_number(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, scale_by: 'float' = 1, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format numeric values. With numeric values within a table's body cells, we can perform number-based formatting so that the targeted values are rendered with a higher consideration for tabular presentation. Furthermore, there is finer control over numeric formatting with the following options: - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - scaling: we can choose to scale targeted values by a multiplier value - large-number suffixing: larger figures (thousands, millions, etc.) can be autoscaled and decorated with the appropriate suffixes - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in number formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. decimals The `decimals` values corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. If you always need `decimals = 0`, the [`fmt_integer()`](`great_tables.GT.fmt_integer`) method should be considered. n_sigfig A option to format numbers to *n* significant figures. By default, this is `None` and thus number values will be formatted according to the number of decimal places set via `decimals`. If opting to format according to the rules of significant figures, `n_sigfig` must be a number greater than or equal to `1`. Any values passed to the `decimals` and `drop_trailing_zeros` arguments will be ignored. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. accounting Whether to use accounting style, which wraps negative numbers in parentheses instead of using a minus sign. scale_by All numeric values will be multiplied by the `scale_by` value before undergoing formatting. Since the `default` value is `1`, no values will be changed unless a different multiplier value is supplied. compact A boolean value that allows for compact formatting of numeric values. Values will be scaled and decorated with the appropriate suffixes (e.g., `1230` becomes `1.23K`, and `1230000` becomes `1.23M`). The `compact` option is `False` by default. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use the `exibble` dataset to create a table. With the `fmt_number()` method, we'll format the `num` column to have three decimal places (with `decimals=3`) and omit the use of digit separators (with `use_seps=False`). ```{python} from great_tables import GT, exibble ( GT(exibble) .fmt_number(columns="num", decimals=3, use_seps=False) ) ``` fmt_integer(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, use_seps: 'bool' = True, scale_by: 'float' = 1, accounting: 'bool' = False, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format values as integers. With numeric values in one or more table columns, we can perform number-based formatting so that the targeted values are always rendered as integer values. We can have fine control over integer formatting with the following options: - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - scaling: we can choose to scale targeted values by a multiplier value - large-number suffixing: larger figures (thousands, millions, etc.) can be autoscaled and decorated with the appropriate suffixes - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in number formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. scale_by All numeric values will be multiplied by the `scale_by` value before undergoing formatting. Since the `default` value is `1`, no values will be changed unless a different multiplier value is supplied. accounting Whether to use accounting style, which wraps negative numbers in parentheses instead of using a minus sign. compact A boolean value that allows for compact formatting of numeric values. Values will be scaled and decorated with the appropriate suffixes (e.g., `1230` becomes `1K`, and `1230000` becomes `1M`). The `compact` option is `False` by default. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator marks will be correct for the given locale. Should any value be provided in `sep_mark`, it will be overridden by the locale's preferred value. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- For this example, we'll use the `exibble` dataset as the input table. With the `fmt_integer()` method, we'll format the `num` column as integer values having no digit separators (with the `use_seps=False` option). ```{python} from great_tables import GT, exibble ( GT(exibble) .fmt_integer(columns="num", use_seps=False) ) ``` fmt_scientific(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_by: 'float' = 1, exp_style: 'str' = 'x10n', pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign_m: 'bool' = False, force_sign_n: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format values to scientific notation. With numeric values in a table, we can perform formatting so that the targeted values are rendered in scientific notation, where extremely large or very small numbers can be expressed in a more practical fashion. Here, numbers are written in the form of a mantissa (`m`) and an exponent (`n`) with the construction *m* x 10^*n* or *m*E*n*. The mantissa component is a number between `1` and `10`. For instance, `2.5 x 10^9` can be used to represent the value 2,500,000,000 in scientific notation. In a similar way, 0.00000012 can be expressed as `1.2 x 10^-7`. Due to its ability to describe numbers more succinctly and its ease of calculation, scientific notation is widely employed in scientific and technical domains. We have fine control over the formatting task, with the following options: - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - scaling: we can choose to scale targeted values by a multiplier value - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. decimals The `decimals` values corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. n_sigfig A option to format numbers to *n* significant figures. By default, this is `None` and thus number values will be formatted according to the number of decimal places set via `decimals`. If opting to format according to the rules of significant figures, `n_sigfig` must be a number greater than or equal to `1`. Any values passed to the `decimals` and `drop_trailing_zeros` arguments will be ignored. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. scale_by All numeric values will be multiplied by the `scale_by` value before undergoing formatting. Since the `default` value is `1`, no values will be changed unless a different multiplier value is supplied. exp_style Style of formatting to use for the scientific notation formatting. By default this is `"x10n"` but other options include using a single letter (e.g., `"e"`, `"E"`, etc.), a letter followed by a `"1"` to signal a minimum digit width of one, or `"low-ten"` for using a stylized `"10"` marker. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign_m Should the plus sign be shown for positive values of the mantissa (first component)? This would effectively show a sign for all values except zero on the first numeric component of the notation. If so, use `True` (the default for this is `False`), where only negative numbers will display a sign. force_sign_n Should the plus sign be shown for positive values of the exponent (second component)? This would effectively show a sign for all values except zero on the second numeric component of the notation. If so, use `True` (the default for this is `False`), where only negative numbers will display a sign. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should a value be provided in `dec_mark` it will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- For this example, we'll use the `exibble` dataset as the input table. With the `fmt_scientific()` method, we'll format the `num` column to contain values in scientific formatting. ```{python} from great_tables import GT, exibble ( GT(exibble) .fmt_scientific(columns="num") ) ``` fmt_engineering(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_by: 'float' = 1, exp_style: 'str' = 'x10n', pattern: 'str' = '{x}', dec_mark: 'str' = '.', force_sign_m: 'bool' = False, force_sign_n: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format values to engineering notation. With numeric values in a table, we can perform formatting so that the targeted values are rendered in engineering notation, where numbers are written in the form of a mantissa (`m`) and an exponent (`n`). When combined the construction is either of the form *m* x 10^*n* or *m*E*n*. The mantissa is a number between `1` and `1000` and the exponent is a multiple of `3`. For example, the number `0.0000345` can be written in engineering notation as `34.50 x 10^-6`. This notation helps to simplify calculations and make it easier to compare numbers that are on very different scales. Engineering notation is particularly useful as it aligns with SI prefixes (e.g., *milli-*, *micro-*, *kilo-*, *mega-*). For instance, numbers in engineering notation with exponent `-3` correspond to milli-units, while those with exponent `6` correspond to mega-units. We have fine control over the formatting task, with the following options: - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - scaling: we can choose to scale targeted values by a multiplier value - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. decimals The `decimals` values corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. n_sigfig A option to format numbers to *n* significant figures. By default, this is `None` and thus number values will be formatted according to the number of decimal places set via `decimals`. If opting to format according to the rules of significant figures, `n_sigfig` must be a number greater than or equal to `1`. Any values passed to the `decimals` and `drop_trailing_zeros` arguments will be ignored. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. scale_by All numeric values will be multiplied by the `scale_by` value before undergoing formatting. Since the `default` value is `1`, no values will be changed unless a different multiplier value is supplied. exp_style Style of formatting to use for the engineering notation formatting. By default this is `"x10n"` but other options include using a single letter (e.g., `"e"`, `"E"`, etc.), a letter followed by a `"1"` to signal a minimum digit width of one, or `"low-ten"` for using a stylized `"10"` marker. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign_m Should the plus sign be shown for positive values of the mantissa (first component)? This would effectively show a sign for all values except zero on the first numeric component of the notation. If so, use `True` (the default for this is `False`), where only negative numbers will display a sign. force_sign_n Should the plus sign be shown for positive values of the exponent (second component)? This would effectively show a sign for all values except zero on the second numeric component of the notation. If so, use `True` (the default for this is `False`), where only negative numbers will display a sign. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means decimal marks will be correct for the given locale. Should a value be provided in `dec_mark` it will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- With numeric values in a table, we can perform formatting so that the targeted values are rendered in engineering notation. For example, the number `0.0000345` can be written in engineering notation as `34.50 x 10^-6`. ```{python} import polars as pl from great_tables import GT numbers_df = pl.DataFrame({ "numbers": [0.0000345, 3450, 3450000] }) GT(numbers_df).fmt_engineering() ``` Notice that in each case, the exponent is a multiple of `3`. Let's define a DataFrame that contains two columns of values (one small and one large). After creating a simple table with `GT()`, we'll call `fmt_engineering()` on both columns. ```{python} small_large_df = pl.DataFrame({ "small": [10**-i for i in range(12, 0, -1)], "large": [10**i for i in range(1, 13)] }) GT(small_large_df).fmt_engineering() ``` Notice that within the form of *m* x 10^*n*, the *n* values move in steps of 3 (away from 0), and *m* values can have 1-3 digits before the decimal. Further to this, any values where *n* is 0 results in a display of only *m* (the first two values in the `large` column demonstrates this). Engineering notation expresses values so that they align to certain SI prefixes. Here is a table that compares select SI prefixes and their symbols to decimal and engineering-notation representations of the key numbers. ```{python} import polars as pl from great_tables import GT prefixes_df = pl.DataFrame({ "name": [ "peta", "tera", "giga", "mega", "kilo", None, "milli", "micro", "nano", "pico", "femto" ], "symbol": [ "P", "T", "G", "M", "k", None, "m", "μ", "n", "p", "f" ], "decimal": [float(10**i) for i in range(15, -18, -3)], }) prefixes_df = prefixes_df.with_columns( engineering=pl.col("decimal") ) ( GT(prefixes_df) .fmt_number(columns="decimal", n_sigfig=1) .fmt_engineering(columns="engineering") .sub_missing() ) ``` fmt_percent(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_values: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, placement: 'str' = 'right', incl_space: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format values as a percentage. With numeric values in a **gt** table, we can perform percentage-based formatting. It is assumed the input numeric values are proportional values and, in this case, the values will be automatically multiplied by `100` before decorating with a percent sign (the other case is accommodated though setting `scale_values` to `False`). For more control over percentage formatting, we can use the following options: - percent sign placement: the percent sign can be placed after or before the values and a space can be inserted between the symbol and the value. - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - value scaling toggle: choose to disable automatic value scaling in the situation that values are already scaled coming in (and just require the percent symbol) - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in number formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. decimals The `decimals` values corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. scale_values Should the values be scaled through multiplication by 100? By default this scaling is performed since the expectation is that incoming values are usually proportional. Setting to `False` signifies that the values are already scaled and require only the percent sign when formatted. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. accounting Whether to use accounting style, which wraps negative numbers in parentheses instead of using a minus sign. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. placement This option governs the placement of the percent sign. This can be either be `"right"` (the default) or `"left"`. incl_space An option for whether to include a space between the value and the percent sign. The default is to not introduce a space character. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let’s use the `towny` dataset as the input table. With the `fmt_percent()` method, we'll format the `pop_change_2016_2021_pct` column to to display values as percentages (to two decimal places). ```{python} from great_tables import GT from great_tables.data import towny towny_mini = ( towny[["name", "pop_change_2016_2021_pct"]] .head(10) ) (GT(towny_mini).fmt_percent("pop_change_2016_2021_pct", decimals=2)) ``` fmt_partsper(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, to_units: 'str' = 'per-mille', symbol: 'str' = 'auto', decimals: 'int' = 2, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_values: 'bool' = True, use_seps: 'bool' = True, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, incl_space: 'str | bool' = 'auto', locale: 'str | None' = None) -> 'GTSelf' Format values as parts-per quantities. With numeric values in a **gt** table, we can format the values so that they are rendered as parts-per quantities (per mille, ppm, ppb, etc.). The following keywords are available for the `to_units` parameter: - `"per-mille"`: Per mille (1 part in 1,000) - `"per-myriad"`: Per myriad (1 part in 10,000) - `"pcm"`: Per cent mille (1 part in 100,000) - `"ppm"`: Parts per million (1 part in 1,000,000) - `"ppb"`: Parts per billion (1 part in 1,000,000,000) - `"ppt"`: Parts per trillion (1 part in 1,000,000,000,000) - `"ppq"`: Parts per quadrillion (1 part in 1,000,000,000,000,000) The function provides a lot of formatting control and we can use the following options: - custom symbol/units: override the automatic symbol or units display with a custom choice - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - value scaling toggle: choose to disable automatic value scaling in the situation that values are already scaled coming in (and just require the appropriate symbol or unit display) - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in number formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. to_units A keyword that signifies the desired output quantity. This can be any from the following set: `"per-mille"`, `"per-myriad"`, `"pcm"`, `"ppm"`, `"ppb"`, `"ppt"`, or `"ppq"`. symbol The symbol/units to use for the quantity. By default, this is set to `"auto"` and the appropriate symbol will be chosen based on the `to_units` keyword and the output context. This can be changed by supplying a string (e.g., using `symbol="ppbV"` when `to_units="ppb"`). decimals The `decimals` values corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. scale_values Should the values be scaled through multiplication according to the keyword set in `to_units`? By default this is `True` since the expectation is that normally values are proportions. Setting to `False` signifies that the values are already scaled and require only the appropriate symbol/units when formatted. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. incl_space An option for whether to include a space between the value and the symbol/units. The default is `"auto"` which provides spacing dependent on the mark itself (symbols like `‰` get no space; text abbreviations like `ppm` get a space). This can be directly controlled by using either `True` or `False`. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use a small dataset with proportional values and format them as parts-per-mille values. ```{python} from great_tables import GT import pandas as pd df = pd.DataFrame({"x": [0.001, 0.0001, 0.00001, 0.5, -0.005]}) GT(df).fmt_partsper(columns="x", to_units="per-mille") ``` We can also format values as parts per million (ppm) using a Polars DataFrame: ```{python} import polars as pl from great_tables import GT df = pl.DataFrame({"x": [0.0000015, 0.00035, 0.0001]}) GT(df).fmt_partsper(columns="x", to_units="ppm") ``` If the values are already scaled (not proportions), set `scale_values=False` and use a custom symbol: ```{python} import polars as pl from great_tables import GT concentrations = pl.DataFrame({"gas": ["CO", "NO2", "O3"], "conc": [1.5, 35.0, 120.0]}) GT(concentrations).fmt_partsper(columns="conc", to_units="ppb", scale_values=False, symbol="ppbV") ``` fmt_currency(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, currency: 'str | None' = None, use_subunits: 'bool' = True, decimals: 'int | None' = None, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, scale_by: 'float' = 1, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, placement: 'str' = 'left', incl_space: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format values as currencies. With numeric values in a **gt** table, we can perform currency-based formatting with the `fmt_currency()` method. This supports both automatic formatting with a three-letter currency code. We have fine control over the conversion from numeric values to currency values, where we could take advantage of the following options: - the currency: providing a currency code or common currency name will procure the correct currency symbol and number of currency subunits - currency symbol placement: the currency symbol can be placed before or after the values - decimals/subunits: choice of the number of decimal places, and a choice of the decimal symbol, and an option on whether to include or exclude the currency subunits (the decimal portion) - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - scaling: we can choose to scale targeted values by a multiplier value - pattern: option to use a text pattern for decoration of the formatted currency values - locale-based formatting: providing a locale ID will result in currency formatting specific to the chosen locale; it will also retrieve the locale's currency if none is explicitly given Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. currency The currency to use for the numeric value. This input can be supplied as a 3-letter currency code (e.g., `"USD"` for U.S. Dollars, `"EUR"` for the Euro currency). use_subunits An option for whether the subunits portion of a currency value should be displayed. For example, with an input value of `273.81`, the default formatting will produce `"$273.81"`. Removing the subunits (with `use_subunits = False`) will give us `"$273"`. decimals The `decimals` values corresponds to the exact number of decimal places to use. This value is optional as a currency has an intrinsic number of decimal places (i.e., the subunits). A value such as `2.34` can, for example, be formatted with `0` decimal places and if the currency used is `"USD"` it would result in `"$2"`. With `4` decimal places, the formatted value becomes `"$2.3400"`. drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. accounting Whether to use accounting style, which wraps negative numbers in parentheses instead of using a minus sign. scale_by All numeric values will be multiplied by the `scale_by` value before undergoing formatting. Since the `default` value is `1`, no values will be changed unless a different multiplier value is supplied. compact Whether to use compact formatting. This is a boolean value that, when set to `True`, will format large numbers in a more compact form (e.g., `1,000,000` becomes `1M`). This is `False` by default. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. placement The placement of the currency symbol. This can be either be `"left"` (as in `"$450"`) or `"right"` (which yields `"450$"`). incl_space An option for whether to include a space between the value and the currency symbol. The default is to not introduce a space character. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's preferred values. In addition to number formatting, providing a `locale` value and not providing a `currency` allows **Great Tables** to obtain the currency code from the locale's territory. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use the `exibble` dataset to create a table. With the `fmt_currency()` method, we'll format the `currency` column to display monetary values. ```{python} from great_tables import GT, exibble ( GT(exibble) .fmt_currency( columns="currency", decimals=3, use_seps=False ) ) ``` fmt_roman(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, case: 'str' = 'upper', pattern: 'str' = '{x}') -> 'GTSelf' Format values as Roman numerals. With numeric values in a **gt** table we can transform those to Roman numerals, rounding values as necessary. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. case Should Roman numerals should be rendered as uppercase (`"upper"`) or lowercase (`"lower"`) letters? By default, this is set to `"upper"`. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's first create a DataFrame containing small numeric values and then introduce that to [`GT()`](`great_tables.GT`). We'll then format the `roman` column to appear as Roman numerals with the `fmt_roman()` method. ```{python} import pandas as pd from great_tables import GT numbers_tbl = pd.DataFrame({"arabic": [1, 8, 24, 85], "roman": [1, 8, 24, 85]}) ( GT(numbers_tbl, rowname_col="arabic") .fmt_roman(columns="roman") ) ``` fmt_bytes(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, standard: 'str' = 'decimal', decimals: 'int' = 1, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = True, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, incl_space: 'bool' = True, locale: 'str | None' = None) -> 'GTSelf' Format values as bytes. With numeric values in a table, we can transform those to values of bytes with human readable units. The `fmt_bytes()` method allows for the formatting of byte sizes to either of two common representations: (1) with decimal units (powers of 1000, examples being `"kB"` and `"MB"`), and (2) with binary units (powers of 1024, examples being `"KiB"` and `"MiB"`). It is assumed the input numeric values represent the number of bytes and automatic truncation of values will occur. The numeric values will be scaled to be in the range of 1 to <1000 and then decorated with the correct unit symbol according to the standard chosen. For more control over the formatting of byte sizes, we can use the following options: - decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice of the decimal symbol - digit grouping separators: options to enable/disable digit separators and provide a choice of separator symbol - pattern: option to use a text pattern for decoration of the formatted values - locale-based formatting: providing a locale ID will result in number formatting specific to the chosen locale Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. standard The form of expressing large byte sizes is divided between: (1) decimal units (powers of 1000; e.g., `"kB"` and `"MB"`), and (2) binary units (powers of 1024; e.g., `"KiB"` and `"MiB"`). The default is to use decimal units with the `"decimal"` option. The alternative is to use binary units with the `"binary"` option. decimals This corresponds to the exact number of decimal places to use. A value such as `2.34` can, for example, be formatted with `0` decimal places and it would result in `"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed with `drop_trailing_zeros=True`. drop_trailing_zeros A boolean value that allows for removal of trailing zeros (those redundant zeros after the decimal mark). drop_trailing_dec_mark A boolean value that determines whether decimal marks should always appear even if there are no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By default trailing decimal marks are not shown. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). dec_mark The string to be used as the decimal mark. For example, using `dec_mark=","` with the value `0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. incl_space An option for whether to include a space between the value and the currency symbol. The default is to not introduce a space character. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid locale ID here means separator and decimal marks will be correct for the given locale. Should any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's preferred values. Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use a single column from the `exibble` dataset and create a new table. We'll format the `num` column to display as byte sizes in the decimal standard through use of the `fmt_bytes()` method. ```{python} from great_tables import GT, exibble ( GT(exibble[["num"]]) .fmt_bytes(columns="num", standard="decimal") ) ``` fmt_date(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, date_style: 'DateStyle' = 'iso', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf' Format values as dates. Format input values to time values using one of 17 preset date styles. Input can be in the form of `date` type or as a ISO-8601 string (in the form of `YYYY-MM-DD HH:MM:SS` or `YYYY-MM-DD`). Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. date_style The date style to use. By default this is the short name `"iso"` which corresponds to ISO 8601 date formatting. There are 41 date styles in total. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Formatting with the `date_style=` argument ----------------------------------------- We need to supply a preset date style to the `date_style=` argument. The date styles are numerous and can handle localization to any supported locale. The following table provides a listing of all date styles and their output values (corresponding to an input date of `2000-02-29`). | | Date Style | Output | |----|-----------------------|-------------------------| | 1 | `"iso"` | `"2000-02-29"` | | 2 | `"wday_month_day_year"`| `"Tuesday, February 29, 2000"` | | 3 | `"wd_m_day_year"` | `"Tue, Feb 29, 2000"` | | 4 | `"wday_day_month_year"`| `"Tuesday 29 February 2000"` | | 5 | `"month_day_year"` | `"February 29, 2000"` | | 6 | `"m_day_year"` | `"Feb 29, 2000"` | | 7 | `"day_m_year"` | `"29 Feb 2000"` | | 8 | `"day_month_year"` | `"29 February 2000"` | | 9 | `"day_month"` | `"29 February"` | | 10 | `"day_m"` | `"29 Feb"` | | 11 | `"year"` | `"2000"` | | 12 | `"month"` | `"February"` | | 13 | `"day"` | `"29"` | | 14 | `"year.mn.day"` | `"2000/02/29"` | | 15 | `"y.mn.day"` | `"00/02/29"` | | 16 | `"year_week"` | `"2000-W09"` | | 17 | `"year_quarter"` | `"2000-Q1"` | Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date` and `time` columns). With the `fmt_date()` method, we'll format the `date` column to display dates formatted with the `"month_day_year"` date style. ```{python} from great_tables import GT, exibble exibble_mini = exibble[["date", "time"]] ( GT(exibble_mini) .fmt_date(columns="date", date_style="month_day_year") ) ``` fmt_time(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, time_style: 'TimeStyle' = 'iso', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf' Format values as times. Format input values to time values using one of 5 preset time styles. Input can be in the form of `time` values, or strings in the ISO 8601 forms of `HH:MM:SS` or `YYYY-MM-DD HH:MM:SS`. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. time_style The time style to use. By default this is the short name `"iso"` which corresponds to how times are formatted within ISO 8601 datetime values. There are 5 time styles in total. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Formatting with the `time_style=` argument ----------------------------------------- We need to supply a preset time style to the `time_style=` argument. The time styles are numerous and can handle localization to any supported locale. The following table provides a listing of all time styles and their output values (corresponding to an input time of `14:35:00`). | | Time Style | Output | Notes | |----|---------------|---------------------------------|---------------| | 1 | `"iso"` | `"14:35:00"` | ISO 8601, 24h | | 2 | `"iso-short"` | `"14:35"` | ISO 8601, 24h | | 3 | `"h_m_s_p"` | `"2:35:00 PM"` | 12h | | 4 | `"h_m_p"` | `"2:35 PM"` | 12h | | 5 | `"h_p"` | `"2 PM"` | 12h | Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Adapting output to a specific `locale` -------------------------------------- This formatting method can adapt outputs according to a provided `locale` value. Examples include `"en"` for English (United States) and `"fr"` for French (France). Note that a `locale` value provided here will override any global locale setting performed in [`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by all other methods that have a `locale` argument). Examples -------- Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date` and `time` columns). With the `fmt_time()` method, we'll format the `time` column to display times formatted with the `"h_m_s_p"` time style. ```{python} from great_tables import GT, exibble exibble_mini = exibble[["date", "time"]] ( GT(exibble_mini) .fmt_time(columns="time", time_style="h_m_s_p") ) ``` fmt_datetime(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, date_style: 'DateStyle' = 'iso', time_style: 'TimeStyle' = 'iso', format_str: 'str | None' = None, sep: 'str' = ' ', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf' Format values as datetimes. Format input values to datetime values using one of 17 preset date styles and one of 5 preset time styles. Input can be in the form of `datetime` values, or strings in the ISO 8601 forms of `YYYY-MM-DD HH:MM:SS` or `YYYY-MM-DD`. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. date_style The date style to use. By default this is the short name `"iso"` which corresponds to ISO 8601 date formatting. There are 41 date styles in total. time_style The time style to use. By default this is the short name `"iso"` which corresponds to how times are formatted within ISO 8601 datetime values. There are 5 time styles in total. format_str A string that specifies the format of the datetime string. This is a `strftime()` format string that can be used to format date or datetime input. If `format=` is provided, the `date_style=` and `time_style=` arguments are ignored. sep A string that separates the date and time components of the datetime string. The default is a space character (`" "`). This is ignored if `format=` is provided. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Only relevant if `date_style=` or `time_style=` are provided. Formatting with the `date_style=` and `time_style=` arguments ------------------------------------------------------------- If not supplying a formatting string to `format_str=` we need to supply a preset date style to the `date_style=` argument and a preset time style to the `time_style=` argument. The date styles are numerous and can handle localization to any supported locale. The following table provides a listing of all date styles and their output values (corresponding to an input date of `2000-02-29 14:35:00`). | | Date Style | Output | |----|-----------------------|-------------------------| | 1 | `"iso"` | `"2000-02-29"` | | 2 | `"wday_month_day_year"`| `"Tuesday, February 29, 2000"` | | 3 | `"wd_m_day_year"` | `"Tue, Feb 29, 2000"` | | 4 | `"wday_day_month_year"`| `"Tuesday 29 February 2000"` | | 5 | `"month_day_year"` | `"February 29, 2000"` | | 6 | `"m_day_year"` | `"Feb 29, 2000"` | | 7 | `"day_m_year"` | `"29 Feb 2000"` | | 8 | `"day_month_year"` | `"29 February 2000"` | | 9 | `"day_month"` | `"29 February"` | | 10 | `"day_m"` | `"29 Feb"` | | 11 | `"year"` | `"2000"` | | 12 | `"month"` | `"February"` | | 13 | `"day"` | `"29"` | | 14 | `"year.mn.day"` | `"2000/02/29"` | | 15 | `"y.mn.day"` | `"00/02/29"` | | 16 | `"year_week"` | `"2000-W09"` | | 17 | `"year_quarter"` | `"2000-Q1"` | The time styles can also handle localization to any supported locale. The following table provides a listing of all time styles and their output values (corresponding to an input time of `2000-02-29 14:35:00`). | | Time Style | Output | Notes | |----|---------------|---------------------------------|---------------| | 1 | `"iso"` | `"14:35:00"` | ISO 8601, 24h | | 2 | `"iso-short"` | `"14:35"` | ISO 8601, 24h | | 3 | `"h_m_s_p"` | `"2:35:00 PM"` | 12h | | 4 | `"h_m_p"` | `"2:35 PM"` | 12h | | 5 | `"h_p"` | `"2 PM"` | 12h | Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date` and `time` columns). With the `fmt_datetime()` method, we'll format the `date` column to display dates formatted with the `"month_day_year"` date style and the `time` column to display times formatted with the `"h_m_s_p"` time style. ```{python} from great_tables import GT, exibble exibble_mini = exibble[["date", "time"]] ( GT(exibble_mini) .fmt_datetime( columns="date", date_style="month_day_year", time_style="h_m_s_p" ) ) ``` fmt_duration(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, input_units: 'str | None' = None, output_units: 'str | list[str] | None' = None, duration_style: 'DurationStyle' = 'narrow', trim_zero_units: 'bool | list[str]' = True, max_output_units: 'int | None' = None, pattern: 'str' = '{x}', use_seps: 'bool' = True, sep_mark: 'str' = ',', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf' Format numeric or duration values as styled time duration strings. Format input values to time duration values whether those input values are numbers or of the `timedelta` class. We can specify which time units any numeric input values have (as weeks, days, hours, minutes, or seconds) and the output can be customized with a duration style (corresponding to narrow, wide, colon-separated, and ISO forms) and a choice of output units ranging from weeks to seconds. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. input_units If one or more selected columns contains numeric values (not `timedelta` values, which contain the duration units), a keyword must be provided for `input_units` for the values to be interpreted in terms of duration. The accepted units are: `"seconds"`, `"minutes"`, `"hours"`, `"days"`, and `"weeks"`. This is required for numeric columns and ignored for `timedelta` columns. output_units Controls the output time units. The default (`None`) means that output units will be automatically chosen based on the input duration value. To control which time units are to be considered for output (before trimming with `trim_zero_units=`) we can specify a list of one or more of the following keywords: `"weeks"`, `"days"`, `"hours"`, `"minutes"`, or `"seconds"`. duration_style A choice of four formatting styles for the output duration values. With `"narrow"` (the default style), duration values will be formatted with single-letter time-part units (e.g., 1.35 days will be styled as `"1d 8h 24m"`). With `"wide"`, this example value will be expanded to `"1 day 8 hours 24 minutes"` after formatting. The `"colon-sep"` style will put days, hours, minutes, and seconds in the `"([D]/)[HH]:[MM]:[SS]"` format. The `"iso"` style will produce a value that conforms to the ISO 8601 rules for duration values (e.g., 1.35 days will become `"P1DT8H24M"`). trim_zero_units Provides methods to remove output time units that have zero values. By default this is `True` and duration values that might otherwise be formatted as `"0w 1d 0h 4m 19s"` with `trim_zero_units=False` are instead displayed as `"1d 4m 19s"`. Aside from using `True`/`False` we could provide a list of keywords for more precise control. These keywords are: (1) `"leading"`, to omit all leading zero-value time units (e.g., `"0w 1d"` -> `"1d"`), (2) `"trailing"`, to omit all trailing zero-value time units (e.g., `"3d 5h 0s"` -> `"3d 5h"`), and (3) `"internal"`, which removes all internal zero-value time units (e.g., `"5d 0h 33m"` -> `"5d 33m"`). max_output_units If `output_units` is `None`, where the output time units are unspecified and left to be handled automatically, a numeric value provided for `max_output_units=` will be taken as the maximum number of time units to display in all output time duration values. By default, this is `None` and all possible time units will be displayed. This option has no effect when `duration_style="colon-sep"` (only `output_units` can be used to customize that type of duration output). pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. use_seps The `use_seps` option allows for the use of digit group separators. The type of digit group separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This setting is `True` by default. sep_mark The string to use as a separator between groups of digits. For example, using `sep_mark=","` with a value of `1000` would result in a formatted value of `"1,000"`. This argument is ignored if a `locale` is supplied (i.e., is not `None`). force_sign Should the positive sign be shown for positive values (effectively showing a sign for all values except zero)? If so, use `True` for this option. The default is `False`, where only negative numbers will display a minus sign. locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Output units for the colon-separated duration style --------------------------------------------------- The colon-separated duration style (enabled when `duration_style="colon-sep"`) is essentially a clock-based output format which uses the display logic of chronograph watch functionality. It will, by default, display duration values in the `(D/)HH:MM:SS` format. Any duration values greater than or equal to 24 hours will have the number of days prepended with an adjoining slash mark. While this output format is versatile, it can be changed somewhat with the `output_units=` option. The following combinations of output units are permitted: - `["minutes", "seconds"]` -> `MM:SS` - `["hours", "minutes"]` -> `HH:MM` - `["hours", "minutes", "seconds"]` -> `HH:MM:SS` - `["days", "hours", "minutes"]` -> `(D/)HH:MM` Any other specialized combinations will result in the default set being used, which is `["days", "hours", "minutes", "seconds"]`. Compatibility of formatting function with data values ----------------------------------------------------- `fmt_duration()` is compatible with body cells that are of `int`, `float`, or `datetime.timedelta` types. Any other types of body cells are ignored during formatting. Examples -------- Let's create a table with duration values in seconds and format them using the default narrow style. This produces compact output with single-letter unit abbreviations, ideal for space-constrained displays. ```{python} import pandas as pd from great_tables import GT df = pd.DataFrame({"duration_s": [3661, 86400, 172800, 60, 0]}) ( GT(df) .fmt_duration(columns="duration_s", input_units="seconds") ) ``` Notice that zero-valued time units are automatically trimmed from the output, keeping the display clean. A value of `86400` seconds (exactly 1 day) simply shows `"1d"` rather than `"0w 1d 0h 0m 0s"`. For reporting contexts where readability is more important than compactness, the wide style spells out the full unit names with proper singular/plural forms. ```{python} df = pd.DataFrame({"hours": [1.5, 24.0, 0.5, 100.75]}) ( GT(df) .fmt_duration(columns="hours", input_units="hours", duration_style="wide") ) ``` The colon-separated style is useful for timing data, race results, or any context where a clock-like display is expected. Days are shown with a slash prefix when the duration is 24 hours or more. ```{python} df = pd.DataFrame({ "event": ["Marathon", "Half Marathon", "10K", "Mile"], "winning_time_s": [7377, 3542, 1620, 233], }) ( GT(df) .fmt_duration( columns="winning_time_s", input_units="seconds", duration_style="colon-sep", output_units=["hours", "minutes", "seconds"], ) ) ``` The output is zero-padded in the familiar `HH:MM:SS` format. By specifying `output_units` we control exactly which components appear in the colon-separated output. When working with `timedelta` columns (common in Pandas when computing differences between timestamps), `fmt_duration()` automatically detects the units—no `input_units` argument is needed. ```{python} from datetime import datetime events = pd.DataFrame({ "task": ["Build", "Test suite", "Deploy", "Full pipeline"], "elapsed": [ datetime(2024, 1, 1, 0, 12, 45) - datetime(2024, 1, 1, 0, 0, 0), datetime(2024, 1, 1, 1, 5, 30) - datetime(2024, 1, 1, 0, 0, 0), datetime(2024, 1, 1, 0, 3, 15) - datetime(2024, 1, 1, 0, 0, 0), datetime(2024, 1, 1, 1, 21, 30) - datetime(2024, 1, 1, 0, 0, 0), ], }) ( GT(events, rowname_col="task") .fmt_duration(columns="elapsed", duration_style="narrow") ) ``` Polars DataFrames work the same way. Here we format numeric duration values using the ISO 8601 duration style, which is useful for machine-readable output or standards-compliant reporting. ```{python} import polars as pl from great_tables import GT df = pl.DataFrame({"activity": ["Flight", "Layover", "Drive"], "seconds": [14400, 5400, 1830]}) ( GT(df) .fmt_duration(columns="seconds", input_units="seconds", duration_style="iso") ) ``` Polars also has native `Duration` dtype columns (created via temporal arithmetic or `timedelta` values). These are handled automatically without needing to specify `input_units`. ```{python} from datetime import timedelta df = pl.DataFrame({ "segment": ["Warm-up", "Main set", "Cool-down"], "duration": [timedelta(minutes=10), timedelta(minutes=45, seconds=30), timedelta(minutes=5)], }) ( GT(df) .fmt_duration(columns="duration", duration_style="wide") ) ``` fmt_tf(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, tf_style: 'str' = 'true-false', pattern: 'str' = '{x}', true_val: 'str | None' = None, false_val: 'str | None' = None, na_val: 'str | None' = None, colors: 'list[str] | None' = None) -> 'GTSelf' Format True and False values There can be times where boolean values are useful in a display table. You might want to express a 'yes' or 'no', a 'true' or 'false', or, perhaps use pairings of complementary symbols that make sense in a table. The `fmt_tf()` method has a set of `tf_style=` presets that can be used to quickly map `True`/`False` values to strings, or, symbols like up/down or left/right arrows and open/closed shapes. While the presets are nice, you can provide your own mappings through the `true_val=` and `false_val=` arguments. For extra customization, you can also apply color to the individual `True`, `False`, and NA mappings. Just supply a list of colors (up to a length of 3) to the `colors=` argument. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. tf_style The `True`/`False` mapping style to use. By default this is the short name `"true-false"` which corresponds to the words `"true"` and `"false"`. Two other `tf_style=` values produce words: `"yes-no"` and `"up-down"`. The remaining options involve pairs of symbols (e.g., `"check-mark"` displays a check mark for `True` and an ✗ symbol for `False`). pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. true_val While the choice of a `tf_style=` will typically supply the `true_val=` and `false_val=` text, we could override this and supply text for any `True` values. This doesn't need to be used in conjunction with `false_val=`. false_val While the choice of a `tf_style=` will typically supply the `true_val=` and `false_val=` text, we could override this and supply text for any `False` values. This doesn't need to be used in conjunction with `true_val=`. na_val None of the `tf_style` presets will replace any missing values encountered in the targeted cells. While we always have the option to use `sub_missing()` for NA replacement, we have the opportunity handle missing values here with the `na_val=` option. This is useful because we also have the means to add color to the `na_val=` text or symbol and doing that requires that a replacement value for NAs is specified here. colors Providing a list of color values to colors will progressively add color to the formatted result depending on the number of colors provided. With a single color, all formatted values will be in that color. Using two colors results in `True` values being the first color, and `False` values receiving the second. With the three-color option, the final color will be given to any missing values replaced through `na_val=`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Formatting with the `tf_style=` argument ---------------------------------------- We need to supply a preset `tf_style=` value. The following table provides a listing of all `tf_style=` values and their output `True` and `False` values. | | TF Style | Output | |----|-----------------|-------------------------| | 1 | `"true-false"` | `"true" / `"false"` | | 2 | `"yes-no"` | `"yes" / `"no"` | | 3 | `"up-down"` | `"up" / `"down"` | | 4 | `"check-mark"` | `"✓" / `"✗"` | | 5 | `"circles"` | `"●" / `"○"` | | 6 | `"squares"` | `"■" / `"□"` | | 7 | `"diamonds"` | `"◆" / `"◇"` | | 8 | `"arrows"` | `"↑" / `"↓"` | | 9 | `"triangles"` | `"▲" / `"▼"` | | 10 | `"triangles-lr"`| `"▶" / `"◀"` | Examples -------- Let's use a subset of the `sp500` dataset to create a small table containing opening and closing price data for the last few days in 2015. We added a boolean column (`dir`) where `True` indicates a price increase from opening to closing and `False` is the opposite. Using `fmt_tf()` generates up and down arrows in the `dir` column. We elect to use green upward arrows and red downward arrows (through the `colors=` option). ```{python} from great_tables import GT from great_tables.data import sp500 import polars as pl sp500_mini = ( pl.from_pandas(sp500) .slice(0, 5) .drop(["volume", "adj_close", "high", "low"]) .with_columns(dir = pl.col("close") > pl.col("open")) ) ( GT(sp500_mini, rowname_col="date") .fmt_tf(columns="dir", tf_style="arrows", colors=["green", "red"]) .fmt_currency(columns=["open", "close"]) .cols_label( open="Opening", close="Closing", dir="" ) ) ``` fmt_markdown(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None) -> 'GTSelf' Format Markdown text. Any Markdown-formatted text in the incoming cells will be transformed during render when using the `fmt_markdown()` method. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples: ------- Let’s first create a DataFrame containing some text that is Markdown-formatted and then introduce that to [`GT()`](`great_tables.GT`). We’ll then transform the `md` column with the `fmt_markdown()` method. ```{python} import pandas as pd from great_tables import GT from great_tables.data import towny text_1 = """ ### This is Markdown. Markdown’s syntax is comprised entirely of punctuation characters, which punctuation characters have been carefully chosen so as to look like what they mean... assuming you’ve ever used email. """ text_2 = """ Info on Markdown syntax can be found [here](https://daringfireball.net/projects/markdown/). """ df = pd.DataFrame({"md": [text_1, text_2]}) (GT(df).fmt_markdown("md")) ``` fmt_units(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, pattern: 'str' = '{x}') -> 'GTSelf' Format measurement units. The `fmt_units()` method lets you better format measurement units in the table body. These must conform to the **Great Tables** *units notation*; as an example of this, `"J Hz^-1 mol^-1"` can be used to generate units for the *molar Planck constant*. The notation here provides several conveniences for defining units, so as long as the values to be formatted conform to this syntax, you'll obtain nicely-formatted inline units. Details pertaining to *units notation* can be found in the section entitled *How to use units notation*. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. pattern A formatting pattern that allows for decoration of the formatted value. The formatted value is represented by the `{x}` (which can be used multiple times, if needed) and all other characters will be interpreted as string literals. How to use units notation ------------------------- The **Great Tables** units notation involves a shorthand of writing units that feels familiar and is fine-tuned for the task at hand. Each unit is treated as a separate entity (parentheses and other symbols included) and the addition of subscript text and exponents is flexible and relatively easy to formulate. This is all best shown with examples: - `"m/s"` and `"m / s"` both render as `"m/s"` - `"m s^-1"` will appear with the `"-1"` exponent intact - `"m /s"` gives the the same result, as `"/"` is equivalent to `"^-1"` - `"E_h"` will render an `"E"` with the `"h"` subscript - `"t_i^2.5"` provides a `t` with an `"i"` subscript and a `"2.5"` exponent - `"m[_0^2]"` will use overstriking to set both scripts vertically - `"g/L %C6H12O6%"` uses a chemical formula (enclosed in a pair of `"%"` characters) as a unit partial, and the formula will render correctly with subscripted numbers - Common units that are difficult to write using ASCII text may be implicitly converted to the correct characters (e.g., the `"u"` in `"ug"`, `"um"`, `"uL"`, and `"umol"` will be converted to the Greek *mu* symbol; `"degC"` and `"degF"` will render a degree sign before the temperature unit) - We can transform shorthand symbol/unit names enclosed in `":"` (e.g., `":angstrom:"`, `":ohm:"`, etc.) into proper symbols - Greek letters can added by enclosing the letter name in `":"`; you can use lowercase letters (e.g., `":beta:"`, `":sigma:"`, etc.) and uppercase letters too (e.g., `":Alpha:"`, `":Zeta:"`, etc.) - The components of a unit (unit name, subscript, and exponent) can be fully or partially italicized/emboldened by surrounding text with `"*"` or `"**"` Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `illness` dataset and create a new table. The `units` column happens to contain string values in *units notation* (e.g., `"x10^9 / L"`). Using the `fmt_units()` method here will improve the formatting of those measurement units. ```{python} from great_tables import GT, style, loc from great_tables.data import illness ( GT(illness, rowname_col="test") .fmt_units(columns="units") .fmt_number(columns=lambda x: x.startswith("day"), decimals=2, drop_trailing_zeros=True) .tab_header(title="Laboratory Findings for the YF Patient") .tab_spanner(label="Day", columns=lambda x: x.startswith("day")) .tab_spanner(label="Normal Range", columns=lambda x: x.startswith("norm")) .cols_label( norm_l="Lower", norm_u="Upper", units="Units" ) .opt_vertical_padding(scale=0.4) .opt_align_table_header(align="left") .tab_options(heading_padding="10px") .tab_style( locations=loc.body(columns="norm_l"), style=style.borders(sides="left") ) .opt_vertical_padding(scale=0.5) ) ``` The `constants` dataset contains values for hundreds of fundamental physical constants. We'll take a subset of values that have some molar basis and generate a new display table from that. Like the `illness` dataset, this one has a `units` column so, again, the `fmt_units()` method will be used to format those units. Here, the preference for typesetting measurement units is to have positive and negative exponents (e.g., not `" / "` but rather `" ^-1"`). ```{python} from great_tables.data import constants import polars as pl import polars.selectors as cs constants_mini = ( pl.from_pandas(constants) .filter(pl.col("name").str.contains("molar")).sort("value") .with_columns( name=pl.col("name") .str.to_titlecase() .str.replace("Kpa", "kpa") .str.replace("Of", "of") ) ) ( GT(constants_mini) .cols_hide(columns=["uncert", "sf_value", "sf_uncert"]) .fmt_units(columns="units") .fmt_scientific(columns="value", decimals=3) .tab_header(title="Physical Constants Having a Molar Basis") .tab_options(column_labels_hidden=True) ) ``` fmt_image(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | int | None' = None, width: 'str | int | None' = None, sep: 'str' = ' ', path: 'str | Path | None' = None, file_pattern: 'str' = '{}', encode: 'bool' = True) -> 'GTSelf' Format image paths to generate images in cells. To more easily insert graphics into body cells, we can use the `fmt_image()` method. This allows for one or more images to be placed in the targeted cells. The cells need to contain some reference to an image file, either: (1) local paths to the files; (2) complete http/https to the files; (3) the file names, where a common path can be provided via `path=`; or (4) a fragment of the file name, where the `file_pattern=` argument helps to compose the entire file name and `path=` provides the path information. This should be expressly used on columns that contain *only* references to image files (i.e., no image references as part of a larger block of text). Multiple images can be included per cell by separating image references by commas. The `sep=` argument allows for a common separator to be applied between images. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. height The height of the rendered images. width The width of the rendered images. sep In the output of images within a body cell, `sep=` provides the separator between each image. path An optional path to local image files or an HTTP/HTTPS URL. This is combined with the filenames to form the complete image paths. file_pattern The pattern to use for mapping input values in the body cells to the names of the graphics files. The string supplied should use `"{}"` in the pattern to map filename fragments to input strings. encode The option to always use Base64 encoding for image paths that are determined to be local. By default, this is `True`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using a small portion of `metro` dataset, let's create a new table. We will only include a few columns and rows from that table. The `lines` column has comma-separated listings of numbers corresponding to lines served at each station. We have a directory of SVG graphics for all of these lines in the package (the path for the image directory can be accessed via `files("great_tables") / "data/metro_images"`, using the `importlib_resources` package). The filenames roughly corresponds to the data in the `lines` column. The `fmt_image()` method can be used with these inputs since the `path=` and `file_pattern=` arguments allow us to compose complete and valid file locations. What you get from this are sequences of images in the table cells, taken from the referenced graphics files on disk. ```{python} from great_tables import GT from great_tables.data import metro from importlib_resources import files img_paths = files("great_tables") / "data/metro_images" metro_mini = metro[["name", "lines", "passengers"]].head(5) ( GT(metro_mini) .fmt_image( columns="lines", path=img_paths, file_pattern="metro_{}.svg" ) .fmt_integer(columns="passengers") ) ``` fmt_flag(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | int | float | None' = '1em', sep: 'str' = ' ', use_title: 'bool' = True) -> 'GTSelf' Generate flag icons for countries from their country codes. While it is fairly straightforward to insert images into body cells (using `fmt_image()` is one way to it), there is often the need to incorporate specialized types of graphics within a table. One such group of graphics involves iconography representing different countries, and the `fmt_flag()` method helps with inserting a flag icon (or multiple) in body cells. To make this work seamlessly, the input cells need to contain some reference to a country, and this can be in the form of a 2- or 3-letter ISO 3166-1 country code (e.g., Egypt has the `"EG"` country code). This method will parse the targeted body cells for those codes and insert the appropriate flag graphics. Multiple flags can be included per cell by separating country codes with commas (e.g., `"GB,TT"`). The `sep=` argument allows for a common separator to be applied between flag icons. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. height The height of the flag icons. The default value is `"1em"`. If given as a number, it is assumed to be in pixels. sep In the output of multiple flag icons within a body cell, `sep=` provides the separator between each of the flag icons. use_title The option to include a title attribute with the country name when hovering over the flag icon. The default is `True`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `countrypops` dataset to create a new table with flag icons. We will only include a few columns and rows from that table. The `country_code_2` column has 2-letter country codes in the format required for `fmt_flag()` and using that method transforms the codes to circular flag icons. ```{python} from great_tables import GT from great_tables.data import countrypops import polars as pl countrypops_mini = ( pl.from_pandas(countrypops) .filter(pl.col("year") == 2021) .filter(pl.col("country_name").str.starts_with("S")) .sort("country_name") .head(10) .drop(["year", "country_code_3"]) ) ( GT(countrypops_mini) .fmt_integer(columns="population") .fmt_flag(columns="country_code_2") .cols_label( country_code_2="", country_name="Country", population="Population (2021)" ) .cols_move_to_start(columns="country_code_2") ) ``` Here's another example (again using `countrypops`) where we generate a table providing populations every five years for the Benelux countries (`"BEL"`, `"NLD"`, and `"LUX"`). After some filtering and a pivot, the `fmt_flag()` method is used to obtain flag icons from 3-letter country codes present in the `country_code_3` column. ```{python} import polars.selectors as cs countrypops_mini = ( pl.from_pandas(countrypops) .filter(pl.col("country_code_3").is_in(["BEL", "NLD", "LUX"])) .filter((pl.col("year") % 10 == 0) & (pl.col("year") >= 1960)) .pivot("year", index = ["country_code_3", "country_name"], values="population") ) ( GT(countrypops_mini) .tab_header(title="Populations of the Benelux Countries") .tab_spanner(label="Year", columns=cs.numeric()) .fmt_integer(columns=cs.numeric()) .fmt_flag(columns="country_code_3") .cols_label( country_code_3="", country_name="Country" ) ) ``` fmt_icon(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | None' = None, sep: 'str' = ' ', stroke_color: 'str | None' = None, stroke_width: 'str | int | None' = None, stroke_alpha: 'float | None' = None, fill_color: 'str | dict[str, str] | None' = None, fill_alpha: 'float | None' = None, margin_left: 'str | None' = None, margin_right: 'str | None' = None) -> 'GTSelf' Use icons within a table's body cells. We can draw from a library of thousands of icons and selectively insert them into a table. The `fmt_icon()` method makes this possible by mapping input cell labels to an icon name. We are exclusively using Font Awesome icons here so the reference is the short icon name. Multiple icons can be included per cell by separating icon names with commas (e.g., `"hard-drive,clock"`). The `sep=` argument allows for a common separator to be applied between icons. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. height The absolute height of the icon in the table cell. By default, this is set to "1em". sep In the output of icons within a body cell, `sep=` provides the separator between each icon. stroke_color The icon stroke is essentially the outline of the icon. The color of the stroke can be modified by applying a single color here. If not provided then the default value of `"currentColor"` is applied so that the stroke color matches that of the parent HTML element's color attribute. stroke_width The `stroke_width=` option allows for setting the color of the icon outline stroke. By default, the stroke width is very small at "1px" so a size adjustment here can sometimes be useful. If an integer value is provided then it is assumed to be in pixels. stroke_alpha The level of transparency for the icon stroke can be controlled with a decimal value between `0` and `1`. fill_color The fill color of the icon can be set with `fill_color=`; providing a single color here will change the color of the fill but not of the icon's 'stroke' or outline (use `stroke_color=` to modify that). A dictionary comprising the icon names with corresponding fill colors can alternatively be used here (e.g., `{"circle-check" = "green", "circle-xmark" = "red"}`. If nothing is provided then the default value of `"currentColor"` is applied so that the fill matches the color of the parent HTML element's color attribute. fill_alpha The level of transparency for the icon fill can be controlled with a decimal value between `0` and `1`. margin_left The length value for the margin that's to the left of the icon. By default, `"auto"` is used for this but if space is needed on the left-hand side then a length of `"0.2em"` is recommended as a starting point. margin_right The length value for the margin right of the icon. By default, `"auto"` is used but if space is needed on the right-hand side then a length of `"0.2em"` is recommended as a starting point. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- For this first example of generating icons with `fmt_icon()`, let's make a simple DataFrame that has two columns of Font Awesome icon names. We separate multiple icons per cell with commas. By default, the icons are 1 em in height; we're going to make the icons slightly larger here (so we can see the fine details of them) by setting height = "4em". ```{python} import pandas as pd from great_tables import GT animals_foods_df = pd.DataFrame( { "animals": ["hippo", "fish,spider", "mosquito,locust,frog", "dog,cat", "kiwi-bird"], "foods": ["bowl-rice", "egg,pizza-slice", "burger,lemon,cheese", "carrot,hotdog", "bacon"], } ) ( GT(animals_foods_df) .fmt_icon( columns=["animals", "foods"], height="4em" ) .cols_align( align="center", columns=["animals", "foods"] ) ) ``` Let's take a few rows from the towny dataset and make it so the `csd_type` column contains *Font Awesome* icon names (we want only the `"city"` and `"house-chimney"` icons here). After using `fmt_icon()` to format the `csd_type` column, we get icons that are representative of the two categories of municipality for this subset of data. ```{python} import polars as pl from great_tables.data import towny towny_mini = ( pl.from_pandas(towny.loc[[323, 14, 26, 235]]) .select(["name", "csd_type", "population_2021"]) .with_columns( csd_type = pl.when(pl.col("csd_type") == "town") .then(pl.lit("house-chimney")) .otherwise(pl.lit("city")) ) ) ( GT(towny_mini) .fmt_integer(columns="population_2021") .fmt_icon(columns="csd_type") .cols_label( csd_type="", name="City/Town", population_2021="Population" ) ) ``` A fairly common thing to do with icons in tables is to indicate whether a quantity is either higher or lower than another. Up and down arrow symbols can serve as good visual indicators for this purpose. We can make use of the `"up-arrow"` and `"down-arrow"` icons here. As those strings are available in the `dir` column of the table derived from the `sp500` dataset, `fmt_icon()` can be used. We set the `fill_color` argument with a dictionary that indicates which color should be used for each icon. ```{python} from great_tables.data import sp500 sp500_mini = ( pl.from_pandas(sp500) .head(10) .select(["date", "open", "close"]) .sort("date", descending=False) .with_columns( dir = pl.when(pl.col("close") >= pl.col("open")).then( pl.lit("arrow-up")).otherwise(pl.lit("arrow-down")) ) ) ( GT(sp500_mini, rowname_col="date") .fmt_icon( columns="dir", fill_color={"arrow-up": "green", "arrow-down": "red"} ) .cols_label( open="Opening Value", close="Closing Value", dir="" ) .opt_stylize(style=1, color="gray") ) ``` fmt_nanoplot(self: 'GTSelf', columns: 'str | None' = None, rows: 'int | list[int] | None' = None, plot_type: 'PlotType' = 'line', plot_height: 'str' = '2em', missing_vals: 'MissingVals' = 'gap', autoscale: 'bool' = False, reference_line: 'str | int | float | None' = None, reference_area: 'list[Any] | None' = None, expand_x: 'list[int] | list[float] | list[int | float] | None' = None, expand_y: 'list[int] | list[float] | list[int | float] | None' = None, options: 'dict[str, Any] | None' = None) -> 'GTSelf' Format data for nanoplot visualizations. The `fmt_nanoplot()` method is used to format data for nanoplot visualizations. This method allows for the creation of a variety of different plot types, including line, bar, and scatter plots. :::{.callout-warning} `fmt_nanoplot()` is still experimental. ::: Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices. plot_type Nanoplots can either take the form of a line plot (using `"line"`) or a bar plot (with `"bar"`). A line plot, by default, contains layers for a data line, data points, and a data area. With a bar plot, the always visible layer is that of the data bars. plot_height The height of the nanoplots. The default here is a sensible value of `"2em"`. missing_vals If missing values are encountered within the input data, there are three strategies available for their handling: (1) `"gap"` will show data gaps at the sites of missing data, where data lines will have discontinuities and bar plots will have missing bars; (2) `"marker"` will behave like `"gap"` but show prominent visual marks at the missing data locations; (3) `"zero"` will replace missing values with zero values; and (4) `"remove"` will remove any incoming missing values. autoscale Using `autoscale=True` will ensure that the bounds of all nanoplots produced are based on the limits of data combined from all input rows. This will result in a shared scale across all of the nanoplots (for *y*- and *x*-axis data), which is useful in those cases where the nanoplot data should be compared across rows. reference_line A reference line requires a single input to define the line. It could be a numeric value, applied to all nanoplots generated. Or, the input can be one of the following for generating the line from the underlying data: (1) `"mean"`, (2) `"median"`, (3) `"min"`, (4) `"max"`, (5) `"q1"`, (6) `"q3"`, (7) `"first"`, or (8) `"last"`. reference_area A reference area requires a list of two values for defining bottom and top boundaries (in the *y* direction) for a rectangular area. The types of values supplied are the same as those expected for `reference_line=`, which is either a numeric value or one of the following keywords for the generation of the value: (1) `"mean"`, (2) `"median"`, (3) `"min"`, (4) `"max"`, (5) `"q1"`, (6) `"q3"`, (7) `"first"`, or (8) `"last"`. Input can either be a vector or list with two elements. expand_x Should you need to have plots expand in the *x* direction, provide one or more values to `expand_x=`. Any values provided that are outside of the range of *x*-value data provided to the plot will result in a *x*-scale expansion. expand_y Similar to `expand_x=`, one can have plots expand in the *y* direction. To make this happen, provide one or more values to `expand_y=`. If any of the provided values are outside of the range of *y*-value data provided, the plot will result in a *y*-scale expansion. options By using the [`nanoplot_options()`](`great_tables.nanoplot_options`) helper function here, you can alter the layout and styling of the nanoplots in the new column. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Details ------- Nanoplots try to show individual data with reasonably good visibility. Interactivity is included as a basic feature so one can hover over the data points and vertical guides will display the value ascribed to each data point. Because **Great Tables** knows all about numeric formatting, values will be compactly formatted so as to not take up valuable real estate. While basic customization options are present in `fmt_nanoplot()`, many more opportunities for customizing nanoplots on a more granular level are possible with the aforementioned [`nanoplot_options()`](`great_tables.nanoplot_options`) helper function. With that, layers of the nanoplots can be selectively removed and the aesthetics of the remaining plot components can be modified. Examples -------- Let's create a nanoplot from a Polars DataFrame containing multiple numbers per cell. The numbers are represented here as strings, where spaces separate the values, and the same values are present in two columns: `lines` and `bars`. We will use the `fmt_nanoplot()` method twice to create a line plot and a bar plot from the data in their respective columns. ```{python} from great_tables import GT import polars as pl random_numbers_df = pl.DataFrame( { "i": range(1, 5), "lines": [ "20 23 6 7 37 23 21 4 7 16", "2.3 6.8 9.2 2.42 3.5 12.1 5.3 3.6 7.2 3.74", "-12 -5 6 3.7 0 8 -7.4", "2 0 15 7 8 10 1 24 17 13 6", ], } ).with_columns(bars=pl.col("lines")) ( GT(random_numbers_df, rowname_col="i") .fmt_nanoplot(columns="lines") .fmt_nanoplot(columns="bars", plot_type="bar") ) ``` We can always represent the input DataFrame in a different way (with list columns) and `fmt_nanoplot()` will still work. While the input data is the same as in the previous example, we'll take the opportunity here to add a reference line and a reference area to the line plot and also to the bar plot. ```{python} random_numbers_df = pl.DataFrame( { "i": range(1, 5), "lines": [ { "val": [20.0, 23.0, 6.0, 7.0, 37.0, 23.0, 21.0, 4.0, 7.0, 16.0] }, { "val": [2.3, 6.8, 9.2, 2.42, 3.5, 12.1, 5.3, 3.6, 7.2, 3.74] }, { "val": [-12.0, -5.0, 6.0, 3.7, 0.0, 8.0, -7.4] }, { "val": [2.0, 0.0, 15.0, 7.0, 8.0, 10.0, 1.0, 24.0, 17.0, 13.0, 6.0] }, ], } ).with_columns(bars=pl.col("lines")) ( GT(random_numbers_df, rowname_col="i") .fmt_nanoplot( columns="lines", reference_line="mean", reference_area=["min", "q1"] ) .fmt_nanoplot( columns="bars", plot_type="bar", reference_line="max", reference_area=["max", "median"]) ) ``` Here's an example to adjust some of the options using [`nanoplot_options()`](`great_tables.nanoplot_options`). ```{python} from great_tables import nanoplot_options ( GT(random_numbers_df, rowname_col="i") .fmt_nanoplot( columns="lines", reference_line="mean", reference_area=["min", "q1"], options=nanoplot_options( data_point_radius=8, data_point_stroke_color="black", data_point_stroke_width=2, data_point_fill_color="white", data_line_type="straight", data_line_stroke_color="brown", data_line_stroke_width=2, data_area_fill_color="orange", vertical_guide_stroke_color="green", ), ) .fmt_nanoplot( columns="bars", plot_type="bar", reference_line="max", reference_area=["max", "median"], options=nanoplot_options( data_bar_stroke_color="gray", data_bar_stroke_width=2, data_bar_fill_color="orange", data_bar_negative_stroke_color="blue", data_bar_negative_stroke_width=1, data_bar_negative_fill_color="lightblue", reference_line_color="pink", reference_area_fill_color="bisque", vertical_guide_stroke_color="blue", ), ) ) ``` Single-value bar plots and line plots can be made with `fmt_nanoplot()`. These run in the horizontal direction, which is ideal for tabular presentation. The key thing here is that `fmt_nanoplot()` expects a column of numeric values. These plots are meant for comparison across rows so the method automatically scales the horizontal bars to facilitate this type of display. The following example shows how `fmt_nanoplot()` can be used to create single-value bar and line plots. ```{python} single_vals_df = pl.DataFrame( { "i": range(1, 6), "bars": [4.1, 1.3, -5.3, 0, 8.2], "lines": [12.44, 6.34, 5.2, -8.2, 9.23] } ) ( GT(single_vals_df, rowname_col="i") .fmt_nanoplot(columns="bars", plot_type="bar") .fmt_nanoplot(columns="lines", plot_type="line") ) ``` fmt(self: 'GTSelf', fns: 'FormatFn', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, is_substitution: 'bool' = False) -> 'GTSelf' Set a column format with a formatter function. The `fmt()` method provides a way to execute custom formatting functionality with raw data values in a way that can consider all output contexts. Along with the `columns` and `rows` arguments that provide some precision in targeting data cells, the `fns` argument allows you to define a function for manipulating the raw data. Parameters ---------- fns A formatting function to apply to the targeted cells. columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in `columns` being formatted. Alternatively, we can supply a list of row indices. is_substitution Whether the formatter is a substitution. Substitutions are run last, after other formatters. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `exibble` dataset to create a table. With the `fmt()` method, we'll add a prefix `^` and a suffix `$` to the `row` and `group` columns. ```{python} from great_tables import GT, exibble ( GT(exibble) .fmt(lambda x: f"^{x}$", columns=["row", "group"]) ) ``` sub_missing(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, missing_text: 'str | Text | None' = None) -> 'GTSelf' Substitute missing values in the table body. Wherever there is missing data (i.e., `None` values) customizable content may present better than the standard representation of missing values that would otherwise appear. The `sub_missing()` method allows for this replacement through its `missing_text=` argument. And by not supplying anything to `missing_text=`, an em dash will serve as a default indicator of missingness. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should be scanned for missing values. The default is all rows, resulting in all rows in all targeted columns being considered for this substitution. Alternatively, we can supply a list of row indices. missing_text The text to be used in place of missing values in the rendered table. We can optionally use the [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain HTML elements in the text. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using a subset of the `exibble` dataset, let's create a new table. The missing values in two selections of columns will be given different variations of replacement text (across two separate calls of `sub_missing()`). ```{python} from great_tables import GT, md, html, exibble import polars as pl import polars.selectors as cs exibble_mini = pl.from_pandas(exibble).drop("row", "group", "fctr").slice(4, 8) ( GT(exibble_mini) .sub_missing( columns=["num", "char"], missing_text="missing" ) .sub_missing( columns=cs.contains(("date", "time")) | cs.by_name("currency"), missing_text="nothing" ) ) ``` sub_zero(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, zero_text: 'str' = 'nil') -> 'GTSelf' Substitute zero values in the table body. Wherever there is numerical data that are zero in value, replacement text may be better for explanatory purposes. The `sub_zero()` function allows for this replacement through its `zero_text=` argument. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should be scanned for zeros. The default is all rows, resulting in all rows in all targeted columns being considered for this substitution. Alternatively, we can supply a list of row indices. zero_text The text to be used in place of zero values in the rendered table. We can optionally use the [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) functions to style the text as Markdown or to retain HTML elements in the text. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's generate a simple table that contains an assortment of values that could potentially undergo some substitution via the `sub_zero()` method (i.e., there are two `0` values). The ordering of the [`fmt_scientific()`](`great_tables.GT.fmt_scientific`) and `sub_zero()` calls in the example below doesn't affect the final result since any `sub_*()` method won't interfere with the formatting of the table. ```{python} from great_tables import GT import polars as pl single_vals_df = pl.DataFrame( { "i": range(1, 8), "numbers": [2.75, 0, -3.2, 8, 1e-10, 0, 2.6e9] } ) GT(single_vals_df).fmt_scientific(columns="numbers").sub_zero() ``` sub_small_vals(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, threshold: 'int | float' = 0.01, small_pattern: 'str | None' = None, sign: 'str' = '+') -> 'GTSelf' Substitute small values in the table body. Wherever there is numerical data that are very small in value, replacement text may be better for explanatory purposes. The `sub_small_vals()` method allows for this replacement through specification of a `threshold`, a `small_pattern`, and the sign of the values to be considered. The substitution will occur for those values found to be between `0` and the threshold value. This is possible for small positive and small negative values (this can be explicitly set by the `sign` option). Note that the interval does not include the `0` or the `threshold` value. Should you need to include zero values, use `sub_zero()`. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should be scanned for small values. The default is all rows, resulting in all rows in all targeted columns being considered for this substitution. Alternatively, we can supply a list of row indices. threshold The threshold value with which values should be considered small enough for replacement. small_pattern The pattern text to be used in place of the suitably small values in the rendered table. The `{x}` placeholder within the pattern will be replaced with the threshold value. If not provided, the default is `"<{x}"` for positive values and `">-{x}"` for negative values. sign The sign of the numbers to be considered in the replacement. By default, we only consider positive values (`"+"`). The other option (`"-"`) can be used to consider only negative values. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's generate a simple, single-column table that contains an assortment of values that could potentially undergo some substitution via `sub_small_vals()`. ```{python} from great_tables import GT import polars as pl single_vals_df = pl.DataFrame( { "i": range(1, 8), "numbers": [0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0] } ) GT(single_vals_df).fmt_number(columns="numbers").sub_small_vals() ``` We can also target small negative values by setting `sign="-"` and use a custom `small_pattern` to provide alternative replacement text. ```{python} from great_tables import GT import polars as pl neg_vals_df = pl.DataFrame( { "i": range(1, 6), "numbers": [-0.0001, -0.005, -0.05, -1.0, -100.0] } ) ( GT(neg_vals_df) .fmt_number(columns="numbers") .sub_small_vals(sign="-", threshold=0.01, small_pattern="~0") ) ``` sub_large_vals(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, threshold: 'int | float' = 1000000000000.0, large_pattern: 'str' = '>={x}', sign: 'str' = '+') -> 'GTSelf' Substitute large values in the table body. Wherever there are numerical data that are very large in value, replacement text may be better for explanatory purposes. The `sub_large_vals()` method allows for this replacement through specification of a `threshold`, a `large_pattern`, and the sign (positive or negative) of the values to be considered. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should be scanned for large values. The default is all rows, resulting in all rows in all targeted columns being considered for this substitution. Alternatively, we can supply a list of row indices. threshold The threshold value with which values should be considered large enough for replacement. large_pattern The pattern text to be used in place of the suitably large values in the rendered table. The `{x}` placeholder within the pattern will be replaced with the threshold value. sign The sign of the numbers to be considered in the replacement. By default, we only consider positive values (`"+"`). The other option (`"-"`) can be used to consider only negative values. Note that when `sign="-"` and the default `large_pattern=">={x}"` is used, the `">="` is automatically changed to `"<="`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's generate a simple, single-column table that contains an assortment of values that could potentially undergo some substitution via `sub_large_vals()`. ```{python} from great_tables import GT import polars as pl single_vals_df = pl.DataFrame( { "i": range(1, 8), "numbers": [0.0, 10.0, 1e8, 1e9, 1e10, 1e11, 1e12] } ) GT(single_vals_df).fmt_number(columns="numbers").sub_large_vals(threshold=1e10) ``` Large negative values can also be targeted with `sign="-"`. Notice the `">="` in the default pattern is automatically changed to `"<="` when dealing with negative values. ```{python} from great_tables import GT import polars as pl neg_vals_df = pl.DataFrame( { "i": range(1, 5), "numbers": [-10.0, -500.0, -1e6, -1e12] } ) ( GT(neg_vals_df) .fmt_number(columns="numbers") .sub_large_vals(threshold=1000, sign="-") ) ``` sub_values(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, values: 'list[Any] | Any | None' = None, pattern: 'str | None' = None, fn: 'Callable[..., bool] | None' = None, replacement: 'str | int | float | None' = None) -> 'GTSelf' Substitute targeted values in the table body. Should you need to replace specific cell values with custom text, `sub_values()` can be a good choice. We can target cells for replacement through value, regex, and custom matching rules. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which of their rows should be targeted for substitution. The default is all rows, resulting in all rows in all targeted columns being considered for this substitution. Alternatively, we can supply a list of row indices. values The specific value or values that should be replaced with a `replacement` value. If `pattern` is also supplied then `values` will be ignored. pattern A regex pattern that can target solely those values in character-based columns. If `values` is also supplied, `pattern` will take precedence. fn A supplied function that operates on each cell value `x` and should return a boolean indicating whether that value should be replaced. If either of `values` or `pattern` is also supplied, `fn` will take precedence. replacement The replacement value for any cell values matched by either `values`, `pattern`, or `fn`. Must be a string or numeric value. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's create an input table with three columns containing an assortment of values that could potentially undergo some substitution via `sub_values()`. ```{python} from great_tables import GT import polars as pl tbl = pl.DataFrame( { "num_1": [-0.01, 74.0, None, 0.0, 500.0, 0.001, 84.3], "int_1": [1, -100000, 800, 5, None, 1, -32], "lett": ["A", "B", "C", "D", "E", "F", "G"], } ) GT(tbl).sub_values(values=[74, 500], replacement="—") ``` For the most flexibility, use the `fn` argument. The function you provide should accept a cell value and return a boolean indicating whether it should be replaced. ```{python} from great_tables import GT import polars as pl tbl = pl.DataFrame( { "num_1": [-0.01, 74.0, None, 0.0, 500.0, 0.001, 84.3], "int_1": [1, -100000, 800, 5, None, 1, -32], "lett": ["A", "B", "C", "D", "E", "F", "G"], } ) ( GT(tbl) .sub_values( fn=lambda x: isinstance(x, (int, float)) and x >= 0 and x < 50, replacement="small" ) ) ``` data_color(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, palette: 'str | list[str] | None' = None, domain: 'list[str] | list[int] | list[float] | None' = None, na_color: 'str | None' = None, alpha: 'int | float | None' = None, reverse: 'bool' = False, autocolor_text: 'bool' = True, truncate: 'bool' = False) -> 'GTSelf' Perform data cell colorization. It's possible to add color to data cells according to their values with the `data_color()` method. There is a multitude of ways to perform data cell colorizing here: - targeting: we can constrain which columns should receive the colorization treatment through the `columns=` argument) - color palettes: with `palette=` we could supply a list of colors composed of hexadecimal values or color names - value domain: we can either opt to have the range of values define the domain, or, specify one explicitly with the `domain=` argument - text autocoloring: `data_color()` will automatically recolor the foreground text to provide the best contrast (can be deactivated with `autocolor_text=False`) Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows In conjunction with `columns=`, we can specify which rows should be colored. By default, all rows in the targeted columns will be colored. Alternatively, we can provide a list of row indices. palette The color palette to use. This should be a list of colors (e.g., `["#FF0000", "#00FF00", "#0000FF"]`). A ColorBrewer palette could also be used, just supply the name (reference available in the *Color palette access from ColorBrewer* section). If `None`, then a default palette will be used. domain The domain of values to use for the color scheme. This can be a list of floats, integers, or strings. If `None`, then the domain will be inferred from the data values. na_color The color to use for missing values. If `None`, then the default color (`"#808080"`) will be used. alpha An optional, fixed alpha transparency value that will be applied to all color palette values. reverse Should the colors computed operate in the reverse order? If `True` then colors that normally change from red to blue will change in the opposite direction. autocolor_text Whether or not to automatically color the text of the data values. If `True`, then the text will be colored according to the background color of the cell. truncate If `True`, then any values that fall outside of the domain will be truncated to the minimum or maximum value of the domain (will have the same color). If `False`, then any values that fall outside of the domain will be set to `NaN` and will follow the `na_color=` color. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Color palette access from ColorBrewer and viridis ------------------------------------------------- All palettes from the ColorBrewer package can be accessed by providing the palette name in `palette=`. There are 35 available palettes: | | Palette Name | Colors | Category | Colorblind Friendly | |----|-------------------|---------|-------------|---------------------| | 1 | `"BrBG"` | 11 | Diverging | Yes | | 2 | `"PiYG"` | 11 | Diverging | Yes | | 3 | `"PRGn"` | 11 | Diverging | Yes | | 4 | `"PuOr"` | 11 | Diverging | Yes | | 5 | `"RdBu"` | 11 | Diverging | Yes | | 6 | `"RdYlBu"` | 11 | Diverging | Yes | | 7 | `"RdGy"` | 11 | Diverging | No | | 8 | `"RdYlGn"` | 11 | Diverging | No | | 9 | `"Spectral"` | 11 | Diverging | No | | 10 | `"Dark2"` | 8 | Qualitative | Yes | | 11 | `"Paired"` | 12 | Qualitative | Yes | | 12 | `"Set1"` | 9 | Qualitative | No | | 13 | `"Set2"` | 8 | Qualitative | Yes | | 14 | `"Set3"` | 12 | Qualitative | No | | 15 | `"Accent"` | 8 | Qualitative | No | | 16 | `"Pastel1"` | 9 | Qualitative | No | | 17 | `"Pastel2"` | 8 | Qualitative | No | | 18 | `"Blues"` | 9 | Sequential | Yes | | 19 | `"BuGn"` | 9 | Sequential | Yes | | 20 | `"BuPu"` | 9 | Sequential | Yes | | 21 | `"GnBu"` | 9 | Sequential | Yes | | 22 | `"Greens"` | 9 | Sequential | Yes | | 23 | `"Greys"` | 9 | Sequential | Yes | | 24 | `"Oranges"` | 9 | Sequential | Yes | | 25 | `"OrRd"` | 9 | Sequential | Yes | | 26 | `"PuBu"` | 9 | Sequential | Yes | | 27 | `"PuBuGn"` | 9 | Sequential | Yes | | 28 | `"PuRd"` | 9 | Sequential | Yes | | 29 | `"Purples"` | 9 | Sequential | Yes | | 30 | `"RdPu"` | 9 | Sequential | Yes | | 31 | `"Reds"` | 9 | Sequential | Yes | | 32 | `"YlGn"` | 9 | Sequential | Yes | | 33 | `"YlGnBu"` | 9 | Sequential | Yes | | 34 | `"YlOrBr"` | 9 | Sequential | Yes | | 35 | `"YlOrRd"` | 9 | Sequential | Yes | We can also use the *viridis* and associated color palettes by providing to `palette=` any of the following string values: `"viridis"`, `"plasma"`, `"inferno"`, `"magma"`, or `"cividis"`. Examples -------- The `data_color()` method can be used without any supplied arguments to colorize a table. Let's do this with the `exibble` dataset: ```{python} from great_tables import GT from great_tables.data import exibble GT(exibble).data_color() ``` What's happened is that `data_color()` applies background colors to all cells of every column with the palette of eight colors. Numeric columns will use 'numeric' methodology for color scaling whereas string-based columns will use the 'factor' methodology. The text color undergoes an automatic modification that maximizes contrast (since `autocolor_text=True` by default). We can target specific colors and apply color to just those columns. Let's do that and also supply `palette=` values of `"red"` and `"green"`. ```{python} GT(exibble).data_color( columns=["num", "currency"], palette=["red", "green"] ) ``` With those options in place we see that only the numeric columns `num` and `currency` received color treatments. Moreover, the palette colors were mapped to the lower and upper limits of the data in each column; interpolated colors were used for the values in between the numeric limits of the two columns. We can manually set the limits of the data with the `domain=` argument (which is preferable in most cases). Let's colorize just the currency column and set `domain=[0, 50]`. Any values that are either missing or lie outside of the domain will be colorized with the `na_color=` color (so we'll set that to `"lightgray"`). ```{python} GT(exibble).data_color( columns="currency", palette=["red", "green"], domain=[0, 50], na_color="lightgray" ) ``` ## Text transformation The text_*() method take cell data that are solidified into strings and allow for flexible transformations of those string values. Whereas the `fmt_*()` and `sub_*()` methods are phases 1 and 2 of cell data metamorphoses, the text transformation functions are the final phase, acting on strings generated by formatting and substitution functions with no reference to the source values. text_replace(self: 'GTSelf', pattern: 'str', replacement: 'str', locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf' Perform targeted text replacement with a regex pattern. With `text_replace()` we can target cells in specific locations and replace text fragments matching a regular expression pattern. This operates on the already-formatted cell content (i.e., after `fmt_*()` methods have been applied). Parameters ---------- pattern A regex pattern used to target text fragments in the resolved cells. replacement The replacement text for any matched text fragments. Backreferences (e.g., `"\\1"`) can be used to refer to capture groups in the pattern. locations The cell or set of cells to be associated with the text replacement. Supported locations include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If `None`, defaults to `loc.body()`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Use `text_replace()` to add HTML emphasis tags around text in parentheses. ```{python} import pandas as pd from great_tables import GT, loc df = pd.DataFrame({"item": ["Column A (details)", "Colum B (info)"], "value": [1, 2]}) ( GT(df) .text_replace( pattern=r"\((.+?)\)", replacement=r"(\1)", locations=loc.body(columns="item"), ) ) ``` Replace underscores with spaces in the stub (row labels). ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char", "row"]].head(4), rowname_col="row") .text_replace(pattern="_", replacement=" ", locations=loc.stub()) ) ``` text_case_when(self: 'GTSelf', *cases: 'tuple[Callable[[str], bool], str]', default: 'str | None' = None, locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf' Perform text replacements using a case-when approach. With `text_case_when()` we supply a sequence of cases as `(predicate, replacement)` tuples. Each predicate is a function that takes the cell text (as a string) and returns `True` or `False`. The first predicate that returns `True` determines the replacement text. This is analogous to a series of if/elif statements applied to each cell. Parameters ---------- *cases One or more tuples of the form `(predicate_fn, new_text)` where `predicate_fn` is a callable that accepts a string and returns a boolean, and `new_text` is the replacement string to use when the predicate is `True`. default The replacement text to use when no predicate matches. If `None` (the default), unmatched cells are left unchanged. locations The cell or set of cells to be associated with the text replacement. Supported locations include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If `None`, defaults to `loc.body()`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Conditionally replace cell values based on their content. ```{python} import pandas as pd from great_tables import GT, loc df = pd.DataFrame({"score": [95, 72, 88, 61, 100]}) ( GT(df) .fmt_number(columns="score", decimals=0) .text_case_when( (lambda x: int(x) >= 90, "A"), (lambda x: int(x) >= 80, "B"), (lambda x: int(x) >= 70, "C"), default="F", locations=loc.body(columns="score"), ) ) ``` Use string methods in predicates to match patterns. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .text_case_when( (lambda x: x.startswith("a"), "Starts with A"), (lambda x: len(x) > 6, "Long text"), default="other", locations=loc.body(columns="char"), ) ) ``` text_case_match(self: 'GTSelf', *cases: 'tuple[str | list[str], str]', default: 'str | None' = None, replace: "Literal['all', 'partial']" = 'all', locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf' Perform text replacements with a switch-like approach. With `text_case_match()` we can supply a sequence of matching cases in the form of `(old_text, new_text)` tuples. Each tuple's first element specifies text to match (either a single string or a list of strings) and the second element provides the replacement. By default, the matching is performed on the entire cell text (`replace="all"`); use `replace="partial"` for substring matching and replacement. Parameters ---------- *cases One or more tuples of the form `(old_text, new_text)` where `old_text` is a string or list of strings to match, and `new_text` is the replacement string. default The replacement text to use when cell values aren't matched by any of the supplied cases. If `None` (the default), unmatched cells are left unchanged. replace The method for text replacement. Use `"all"` (the default) to match and replace the entire cell text, or `"partial"` to match and replace substrings within the cell text. locations The cell or set of cells to be associated with the text replacement. Supported locations include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If `None`, defaults to `loc.body()`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Replace specific cell values in the `char` column with different text. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .text_case_match( ("apricot", "APRICOT"), (["banana", "coconut"], "tropical fruit"), default="other", locations=loc.body(columns="char"), ) ) ``` Use `replace="partial"` to perform substring replacements. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .text_case_match( ("an", "@"), replace="partial", locations=loc.body(columns="char"), ) ) ``` text_transform(self: 'GTSelf', locations: 'Loc | list[Loc]', fn: 'Callable[[str], str]') -> 'GTSelf' Apply a custom text transformation to cells at specified locations. With the `text_transform()` method we can target specific cells and apply a text transformation function to their already-formatted content. This is useful for modifying the rendered text of cells after all formatting (via `fmt_*()` methods) has been applied. Parameters ---------- locations The cell or set of cells to be associated with the text transformation. Supported locations include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. fn A function that takes a cell's text content as a string and returns the transformed string. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `exibble` dataset to demonstrate `text_transform()`. We'll format the `num` column and then apply a text transformation to wrap the values in parentheses. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .fmt_number(columns="num", decimals=1) .text_transform( locations=loc.body(columns="num"), fn=lambda x: f"({x})", ) ) ``` Using `text_transform()` we can also convert specific cells to uppercase. Here we target only the first two rows of the `char` column. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .text_transform( locations=loc.body(columns="char", rows=[0, 1]), fn=lambda x: x.upper(), ) ) ``` Multiple locations can be targeted at once by passing a list. In this example, we add a prefix to all cells in both the `num` and `char` columns. ```{python} from great_tables import GT, loc, exibble ( GT(exibble[["num", "char"]].head(4)) .fmt_number(columns="num", decimals=2) .text_transform( locations=[loc.body(columns="num"), loc.body(columns="char")], fn=lambda x: f"~ {x}", ) ) ``` ## Modifying columns The `cols_*()` methods allow for modifications that act on entire columns. This includes alignment of the data in columns ([`cols_align()`](`great_tables.GT.cols_align`)), hiding columns from view ([`cols_hide()`](`great_tables.GT.cols_hide`)), re-labeling the column labels ([`cols_label()`](`great_tables.GT.cols_label`)), and moving columns around (with the `cols_move*()` methods). cols_align(self: 'GTSelf', align: 'str' = 'left', columns: 'SelectExpr' = None) -> 'GTSelf' Set the alignment of one or more columns. The `cols_align()` method sets the alignment of one or more columns. The `align` argument can be set to one of `"left"`, `"center"`, or `"right"` and the `columns` argument can be used to specify which columns to apply the alignment to. If `columns` is not specified, the alignment is applied to all columns. Parameters ---------- align The alignment to apply. Must be one of `"left"`, `"center"`, or `"right"`. columns The columns to target. Can either be a single column name or a series of column names provided in a list. If `None`, the alignment is applied to all columns. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `countrypops` to create a small table. We can change the alignment of the `population` column with `cols_align()`. In this example, the column label and body cells of `population` will be aligned to the left. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "San Marino"][ ["country_name", "year", "population"] ].tail(5) ( GT(countrypops_mini, rowname_col="year", groupname_col="country_name") .cols_align(align="left", columns="population") ) ``` cols_width(self: 'GTSelf', cases: 'dict[str, str] | None' = None, **kwargs: 'str') -> 'GTSelf' Set the widths of columns. Manual specifications of column widths can be performed using the `cols_width()` method. We choose which columns get specific widths. This can be in units of pixels or as percentages. Width assignments are supplied inside of a dictionary where columns are the keys and the corresponding width is the value. Parameters ---------- cases A dictionary where the keys are column names and the values are the widths. Widths can be specified in pixels (e.g., `"50px"`) or as percentages (e.g., `"20%"`). **kwargs Keyword arguments to specify column widths. Each keyword corresponds to a column name, with its value indicating the width in pixels or percentages. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use select columns from the `exibble` dataset to create a new table. We can specify the widths of columns with `cols_width()`. This is done by specifying the exact widths for table columns in a dictionary. In this example, we'll set the width of the `num` column to `"150px"`, the `char` column to `"100px"`, the `date` column to `"300px"`. All other columns won't be affected (their widths will be automatically set by their content). ```{python} import warnings from great_tables import GT, exibble warnings.filterwarnings("ignore") exibble_mini = exibble[["num", "char", "date", "datetime", "row"]].head(5) ( GT(exibble_mini) .cols_width( cases={ "num": "150px", "char": "100px", "date": "300px" } ) ) ``` We can also specify the widths of columns as percentages. In this example, we'll set the width of the `num` column to `"20%"`, the `char` column to `"10%"`, and the `date` column to `"30%"`. Note that the percentages are relative and don't need to sum to 100%. ```{python} ( GT(exibble_mini) .cols_width( cases={ "num": "20%", "char": "10%", "date": "30%" } ) ) ``` We can also mix and match pixel and percentage widths. In this example, we'll set the width of the `num` column to `"150px"`, the `char` column to `"10%"`, and the `date` column to `"30%"`. ```{python} ( GT(exibble_mini) .cols_width( cases={ "num": "150px", "char": "10%", "date": "30%" } ) ) ``` If we set the width of all columns, the table will be forced to use the specified widths (i.e., a column width less than the content width will be honored). In this next example, we'll set widths for all columns. This is a good way to ensure that the widths you specify are fully respected (and not overridden by automatic width calculations). ```{python} ( GT(exibble_mini) .cols_width( cases={ "num": "30px", "char": "100px", "date": "100px", "datetime": "200px", "row": "50px" } ) ) ``` Notice that in the above example, the `num` column is very small (only `30px`) and the content overflows. When not specifying the width of all columns, the table will automatically adjust the column widths based on the content (and you wouldn't get the overflowing behavior seen in the previous example). cols_label(self: 'GTSelf', cases: 'dict[str, str | BaseText] | None' = None, **kwargs: 'str | BaseText') -> 'GTSelf' Relabel one or more columns. There are three important pieces to labelling: * Each argument has the form: {name in data} = {new label}. * Multiple columns may be given the same label. * Labels may use curly braces to apply special formatting, called unit notation. For example, "area ({{ft^2}})" would appear as "area (ft²)". See [`define_units()`](`great_tables.define_units`) for details on unit notation. Parameters ---------- cases A dictionary where the keys are column names and the values are the labels. Labels may use [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) helpers for formatting. **kwargs Keyword arguments to specify column labels. Each keyword corresponds to a column name, with its value indicating the new label. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Notes ----- GT always selects columns using their name in the underlying data. This means that a column's label is purely for final presentation. Examples -------- The example below relabels columns from the `countrypops` data to start with uppercase. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Uganda"][ ["country_name", "year", "population"] ].tail(5) ( GT(countrypops_mini) .cols_label( country_name="Country Name", year="Year", population="Population" ) ) ``` Note that we supplied the name of the column as the key, and the new label as the value. We can also use Markdown formatting for the column labels. In this example, we'll use `md("*Population*")` to make the label italicized. ```{python} from great_tables import GT, md from great_tables.data import countrypops ( GT(countrypops_mini) .cols_label( country_name="Name", year="Year", population=md("*Population*") ) ) ``` We can also use unit notation to format the column labels. In this example, we'll use `{{cm^3 molecules^-1 s^-1}}` for part of the label for the `OH_k298` column. ```{python} from great_tables import GT from great_tables.data import reactions import polars as pl reactions_mini = ( pl.from_pandas(reactions) .filter(pl.col("cmpd_type") == "mercaptan") .select(["cmpd_name", "OH_k298"]) ) ( GT(reactions_mini) .fmt_scientific("OH_k298") .sub_missing() .cols_label( cmpd_name="Compound Name", OH_k298="OH, {{cm^3 molecules^-1 s^-1}}", ) ) ``` cols_label_with(self: 'GTSelf', columns: 'SelectExpr' = None, fn: 'Callable[[str], str] | None' = None) -> 'GTSelf' Relabel one or more columns using a function. The `cols_label_with()` function allows for modification of column labels through a supplied function. By default, the function will be invoked on all column labels but this can be limited to a subset via the `columns` parameter. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. fn A function that accepts a column name as input and returns a label as output. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Notes ----- GT always selects columns using their name in the underlying data. This means that a column's label is purely for final presentation. Examples -------- Let's use a subset of the `sp500` dataset to create a gt table. ```{python} from great_tables import GT, md from great_tables.data import sp500 gt = GT(sp500.head()) gt ``` We can pass `str.upper` to the `fn` parameter to convert all column labels to uppercase. ```{python} gt.cols_label_with(fn=str.upper) ``` One useful use case is using `md()`, provided by **Great Tables**, to format column labels. For example, the following code demonstrates how to make the `date` and `adj_close` column labels bold using markdown syntax. ```{python} gt.cols_label_with(["date", "adj_close"], lambda x: md(f"**{x}**")) ``` cols_label_rotate(self: 'GTSelf', columns: 'SelectExpr' = None, dir: "Literal['sideways-lr', 'sideways-rl', 'vertical-lr']" = 'sideways-lr', align: "Literal['left', 'center', 'right'] | None" = None, padding: 'int' = 8) -> 'GTSelf' Rotate the column label for one or more columns. The `cols_label_rotate()` method sets the orientation of the column label text to make it flow vertically. The `dir` argument can be set to one of `"sideways-lr"`, `"sideways-rl"`, or `"vertical-lr"`, and the `columns` argument can be used to specify which columns to apply the alignment to. If `columns` is not specified, the alignment is applied to all columns. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. If `None`, the alignment is applied to all columns. dir A string that gives the direction of the text. Options: `"sideways-lr"`, `"sideways-rl"`, `"vertical-lr"`. See note for information on text layout. align The alignment to apply. Must be one of `"left"`, `"center"`, `"right"`, or `"none"`. If text is laid out vertically, this affects alignment along the vertical axis. padding The vertical padding to apply to the column labels. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- The example below rotates column labels such that the text is set to the left. ```{python} from great_tables import GT, style, loc, exibble exibble_sm = exibble[["num", "fctr", "row", "group"]] ( GT(exibble_sm, rowname_col="row", groupname_col="group") .cols_label_rotate(columns=["num", "fctr"]) ) ``` Other styles you provide won't override the column label rotation directives. Here we set the text to the right. ```{python} ( GT(exibble_sm, rowname_col="row", groupname_col="group") .cols_label_rotate(columns=["num", "fctr"], dir="vertical-lr") .tab_style(style=style.text(weight="bold"), locations=loc.column_labels(["fctr"])) ) ``` Labels that are restricted by the height of the stub head will wrap horizontally. ```{python} ( GT(exibble_sm, rowname_col="row", groupname_col="group") .cols_label({"fctr": "A longer description of the values in the column below"}) .cols_label_rotate(columns=["num", "fctr"], dir="sideways-lr") .tab_style( style=[style.text(weight="bold"), style.css(rule="height: 200px;")], locations=loc.column_labels(["fctr"]) ) ) ``` Note -------- The `dir` parameter uses the following keywords to alter the direction of the column label text. ##### `"sideways-lr"` For ltr scripts, content flows vertically from bottom to top. For rtl scripts, content flows vertically from top to bottom. Characters are set sideways toward the left. Overflow lines are appended to the right. ##### `"sideways-rl"` For ltr scripts, content flows vertically from top to bottom. For rtl scripts, content flows vertically from bottom to top. Characters are set sideways toward the right. Overflow lines are appended to the left. ##### `"vertical-lr"` Identical to `"sideways-rl"`, but overflow lines are appended to the right. cols_move(self: 'GTSelf', columns: 'SelectExpr', after: 'str') -> 'GTSelf' Move one or more columns. On those occasions where you need to move columns this way or that way, we can make use of the `cols_move()` method. While it's true that the movement of columns can be done upstream of **Great Tables**, it is much easier and less error prone to use the method provided here. The movement procedure here takes one or more specified columns (in the `columns` argument) and places them to the right of a different column (the `after` argument). The ordering of the `columns` to be moved is preserved, as is the ordering of all other columns in the table. The columns supplied in `columns` must all exist in the table and none of them can be in the `after` argument. The `after` column must also exist and only one column should be provided here. If you need to place one more or columns at the beginning of the column series, the `cols_move_to_start()` method should be used. Similarly, if those columns to move should be placed at the end of the column series then use `cols_move_to_end()`. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. after The column after which the `columns` should be placed. This can be any column name that exists in the table. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use the `countrypops` dataset to create a table. We'll choose to position the `population` column after the `country_name` column by using the `cols_move()` method. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Japan"][ ["country_name", "year", "population"] ].tail(5) ( GT(countrypops_mini) .cols_move( columns="population", after="country_name" ) ) ``` cols_move_to_start(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf' Move one or more columns to the start. We can easily move set of columns to the beginning of the column series and we only need to specify which `columns`. It's possible to do this upstream of **Great Tables**, however, it is easier with this method and it presents less possibility for error. The ordering of the `columns` that are moved to the start is preserved (same with the ordering of all other columns in the table). The columns supplied in `columns` must all exist in the table. If you need to place one or columns at the end of the column series, the `cols_move_to_end()` method should be used. More control is offered with the `cols_move()` method, where columns could be placed after a specific column. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- For this example, we'll use a portion of the `countrypops` dataset to create a simple table. Let's move the `year` column, which is the middle column, to the start of the column series with the `cols_move_to_start()` method. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Fiji"][ ["country_name", "year", "population"] ].tail(5) GT(countrypops_mini).cols_move_to_start(columns="year") ``` We can also move multiple columns at a time. With the same `countrypops`-based table (`countrypops_mini`), let's move both the `year` and `population` columns to the start of the column series. ```{python} GT(countrypops_mini).cols_move_to_start(columns=["year", "population"]) ``` cols_move_to_end(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf' Move one or more columns to the end. We can easily move set of columns to the beginning of the column series and we only need to specify which `columns`. It's possible to do this upstream of **Great Tables**, however, it is easier with this method and it presents less possibility for error. The ordering of the `columns` that are moved to the end is preserved (same with the ordering of all other columns in the table). Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- For this example, we'll use a portion of the `countrypops` dataset to create a simple table. Let's move the `year` column, which is the middle column, to the end of the column series with the `cols_move_to_end()` method. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][ ["country_name", "year", "population"] ].tail(5) GT(countrypops_mini).cols_move_to_end(columns="year") ``` We can also move multiple columns at a time. With the same `countrypops`-based table (`countrypops_mini`), let's move both the `year` and `country_name` columns to the end of the column series. ```{python} GT(countrypops_mini).cols_move_to_end(columns=["year", "country_name"]) ``` cols_reorder(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf' Reorder all columns in a specified order. The `cols_reorder()` method allows you to completely rearrange the column order of a table. Provide all column names in the exact order you want them to appear. This is useful when you need full control over the column layout and want to express the entire ordering in a single call, rather than using multiple `cols_move()`, `cols_move_to_start()`, or `cols_move_to_end()` calls. Every column in the table must appear exactly once in the `columns=` list. If any columns are missing or extra names are provided, a `ValueError` will be raised. Parameters ---------- columns A list of all column names in the desired display order. This can be a list of column name strings or a column selection expression (e.g., Polars selectors). All columns in the table must be included exactly once. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Raises ------ ValueError If the provided columns do not match all columns in the table (e.g., missing columns, extra columns, or duplicates). Examples -------- Let's use a subset of columns from the `exibble` dataset to create a table. ```{python} from great_tables import GT from great_tables.data import exibble exibble_mini = exibble[["num", "char", "fctr", "date", "time"]] GT(exibble_mini) ``` Now, let's reorder the columns so that `fctr` and `date` come first, followed by the remaining columns in a custom order: ```{python} ( GT(exibble_mini) .cols_reorder(["fctr", "date", "time", "char", "num"]) ) ``` For tables with many columns, you can use Python's iterable unpacking to build the column list programmatically. Here we use the full `exibble` dataset (9 columns) and move `fctr` to the front while pushing `num` and `char` to the end—without typing every column name in between: ```{python} # Unpack the first three column names and capture all remaining ones in `rest` # exibble.columns is: ["num", "char", "fctr", "date", "time", "datetime", "currency", "row", "group"] num, char, fctr, *rest = exibble.columns # Build the new order: fctr first, then all middle columns in their # original order, and finally char and num moved to the end ( GT(exibble) .cols_reorder([fctr, *rest, char, num]) ) ``` This unpacking technique is especially handy for wide tables where you want to pin a few columns to the start or end without manually listing every column in between. The `*rest` variable automatically adapts if columns are added to or removed from the dataset, making your table code more resilient to upstream schema changes. cols_hide(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf' Hide one or more columns. The `cols_hide()` method allows us to hide one or more columns from appearing in the final output table. While it's possible and often desirable to omit columns from the input table data before introduction to the `GT()` class, there can be cases where the data in certain columns is useful (as a column reference during formatting of other columns) but the final display of those columns is not necessary. Parameters ---------- columns The columns to hide in the output display table. Can either be a single column name or a series of column names provided in a list. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- For this example, we'll use a portion of the `countrypops` dataset to create a simple table. Let's hide the `year` column with the `cols_hide()` method. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][ ["country_name", "year", "population"] ].tail(5) GT(countrypops_mini).cols_hide(columns="year") ``` Details ------- The hiding of columns is internally a rendering directive, so, all columns that are 'hidden' are still accessible and useful in any expression provided to a `rows` argument. Furthermore, the `cols_hide()` method (as with many of the methods available in **Great Tables**) can be placed anywhere in a chain of calls (acting as a promise to hide columns when the timing is right). However there's perhaps greater readability when placing this call closer to the end of such a chain. The `cols_hide()` method quietly changes the visible state of a column and doesn't yield warnings when changing the state of already-invisible columns. cols_unhide(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf' Unhide one or more columns. The `cols_unhide()` method allows us to unhide one or more columns from appearing in the final output table. This may be important in cases where the user obtains a `GT` instance with hidden columns and there is motivation to reveal one or more of those. Parameters ---------- columns The columns to unhide in the output display table. Can either be a single column name or a series of column names provided in a list. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- For this example, we'll use a portion of the `countrypops` dataset to create a simple table. We'll hide the `year` column using `cols_hide()` and then unhide it with `cols_unhide()`, ensuring that the `year` column remains visible in the table. ```{python} from great_tables import GT from great_tables.data import countrypops countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][ ["country_name", "year", "population"] ].tail(5) GT(countrypops_mini).cols_hide(columns="year").cols_unhide(columns="year") ``` cols_merge(self: 'GTSelf', columns: 'SelectExpr', hide_columns: 'SelectExpr | Literal[False]' = None, rows: 'int | list[int] | None' = None, pattern: 'str | None' = None) -> 'GTSelf' Merge data from two or more columns into a single column. This method takes input from two or more columns and allows the contents to be merged into a single column by using a pattern that specifies the arrangement. The first column in the `columns=` parameter operates as the target column (i.e., the column that will undergo mutation) whereas all following columns will be untouched. There is the option to hide the non-target columns. The formatting of values in different columns will be preserved upon merging. Parameters ---------- columns The columns for which the merging operations should be applied. The first column name resolved will be the target column (i.e., undergo mutation) and the other columns will serve to provide input. Can be a list of column names or a selection expression, though a list is preferred here to ensure the order of columns is exactly as intended (since order matters for the `pattern=` parameter). hide_columns Any column names provided here will have their state changed to hidden (via internal use of `.cols_hide()`) if they aren't already hidden. This is convenient if the shared purpose of these specified columns is only to provide string input to the target column. To suppress any hiding of columns, `False` can be used here. By default, all columns other than the first one specified in `columns=` will be hidden. rows In conjunction with `columns=`, we can specify which of their rows should participate in the merging process. The default is all rows, resulting in all rows in `columns=` being formatted. Alternatively, we can supply a list of row indices. pattern A formatting pattern that specifies the arrangement of the column values and any string literals. The pattern uses numbers (within `{}`) that correspond to the indices of columns provided in `columns=`. If two columns are provided in `columns=` and we would like to combine the cell data onto the first column, `"{0} {1}"` could be used. If a pattern isn't provided then a space-separated pattern that includes all columns will be generated automatically. The pattern can also use `<<`/`>>` to surround spans of text that will be removed if any of the contained `{}` yields a missing value. Further details are provided in the *How the pattern works* section. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Details ------- ### How the pattern works There are two types of templating for the `pattern` string: - `{` `}` for arranging single column values in a row-wise fashion - `<<` `>>` to surround spans of text that will be removed if any of the contained `{` `}` yields a missing value Integer values are placed in `{}` and those values correspond to the columns involved in the merge, in the order they are provided in the `columns=` argument. So the pattern `"{0} ({1}-{2})"` corresponds to the target column value listed first in `columns` and the second and third columns cited (formatted as a range in parentheses). With hypothetical values, this might result as the merged string `"38.2 (3-8)"`. Because some values involved in merging may be missing, it is likely that something like `"38.2 (3-None)"` would be undesirable. For such cases, placing sections of text in `<<>>` results in the entire span being eliminated if there were to be an `None` value (arising from `{}` values). We could instead opt for a pattern like `"{0}<< ({1}-{2})>>"`, which results in `"38.2"` if either columns `{1}` or `{2}` have a `None` value. We can even use a more complex nesting pattern like `"{0}<< ({1}-<<{2}>>)>>"` to retain a lower limit in parentheses (where `{2}` is `None`) but remove the range altogether if `{1}` is `None`. One more thing to note here is that if `.sub_missing()` is used on values in a column, those specific values affected won't be considered truly missing by `.cols_merge()` (since they have been explicitly handled with substitute text). Examples -------- Let's use a subset of the `sp500` dataset to create a table. We'll merge the `open` & `close` columns together, and the `low` & `high` columns (putting an em dash between both). ```{python} from great_tables import GT from great_tables.data import sp500 import polars as pl sp500_mini = ( pl.from_pandas(sp500) .slice(49, 6) .select("open", "close", "low", "high") ) ( GT(sp500_mini) .fmt_number( columns=["open", "close", "low", "high"], decimals=2, use_seps=False ) .cols_merge(columns=["open", "close"], pattern="{0}—{1}") .cols_merge(columns=["low", "high"], pattern="{0}—{1}") .cols_label(open="open/close", low="low/high") ) ``` Now we'll use a portion of the `gtcars` for the next example that accounts for missing values in the `pattern=` parameter. Use the `.cols_merge()` method twice to merge together the: (1) `trq` and `trq_rpm` columns, and (2) `mpg_c` & `mpg_h` columns. Given the presence of missing values, we can use patterns with `<<`/`>>` to create conditional text spans, avoiding results where any of the merged columns have missing values. ```{python} from great_tables.data import gtcars import polars.selectors as cs gtcars_pl = ( pl.from_pandas(gtcars) .filter(pl.col("year") == 2017) .select(["mfr", "model", "trq", "trq_rpm", "mpg_c", "mpg_h"]) ) ( GT(gtcars_pl) .fmt_integer(columns=[cs.starts_with("trq"), cs.starts_with("mpg")]) .cols_merge(columns=["trq", "trq_rpm"], pattern="{0}<< ({1} rpm)>>") .cols_merge(columns=["mpg_c", "mpg_h"], pattern="<<{0} city<>>>") .cols_label(mfr="Manufacturer", model="Car Model", trq="Torque", mpg_c="MPG") ) ``` cols_merge_uncert(self: 'GTSelf', col_val: 'SelectExpr', col_uncert: 'SelectExpr', rows: 'int | list[int] | None' = None, sep: 'str' = ' +/- ', autohide: 'bool' = True) -> 'GTSelf' Merge columns to a value-with-uncertainty column. `cols_merge_uncert()` is a specialized variant of `cols_merge()`. It takes as input a base value column (`col_val`) and either: (1) a single uncertainty column, or (2) two columns representing lower and upper uncertainty bounds. These columns will be essentially merged into a single column (that of `col_val`). What results is a column with values and associated uncertainties, and any columns specified in `col_uncert` are hidden from appearing in the output table. Parameters ---------- col_val The column that contains values for the base measurement. While column selection expressions can be used, it's recommended that a single column name be used to ensure that exactly one column is provided here. col_uncert The column or columns that contain uncertainty values. The most common case involves supplying a single column with uncertainties; these values will be combined with those in `col_val`. Less commonly, the lower and upper uncertainty bounds may be different. For that case, two columns representing the lower and upper uncertainty values away from `col_val`, respectively, should be provided as a list. rows In conjunction with `col_val`, we can specify which rows should participate in the merging process. The default is all rows. Alternatively, we can supply a list of row indices. sep The separator text that contains the uncertainty mark for a single uncertainty value. The default value of `" +/- "` indicates that an appropriate plus/minus mark will be used depending on the output context. The plus/minus symbol (±) is used in HTML output. autohide An option to automatically hide any columns specified in `col_uncert`. Any columns with their state changed to hidden will behave the same as before, they just won't be displayed in the finalized table. Defaults to `True`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Details ------- ### Specialized NA handling This function employs specialized semantics for missing value handling that differ from the generic `cols_merge()`: 1. Missing values in `col_val` result in missing values for the merged column (e.g., `NA` + `0.1` = `NA`) 2. Missing values in `col_uncert` (but not `col_val`) result in base values only for the merged column (e.g., `12.0` + `NA` = `12.0`) 3. Missing values in both `col_val` and `col_uncert` result in missing values for the merged column (e.g., `NA` + `NA` = `NA`) Examples -------- Use the `exibble` dataset to create a simple, two-column table. Merge the `currency` and `num` columns together as a value with uncertainty. ```{python} from great_tables import GT from great_tables.data import exibble import polars as pl exibble_mini = ( pl.from_pandas(exibble) .select("num", "currency") .slice(0, 7) ) ( GT(exibble_mini) .fmt_number(columns="num", decimals=3, use_seps=False) .cols_merge_uncert(col_val="currency", col_uncert="num") .cols_label(currency="value + uncert.") ) ``` When there are missing values in the uncertainty column, the merged result shows only the base value. When the base value itself is missing, the entire merged cell is empty. ```{python} df = pl.DataFrame({ "measurement": [12.5, 8.3, 15.0, 9.7], "error": [0.2, None, 0.5, None], }) ( GT(df) .fmt_number(columns="error", decimals=2) .cols_merge_uncert(col_val="measurement", col_uncert="error") .cols_label(measurement="Measurement") ) ``` cols_merge_range(self: 'GTSelf', col_begin: 'SelectExpr', col_end: 'SelectExpr', rows: 'int | list[int] | None' = None, sep: 'str | None' = None, autohide: 'bool' = True, locale: 'str | None' = None) -> 'GTSelf' Merge two columns to a value range column. `cols_merge_range()` is a specialized variant of `cols_merge()`. It operates by taking two columns that constitute a range of values (`col_begin` and `col_end`) and merges them into a single column. What results is a column containing both values separated by an en dash (or a custom separator). The column specified in `col_end` is dropped from the output table. Parameters ---------- col_begin The column that contains values for the start of the range. While column selection expressions can be used, it's recommended that a single column name be used to ensure that exactly one column is provided here. col_end The column that contains values for the end of the range. While column selection expressions can be used, it's recommended that a single column name be used to ensure that exactly one column is provided here. rows In conjunction with `col_begin`, we can specify which rows should participate in the merging process. The default is all rows. Alternatively, we can supply a list of row indices. sep The separator text that indicates the values are ranged. If not provided, an en dash (`"–"`) will be used. You can use `"--"` for an en dash or `"---"` for an em dash. autohide An option to automatically hide the column specified as `col_end`. Any columns with their state changed to hidden will behave the same as before, they just won't be displayed in the finalized table. Defaults to `True`. locale An optional locale identifier that can be used for applying a separator pattern specific to a locale's rules. Currently reserved for future use. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Details ------- ### Specialized NA handling This function employs specialized semantics for missing value handling that differ from the generic `cols_merge()`: 1. Missing values in `col_begin` (but not `col_end`) result in a display of only the `col_end` value 2. Missing values in `col_end` (but not `col_begin`) result in a display of only the `col_begin` value 3. Missing values in both `col_begin` and `col_end` result in missing values for the merged column Examples -------- Use a subset of the `gtcars` dataset to create a table. Merge the `mpg_c` and `mpg_h` columns together as a range. ```{python} from great_tables import GT from great_tables.data import gtcars import polars as pl gtcars_mini = ( pl.from_pandas(gtcars) .select("model", "mpg_c", "mpg_h") .slice(0, 8) ) ( GT(gtcars_mini) .cols_merge_range(col_begin="mpg_c", col_end="mpg_h") .cols_label(mpg_c="MPG") ) ``` When there are missing values, the merged result gracefully degrades: if only one side is missing, the other value is shown alone (without a separator). A custom separator can be provided via the `sep=` argument. ```{python} df = pl.DataFrame({ "city": ["NYC", "LA", "CHI", "HOU"], "temp_low": [28, 55, None, 45], "temp_high": [35, None, 50, 60], }) ( GT(df) .cols_merge_range(col_begin="temp_low", col_end="temp_high", sep=" to ") .cols_label(temp_low="Temp. Range (°F)") ) ``` cols_merge_n_pct(self: 'GTSelf', col_n: 'SelectExpr', col_pct: 'SelectExpr', rows: 'int | list[int] | None' = None, autohide: 'bool' = True) -> 'GTSelf' Merge two columns to combine counts and percentages. `cols_merge_n_pct()` is a specialized variant of `cols_merge()`. It operates by taking two columns that constitute both a count (`col_n`) and a fraction of the total population (`col_pct`) and merges them into a single column. What results is a column containing both counts and their associated percentages (e.g., `12 (23.2%)`). The column specified in `col_pct` is dropped from the output table. Parameters ---------- col_n The column that contains values for the count component. While column selection expressions can be used, it's recommended that a single column name be used to ensure that exactly one column is provided here. col_pct The column that contains values for the percentage component. While column selection expressions can be used, it's recommended that a single column name be used to ensure that exactly one column is provided here. This column should be formatted such that percentages are displayed (e.g., with `fmt_percent()`). rows In conjunction with `col_n`, we can specify which rows should participate in the merging process. The default is all rows. Alternatively, we can supply a list of row indices. autohide An option to automatically hide the column specified as `col_pct`. Any columns with their state changed to hidden will behave the same as before, they just won't be displayed in the finalized table. Defaults to `True`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Details ------- ### Specialized NA and zero-value handling This function employs specialized semantics for missing value and zero-value handling: 1. Missing values in `col_n` result in missing values for the merged column (e.g., `NA` + `10.2%` = `NA`) 2. Missing values in `col_pct` (but not `col_n`) result in base values only for the merged column (e.g., `13` + `NA` = `13`) 3. Missing values in both `col_n` and `col_pct` result in missing values for the merged column (e.g., `NA` + `NA` = `NA`) 4. If a zero (`0`) value is in `col_n` then the formatted output will be `"0"` (i.e., no percentage will be shown) It is the responsibility of the user to ensure that values are correct in both the `col_n` and `col_pct` columns (this function neither generates nor recalculates values in either). Formatting of each column can be done independently in separate `fmt_number()` and `fmt_percent()` calls. Examples -------- Create a simple table with counts and percentages, then merge them. ```{python} from great_tables import GT import polars as pl df = pl.DataFrame({ "category": ["A", "B", "C"], "n": [10, 20, 30], "pct": [0.167, 0.333, 0.500], }) ( GT(df) .fmt_percent(columns="pct") .cols_merge_n_pct(col_n="n", col_pct="pct") .cols_label(n="Count (%)") ) ``` Zero values in the count column suppress the percentage display. Missing values in the percentage column result in just the count being shown, and missing counts produce empty cells. ```{python} df = pl.DataFrame({ "item": ["Alpha", "Beta", "Gamma", "Delta"], "count": [15, 0, 8, None], "frac": [0.375, 0.0, None, 0.125], }) ( GT(df) .fmt_percent(columns="frac", decimals=1) .cols_merge_n_pct(col_n="count", col_pct="frac") .cols_label(count="N (%)") ) ``` ## Adding rows The [`summary_rows()`](`great_tables.GT.summary_rows`) function adds rows to summarize data within each row group, while [`grand_summary_rows()`](`great_tables.GT.grand_summary_rows`) summarizes across the entire table. summary_rows(self: 'GTSelf', *, fns: 'dict[str, PlExpr] | dict[str, Callable[[TblData], Any]]', fmt: 'FormatFn | None' = None, columns: 'SelectExpr' = None, groups: 'list[str] | None' = None, side: "Literal['bottom', 'top']" = 'bottom', missing_text: 'str' = '---') -> 'GTSelf' Add group-wise summary rows to the table. Add summary rows by using the table data and any suitable aggregation functions. With `summary_rows()`, the data within each row group is aggregated separately and summary rows are placed adjacent to each group. Multiple summary rows can be added via expressions given to `fns=`. You can selectively format the values in the resulting summary cells by use of formatting expressions from the `vals.fmt_*` class of functions. Note that currently all arguments are keyword-only, since the final positions may change. Parameters ---------- fns A dictionary mapping row labels to aggregation expressions. Can be either Polars expressions or callable functions that take a DataFrame subset and return aggregated results. Each key becomes the label for a summary row within each group. fmt A formatting function from the `vals.fmt_*` family (e.g., `vals.fmt_number`, `vals.fmt_currency`) to apply to the summary row values. If `None`, no formatting is applied. columns Currently, this function does not support selection by columns. If you would like to choose which columns to summarize, you can select columns within the functions given to `fns=`. See examples below for more explicit cases. groups The groups to target for summary row insertion. Can be a list of group IDs as strings. By default (`None`), summary rows are generated for all groups. side Should the summary rows be placed at the `"bottom"` (the default) or the `"top"` of each group? missing_text The text to be used in summary cells with no data outputs. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use a subset of the `gtcars` dataset to create a table with group summary rows. We'll group by manufacturer and show min and max values for horsepower and torque columns. ```{python} import polars as pl from great_tables import GT, vals from great_tables.data import gtcars gtcars_mini = ( pl.from_pandas(gtcars) .select(["mfr", "model", "hp", "trq"]) .head(12) ) ( GT(gtcars_mini, rowname_col="model", groupname_col="mfr") .summary_rows( fns={ "Min": pl.col("hp", "trq").min(), "Max": pl.col("hp", "trq").max(), }, fmt=vals.fmt_integer, ) ) ``` We can also target specific groups by using the `groups=` parameter. Here we only show summary rows for the `"Ferrari"` group: ```{python} ( GT(gtcars_mini, rowname_col="model", groupname_col="mfr") .summary_rows( fns={ "Average": pl.col("hp", "trq").mean(), }, groups=["Ferrari"], fmt=vals.fmt_number, ) ) ``` Callable functions work with pandas DataFrames. Each function receives the subset of data for that group: ```{python} from great_tables import GT, vals from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq"]].head(12), rowname_col="model", groupname_col="mfr", ) .summary_rows( fns={ "Min": lambda df: df.min(numeric_only=True), "Max": lambda df: df.max(numeric_only=True), }, fmt=vals.fmt_integer, ) ) ``` Summary rows can be placed at the top of each group using `side="top"`: ```{python} import polars as pl from great_tables import GT, vals from great_tables.data import gtcars gtcars_mini = ( pl.from_pandas(gtcars) .select(["mfr", "model", "hp", "trq"]) .head(12) ) ( GT(gtcars_mini, rowname_col="model", groupname_col="mfr") .summary_rows( fns={"Mean": pl.col("hp", "trq").mean()}, side="top", fmt=vals.fmt_number, ) ) ``` Combining group summaries with grand summary rows and styling provides a comprehensive summary view of the data. Use `loc.summary()` to style all group summary cells: ```{python} import polars as pl from great_tables import GT, vals, style, loc from great_tables.data import gtcars gtcars_mini = ( pl.from_pandas(gtcars) .select(["mfr", "model", "hp", "trq"]) .head(12) ) ( GT(gtcars_mini, rowname_col="model", groupname_col="mfr") .summary_rows( fns={ "Min": pl.col("hp", "trq").min(), "Max": pl.col("hp", "trq").max(), }, fmt=vals.fmt_integer, ) .grand_summary_rows( fns={"Overall Mean": pl.col("hp", "trq").mean()}, fmt=vals.fmt_number, ) .tab_style( style=[style.fill(color="lightyellow")], locations=loc.summary(), ) .tab_style( style=[style.fill(color="lightblue")], locations=loc.grand_summary(), ) ) ``` When groups are displayed as a column in the stub (using `row_group_as_column=True`), the summary row labels span the stub columns: ```{python} import polars as pl from great_tables import GT, vals from great_tables.data import gtcars gtcars_mini = ( pl.from_pandas(gtcars) .select(["mfr", "model", "hp", "trq"]) .head(12) ) ( GT(gtcars_mini, rowname_col="model", groupname_col="mfr") .tab_options(row_group_as_column=True) .summary_rows( fns={ "Min": pl.col("hp", "trq").min(), "Max": pl.col("hp", "trq").max(), }, fmt=vals.fmt_integer, ) ) ``` grand_summary_rows(self: 'GTSelf', *, fns: 'dict[str, PlExpr] | dict[str, Callable[[TblData], Any]]', fmt: 'FormatFn | None' = None, columns: 'SelectExpr' = None, side: "Literal['bottom', 'top']" = 'bottom', missing_text: 'str' = '---') -> 'GTSelf' Add grand summary rows to the table. Add grand summary rows by using the table data and any suitable aggregation functions. With grand summary rows, all of the available data in the gt table is incorporated (regardless of whether some of the data are part of row groups). Multiple grand summary rows can be added via expressions given to fns. You can selectively format the values in the resulting grand summary cells by use of formatting expressions from the `vals.fmt_*` class of functions. Note that currently all arguments are keyword-only, since the final positions may change. Parameters ---------- fns A dictionary mapping row labels to aggregation expressions. Can be either Polars expressions or callable functions that take the entire DataFrame and return aggregated results. Each key becomes the label for a grand summary row. fmt A formatting function from the `vals.fmt_*` family (e.g., `vals.fmt_number`, `vals.fmt_currency`) to apply to the summary row values. If `None`, no formatting is applied. columns Currently, this function does not support selection by columns. If you would like to choose which columns to summarize, you can select columns within the functions given to `fns=`. See examples below for more explicit cases. side Should the grand summary rows be placed at the `"bottom"` (the default) or the `"top"` of the table? missing_text The text to be used in summary cells with no data outputs. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's use a subset of the `sp500` dataset to create a table with grand summary rows. We'll calculate min, max, and mean values for the numeric columns. Notice the different approaches to selecting columns to apply the aggregations to: we can use polars selectors or select the columns directly. ```{python} import polars as pl import polars.selectors as cs from great_tables import GT, vals, style, loc from great_tables.data import sp500 sp500_mini = ( pl.from_pandas(sp500) .slice(0, 7) .drop(["volume", "adj_close"]) ) ( GT(sp500_mini, rowname_col="date") .grand_summary_rows( fns={ "Minimum": pl.min("open", "high", "low", "close"), "Maximum": pl.col("open", "high", "low", "close").max(), "Average": cs.numeric().mean(), }, fmt=vals.fmt_currency, ) .tab_style( style=[ style.text(color="crimson"), style.fill(color="lightgray"), ], locations=loc.grand_summary(), ) ) ``` We can also use custom callable functions to create more complex summary calculations. Notice here that grand summary rows can be placed at the top of the table and formatted with currency notation, by passing a formatter from the `vals.fmt_*` class of functions. ```{python} from great_tables import GT, style, loc, vals from great_tables.data import gtcars def pd_median(df): return df.median(numeric_only=True) ( GT( gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6), rowname_col="model", ) .fmt_integer(columns=["hp", "trq", "mpg_c"]) .grand_summary_rows( fns={ "Min": lambda df: df.min(numeric_only=True), "Max": lambda df: df.max(numeric_only=True), "Median": pd_median, }, side="top", fmt=vals.fmt_integer, ) .tab_style( style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")], locations=loc.grand_summary_stub(), ) ) ``` ## Location Targeting and Styling Classes Location targeting is a powerful feature of Great Tables. It allows for the precise selection of table locations for styling (using the `tab_style()` method). The styling classes allow for the specification of the styling properties to be applied to the targeted locations. LocHeader() -> None Target the table header (title and subtitle). With `loc.header()`, we can target the table header which contains the title and the subtitle. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Returns ------- LocHeader A LocHeader object, which is used for a `locations=` argument if specifying the title of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table header (the 'title' and 'subtitle' parts. This can be done by using `locations=loc.header()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_header( title="Select Cars from the gtcars Dataset", subtitle="Only the first five cars are displayed" ) .tab_style( style=style.fill(color="lightblue"), locations=loc.header() ) .fmt_currency(columns="msrp", decimals=0) ) ``` LocTitle() -> None Target the table title. With `loc.title()`, we can target the part of table containing the title (within the table header). This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Returns ------- LocTitle A LocTitle object, which is used for a `locations=` argument if specifying the title of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style only the 'title' part of the table header (leaving the 'subtitle' part unaffected). This can be done by using `locations=loc.title()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_header( title="Select Cars from the gtcars Dataset", subtitle="Only the first five cars are displayed" ) .tab_style( style=style.text(color="blue", size="large", weight="bold"), locations=loc.title() ) .fmt_currency(columns="msrp", decimals=0) ) ``` LocSubTitle() -> None Target the table subtitle. With `loc.subtitle()`, we can target the part of table containing the subtitle (within the table header). This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Returns ------- LocSubTitle A LocSubTitle object, which is used for a `locations=` argument if specifying the subtitle of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style only the 'subtitle' part of the table header (leaving the 'title' part unaffected). This can be done by using `locations=loc.subtitle()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_header( title="Select Cars from the gtcars Dataset", subtitle="Only the first five cars are displayed" ) .tab_style( style=style.fill(color="lightblue"), locations=loc.subtitle() ) .fmt_currency(columns="msrp", decimals=0) ) ``` LocStubhead() -> None Target the stubhead. With `loc.stubhead()`, we can target the part of table that resides both at the top of the stub and also beside the column header. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Returns ------- LocStubhead A LocStubhead object, which is used for a `locations=` argument if specifying the stubhead of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. This table contains a stub (produced by setting `rowname_col="model"` in the initial `GT()` call). The stubhead is given a label by way of the [`tab_stubhead()`](`great_tables.GT.tab_stubhead`) method and this label can be styled by using `locations=loc.stubhead()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5), rowname_col="model", groupname_col="mfr" ) .tab_stubhead(label="car") .tab_style( style=style.text(color="red", weight="bold"), locations=loc.stubhead() ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocColumnHeader() -> None Target column spanners and column labels. With `loc.column_header()`, we can target the column header which contains all of the column labels and any spanner labels that are present. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Returns ------- LocColumnHeader A LocColumnHeader object, which is used for a `locations=` argument if specifying the column header of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We create spanner labels through use of the [`tab_spanner()`](`great_tables.GT.tab_spanner`) method; this gives us a column header with a mix of column labels and spanner labels. We will style the entire column header at once by using `locations=loc.column_header()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5)) .tab_spanner( label="performance", columns=["hp", "trq"] ) .tab_spanner( label="make and model", columns=["mfr", "model"] ) .tab_style( style=[ style.text(color="white", weight="bold"), style.fill(color="steelblue") ], locations=loc.column_header() ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocSpannerLabels(ids: 'SelectExpr' = None) -> None Target spanner labels. With `loc.spanner_labels()`, we can target the cells containing the spanner labels. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- ids: The ID values for the spanner labels to target. A list of one or more ID values is required. Returns ------- LocSpannerLabels A LocSpannerLabels object, which is used for a `locations=` argument if specifying the table's spanner labels. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We create two spanner labels through two separate calls of the [`tab_spanner()`](`great_tables.GT.tab_spanner`) method. In each of those, the text supplied to `label=` argument is used as the ID value (though they have to be explicitly set via the `id=` argument). We will style only the spanner label having the text `"performance"` by using `locations=loc.spanner_labels(ids=["performance"])` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5)) .tab_spanner( label="performance", columns=["hp", "trq"] ) .tab_spanner( label="make and model", columns=["mfr", "model"] ) .tab_style( style=style.text(color="blue", weight="bold"), locations=loc.spanner_labels(ids=["performance"]) ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocColumnLabels(columns: 'SelectExpr' = None) -> None Target column labels. With `loc.column_labels()`, we can target the cells containing the column labels. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. If no columns are specified, all columns are targeted. Returns ------- LocColumnLabels A LocColumnLabels object, which is used for a `locations=` argument if specifying the table's column labels. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style all three of the column labels by using `locations=loc.column_labels()` within [`tab_style()`](`great_tables.GT.tab_style`). Note that no specification of `columns=` is needed here because we want to target all columns. ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_style( style=style.text(color="blue", size="large", weight="bold"), locations=loc.column_labels() ) ) ``` LocGrandSummaryStub(rows: 'RowSelectExpr' = None) -> None Target the grand summary stub. With `loc.grand_summary_stub()` we can target the cells containing the grand summary row labels, which reside in the table stub. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- rows The rows to target within the grand summary stub. Can either be a single row name or a series of row names provided in a list. If no rows are specified, all grand summary rows are targeted. Note that if rows are targeted by index, top and bottom grand summary rows are indexed as one combined list starting with the top rows. Returns ------- LocGrandSummaryStub A LocGrandSummaryStub object, which is used for a `locations=` argument if specifying the table's grand summary rows' labels. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table grand summary stub (the row labels) by using `locations=loc.grand_summary_stub()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc, vals from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6), rowname_col="model", ) .fmt_integer(columns=["hp", "trq", "mpg_c"]) .grand_summary_rows( fns={ "Min": lambda df: df.min(numeric_only=True), "Max": lambda x: x.max(numeric_only=True), }, side="top", fmt=vals.fmt_integer, ) .tab_style( style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")], locations=loc.grand_summary_stub(), ) ) ``` LocStub(rows: 'RowSelectExpr' = None) -> None Target the table stub. With `loc.stub()` we can target the cells containing the row labels, which reside in the table stub. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- rows The rows to target within the stub. Can either be a single row name or a series of row names provided in a list. If no rows are specified, all rows are targeted. Returns ------- LocStub A LocStub object, which is used for a `locations=` argument if specifying the table's stub. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table stub (the row labels) by using `locations=loc.stub()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5), rowname_col="model", groupname_col="mfr" ) .tab_stubhead(label="car") .tab_style( style=[ style.text(color="crimson", weight="bold"), style.fill(color="lightgray") ], locations=loc.stub() ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocRowGroups(rows: 'RowSelectExpr' = None) -> None Target row groups. With `loc.row_groups()` we can target the cells containing the row group labels, which span across the table body. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- rows The row groups to target. Can either be a single group name or a series of group names provided in a list. If no groups are specified, all are targeted. Returns ------- LocRowGroups A LocRowGroups object, which is used for a `locations=` argument if specifying the table's row groups. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style all of the cells comprising the row group labels by using `locations=loc.row_groups()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5), rowname_col="model", groupname_col="mfr" ) .tab_stubhead(label="car") .tab_style( style=[ style.text(color="crimson", weight="bold"), style.fill(color="lightgray") ], locations=loc.row_groups() ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocGrandSummary(columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, mask: 'PlExpr | None' = None) -> None Target the data cells in grand summary rows. With `loc.grand_summary()` we can target the cells containing the grand summary data. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows The rows to target. Can either be a single row name or a series of row names provided in a list. Note that if rows are targeted by index, top and bottom grand summary rows are indexed as one combined list starting with the top rows. Returns ------- LocGrandSummary A LocGrandSummary object, which is used for a `locations=` argument if specifying the table's grand summary rows. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style all of the grand summary cells by using `locations=loc.grand_summary()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc, vals from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6), rowname_col="model", ) .fmt_integer(columns=["hp", "trq", "mpg_c"]) .grand_summary_rows( fns={ "Min": lambda df: df.min(numeric_only=True), "Max": lambda x: x.max(numeric_only=True), }, side="top", fmt=vals.fmt_integer, ) .tab_style( style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")], locations=loc.grand_summary(), ) ) ``` LocBody(columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, mask: 'PlExpr | None' = None) -> None Target data cells in the table body. With `loc.body()`, we can target the data cells in the table body. This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. :::{.callout-warning} `mask=` is still experimental. ::: Parameters ---------- columns The columns to target. Can either be a single column name or a series of column names provided in a list. rows The rows to target. Can either be a single row name or a series of row names provided in a list. mask The cells to target. If the underlying wrapped DataFrame is a Polars DataFrame, you can pass a Polars expression for cell-based selection. This argument must be used exclusively and cannot be combined with the `columns=` or `rows=` arguments. Returns ------- LocBody A LocBody object, which is used for a `locations=` argument if specifying the table body. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. We will style all of the body cells by using `locations=loc.body()` within [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT( gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5), rowname_col="model", groupname_col="mfr" ) .tab_stubhead(label="car") .tab_style( style=[ style.text(color="darkblue", weight="bold"), style.fill(color="gainsboro") ], locations=loc.body() ) .fmt_integer(columns=["hp", "trq"]) .fmt_currency(columns="msrp", decimals=0) ) ``` LocFooter() -> None Target the table footer. With `loc.footer()` we can target the table's footer, which currently contains the source notes (and may contain a 'footnotes' location in the future). This is useful when applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. The 'footer' location is generated by [`tab_source_note()`](`great_tables.GT.tab_source_note`). Returns ------- LocFooter A `LocFooter` object, which is used for a `locations=` argument if specifying the footer of the table. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. Add a source note (with [`tab_source_note()`](`great_tables.GT.tab_source_note`) and style this footer section inside of [`tab_style()`](`great_tables.GT.tab_style`) with `locations=loc.footer()`. ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_source_note(source_note="From edmunds.com") .tab_style( style=style.text(color="blue", size="small", weight="bold"), locations=loc.footer() ) ) ``` LocSourceNotes() -> None Target the source notes. With `loc.source_notes()`, we can target the source notes in the table. This is useful when applying custom with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and this class should be used there to perform the targeting. The 'source_notes' location is generated by [`tab_source_note()`](`great_tables.GT.tab_source_note`). Returns ------- LocSourceNotes A `LocSourceNotes` object, which is used for a `locations=` argument if specifying the source notes. Examples -------- Let's use a subset of the `gtcars` dataset in a new table. Add a source note (with [`tab_source_note()`](`great_tables.GT.tab_source_note`) and style the source notes section inside [`tab_style()`](`great_tables.GT.tab_style`) with `locations=loc.source_notes()`. ```{python} from great_tables import GT, style, loc from great_tables.data import gtcars ( GT(gtcars[["mfr", "model", "msrp"]].head(5)) .tab_source_note(source_note="From edmunds.com") .tab_style( style=style.text(color="blue", size="small", weight="bold"), locations=loc.source_notes() ) ) ``` CellStyleFill(color: 'str | ColumnExpr') -> None A style specification for the background fill of targeted cells. The `style.fill()` class is to be used with the `tab_style()` method, which itself allows for the setting of custom styles to one or more cells. Specifically, the call to `style.fill()` should be bound to the `styles` argument of `tab_style()`. Parameters ---------- color The color to use for the cell background fill. This can be any valid CSS color value, such as a hex code, a named color, or an RGB value. Returns ------- CellStyleFill A CellStyleFill object, which is used for a `styles` argument if specifying a cell fill value. Examples ------ See [`GT.tab_style()`](`great_tables.GT.tab_style`). CellStyleText(color: 'str | ColumnExpr | None' = None, font: 'str | ColumnExpr | GoogleFont | None' = None, size: 'str | ColumnExpr | None' = None, align: "Literal['center', 'left', 'right', 'justify'] | ColumnExpr | None" = None, v_align: "Literal['middle', 'top', 'bottom'] | ColumnExpr | None" = None, style: "Literal['normal', 'italic', 'oblique'] | ColumnExpr | None" = None, weight: "Literal['normal', 'bold', 'bolder', 'lighter'] | ColumnExpr | None" = None, stretch: "Literal['normal', 'condensed', 'ultra-condensed', 'extra-condensed', 'semi-condensed', 'semi-expanded', 'expanded', 'extra-expanded', 'ultra-expanded'] | ColumnExpr | None" = None, decorate: "Literal['overline', 'line-through', 'underline', 'underline overline'] | ColumnExpr | None" = None, transform: "Literal['uppercase', 'lowercase', 'capitalize'] | ColumnExpr | None" = None, whitespace: "Literal['normal', 'nowrap', 'pre', 'pre-wrap', 'pre-line', 'break-spaces'] | ColumnExpr | None" = None) -> None A style specification for cell text. The `style.text()` class is to be used with the `tab_style()` method, which itself allows for the setting of custom styles to one or more cells. With it, you can specify the color of the text, the font family, the font size, and the horizontal and vertical alignment of the text and more. Parameters ---------- color The text color can be modified through the `color` argument. font The font or collection of fonts (subsequent font names are) used as fallbacks. size The size of the font. Can be provided as a number that is assumed to represent `px` values (or could be wrapped in the `px()` helper function). We can also use one of the following absolute size keywords: `"xx-small"`, `"x-small"`, `"small"`, `"medium"`, `"large"`, `"x-large"`, or `"xx-large"`. align The text in a cell can be horizontally aligned though one of the following options: `"center"`, `"left"`, `"right"`, or `"justify"`. v_align The vertical alignment of the text in the cell can be modified through the options `"middle"`, `"top"`, or `"bottom"`. style Can be one of either `"normal"`, `"italic"`, or `"oblique"`. weight The weight of the font can be modified thorough a text-based option such as `"normal"`, `"bold"`, `"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`, inclusive. Note that only variable fonts may support the numeric mapping of weight. stretch Allows for text to either be condensed or expanded. We can use one of the following text-based keywords to describe the degree of condensation/expansion: `"ultra-condensed"`, `"extra-condensed"`, `"condensed"`, `"semi-condensed"`, `"normal"`, `"semi-expanded"`, `"expanded"`, `"extra-expanded"`, or `"ultra-expanded"`. Alternatively, we can supply percentage values from `0%` to `200%`, inclusive. Negative percentage values are not allowed. decorate Allows for text decoration effect to be applied. Here, we can use `"overline"`, `"line-through"`, or `"underline"`. transform Allows for the transformation of text. Options are `"uppercase"`, `"lowercase"`, or `"capitalize"`. whitespace A white-space preservation option. By default, runs of white-space will be collapsed into single spaces but several options exist to govern how white-space is collapsed and how lines might wrap at soft-wrap opportunities. The options are `"normal"`, `"nowrap"`, `"pre"`, `"pre-wrap"`, `"pre-line"`, and `"break-spaces"`. Returns ------- CellStyleText A CellStyleText object, which is used for a `styles` argument if specifying any cell text properties. Examples ------ See [`GT.tab_style()`](`great_tables.GT.tab_style`). CellStyleBorders(sides: "Literal['all', 'top', 'bottom', 'left', 'right'] | list[Literal['all', 'top', 'bottom', 'left', 'right']]" = 'all', color: 'str | ColumnExpr' = '#000000', style: 'str | ColumnExpr' = 'solid', weight: 'str | ColumnExpr' = '1px') -> None A style specification for cell borders. The `styles.borders()` class is to be used with the `tab_style()` method, which itself allows for the setting of custom styles to one or more cells. The `sides` argument is where we define which borders should be modified (e.g., `"left"`, `"right"`, etc.). With that selection, the `color`, `style`, and `weight` of the selected borders can then be modified. Parameters ---------- sides The border sides to be modified. Options include `"left"`, `"right"`, `"top"`, and `"bottom"`. For all borders surrounding the selected cells, we can use the `"all"` option. color The border `color` can be defined with any valid CSS color value, such as a hex code, a named color, or an RGB value. The default `color` value is `"#000000"` (black). style The border `style` can be one of either `"solid"` (the default), `"dashed"`, `"dotted"`, `"hidden"`, or `"double"`. weight The default value for `weight` is `"1px"` and higher values will become more visually prominent. Returns ------- CellStyleBorders A CellStyleBorders object, which is used for a `styles` argument if specifying cell borders. Examples ------ See [`GT.tab_style()`](`great_tables.GT.tab_style`). CellStyleCss(rule: 'str') -> None A style specification for custom CSS rules. The `style.css()` class is to be used with the `tab_style()` method, which itself allows for the setting of custom styles to one or more cells. With `style.css()`, you can specify any CSS rule that you would like to apply to the targeted cells. Parameters ---------- rule The CSS rule to apply to the targeted cells. This can be any valid CSS rule, such as `background-color: red;` or `font-size: 14px;`. Returns ------- CellStyleCss A CellStyleCss object, which is used for a `styles` argument if specifying a custom CSS rule. Examples -------- See [`GT.tab_style()`](`great_tables.GT.tab_style`). ## Helper Functions An assortment of helper functions is available in the Great Tables package. The `md()` and `html()` helper functions can be used during label creation with the `tab_header()`, `tab_spanner()`, `tab_stubhead()`, and `tab_source_note()` methods. with_id(self: 'GTSelf', id: 'str | None' = None) -> 'GTSelf' Set the id for this table. Note that this is a shortcut for the `table_id=` argument in `GT.tab_options()`. Parameters ---------- id By default (with `None`) the table ID will be a random, ten-letter string as generated through internal use of the `random_id()` function. A custom table ID can be used here by providing a string. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- The use of `with_id` is straightforward—simply pass a string to `id=` to set the table ID: ```{python} from great_tables import GT, exibble GT(exibble).with_id("your-table-id") ``` with_locale(self: 'GTSelf', locale: 'str | None' = None) -> 'GTSelf' Set a column to be the default locale. Setting a default locale affects formatters like `fmt_number()`, and `fmt_date()`, by having them default to locale-specific features (e.g. representing one thousand as 1.000,00) Parameters ---------- locale An optional locale identifier that can be used for formatting values according the locale's rules. Examples include `"en"` for English (United States) and `"fr"` for French (France). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Let's create a table and set its `locale=` to `"ja"` for Japan. Then, we call `fmt_currency()` to format the `"currency"` column. Since we didn't specify a `locale=` for `fmt_currency()`, it will adopt the globally set `"ja"` locale. ```{python} from great_tables import GT, exibble ( GT(exibble) .with_locale("ja") .fmt_currency( columns="currency", decimals=3, use_seps=False ) ) ``` **Great Tables** internally supports many locale options. You can find the available locales in the following table: ```{python} from great_tables.data import __x_locales columns = ["locale", "lang_name", "lang_desc", "territory_name", "territory_desc"] GT(__x_locales.loc[:, columns]).cols_align("right") ``` md(text: 'str') -> 'Md' Interpret input text as Markdown-formatted text. Markdown can be used in certain places (e.g., source notes, table title/subtitle, etc.) and we can expect it to render to HTML. There is also the [`html()`](`great_tables.html`) helper function that allows you to use raw HTML text. Parameters ---------- text The text that is understood to contain Markdown formatting. Examples ------ See [`GT.tab_header()`](`great_tables.GT.tab_header`). html(text: 'str') -> 'Html' Interpret input text as HTML-formatted text. For certain pieces of text (like in column labels or table headings) we may want to express them as raw HTML. In fact, with HTML, anything goes so it can be much more than just text. The `html()` function will guard the input HTML against escaping, so, your HTML tags will come through as HTML when rendered. Parameters ---------- text The text that is understood to contain HTML formatting. Examples ------ See [`GT.tab_header()`](`great_tables.GT.tab_header`). FromColumn(column: 'str', na_value: 'Any | None' = None, fn: 'Callable[[Any], Any] | None' = None) -> None Specify that a style value should be fetched from a column in the data. Parameters ---------- column A column name in the data containing the styling information. na_value A single value to replace any NA values in the column (currently not supported). fn A callable applied to transform each value extracted from `column=`. Examples -------- This example demonstrates styling the `"x"` column. Style the text color using the `"color"` column: ```{python} import pandas as pd import polars as pl from great_tables import GT, from_column, loc, style, px df = pd.DataFrame({"x": [15, 20], "color": ["red", "blue"]}) (GT(df).tab_style(style=style.text(color=from_column("color")), locations=loc.body(columns=["x"]))) ``` With polars, you can pass expressions directly: ```{python} df_polars = pl.from_pandas(df) ( GT(df_polars).tab_style( style=style.text(color=pl.col("color")), locations=loc.body(columns=["x"]) ) ) ``` Style the text size using values from the `"x"` column, with the `px()` helper function as the `fn=` parameter: ```{python} ( GT(df).tab_style( style=style.text(color=from_column("color"), size=from_column("x", fn=px)), locations=loc.body(columns=["x"]), ) ) ``` google_font(name: 'str') -> 'GoogleFont' Specify a font from the *Google Fonts* service. The `google_font()` helper function can be used wherever a font name might be specified. There are two instances where this helper can be used: 1. `opt_table_font(font=...)` (for setting a table font) 2. `style.text(font=...)` (itself used in [`tab_style()`](`great_tables.GT.tab_style`)) Parameters ---------- name The name of the Google Font to use. Returns ------- GoogleFont A GoogleFont object, which contains the name of the font and methods for incorporating the font in HTML output tables. Examples -------- Let's use the `exibble` dataset to create a table of two columns and eight rows. We'll replace missing values with em dashes using [`sub_missing()`](`great_tables.GT.sub_missing`). For text in the time column, we will use the font called `"IBM Plex Mono"` which is available from Google Fonts. This is defined inside the `google_font()` call, itself within the [`style.text()`](`great_tables.style.text`) method that's applied to the `style=` parameter of [`tab_style()`](`great_tables.GT.tab_style`). ```{python} from great_tables import GT, exibble, style, loc, google_font ( GT(exibble[["char", "time"]]) .sub_missing() .tab_style( style=style.text(font=google_font(name="IBM Plex Mono")), locations=loc.body(columns="time") ) ) ``` We can use a subset of the `sp500` dataset to create a small table. With [`fmt_currency()`](`great_tables.GT.fmt_currency`), we can display values as monetary values. Then, we'll set a larger font size for the table and opt to use the `"Merriweather"` font by calling `google_font()` within [`opt_table_font()`](`great_tables.GT.opt_table_font`). In cases where that font may not materialize, we include two font fallbacks: `"Cochin"` and the catchall `"Serif"` group. ```{python} from great_tables import GT, google_font from great_tables.data import sp500 ( GT(sp500.drop(columns=["volume", "adj_close"]).head(10)) .fmt_currency(columns=["open", "high", "low", "close"]) .tab_options(table_font_size="20px") .opt_table_font(font=[google_font(name="Merriweather"), "Cochin", "Serif"]) ) ``` system_fonts(name: 'FontStackName' = 'system-ui') -> 'list[str]' Get a themed font stack that works well across systems. A font stack can be obtained from `system_fonts()` using one of various keywords such as `"system-ui"`, `"old-style"`, and `"humanist"` (there are 15 in total) representing a themed set of fonts. These sets comprise a font family that has been tested to work across a wide range of computer systems. Parameters ---------- name The name of a font stack. Must be drawn from the set of `"system-ui"` (the default), `"transitional"`, `"old-style"`, `"humanist"`, `"geometric-humanist"`, `"classical-humanist"`, `"neo-grotesque"`, `"monospace-slab-serif"`, `"monospace-code"`, `"industrial"`, `"rounded-sans"`, `"slab-serif"`, `"antique"`, `"didone"`, and `"handwritten"`. Returns ------- list[str] A list of font names that make up the font stack. The font stacks and the individual fonts used by platform --------------------------------------------------------- ### System UI (`"system-ui"`) ```css font-family: system-ui, sans-serif; ``` The operating system interface's default typefaces are known as system UI fonts. They contain a variety of font weights, are quite readable at small sizes, and are perfect for UI elements. These typefaces serve as a great starting point for text in data tables and so this font stack is the default for **Great Tables**. ### Transitional (`"transitional"`) ```css font-family: Charter, 'Bitstream Charter', 'Sitka Text', Cambria, serif; ``` The Enlightenment saw the development of transitional typefaces, which combine Old Style and Modern typefaces. *Times New Roman*, a transitional typeface created for the Times of London newspaper, is among the most well-known instances of this style. ### Old Style (`"old-style"`) ```css font-family: 'Iowan Old Style', 'Palatino Linotype', 'URW Palladio L', P052, serif; ``` Old style typefaces were created during the Renaissance and are distinguished by diagonal stress, a lack of contrast between thick and thin strokes, and rounded serifs. *Garamond* is among the most well-known instances of an antique typeface. ### Humanist (`"humanist"`) ```css font-family: Seravek, 'Gill Sans Nova', Ubuntu, Calibri, 'DejaVu Sans', source-sans-pro, sans-serif; ``` Low contrast between thick and thin strokes and organic, calligraphic forms are traits of humanist typefaces. These typefaces, which draw their inspiration from Renaissance calligraphy, are frequently regarded as being more readable and easier to read than other sans serif typefaces. ### Geometric Humanist (`"geometric-humanist"`) ```css font-family: Avenir, Montserrat, Corbel, 'URW Gothic', source-sans-pro, sans-serif; ``` Clean, geometric forms and consistent stroke widths are characteristics of geometric humanist typefaces. These typefaces, which are frequently used for headlines and other display purposes, are frequently thought to be contemporary and slick in appearance. A well-known example of this classification is *Futura*. ### Classical Humanist (`"classical-humanist"`) ```css font-family: Optima, Candara, 'Noto Sans', source-sans-pro, sans-serif; ``` The way the strokes gradually widen as they approach the stroke terminals without ending in a serif is what distinguishes classical humanist typefaces. The stone carving on Renaissance-era tombstones and classical Roman capitals served as inspiration for these typefaces. ### Neo-Grotesque (`"neo-grotesque"`) ```css font-family: Inter, Roboto, 'Helvetica Neue', 'Arial Nova', 'Nimbus Sans', Arial, sans-serif; ``` Neo-grotesque typefaces are a form of sans serif that originated in the late 19th and early 20th centuries. They are distinguished by their crisp, geometric shapes and regular stroke widths. *Helvetica* is among the most well-known examples of a Neo-grotesque typeface. ### Monospace Slab Serif (`"monospace-slab-serif"`) ```css font-family: 'Nimbus Mono PS', 'Courier New', monospace; ``` Monospace slab serif typefaces are distinguished by their fixed-width letters, which are the same width irrespective of their shape, and their straightforward, geometric forms. For reports, tabular work, and technical documentation, this technique is used to simulate typewriter output. ### Monospace Code (`"monospace-code"`) ```css font-family: ui-monospace, 'Cascadia Code', 'Source Code Pro', Menlo, Consolas, 'DejaVu Sans Mono', monospace; ``` Specifically created for use in programming and other technical applications, monospace code typefaces are used in these fields. These typefaces are distinguished by their clear, readable forms and monospaced design, which ensures that all letters and characters are the same width. ### Industrial (`"industrial"`) ```css font-family: Bahnschrift, 'DIN Alternate', 'Franklin Gothic Medium', 'Nimbus Sans Narrow', sans-serif-condensed, sans-serif; ``` The development of industrial typefaces began in the late 19th century and was greatly influenced by the industrial and technological advancements of the time. Industrial typefaces are distinguished by their strong sans serif letterforms, straightforward appearance, and use of geometric shapes and straight lines. ### Rounded Sans (`"rounded-sans"`) ```css font-family: ui-rounded, 'Hiragino Maru Gothic ProN', Quicksand, Comfortaa, Manjari, 'Arial Rounded MT', 'Arial Rounded MT Bold', Calibri, source-sans-pro, sans-serif; ``` The rounded, curved letterforms that define rounded typefaces give them a softer, friendlier appearance. The typeface's rounded edges give it a more natural and playful feel, making it appropriate for use in casual or kid-friendly designs. Since the 1950s, the rounded sans-serif design has gained popularity and is still frequently used in branding, graphic design, and other fields. ### Slab Serif (`"slab-serif"`) ```css font-family: Rockwell, 'Rockwell Nova', 'Roboto Slab', 'DejaVu Serif', 'Sitka Small', serif; ``` Slab Serif typefaces are distinguished by the thick, block-like serifs that appear at the ends of each letterform. Typically, these serifs are unbracketed, which means that they do not have any curved or tapered transitions to the letter's main stroke. ### Antique (`"antique"`) ```css font-family: Superclarendon, 'Bookman Old Style', 'URW Bookman', 'URW Bookman L', 'Georgia Pro', Georgia, serif; ``` Serif typefaces that were popular in the 19th century include antique typefaces, also referred to as Egyptians. They are distinguished by their thick, uniform stroke weight and block-like serifs. The typeface *Clarendon* is a highly regarded example of this style and *Superclarendon* is a modern take on that revered typeface. ### Didone (`"didone"`) ```css font-family: Didot, 'Bodoni MT', 'Noto Serif Display', 'URW Palladio L', P052, Sylfaen, serif; ``` Didone typefaces, also referred to as Modern typefaces, are distinguished by their vertical stress, sharp contrast between thick and thin strokes, and hairline serifs without bracketing. The Didone style first appeared in the late 18th century and became well-known in the early 19th century. *Bodoni* and *Didot* are two of the most well-known typefaces in this category. ### Handwritten (`"handwritten"`) ```css font-family: 'Segoe Print', 'Bradley Hand', Chilanka, TSCu_Comic, casual, cursive; ``` The appearance and feel of handwriting are replicated by handwritten typefaces. Although there are a wide variety of handwriting styles, this font stack tends to use a more casual and commonplace style. In regards to these types of fonts in tables, one can say that any table having a handwritten font will evoke a feeling of gleefulness. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll set a font for the entire table using the `tab_options()` method with the `table_font_names` parameter. Instead of passing a list of font names, we'll use the `system_fonts()` helper function to get a font stack. In this case, we'll use the `"industrial"` font stack. ```{python} from great_tables import GT, exibble, md, system_fonts ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_align_table_header(align="left") .tab_options(table_font_names=system_fonts("industrial")) ) ``` Invoking the `system_fonts()` helper function with the `"industrial"` argument will return a list of font names that make up the font stack. This is exactly the type of input that the `table_font_names` parameter requires. define_units(units_notation: 'str') -> 'UnitDefinitionList' With `define_units()` you can work with a specially-crafted units notation string and emit the units as HTML (with the `.to_html()` method). This function is useful as a standalone utility and it powers the `fmt_units()` method in **Great Tables**. Parameters ---------- units_notation : str A string of units notation. Returns ------- UnitDefinitionList A list of unit definitions. Specification of units notation ------------------------------- The following table demonstrates the various ways in which units can be specified in the `units_notation` string and how the input is processed by the `define_units()` function. The concluding step for display of the units in HTML is to use the `to_html()` method. ```{python} #| echo: false from great_tables import GT, style, loc import polars as pl units_tbl = pl.DataFrame( { "rule": [ "'^' creates a superscript", "'_' creates a subscript", "subscripts and superscripts can be combined", "use '[_subscript^superscript]' to create an overstrike", "a '/' at the beginning adds the superscript '-1'", "hyphen is transformed to minus sign when preceding a unit", "'x' at the beginning is transformed to '×'", "ASCII terms from biology/chemistry turned into terminology forms", "can create italics with '*' or '_'; create bold text with '**' or '__'", "special symbol set surrounded by colons", "chemistry notation: '%C6H6%'", ], "input": [ "m^2", "h_0", "h_0^3", "h[_0^3]", "/s", "-h^2", "x10^3 kg^2 m^-1", "ug", "*m*^**2**", ":permille:C", "g/L %C6H12O6%", ], } ).with_columns(output=pl.col("input")) ( GT(units_tbl) .fmt_units(columns="output") .tab_style( style=style.text(font="courier"), locations=loc.body(columns="input") ) ) ``` Examples -------- Let’s demonstrate a use case where we utilize `define_units()` to render an equation as the subtitle in the table header, which currently doesn’t accept unit notation as input. We'll start by creating a Polars DataFrame representing the calculations of the equation $y= a_2x^2 + a_1x + a_0$. ```{python} #| code-fold: true import polars as pl from great_tables import GT, html, define_units df = pl.DataFrame( {"x": [1, 2, 3], "a2": [2, 3, 4], "a1": [3, 4, 5], "a0": [4, 5, 6]} ).with_columns( y=( pl.col("a2").mul(pl.col("x").pow(2)) + pl.col("a1").mul(pl.col("x")) + pl.col("a0") ) ) df ``` If we try to use unit annotations to format the equation as the subtitle in the header, it won’t work as expected: ```{python} ( GT(df) .cols_label(a2="{{a_2}}", a1="{{a_1}}", a0="{{a_0}}") .tab_header(title="Linear Algebra", subtitle="y={{a_2}}{{x^2}}+{{a_1}}x+{{a_0}}") ) ``` To address this, we can create a small helper function, `u2html()`, which wraps a given string in `define_units()` and emits the units to HTML. Next, we can build the subtitle by applying `u2html()` to the string with unit annotations. Finally, we pass the assembled subtitle string through `html()` to ensure it renders correctly. ```{python} def u2html(x: str) -> str: return define_units(x).to_html() subtitle = ( "y" + "=" + u2html("{{a_2}}") + u2html("{{x^2}}") + "+" + u2html("{{a_1}}") + "x" + "+" + u2html("{{a_0}}") ) ( GT(df) .cols_label(a2="{{a_2}}", a1="{{a_1}}", a0="{{a_0}}") .tab_header(title="Linear Algebra", subtitle=html(subtitle)) ) ``` nanoplot_options(data_point_radius: 'int | list[int] | None' = None, data_point_stroke_color: 'str | list[str] | None' = None, data_point_stroke_width: 'int | list[int] | None' = None, data_point_fill_color: 'str | list[str] | None' = None, data_line_type: 'str | None' = None, data_line_stroke_color: 'str | None' = None, data_line_stroke_width: 'int | None' = None, data_area_fill_color: 'str | None' = None, data_bar_stroke_color: 'str | list[str] | None' = None, data_bar_stroke_width: 'int | list[int] | None' = None, data_bar_fill_color: 'str | list[str] | None' = None, data_bar_negative_stroke_color: 'str | None' = None, data_bar_negative_stroke_width: 'int | None' = None, data_bar_negative_fill_color: 'str | None' = None, reference_line_color: 'str | None' = None, reference_area_fill_color: 'str | None' = None, vertical_guide_stroke_color: 'str | None' = None, vertical_guide_stroke_width: 'int | None' = None, show_data_points: 'bool | None' = None, show_data_line: 'bool | None' = None, show_data_area: 'bool | None' = None, show_reference_line: 'bool | None' = None, show_reference_area: 'bool | None' = None, show_vertical_guides: 'bool | None' = None, show_y_axis_guide: 'bool | None' = None, interactive_data_values: 'bool | None' = None, y_val_fmt_fn: 'Callable[..., str] | None' = None, y_axis_fmt_fn: 'Callable[..., str] | None' = None, y_ref_line_fmt_fn: 'Callable[..., str] | None' = None, currency: 'str | None' = None) -> 'dict[str, Any]' Helper for setting the options for a nanoplot. When using `cols_nanoplot()`, the defaults for the generated nanoplots can be modified with `nanoplot_options()` within the `options=` argument. Parameters ---------- data_point_radius The `data_point_radius=` option lets you set the radius for each of the data points. By default this is set to `10`. Individual radius values can be set by using a list of numeric values; however, the list provided must match the number of data points. data_point_stroke_color The default stroke color of the data points is `"#FFFFFF"` (`"white"`). This works well when there is a visible data line combined with data points with a darker fill color. The stroke color can be modified with `data_point_stroke_color=` for all data points by supplying a single color value. With a list of colors, each data point's stroke color can be changed (ensure that the list length matches the number of data points). data_point_stroke_width The width of the outside stroke for the data points can be modified with the `data_point_stroke_width=` option. By default, a value of `4` (as in '4px') is used. data_point_fill_color By default, all data points have a fill color of `"#FF0000"` (`"red"`). This can be changed for all data points by providing a different color to `data_point_fill_color=`. And, a list of different colors can be supplied so long as the length is equal to the number of data points; the fill color values will be applied in order of left to right. data_line_type This can accept either `"curved"` or `"straight"`. Curved lines are recommended when the nanoplot has less than 30 points and data points are evenly spaced. In most other cases, straight lines might present better. data_line_stroke_color The color of the data line can be modified from its default `"#4682B4"` (`"steelblue"`) color by supplying a color to the `data_line_stroke_color=` option. data_line_stroke_width The width of the connecting data line can be modified with `data_line_stroke_width=`. By default, a value of `4` (as in '4px') is used. data_area_fill_color The fill color for the area that bounds the data points in line plot. The default is `"#FF0000"` (`"red"`) but can be changed by providing a color value to `data_area_fill_color=`. data_bar_stroke_color The color of the stroke used for the data bars can be modified from its default `"#3290CC"` color by supplying a color to `data_bar_stroke_color=`. data_bar_stroke_width The width of the stroke used for the data bars can be modified with the `data_bar_stroke_width=` option. By default, a value of `4` (as in '4px') is used. data_bar_fill_color By default, all data bars have a fill color of `"#3FB5FF"`. This can be changed for all data bars by providing a different color to `data_bar_fill_color=`. And, a list of different colors can be supplied so long as the length is equal to the number of data bars; the fill color values will be applied in order of left to right. data_bar_negative_stroke_color The color of the stroke used for the data bars that have negative values. The default color is `"#CC3243"` but this can be changed by supplying a color value to the `data_bar_negative_stroke_color=` option. data_bar_negative_stroke_width The width of the stroke used for negative value data bars. This has the same default as `data_bar_stroke_width=` with a value of `4` (as in '4px'). This can be changed by giving a numeric value to the `data_bar_negative_stroke_width=` option. data_bar_negative_fill_color By default, all negative data bars have a fill color of `"#D75A68"`. This can however be changed by providing a color value to `data_bar_negative_fill_color=`. reference_line_color The reference line will have a color of `"#75A8B0"` if it is set to appear. This color can be changed by providing a single color value to `reference_line_color=`. reference_area_fill_color If a reference area has been defined and is visible it has by default a fill color of `"#A6E6F2"`. This can be modified by declaring a color value in the `reference_area_fill_color=` option. vertical_guide_stroke_color Vertical guides appear when hovering in the vicinity of data points. Their default color is `"#911EB4"` (a strong magenta color) and a fill opacity value of `0.4` is automatically applied to this. However, the base color can be changed with the `vertical_guide_stroke_color=` option. vertical_guide_stroke_width The vertical guide's stroke width, by default, is relatively large at `12` (this is '12px'). This is modifiable by setting a different value with `vertical_guide_stroke_width=`. show_data_points By default, all data points in a nanoplot are shown but this layer can be hidden by setting `show_data_points=` to `False`. show_data_line The data line connects data points together and it is shown by default. This data line layer can be hidden by setting `show_data_line=` to `False`. show_data_area The data area layer is adjacent to the data points and the data line. It is shown by default but can be hidden with `show_data_area=False`. show_reference_line The layer with a horizontal reference line appears underneath that of the data points and the data line. Like vertical guides, hovering over a reference will show its value. The reference line (if available) is shown by default but can be hidden by setting `show_reference_line=` to `False`. show_reference_area The reference area appears at the very bottom of the layer stack, if it is available (i.e., defined in `cols_nanoplot()`). It will be shown in the default case but can be hidden by using `show_reference_area=False`. show_vertical_guides Vertical guides appear when hovering over data points. This hidden layer is active by default but can be deactivated by using `show_vertical_guides=False`. show_y_axis_guide The *y*-axis guide will appear when hovering over the far left side of a nanoplot. This hidden layer is active by default but can be deactivated by using `show_y_axis_guide=False`. interactive_data_values By default, numeric data values will be shown only when the user interacts with certain regions of a nanoplot. This is because the values may be numerous (i.e., clutter the display when all are visible) and it can be argued that the values themselves are secondary to the presentation. However, for some types of plots (like horizontal bar plots), a persistent display of values alongside the plot marks may be desirable. By setting `interactive_data_values=False` we can opt for always displaying the data values alongside the plot components. y_val_fmt_fn If providing a function to `y_val_fmt_fn=`, customized formatting of the *y* values associated with the data points/bars is possible. y_axis_fmt_fn A function supplied to `y_axis_fmt_fn=` will result in customized formatting of the *y*-axis label values. y_ref_line_fmt_fn Providing a function for `y_ref_line_fmt_fn=` yields customized formatting of the reference line (if present). currency If the values are to be displayed as currency values, supply either: (1) a 3-letter currency code (e.g., `"USD"` for U.S. Dollars, `"EUR"` for the Euro currency), or (2) a common currency name (e.g., `"dollar"`, `"pound"`, `"yen"`, etc.). Examples -------- See [`fmt_nanoplot()`](`great_tables.GT.fmt_nanoplot`). ## Table options With the `opt_*()` functions, we have an easy way to set commonly-used table options without having to use `tab_options()` directly. opt_stylize(self: 'GTSelf', style: 'int' = 1, color: 'str' = 'blue', add_row_striping: 'bool' = True) -> 'GTSelf' Stylize your table with a colorful look. With the `opt_stylize()` method you can quickly style your table with a carefully curated set of background colors, line colors, and line styles. There are six styles to choose from and they largely vary in the extent of coloring applied to different table locations. Some have table borders applied, some apply darker colors to the table stub and summary sections, and, some even have vertical lines. In addition to choosing a `style` preset, there are six `color` variations that each use a range of five color tints. Each of the color tints have been fine-tuned to maximize the contrast between text and its background. There are 36 combinations of `style` and `color` to choose from. For examples of each style, see the [*Premade Themes*](../get-started/table-theme-premade.qmd) section of the **Get Started** guide. Parameters ---------- style Six numbered styles are available. Simply provide a number from `1` (the default) to `6` to choose a distinct look. color The color scheme of the table. The default value is `"blue"`. The valid values are `"blue"`, `"cyan"`, `"pink"`, `"green"`, `"red"`, and `"gray"`. add_row_striping An option to enable row striping in the table body for the style chosen. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll apply a predefined style to the table using the `opt_stylize()` method. ```{python} from great_tables import GT, exibble, md gt_tbl = ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_stylize() ) gt_tbl ``` The table has been stylized with the default style and color. The default style is `1` and the default color is `"blue"`. The resulting table style is a combination of color and border settings that are applied to the table. We can modify the overall style and choose a different color theme by providing different values to the `style=` and `color=` arguments. ```{python} gt_tbl.opt_stylize(style=2, color="green") ``` opt_footnote_marks(self: 'GTSelf', marks: 'str | list[str]' = 'numbers') -> 'GTSelf' Option to modify the set of footnote marks. Alter the footnote marks for any footnotes that may be present in the table. Either a list of marks can be provided (including Unicode characters), or, a specific keyword could be used to signify a preset sequence. This method serves as a shortcut for using `tab_options(footnotes_marks=)` We can supply a list of strings will represent the series of marks. The series of footnote marks is recycled when its usage goes beyond the length of the set. At each cycle, the marks are simply doubled, tripled, and so on (e.g., `*` -> `**` -> `***`). The option exists for providing keywords for certain types of footnote marks. The keywords are - `"numbers"`: numeric marks, they begin from 1 and these marks are not subject to recycling behavior - `"letters"`: lowercase alphabetic marks. Same as using the `gt.letters()` function which produces a list of 26 lowercase letters from the Roman alphabet - `"LETTERS"`: uppercase alphabetic marks. Same as using the `gt.LETTERS()` function which produces a list of 26 uppercase letters from the Roman alphabet - `"standard"`: symbolic marks, four symbols in total - `"extended"`: symbolic marks, extends the standard set by adding two more symbols, making six Parameters ---------- marks Either a list of strings that will represent the series of marks or a keyword string that represents a preset sequence of marks. The valid keywords are: `"numbers"` (for numeric marks), `"letters"` and `"LETTERS"` (for lowercase and uppercase alphabetic marks), `"standard"` (for a traditional set of four symbol marks), and `"extended"` (which adds two more symbols to the standard set). Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. opt_row_striping(self: 'GTSelf', row_striping: 'bool' = True) -> 'GTSelf' Option to add or remove row striping. By default, a table does not have row striping enabled. However, this method allows us to easily enable or disable striped rows in the table body. It's a convenient shortcut for `tab_options(row_striping_include_table_body=)`. Parameters ---------- row_striping A boolean that indicates whether row striping should be added or removed. Defaults to `True`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using only a few columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll add row striping to every second row with the `opt_row_striping()` method. ```{python} from great_tables import GT, exibble, md ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_row_striping() ) ``` opt_align_table_header(self: 'GTSelf', align: 'str' = 'center') -> 'GTSelf' Option to align the table header. By default, an added table header will have center alignment for both the title and the subtitle elements. This method allows us to easily set the horizontal alignment of the title and subtitle to the left, right, or center by using the `"align"` argument. This method serves as a convenient shortcut for `gt.tab_options(heading_align=)`. Parameters ---------- align The alignment of the title and subtitle elements in the table header. Options are `"center"` (the default), `"left"`, or `"right"`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll align the header contents (consisting of the title and the subtitle) to the left with the `opt_align_table_header()` method. ```{python} from great_tables import GT, exibble, md ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_align_table_header(align="left") ) ``` opt_vertical_padding(self: 'GTSelf', scale: 'float' = 1.0) -> 'GTSelf' Option to scale the vertical padding of the table. This method allows us to scale the vertical padding of the table by a factor of `scale`. The default value is `1.0` and this method serves as a convenient shortcut for `gt.tab_options(heading_padding=, column_labels_padding=, data_row_padding=, row_group_padding=, source_notes_padding=)`. Parameters ---------- scale The factor by which to scale the vertical padding. The default value is `1.0`. A value less than `1.0` will reduce the padding, and a value greater than `1.0` will increase the padding. The value must be between `0` and `3`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll scale the vertical padding of the table by a factor of `3` using the `opt_vertical_padding()` method. ```{python} from great_tables import GT, exibble, md gt_tbl = ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") ) gt_tbl.opt_vertical_padding(scale=3) ``` Now that's a tall table! The overall effect of scaling the vertical padding is that the table will appear taller and there will be more buffer space between the table elements. A value of `3` is pretty extreme and is likely to be too much in most cases, so, feel free to experiment with different values when looking to increase the vertical padding. Let's go the other way (using a value less than `1`) and try to condense the content vertically with a `scale` factor of `0.5`. This will reduce the top and bottom padding globally and make the table appear more compact. ```{python} gt_tbl.opt_vertical_padding(scale=0.5) ``` A value of `0.5` provides a reasonable amount of vertical padding and the table will appear more compact. This is useful when space is limited and, in such a situation, this is a practical solution to that problem. opt_horizontal_padding(self: 'GTSelf', scale: 'float' = 1.0) -> 'GTSelf' Option to scale the horizontal padding of the table. This method allows us to scale the horizontal padding of the table by a factor of `scale`. The default value is `1.0` and this method serves as a convenient shortcut for `gt.tab_options( heading_padding_horizontal=, column_labels_padding_horizontal=, data_row_padding_horizontal=, row_group_padding_horizontal=, source_notes_padding_horizontal=)`. Parameters ---------- scale The factor by which to scale the horizontal padding. The default value is `1.0`. A value less than `1.0` will reduce the padding, and a value greater than `1.0` will increase the padding. The value must be between `0` and `3`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll scale the horizontal padding of the table by a factor of `3` using the `opt_horizontal_padding()` method. ```{python} from great_tables import GT, exibble, md gt_tbl = ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") ) gt_tbl.opt_horizontal_padding(scale=3) ``` The overall effect of scaling the horizontal padding is that the table will appear wider or and there will added buffer space between the table elements. The overall look of the table will be more spacious and neighboring pieces of text will be less cramped. Let's go the other way and scale the horizontal padding of the table by a factor of `0.5` using the `opt_horizontal_padding()` method. ```{python} gt_tbl.opt_horizontal_padding(scale=0.5) ``` What you get in this case is more condensed text across the horizontal axis. This may not always be desired when cells consist mainly of text, but it could be useful when the table is more visual and the cells are filled with graphics or other non-textual elements. opt_all_caps(self: 'GTSelf', all_caps: 'bool' = True, locations: 'type[LocColumnLabels] | type[LocRowGroups] | type[LocStub] | list[type[LocColumnLabels] | type[LocRowGroups] | type[LocStub]] | str | list[str] | None' = None) -> 'GTSelf' Option to use all caps in select table locations. Sometimes an all-capitalized look is suitable for a table. By using `opt_all_caps()`, we can transform characters in the column labels, the stub, and in all row groups in this way (and there's control over which of these locations are transformed). This method serves as a convenient shortcut for `tab_options(_text_transform="uppercase", _font_size="80%", _font_weight="bolder")` (for all `locations` selected). Parameters ---------- all_caps Indicates whether the text transformation to all caps should be performed (`True`, the default) or reset to default values (`False`) for the `locations` targeted. locations Which locations should undergo this text transformation? By default it includes all of the `loc.column_labels`, the `loc.stub`, and the `loc.row_groups` locations. However, we could just choose one or two of those. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll ensure that all text in the column labels, the stub, and in all row groups is transformed to all caps using the `opt_all_caps()` method. ```{python} from great_tables import GT, exibble, loc, md ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_all_caps() ) ``` `opt_all_caps()` accepts a `locations` parameter that allows us to specify which components should be transformed. For example, if we only want to ensure that all text in the stub and all row groups is converted to all caps: ```{python} ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_all_caps(locations=[loc.stub, loc.row_groups]) ) ``` opt_table_outline(self: 'GTSelf', style: 'str' = 'solid', width: 'str' = '3px', color: 'str' = '#D3D3D3') -> 'GTSelf' Option to wrap an outline around the entire table. The `opt_table_outline()` method puts an outline of consistent `style=`, `width=`, and `color=` around the entire table. It'll write over any existing outside lines so long as the `width=` value is larger that of the existing lines. The default value of `style=` (`"solid"`) will draw a solid outline, whereas using `"none"` will remove any present outline. Parameters ---------- style The style of the table outline. The default value is `"solid"`. The valid values are `"solid"`, `"dashed"`, `"dotted"`, and `"none"`. width The width of the table outline. The default value is `"3px"`. The value must be in pixels and it must be an integer value. color The color of the table outline, where the default is `"#D3D3D3"`. The value must either a hexadecimal color code or a color name. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Examples -------- Using select columns from the `exibble` dataset, let's create a table with a number of components added. Following that, we'll put an outline around the entire table using the `opt_table_outline()` method. ```{python} from great_tables import GT, exibble, md ( GT( exibble[["num", "char", "currency", "row", "group"]], rowname_col="row", groupname_col="group" ) .tab_header( title=md("Data listing from **exibble**"), subtitle=md("`exibble` is a **Great Tables** dataset.") ) .fmt_number(columns="num") .fmt_currency(columns="currency") .tab_source_note(source_note="This is only a subset of the dataset.") .opt_table_outline() ) ``` opt_table_font(self: 'GTSelf', font: 'str | list[str] | dict[str, str] | GoogleFont | None' = None, stack: 'FontStackName | None' = None, weight: 'str | int | float | None' = None, style: 'str | None' = None, add: 'bool' = True) -> 'GTSelf' Options to define font choices for the entire table. The `opt_table_font()` method makes it possible to define fonts used for an entire table. Any font names supplied in `font=` will (by default, with `add=True`) be placed before the names present in the existing font stack (i.e., they will take precedence). You can choose to base the font stack on those provided by the [`system_fonts()`](`system_fonts.md`) helper function by providing a valid keyword for a themed set of fonts. Take note that you could still have entirely different fonts in specific locations of the table. To make that possible you would need to use [`tab_style()`](`great_tables.GT.tab_style`) in conjunction with [`style.text()`](`great_tables.style.text`). Parameters ---------- font One or more font names available on the user's system. This can be provided as a string or a list of strings. Alternatively, you can specify font names using the `google_font()` helper function. The default value is `None` since you could instead opt to use `stack` to define a list of fonts. stack A name that is representative of a font stack (obtained via internally via the `system_fonts()` helper function. If provided, this new stack will replace any defined fonts and any `font=` values will be prepended. style An option to modify the text style. Can be one of either `"normal"`, `"italic"`, or `"oblique"`. weight Option to set the weight of the font. Can be a text-based keyword such as `"normal"`, `"bold"`, `"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`. Please note that typefaces have varying support for the numeric mapping of weight. add Should fonts be added to the beginning of any already-defined fonts for the table? By default, this is `True` and is recommended since those fonts already present can serve as fallbacks when everything specified in `font` is not available. If a `stack=` value is provided, then `add` will automatically set to `False`. Returns ------- GT The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining. Possibilities for the `stack` argument -------------------------------------- There are several themed font stacks available via the [`system_fonts()`](`system_fonts.md`) helper function. That function can be used to generate all or a segment of a list supplied to the `font=` argument. However, using the `stack=` argument with one of the 15 keywords for the font stacks available in [`system_fonts()`](`system_fonts.md`), we could be sure that the typeface class will work across multiple computer systems. Any of the following keywords can be used with `stack=`: - `"system-ui"` - `"transitional"` - `"old-style"` - `"humanist"` - `"geometric-humanist"` - `"classical-humanist"` - `"neo-grotesque"` - `"monospace-slab-serif"` - `"monospace-code"` - `"industrial"` - `"rounded-sans"` - `"slab-serif"` - `"antique"` - `"didone"` - `"handwritten"` Examples -------- Let's use a subset of the `sp500` dataset to create a small table. With `opt_table_font()` we can add some preferred font choices for modifying the text of the entire table. Here we'll use the `"Superclarendon"` and `"Georgia"` fonts (the second font serves as a fallback). ```{python} import polars as pl from great_tables import GT from great_tables.data import sp500 sp500_mini = pl.from_pandas(sp500).slice(0, 10).drop(["volume", "adj_close"]) ( GT(sp500_mini, rowname_col="date") .fmt_currency(use_seps=False) .opt_table_font(font=["Superclarendon", "Georgia"]) ) ``` In practice, both of these fonts are not likely to be available on all systems. The `opt_table_font()` method safeguards against this by prepending the fonts in the `font=` list to the existing font stack. This way, if both fonts are not available, the table will fall back to using the list of default table fonts. This behavior is controlled by the `add=` argument, which is `True` by default. With the `sza` dataset we'll create a two-column, eleven-row table. Within `opt_table_font()`, the `stack=` argument will be supplied with the "rounded-sans" font stack. This sets up a family of fonts with rounded, curved letterforms that should be locally available in different computing environments. ```{python} from great_tables.data import sza sza_mini = ( pl.from_pandas(sza) .filter((pl.col("latitude") == "20") & (pl.col("month") == "jan")) .drop_nulls() .drop(["latitude", "month"]) ) ( GT(sza_mini) .opt_table_font(stack="rounded-sans") .opt_all_caps() ) ``` opt_css(self: 'GTSelf', css: 'str', add: 'bool' = True, allow_duplicates: 'bool' = False) -> 'GTSelf' Option to add custom CSS for the table. `opt_css()` makes it possible to add extra CSS rules to a table. This CSS will be added after the compiled CSS that Great Tables generates automatically when the object is transformed to an HTML output table. If you want to set CSS styles on a specific table location, use `tab_style()` with `style.css()` instead. Parameters ---------- css The CSS to include as part of the rendered table's ` ``` ```{python} # | echo: false # | output: asis print(":::{.grid}") for ii in range(1, 7): gt_html = gt_ex.opt_stylize(style=ii).as_raw_html() print( ":::{.g-col-lg-4 .g-col-12 .shrink-example}", f"
{ii}
", gt_html, ":::", sep="\n\n" ) print(":::") ``` ## `opt_*()` convenience methods This section shows the different `opt_*()` methods available. They serve as convenience methods for common `~~GT.tab_options()` tasks. ### Align table header ```{python} gt_ex.opt_align_table_header(align="left") ``` The title and subtitle are now left-aligned rather than centered, which works well for tables embedded in text-heavy documents. ### Make text ALL CAPS ```{python} gt_ex.opt_all_caps() ``` Column labels and row group labels are rendered in uppercase, giving the table a more formal, structured appearance. ### Reduce or expand padding ```{python} gt_ex.opt_vertical_padding(scale=0.3) ``` Reducing vertical padding creates a more compact table that fits more data into less vertical space. ```{python} gt_ex.opt_horizontal_padding(scale=3) ``` Increasing horizontal padding adds breathing room between columns, which improves readability when columns contain long values. ### Set table outline ```{python} gt_ex.opt_table_outline() ``` The `opt_*()` methods give you quick access to common styling patterns without needing to remember the specific `~~GT.tab_options()` argument names. For full control, you can always drop down to `~~GT.tab_options()` directly, but these convenience methods cover the most frequent customization needs in just a single method call. ### Nanoplots :::{.callout-warning} `~~GT.fmt_nanoplot()` is still experimental. ::: Nanoplots are tiny plots you can use in your table. They are simple by design, mainly because there isn't a lot of space to work with. With that simplicity, however, you do get a set of very succinct data visualizations that adapt nicely to the amount of data you feed into them. The main features of nanoplots include the following: - interactivity: you can hover over data and other elements to show values - choice of line and bar charting - you can annotate plots with a reference line and/or area - plenty of easy-to-use options for composing your plots ## A simple line-based nanoplot Let's make some simple plots with a Polars DataFrame. Here we are using lists to define data values for each cell in the `numbers` column. The `~~GT.fmt_nanoplot()` method understands that these are input values for a line plot (the default type of nanoplot). ```{python} from great_tables import GT import polars as pl random_numbers_df = pl.DataFrame( { "example": ["Row " + str(x) for x in range(1, 5)], "numbers": [ "20 23 6 7 37 23 21 4 7 16", "2.3 6.8 9.2 2.42 3.5 12.1 5.3 3.6 7.2 3.74", "-12 -5 6 3.7 0 8 -7.4", "2 0 15 7 8 10 1 24 17 13 6", ], } ) GT(random_numbers_df).fmt_nanoplot(columns="numbers") ``` This looks a lot like the familiar sparklines you might see in tables where space for plots is limited. The input values, strings of space-separated values, can be considered here as *y* values and they are evenly spaced along the imaginary *x* axis. Hovering over (or touching) the values is something of a treat! You might notice that: - data values are automatically formatted for you in a compact fashion - the plot elements also display pertinent values This sort of interactively is baked into the rendered SVG graphics that `~~GT.fmt_nanoplot()` generates from your data and selection of options. Polars lets us express 'lists-of-values-per-cell' in different ways and **Great Tables** is pretty good at understanding different column *dtypes*. So, you can alternatively create the same table as above with the following code. ```python random_numbers_df = pl.DataFrame( { "example": ["Row " + str(x) for x in range(1, 5)], "numbers": [ { "val": [20, 23, 6, 7, 37, 23, 21, 4, 7, 16] }, { "val": [2.3, 6.8, 9.2, 2.42, 3.5, 12.1, 5.3, 3.6, 7.2, 3.74] }, { "val": [-12, -5, 6, 3.7, 0, 8, -7.4] }, { "val": [2, 0, 15, 7, 8, 10, 1, 24, 17, 13, 6] }, ], } ) GT(random_numbers_df).fmt_nanoplot(columns="numbers") ``` Both forms of the `numbers` column in the two DataFrames look the same to `~~GT.fmt_nanoplot()`. The key for the list of values (here, `"val"`) can be anything as long as it's repeated down the column. So the choice is yours on how you want to prepare those column values. ## The reference line and the reference area You can insert two additional things which may be useful: a reference line and a reference area. You can define them either through literal values or via keywords (these are: `"mean"`, `"median"`, `"min"`, `"max"`, `"q1"`, `"q3"`, `"first"`, or `"last"`). Here's a reference line that corresponds to the mean data value of each nanoplot: ```{python} GT(random_numbers_df).fmt_nanoplot(columns="numbers", reference_line="mean") ``` This example uses a reference area that bounds the minimum value to the median value: ```{python} GT(random_numbers_df).fmt_nanoplot(columns="numbers", reference_area=["min", "median"]) ``` As an added touch, you don't need to worry about the order of the keywords provided to `reference_area=` (which could be potentially problematic if providing a literal value and a keyword). ## Using `autoscale=` to have a common *y*-axis scale across plots There are lots of options. Like, if you want to ensure that the scale is shared across all of the nanoplots (so you can better get a sense of overall magnitude), you can set `autoscale=` to `True`: ```{python} GT(random_numbers_df).fmt_nanoplot(columns="numbers", autoscale=True) ``` If you hover along or touch the left side of any of the plots above, you'll see that each *y* scale runs from `-12.0` to `37.0`. Using `autoscale=True` is very useful if you want to compare the magnitudes of values across rows in addition to their trends. It won't, however, make much sense if the overall magnitudes of values vary wildly across rows (e.g., comparing changing currency values or stock prices over time). ## Using the `nanoplot_options()`{.qd-no-link} helper function There are many options for customization. You can radically change the look of a collection of nanoplots with the `nanoplot_options()` helper function. With that function, you invoke it in the `options=` argument of `~~GT.fmt_nanoplot()`. You can modify the sizes and colors of different elements, decide which elements are even present, and much more! Here's an example where a line-based nanoplot retains all of its elements, but the overall appearance is greatly altered. ```{python} from great_tables import nanoplot_options ( GT(random_numbers_df) .fmt_nanoplot( columns="numbers", options=nanoplot_options( data_point_radius=8, data_point_stroke_color="black", data_point_stroke_width=2, data_point_fill_color="white", data_line_type="straight", data_line_stroke_color="brown", data_line_stroke_width=2, data_area_fill_color="orange", vertical_guide_stroke_color="green", ), ) ) ``` As can be seen, you have a lot of fine-grained control over the look of a nanoplot. ## Making nanoplots with bars using `plot_type="bar"` We don't just support line plots in `~~GT.fmt_nanoplot()`, we also have the option to show bar plots. The only thing you need to change is the value of `plot_type=` argument to `"bar"`: ```{python} GT(random_numbers_df).fmt_nanoplot(columns="numbers", plot_type="bar") ``` An important difference between line plots and bar plots is that the bars project from a zero line. Notice that some negative values in the bar-based nanoplot appear red and radiate downward from the gray zero line. Using `plot_type="bar"` still allows us to supply a reference line and a reference area with `reference_line=` and `reference_area=`. The `autoscale=` option works here as well. We also have a set of options just for bar plots available inside `nanoplot_options()`. Here's an example where we use all of the aforementioned customization possibilities: ```{python} ( GT(random_numbers_df) .fmt_nanoplot( columns="numbers", plot_type="bar", autoscale=True, reference_line="min", reference_area=[0, "max"], options=nanoplot_options( data_bar_stroke_color="gray", data_bar_stroke_width=2, data_bar_fill_color="orange", data_bar_negative_stroke_color="blue", data_bar_negative_stroke_width=1, data_bar_negative_fill_color="lightblue", reference_line_color="pink", reference_area_fill_color="bisque", vertical_guide_stroke_color="blue", ), ) ) ``` The customized bars use orange fills, gray strokes, and a pink reference line. Negative values are styled separately with blue strokes and light blue fills, making it easy to distinguish positive and negative trends at a glance. ## Horizontal bar and line plots Single-value bar plots, running in the horizontal direction, can be made by simply invoking `~~GT.fmt_nanoplot()` on a column of numeric values. These plots are meant for comparison across rows so the method automatically scales the horizontal bars to facilitate this type of display. Here's a simple example that uses `plot_type="bar"` on the `numbers` column that contains a single numeric value in every cell. ```{python} single_vals_df = pl.DataFrame( { "example": ["Row " + str(x) for x in range(1, 5)], "numbers": [2.75, 0, -3.2, 8] } ) GT(single_vals_df).fmt_nanoplot(columns="numbers", plot_type="bar") ``` This, interestingly enough, works with the `"line"` type of nanoplot. The result is akin to a lollipop plot: ```{python} GT(single_vals_df).fmt_nanoplot(columns="numbers") ``` You get to customize the line and the data point marker with the latter display of single values, and that's a plus. Nonetheless, it is more common to see horizontal bar plots in tables and the extra customization of negative values makes that form of presentation more advantageous. ## Line plots with paired *x* and *y* values Aside from a single stream of *y* values, we can plot pairs of *x* and *y* values. This works only for the `"line"` type of plot. We can set up a column of Polars `struct` values in a DataFrame to have this input data prepared for `~~GT.fmt_nanoplot()`. Notice that the dictionary values in the enclosed list must have the `"x"` and `"y"` keys. Further to this, the list lengths for each of `"x"` and `"y"` must match (i.e., to make valid pairs of *x* and *y*). ```{python} weather_2 = pl.DataFrame( { "station": ["Station " + str(x) for x in range(1, 4)], "temperatures": [ { "x": [6.1, 8.0, 10.1, 10.5, 11.2, 12.4, 13.1, 15.3], "y": [24.2, 28.2, 30.2, 30.5, 30.5, 33.1, 33.5, 32.7], }, { "x": [7.1, 8.2, 10.3, 10.75, 11.25, 12.5, 13.5, 14.2], "y": [18.2, 18.1, 20.3, 20.5, 21.4, 21.9, 23.1, 23.3], }, { "x": [6.3, 7.1, 10.3, 11.0, 12.07, 13.1, 15.12, 16.42], "y": [15.2, 17.77, 21.42, 21.63, 25.23, 26.84, 27.2, 27.44], }, ] } ) ( GT(weather_2) .fmt_nanoplot( columns="temperatures", plot_type="line", expand_x=[5, 16], expand_y=[10, 40], options=nanoplot_options( show_data_area=False, show_data_line=False ) ) ) ``` The options for removing the *data area* and the *data line* (though the corresponding `show_*` arguments of `nanoplot_options()`) make the finalized nanoplots look somewhat like scatter plots. Nanoplots bring data visualization directly into your table cells, giving readers an immediate visual sense of trends and distributions without leaving the tabular format. Between line plots, bar charts, reference annotations, and extensive customization through `nanoplot_options()`, you can tailor these compact visualizations to match your data and your presentation style. ## Advanced Topics ### Column Selection Many **Great Tables** methods accept a `columns=` argument for targeting specific columns. Rather than limiting you to a simple list of column names, the package supports a flexible selection system that includes positional indexing, pattern-matching functions, and Polars selectors. This page demonstrates each of these approaches. ## Selection Options The `columns=` argument for methods like `~~GT.tab_spanner()`, `~~GT.cols_move()`, and `~~GT.tab_style()` allows a range of options for selecting columns. The simplest approach is just a list of strings with the exact column names. However, we can specify columns using any of the following: * a single string column name. * an integer for the column's position. * a list of strings or integers. * a **Polars** selector. * a function that takes a string and returns `True` or `False`. ```{python} from great_tables import GT from great_tables.data import exibble lil_exibble = exibble[["num", "char", "fctr", "date", "time"]].head(4) gt_ex = GT(lil_exibble) gt_ex ``` This five-column table will serve as the basis for demonstrating each selection approach. ## Using integers We can use a list of strings or integers to select columns by name or position, respectively. ```{python} gt_ex.cols_move_to_start(columns=["date", 1, -1]) ``` Note the code above moved the following columns: * The string `"date"` matched the column of the same name. * The integer `1` matched the second column (this is similar to list indexing). * The integer `-1` matched the last column. Moreover, the order of the list defines the order of selected columns. In this case, `"data"` was the first entry, so it's the very first column in the new table. ## Using **Polars** selectors When using a **Polars** DataFrame, you can select columns using [**Polars** selectors](https://pola-rs.github.io/polars/py-polars/html/reference/selectors.html). The example below uses **Polars** selectors to move all columns that start with `"c"` or `"f"` to the start of the table. ```{python} import polars as pl import polars.selectors as cs pl_df = pl.from_pandas(lil_exibble) GT(pl_df).cols_move_to_start(columns=cs.starts_with("c") | cs.starts_with("f")) ``` In general, selection should match the behaviors of the **Polars** `DataFrame.select()` method. ```{python} pl_df.select(cs.starts_with("c") | cs.starts_with("f")).columns ``` See the [Selectors page in the polars docs](https://pola-rs.github.io/polars/py-polars/html/reference/selectors.html) for more information on this. ## Using functions A function can be used to select columns. It should take a column name as a string and return `True` or `False`. ```{python} gt_ex.cols_move_to_start(columns=lambda x: "c" in x) ``` These selection methods work consistently across all **Great Tables** methods that accept a `columns=` argument. Whether you prefer explicit column names, positional indexing, Polars selectors, or custom functions, you can choose the approach that best fits your workflow and data. ### Row Selection Just as you can target specific columns, **Great Tables** also provides flexible ways to select rows. The `rows=` argument appears in formatting methods, location specifiers, and styling calls, allowing you to apply operations to a precise subset of your data. This page covers each of the available selection mechanisms. ## Selection Options Location and formatter functions (e.g. `loc.body()` and `~~GT.fmt_number()`) can be applied to specific rows, using the `rows=` argument. Rows may be specified using any of the following: * None (the default), to select everything. * an integer for the row's position. * a list of or integers. * a **Polars** selector for filtering. * a function that takes a DataFrame and returns a boolean Series. The following sections will use a subset of the `exibble` data, to demonstrate these options. ```{python} from great_tables import GT, exibble, loc, style lil_exibble = exibble[["num", "char", "currency"]].head(3) gt_ex = GT(lil_exibble) ``` ## Using integers Use a single integer, or a list of integers, to select rows by position. ```{python} gt_ex.fmt_currency("currency", rows=0, decimals=1) ``` Notice that a dollar sign (`$`) was only added to the first row (index `0` in python). Indexing works the same as selecting items from a python list. This negative integers select relative to the final row. ```{python} gt_ex.fmt_currency("currency", rows=[0, -1], decimals=1) ``` The first and last rows now show currency formatting, while the middle row remains unchanged. Negative indices count backward from the end, just as with Python lists. ## Using polars expressions The `rows=` argument accepts polars expressions, which return a boolean Series, indicating which rows to operate on. For example, the code below only formats the `num` column, but only when currency is less than 40. ```{python} import polars as pl gt_polars = GT(pl.from_pandas(lil_exibble)) gt_polars.fmt_integer("num", rows=pl.col("currency") < 40) ``` Here's a more realistic example, which highlights the row with the highest value for currency. ```{python} import polars.selectors as cs gt_polars.tab_style( style.fill("yellow"), loc.body( columns=cs.all(), rows=pl.col("currency") == pl.col("currency").max() ) ) ``` The row with the maximum currency value is highlighted with a yellow background. Using expressions for row selection keeps the logic declarative and close to the styling call. ## Using a function Since libraries like `pandas` don't have lazy expressions, the `rows=` argument also accepts a function for selecting rows. The function should take a DataFrame and return a boolean series. Here's the same example as the previous polars section, but with pandas data, and a lambda for selecting rows. ```{python} gt_ex.fmt_integer("num", rows=lambda D: D["currency"] < 40) ``` Here's the styling example from the previous polars section. ```{python} import polars.selectors as cs gt_ex.tab_style( style.fill("yellow"), loc.body( columns=lambda colname: True, rows=lambda D: D["currency"] == D["currency"].max() ) ) ``` Whether you prefer integer indexing for quick positional access, Polars expressions for declarative filtering, or functions for compatibility with pandas, the `rows=` argument adapts to your data workflow. Combined with column selection, these tools give you fine-grained control over exactly which cells your formatting and styling operations affect. ### Location Selection The `loc` module is what connects your styling intentions to specific parts of the table. Each location specifier identifies a region of the table (such as the header, body, stub, or footer) and many of them also support targeting specific columns or rows within that region. This page provides a comprehensive overview of all available location specifiers and how to use them effectively. ## Overview Great Tables uses the `loc` module to specify locations for styling in `~~GT.tab_style()`. Some location specifiers also allow selecting specific columns and rows of data. For example, you might style a particular row name, group, column, or spanner label. The table below shows the different location specifiers, along with the types of column or row selection they allow. ```{python} # | echo: false import polars as pl from great_tables import GT data = [ ["header", "loc.header()", "composite"], ["", "loc.title()", ""], ["", "loc.subtitle()", ""], ["boxhead", "loc.column_header()", "composite"], ["", "loc.spanner_labels()", "columns"], ["", "loc.column_labels()", "columns"], ["row stub", "loc.stub()", "rows"], ["", "loc.row_groups()", "rows"], # ["", "loc.summary_stub()", "rows"], ["", "loc.grand_summary_stub()", "rows"], ["table body", "loc.body()", "columns and rows"], # ["", "loc.summary_rows()", "columns and rows"], ["", "loc.grand_summary_rows()", "columns and rows"], ["footer", "loc.footer()", "composite"], ["", "loc.source_notes()", ""], ] df = pl.DataFrame(data, schema=["table part", "name", "selection"], orient="row") GT(df) ``` Note that composite specifiers are ones that target multiple locations. For example, `loc.header()` specifies both `loc.title()` and `loc.subtitle()`. ## Setting up data The examples below will use this small dataset to show selecting different locations, as well as specific rows and columns within a location (where supported). ```{python} import polars as pl import polars.selectors as cs from great_tables import GT, loc, style, exibble pl_exibble = pl.from_pandas(exibble)[[0, 1, 4], ["num", "char", "group"]] pl_exibble ``` This small three-row, three-column dataset gives us enough structure to demonstrate row and column targeting without cluttering the output. ## Simple locations Simple locations don't take any arguments. For example, styling the title uses `loc.title()`. ```{python} ( GT(pl_exibble) .tab_header("A title", "A subtitle") .tab_style( style.fill("yellow"), loc.title(), ) ) ``` Only the title receives the yellow fill; the subtitle and the rest of the table remain unstyled. Simple locations are useful when you want precise control over a single element. ## Composite locations Composite locations target multiple simple locations. For example, `loc.header()` includes both `loc.title()` and `loc.subtitle()`. ```{python} ( GT(pl_exibble) .tab_header("A title", "A subtitle") .tab_style( style.fill("yellow"), loc.header(), ) ) ``` Both the title and subtitle are filled with yellow because `loc.header()` targets the entire header region. Composite locations are a convenient shorthand when you want the same style on all sub-parts. ## Body columns, rows and mask Use `columns=` and `rows=` in `loc.body()` to style specific cells in the table body. ```{python} ( GT(pl_exibble).tab_style( style.fill("yellow"), loc.body( columns=cs.starts_with("cha"), rows=pl.col("char").str.contains("a"), ), ) ) ``` Alternatively, use `mask=` in `loc.body()` to apply conditional styling to rows on a per-column basis. ```{python} ( GT(pl_exibble).tab_style( style.fill("yellow"), loc.body(mask=cs.string().str.contains("p")), ) ) ``` This is discussed in detail in [Styling the Table Body](./11-styling-the-table-body.qmd). ## Column labels Locations like `loc.spanner_labels()` and `loc.column_labels()` can select specific column and spanner labels. You can use name strings, index position, or polars selectors. ```{python} GT(pl_exibble).tab_style( style.fill("yellow"), loc.column_labels( cs.starts_with("cha"), ), ) ``` However, note that `loc.spanner_labels()` currently only accepts list of string names. ## Row and group names Row and group names in `loc.stub()` and `loc.row_groups()` may be specified three ways: * by name * by index * by polars expression ```{python} gt = GT(pl_exibble).tab_stub( rowname_col="char", groupname_col="group", ) gt.tab_style(style.fill("yellow"), loc.stub()) ``` All row labels in the stub are highlighted in yellow. ```{python} gt.tab_style(style.fill("yellow"), loc.stub("banana")) ``` Only the `"banana"` row label is styled, demonstrating name-based targeting. ```{python} gt.tab_style(style.fill("yellow"), loc.stub(["apricot", 2])) ``` You can mix names and integer indices in a list to target multiple specific rows at once. ### Groups by name and position Note that for specifying row groups, the group corresponding to the group name or row number in the original data is used. For example, the code below styles the group corresponding to the row at index 1 (i.e., the second row) in the data. ```{python} gt.tab_style( style.fill("yellow"), loc.row_groups(1), ) ``` Since the second row (starting with "banana") is in "grp_a", that is the group that gets styled. This means you can use a polars expression to select groups: ```{python} gt.tab_style( style.fill("yellow"), loc.row_groups(pl.col("group") == "grp_b"), ) ``` You can also specify group names using a string (or list of strings). ```{python} gt.tab_style( style.fill("yellow"), loc.row_groups("grp_b"), ) ``` The `loc` module provides a complete vocabulary for addressing any part of your table. By combining location specifiers with column selectors, row filters, and Polars expressions, you can apply styles to exactly the right cells. For more details on styling itself, see [Styling the Table Body](./11-styling-the-table-body.qmd) and [Styling the Whole Table](./12-styling-the-whole-table.qmd). ### Exporting and Saving Tables Once you have built a table, you need to get it into its final destination. That might be a notebook cell, a standalone HTML file, a LaTeX document, or an image file for inclusion in a report or presentation. **Great Tables** provides several export methods to cover these use cases, each with options to control the output format. ## Displaying Tables In most notebook environments (Jupyter, Quarto, Marimo), simply placing a `GT` object as the last expression in a cell will render the table automatically. However, you can also use the `~~GT.show()` method for explicit control over where the table is displayed. ```{python} from great_tables import GT from great_tables.data import exibble gt_tbl = ( GT(exibble.head(3)[["num", "char", "currency"]]) .tab_header(title="Example Table", subtitle="A small demonstration") .fmt_currency(columns="currency") .fmt_number(columns="num", decimals=2) ) gt_tbl.show() ``` The `target=` argument controls the display destination. The available options are: - `"auto"` (the default): displays inline in a notebook if possible, otherwise opens a browser window. - `"notebook"`: forces inline notebook display. - `"browser"`: opens the table in your default web browser. This is particularly useful when working in the console or when you want to see the full styled output that some IDEs may suppress. ```python # Open in a browser window (useful when running from a script or console) gt_tbl.show(target="browser") ``` ## Getting HTML as a String The `~~GT.as_raw_html()` method returns the table as an HTML string. This is useful for embedding tables in web applications, email templates, or custom HTML documents. ```{python} html_str = gt_tbl.as_raw_html() # Show the first 200 characters to see the structure print(html_str[:200]) ``` The method accepts several arguments that control the output format. ### Inline CSS for Email Email clients typically strip `