----------------------------------------------------------------------
This is the API documentation for the great_tables library.
----------------------------------------------------------------------
## Table Creation
All tables created in Great Tables begin by using `GT()`. With this class, we supply the input data table and some basic options for creating a stub and row groups (with the `rowname_col=` and `groupname_col=` arguments). All GT methods are documented on their own pages.
GT(data: 'Any', rowname_col: 'str | None' = None, groupname_col: 'str | None' = None, auto_align: 'bool' = True, id: 'str | None' = None, locale: 'str | None' = None)
Create a **Great Tables** object.
The `GT()` class creates the `GT` object when provided with tabular data. Using this class is
the the first step in a typical **Great Tables** workflow. Once we have this object, we can
take advantage of numerous methods to get the desired display table for publication.
There are a few table structuring options we can consider at this stage. We can choose to create
a table stub containing row labels through the use of the `rowname_col=` argument. Further to
this, row groups can be created with the `groupname_col=` argument. Both arguments take the name
of a column in the input table data. Typically, the data in the `groupname_col=` column will
consist of categorical text whereas the data in the `rowname_col=` column will often contain
unique labels (perhaps being unique across the entire table or unique only within the different
row groups).
Parameters
----------
data
A DataFrame object.
rowname_col
The column name in the input `data=` table to use as row labels to be placed in the table
stub.
groupname_col
The column name in the input `data=` table to use as group labels for generation of row
groups.
auto_align
Optionally have column data be aligned depending on the content contained in each column of
the input `data=`.
id
By default (with `None`) the table ID will be a random, ten-letter string as generated
through internal use of the `random_id()` function. A custom table ID can be used here by
providing a string.
locale
An optional locale identifier that can be set as the default locale for all functions that
take a `locale` argument. Examples include `"en"` for English (United States) and `"fr"`
for French (France).
Returns
-------
GT
A GT object is returned.
Examples
--------
Let's use the `exibble` dataset for the next few examples, we'll learn how to make simple
output tables with the `GT()` class. The most basic thing to do is to just use `GT()` with the
dataset as the input.
```{python}
from great_tables import GT, exibble
GT(exibble)
```
This dataset has the `row` and `group` columns. The former contains unique values that are ideal
for labeling rows, and this often happens in what is called the 'stub' (a reserved area that
serves to label rows). With the `GT()` class, we can immediately place the contents of the `row`
column into the stub column. To do this, we use the `rowname_col=` argument with the appropriate
column name.
```{python}
from great_tables import GT, exibble
GT(exibble, rowname_col="row")
```
This sets up a table with a stub, the row labels are placed within the stub column, and a
vertical dividing line has been placed on the right-hand side.
The `group` column contains categorical values that are ideal for grouping rows. We can use the
`groupname_col=` argument to place these values into row groups.
```{python}
from great_tables import GT, exibble
GT(exibble, rowname_col="row", groupname_col="group")
```
By default, values in the body of a table (and their column labels) are automatically aligned.
The alignment is governed by the types of values in a column. If you'd like to disable this form
of auto-alignment, the `auto_align=False` option can be taken.
```{python}
from great_tables import GT, exibble
GT(exibble, rowname_col="row", auto_align=False)
```
What you'll get from that is center-alignment of all table body values and all column labels.
Note that row labels in the the stub are still left-aligned; and `auto_align=` has no effect on
alignment within the table stub.
However which way you generate the initial table object, you can modify it with a huge variety
of methods to further customize the presentation. Formatting body cells is commonly done with
the family of formatting methods (e.g., `fmt_number()`, `fmt_date()`, etc.). The package
supports formatting with internationalization ('i18n' features) and so locale-aware methods
all come with a `locale=` argument. To avoid having to use that argument repeatedly, the `GT()`
class has its own `locale=` argument. Setting a locale in that will make it available globally.
Here's an example of how that works in practice when setting `locale = "fr"` in `GT()` prior to
using formatting methods:
```{python}
from great_tables import GT, exibble
(
GT(exibble, rowname_col="row", locale="fr")
.fmt_currency(columns="currency")
.fmt_scientific(columns="num")
.fmt_date(columns="date", date_style="day_month_year")
)
```
In this example, the `fmt_currency()`, `fmt_scientific()`, and `fmt_date()` methods understand
that the locale for this table is `"fr"` (French), so the appropriate formatting for that locale
is apparent in the `currency`, `num`, and `date` columns.
## Major structural table parts
A table can contain a few useful components for conveying additional information. These include a header (with a titles and subtitle), a footer (with source notes), and additional areas for labels (row group labels, column spanner labels, the stubhead label). We can perform styling on targeted table locations with the [`tab_style()`](`great_tables.GT.tab_style`) method.
tab_header(self: 'GTSelf', title: 'str | Text', subtitle: 'str | Text | None' = None, preheader: 'str | list[str] | None' = None) -> 'GTSelf'
Add a table header.
We can add a table header to the output table that contains a title and even a subtitle with the
`tab_header()` method. A table header is an optional table component that is positioned above
the column labels. We have the flexibility to use Markdown or HTML formatting for the header's
title and subtitle with the [`md()`](`great_tables.md`) and [`html()`](`great_tables.html`)
helper functions.
Parameters
----------
title
Text to be used in the table title. We can elect to use the [`md()`](`great_tables.md`) and
[`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain
HTML elements in the text.
subtitle
Text to be used in the table subtitle. We can elect to use the [`md()`](`great_tables.md`)
and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to
retain HTML elements in the text.
preheader
Optional preheader content that is rendered above the table. Can be supplied as a list
of strings.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use a small portion of the `gtcars` dataset to create a table. A header part can be added
to the table with the `tab_header()` method. We'll add a title and the optional subtitle as
well. With the [`md()`](`great_tables.md`) helper function, we can make sure the Markdown
formatting is interpreted and transformed.
```{python}
from great_tables import GT, md
from great_tables.data import gtcars
gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5)
(
GT(gtcars_mini)
.tab_header(
title=md("Data listing from **gtcars**"),
subtitle=md("`gtcars` is an R dataset")
)
)
```
We can alternatively use the [`html()`](`great_tables.html`) helper function to retain HTML
elements in the text.
```{python}
from great_tables import GT, md, html
from great_tables.data import gtcars
gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5)
(
GT(gtcars_mini)
.tab_header(
title=md("Data listing gtcars"),
subtitle=html("From gtcars")
)
)
```
tab_spanner(self: 'GTSelf', label: 'str | BaseText', columns: 'SelectExpr' = None, spanners: 'str | list[str] | None' = None, level: 'int | None' = None, id: 'str | None' = None, gather: 'bool' = True, replace: 'bool' = False) -> 'GTSelf'
Insert a spanner above a selection of column headings.
This part of the table contains, at a minimum, column labels and, optionally, an unlimited
number of levels for spanners. A spanner will occupy space over any number of contiguous column
labels and it will have an associated label and ID value. This method allows for mapping to be
defined by column names, existing spanner ID values, or a mixture of both.
The spanners are placed in the order of calling `tab_spanner()` so if a later call uses the same
columns in its definition (or even a subset) as the first invocation, the second spanner will be
overlaid atop the first. Options exist for forcibly inserting a spanner underneath others (with
`level` as space permits) and with `replace`, which allows for full or partial spanner
replacement.
Parameters
----------
label
The text to use for the spanner label. We can optionally use the [`md()`](`great_tables.md`)
and [`html()`](`great_tables.html`) helper functions to style the text as Markdown or to
retain HTML elements in the text. Alternatively, units notation can be used (see
[`define_units()`](`great_tables.define_units`) for details).
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
spanners
The spanners that should be spanned over, should they already be defined. One or more
spanner ID values (in quotes) can be supplied here. This argument works in tandem with the
`columns` argument.
level
An explicit level to which the spanner should be placed. If not provided, **Great Tables**
will choose the level based on the inputs provided within `columns` and `spanners`, placing
the spanner label where it will fit. The first spanner level (right above the column labels)
is `0`.
id
The ID for the spanner. When accessing a spanner through the `spanners` argument of
`tab_spanner()` the `id` value is used as the reference (and not the `label`). If an `id`
is not explicitly provided here, it will be taken from the `label` value. It is advisable to
set an explicit `id` value if you plan to access this cell in a later call and the label
text is complicated (e.g., contains markup, is lengthy, or both). Finally, when providing
an `id` value you must ensure that it is unique across all ID values set for spanner labels
(the method will throw an error if `id` isn't unique).
gather
An option to move the specified `columns` such that they are unified under the spanner.
Ordering of the moved-into-place columns will be preserved in all cases. By default, this
is set to `True`.
replace
Should new spanners be allowed to partially or fully replace existing spanners? (This is a
possibility if setting spanners at an already populated `level`.) By default, this is set to
`False` and an error will occur if some replacement is attempted.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's create a table using a small portion of the `gtcars` dataset. Over several columns (`hp`,
`hp_rpm`, `trq`, `trq_rpm`, `mpg_c`, `mpg_h`) we'll use `tab_spanner()` to add a spanner with
the label `"performance"`. This effectively groups together several columns related to car
performance under a unifying label.
```{python}
from great_tables import GT, md
from great_tables.data import gtcars
colnames = ["model", "hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"]
gtcars_mini = gtcars[colnames].head(10)
(
GT(gtcars_mini)
.tab_spanner(
label="performance",
columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"]
)
)
```
One cool feature of `tab_spanner()` is its support for multiple levels, allowing you to group
columns in various ways. For example, you can create three bottom spanners and a top spanner:
```{python}
(
GT(gtcars_mini)
.tab_spanner(
label="hp",
columns=["hp", "hp_rpm"],
)
.tab_spanner(
label="trq",
columns=["trq", "trq_rpm"],
)
.tab_spanner(
label="mpg",
columns=["mpg_c", "mpg_h"],
)
.tab_spanner(
label="performance",
columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"],
)
)
```
Did you notice that the spanners stacked automatically? What if you want granular control to
specify a spanner in a specific hierarchy? **Great Tables** has you covered. By using the `level=`
parameter, you can easily adjust the hierarchy of spanners. For example, by specifying `level=0`
for the last call of `tab_spanner()`, you can place that spanner at the bottom level (level `0`)
instead of the top level (level `2`).
```{python}
(
GT(gtcars_mini)
.tab_spanner(
label="hp",
columns=["hp", "hp_rpm"],
)
.tab_spanner(
label="performance",
columns=["hp", "hp_rpm", "trq", "trq_rpm"],
)
.tab_spanner(
label="trq",
columns=["trq", "trq_rpm"],
level=0,
)
)
```
We can also use Markdown formatting for the spanner label. In this example, we'll use
`gt.md("*Performance*")` to make the label italicized.
```{python}
(
GT(gtcars_mini)
.tab_spanner(
label=md("*Performance*"),
columns=["hp", "hp_rpm", "trq", "trq_rpm", "mpg_c", "mpg_h"]
)
)
```
tab_spanner_delim(self: 'GTSelf', delim: 'str' = '.', columns: 'SelectExpr' = None, split: "Literal['first', 'last']" = 'last', limit: 'int' = -1, reverse: 'bool' = False) -> 'GTSelf'
Insert spanners by splitting column names with a delimiter.
This generates one or more spanners (and sets column labels), by splitting the column name by
the specified delimiter text (delim) and placing the fragments from top to bottom (i.e.,
higher-level spanners to the column labels) or vice versa.
For example, the three side-by-side column names rating_1, rating_2, and rating_3 will
by default produce a spanner labeled "rating" above columns labeled "1", "2", and "3".
Parameters
----------
delim
Delimiter for splitting, default to `"."`.
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
split
Should the delimiter splitting occur from the "last" instance of the delim character or
from the "first"? The default here uses the "last" keyword, and splitting begins at the
last instance of the delimiter in the column name. This option only has some consequence
when there is a limit value applied that is lesser than the number of delimiter characters
for a given column name (i.e., number of splits is not the maximum possible number).
limit
Limit for splitting. An optional limit to place on the splitting procedure. The default -1
means that a column name will be split as many times are there are delimiter characters.
In other words, the default means there is no limit. If an integer value is given to limit
then splitting will cease at the iteration given by limit. This works in tandem with split
since we can adjust the number of splits from either the right side (split = "last") or
left side (split = "first") of the column name.
reverse
Should the order of split names be reversed? By default, this is `False`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's create a table table that includes the column names province.NL_ZH.pop, province.NL_ZH.gdp,
province.NL_NH.pop, and province.NL_NH.gdp, we can see that we have a naming system that has
a well-defined structure. We start with the more general to the left ("province") and move to
the more specific on the right ("pop"). If the columns are in the table in this exact order,
then things are in an ideal state as the eventual spanner labels will form from this neighboring.
When using tab_spanner_delim() here with delim set as "." we get the following table:
```{python}
import polars as pl
import polars.selectors as cs
from great_tables import GT
data = {
"province.NL_ZH.pop": [1, 2, 3],
"province.NL_ZH.gdp": [4, 5, 6],
"province.NL_NH.pop": [7, 8, 9],
"province.NL_NH.gdp": [10, 11, 12],
}
gt = GT(pl.DataFrame(data))
gt.tab_spanner_delim()
```
```{python}
gt.tab_spanner_delim(limit=1)
```
```{python}
# the name "province" repeats in the styled table,
# because the first spanner is column names
gt.tab_spanner_delim(reverse=True)
```
```{python}
from great_tables.data import towny
lil_towny = (
pl.DataFrame(towny)
.select("name", cs.starts_with("population"))
.head()
)
GT(lil_towny).tab_spanner_delim(delim="_")
```
tab_stub(self: 'GTSelf', rowname_col: 'str | None' = None, groupname_col: 'str | None' = None) -> 'GTSelf'
Add a table stub, to emphasize row and group information.
Parameters
----------
rowname_col:
The column to use for row names. By default, no row names added.
groupname_col:
The column to use for group names. By default no group names added.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
By default, all data is together in the body of the table.
```{python}
from great_tables import GT, exibble
GT(exibble)
```
The table stub separates row names with a vertical line, and puts group names
on their own line.
```{python}
GT(exibble).tab_stub(rowname_col="row", groupname_col="group")
```
tab_stubhead(self: 'GTSelf', label: 'str | Text') -> 'GTSelf'
Add label text to the stubhead.
Add a label to the stubhead of a table. The stubhead is the lone element that is positioned
left of the column labels, and above the stub. If a stub does not exist, then there is no
stubhead (so no change will be made when using this method in that case). We have the
flexibility to use Markdown formatting for the stubhead label (through use of the
[`md()`](`great_tables.md`) helper function). Furthermore, we can use HTML for the stubhead
label so long as we also use the [`html()`](`great_tables.html`) helper function.
Parameters
----------
label
The text to be used as the stubhead label. We can optionally use the
[`md()`](`great_tables.md`) and [`html()`](`great_tables.html`) helper functions to style
the text as Markdown or to retain HTML elements in the text.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using a small subset of the `gtcars` dataset, we can create a table with row labels. Since
we have row labels in the stub (via use of `rowname_col="model"` in the `GT()` call) we have
a stubhead, so, let's add a stubhead label (`"car"`) with the `tab_stubhead()` method to
describe what's in the stub.
```{python}
from great_tables import GT
from great_tables.data import gtcars
gtcars_mini = gtcars[["model", "year", "hp", "trq"]].head(5)
(
GT(gtcars_mini, rowname_col="model")
.tab_stubhead(label="car")
)
```
We can also use Markdown formatting for the stubhead label. In this example, we'll use
`md("*Car*")` to make the label italicized.
```{python}
from great_tables import GT, md
from great_tables.data import gtcars
(
GT(gtcars_mini, rowname_col="model")
.tab_stubhead(label=md("*Car*"))
)
```
tab_footnote(self: 'GTSelf', footnote: 'str | Text', locations: 'Loc | None | list[Loc | None]' = None, placement: 'PlacementOptions' = 'auto') -> 'GTSelf'
Add a table footnote.
`tab_footnote()` can make it a painless process to add a footnote to a table. There are commonly
two components to a footnote: (1) a footnote mark that is attached to the targeted cell content,
and (2) the footnote text itself that is placed in the table's footer area. Each unit of
footnote text in the footer is linked to an element of text or otherwise through the footnote
mark.
The footnote system in **Great Tables** presents footnotes in a way that matches the usual
expectations, where:
1. footnote marks have a sequence, whether they are symbols, numbers, or letters
2. multiple footnotes can be applied to the same content (and marks are always presented in an
ordered fashion)
3. footnote text in the footer is never exactly repeated, **Great Tables** reuses footnote marks
where needed throughout the table
4. footnote marks are ordered across the table in a consistent manner (left to right, top to
bottom)
Each call of `tab_footnote()` will either add a different footnote to the footer or reuse
existing footnote text therein. One or more cells outside of the footer are targeted using
location classes from the `loc` module (e.g., `loc.body()`, `loc.column_labels()`, etc.). You
can choose to *not* attach a footnote mark by simply not specifying anything in the `locations`
argument.
By default, **Great Tables** will choose which side of the text to place the footnote mark via
the `placement="auto"` option. You are, however, always free to choose the placement of the
footnote mark (either to the `"left"` or `"right"` of the targeted cell content).
Parameters
----------
footnote
The text to be used in the footnote. We can optionally use [`md()`](`great_tables.md`) or
[`html()`](`great_tables.html`) to style the text as Markdown or to retain HTML elements in
the footnote text.
locations
The cell or set of cells to be associated with the footnote. Supplying any of the location
classes from the `loc` module is a useful way to target the location cells that are
associated with the footnote text. These location classes are: `loc.title`, `loc.stubhead`,
`loc.spanner_labels`, `loc.column_labels`, `loc.row_groups`, `loc.stub`, `loc.body`, etc.
Additionally, we can enclose several location calls within a `list()` if we wish to link the
footnote text to different types of locations (e.g., body cells, row group labels, the table
title, etc.).
placement
Where to affix footnote marks to the table content. Two options for this are `"left"` or
`"right"`, where the placement is either to the absolute left or right of the cell content.
By default, however, this option is set to `"auto"` whereby **Great Tables** will choose a
preferred left-or-right placement depending on the alignment of the cell content.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
This example table will be based on the `towny` dataset. We have a header part, with a title and
a subtitle. We can choose which of these could be associated with a footnote and in this case it
is the `"subtitle"`. This table has a stub with row labels and some of those labels are
associated with a footnote. So long as row labels are unique, they can be easily used as row
identifiers in `loc.stub()`. The third footnote is placed on the `"Density"` column label. Here,
changing the order of the `tab_footnote()` calls has no effect on the final table rendering.
```{python}
import polars as pl
from great_tables import GT, loc, md
from great_tables.data import towny
towny_mini = (
pl.from_pandas(towny)
.filter(pl.col("csd_type") == "city")
.select(["name", "density_2021", "population_2021"])
.top_k(10, by="population_2021")
.sort("population_2021", descending=True)
)
(
GT(towny_mini, rowname_col="name")
.tab_header(
title=md("The 10 Largest Municipalities in `towny`"),
subtitle="Population values taken from the 2021 census."
)
.fmt_integer()
.cols_label(
density_2021="Density",
population_2021="Population"
)
.tab_footnote(
footnote="Part of the Greater Toronto Area.",
locations=loc.stub(rows=[
"Toronto", "Mississauga", "Brampton", "Markham", "Vaughan"
])
)
.tab_footnote(
footnote=md("Density is in terms of persons per {{km^2}}."),
locations=loc.column_labels(columns="density_2021")
)
.tab_footnote(
footnote="Census results made public on February 9, 2022.",
locations=loc.subtitle()
)
.tab_source_note(
source_note=md("Data taken from the `towny` dataset.")
)
.opt_footnote_marks(marks="letters")
)
```
tab_source_note(self: 'GTSelf', source_note: 'str | Text') -> 'GTSelf'
Add a source note citation.
Add a source note to the footer part of the table. A source note is useful for citing the data
included in the table. Several can be added to the footer, simply use the `tab_source_note()`
method multiple times and they will be inserted in the order provided. We can use Markdown
formatting for the note, or, if the table is intended for HTML output, we can include HTML
formatting.
Parameters
----------
source_note
Text to be used in the source note. We can optionally use the [`md()`](`great_tables.md`) or
[`html()`](`great_tables.html`) helper functions to style the text as Markdown or to retain
HTML elements in the text.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
With three columns from the `gtcars` dataset, let's create a new table. We can use the
`tab_source_note()` method to add a source note to the table footer. Here we are citing the
data source but this method can be used for any text you'd prefer to display in the footer
component of the table.
```{python}
from great_tables import GT
from great_tables.data import gtcars
gtcars_mini = gtcars[["mfr", "model", "msrp"]].head(5)
(
GT(gtcars_mini, rowname_col="model")
.tab_source_note(source_note="From edmunds.com")
)
```
tab_style(self: 'GTSelf', style: 'CellStyle | list[CellStyle]', locations: 'Loc | list[Loc]') -> 'GTSelf'
Add custom style to one or more cells
With the `tab_style()` method we can target specific cells and apply styles to them. We do this
with the combination of the `style` and `location` arguments. The `style` argument requires use
of styling classes (e.g., `style.fill(color="red")`) and the `location` argument needs to be an
expression of the cells we want to target using location targeting classes (e.g.,
`loc.body(columns=)`). With the available suite of styling classes, here are some
of the styles we can apply:
- the background color of the cell (`style.fill()`'s `color`)
- the cell's text color, font, and size (`style.text()`'s `color`, `font`, and `size`)
- the text style (`style.text()`'s `style`), enabling the use of italics or oblique text.
- the text weight (`style.text()`'s `weight`), allowing the use of thin to bold text (the degree
of choice is greater with variable fonts)
- the alignment of text (`style.text()`'s `align`)
- cell borders with the `style.borders()` class
Parameters
----------
style
The styles to use for the cells at the targeted `locations`. The `style.text()`,
`style.fill()`, and `style.borders()` classes can be used here to more easily generate valid
styles.
locations
The cell or set of cells to be associated with the style. The `loc.body()` class can be used
here to easily target body cell locations.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use a small subset of the `exibble` dataset to demonstrate how to use `tab_style()` to
target specific cells and apply styles to them. We'll start by creating the `exibble_sm` table
(a subset of the `exibble` table) and then use `tab_style()` to apply a light cyan background
color to the cells in the `num` column for the first two rows of the table. We'll then apply a
larger font size to the cells in the `fctr` column for the last four rows of the table.
```{python}
from great_tables import GT, style, loc, exibble
exibble_sm = exibble[["num", "fctr", "row", "group"]]
(
GT(exibble_sm, rowname_col="row", groupname_col="group")
.tab_style(
style=style.fill(color="lightcyan"),
locations=loc.body(columns="num", rows=["row_1", "row_2"]),
)
.tab_style(
style=style.text(size="22px"),
locations=loc.body(columns=["fctr"], rows=[4, 5, 6, 7]),
)
)
```
Let's use `exibble` once again to create a simple, two-column output table (keeping only the
`num` and `currency` columns). With the `tab_style()` method (called thrice), we'll add style to
the values already formatted by `fmt_number()` and `fmt_currency()`. In the `style` argument of
the first two `tab_style()` call, we can define multiple types of styling with the
`style.fill()` and `style.text()` classes (enclosing these in a list). The cells to be targeted
for styling require the use of `loc.body()`, which is used here with different columns being
targeted. For the final `tab_style()` call, we demonstrate the use of `style.borders()` class
as the `style` argument, which is employed in conjunction with `loc.body()` to locate the row to
be styled.
```{python}
from great_tables import GT, style, loc, exibble
(
GT(exibble[["num", "currency"]])
.fmt_number(columns="num", decimals=1)
.fmt_currency(columns="currency")
.tab_style(
style=[
style.fill(color="lightcyan"),
style.text(weight="bold")
],
locations=loc.body(columns="num")
)
.tab_style(
style=[
style.fill(color="#F9E3D6"),
style.text(style="italic")
],
locations=loc.body(columns="currency")
)
.tab_style(
style=style.borders(sides=["top", "bottom"], weight='2px', color="red"),
locations=loc.body(rows=[4])
)
)
```
tab_options(self: 'GTSelf', container_width: 'str | None' = None, container_height: 'str | None' = None, container_padding_x: 'str | None' = None, container_padding_y: 'str | None' = None, container_overflow_x: 'str | None' = None, container_overflow_y: 'str | None' = None, table_width: 'str | None' = None, table_layout: 'str | None' = None, table_margin_left: 'str | None' = None, table_margin_right: 'str | None' = None, table_background_color: 'str | None' = None, table_additional_css: 'str | list[str] | None' = None, table_font_names: 'str | list[str] | None' = None, table_font_size: 'str | None' = None, table_font_weight: 'str | int | float | None' = None, table_font_style: 'str | None' = None, table_font_color: 'str | None' = None, table_font_color_light: 'str | None' = None, table_border_top_style: 'str | None' = None, table_border_top_width: 'str | None' = None, table_border_top_color: 'str | None' = None, table_border_bottom_style: 'str | None' = None, table_border_bottom_width: 'str | None' = None, table_border_bottom_color: 'str | None' = None, table_border_left_style: 'str | None' = None, table_border_left_width: 'str | None' = None, table_border_left_color: 'str | None' = None, table_border_right_style: 'str | None' = None, table_border_right_width: 'str | None' = None, table_border_right_color: 'str | None' = None, heading_background_color: 'str | None' = None, heading_align: 'str | None' = None, heading_title_font_size: 'str | None' = None, heading_title_font_weight: 'str | int | float | None' = None, heading_subtitle_font_size: 'str | None' = None, heading_subtitle_font_weight: 'str | int | float | None' = None, heading_padding: 'str | None' = None, heading_padding_horizontal: 'str | None' = None, heading_border_bottom_style: 'str | None' = None, heading_border_bottom_width: 'str | None' = None, heading_border_bottom_color: 'str | None' = None, heading_border_lr_style: 'str | None' = None, heading_border_lr_width: 'str | None' = None, heading_border_lr_color: 'str | None' = None, column_labels_background_color: 'str | None' = None, column_labels_font_size: 'str | None' = None, column_labels_font_weight: 'str | int | float | None' = None, column_labels_text_transform: 'str | None' = None, column_labels_padding: 'str | None' = None, column_labels_padding_horizontal: 'str | None' = None, column_labels_vlines_style: 'str | None' = None, column_labels_vlines_width: 'str | None' = None, column_labels_vlines_color: 'str | None' = None, column_labels_border_top_style: 'str | None' = None, column_labels_border_top_width: 'str | None' = None, column_labels_border_top_color: 'str | None' = None, column_labels_border_bottom_style: 'str | None' = None, column_labels_border_bottom_width: 'str | None' = None, column_labels_border_bottom_color: 'str | None' = None, column_labels_border_lr_style: 'str | None' = None, column_labels_border_lr_width: 'str | None' = None, column_labels_border_lr_color: 'str | None' = None, column_labels_hidden: 'bool | None' = None, row_group_background_color: 'str | None' = None, row_group_font_size: 'str | None' = None, row_group_font_weight: 'str | int | float | None' = None, row_group_text_transform: 'str | None' = None, row_group_padding: 'str | None' = None, row_group_padding_horizontal: 'str | None' = None, row_group_border_top_style: 'str | None' = None, row_group_border_top_width: 'str | None' = None, row_group_border_top_color: 'str | None' = None, row_group_border_bottom_style: 'str | None' = None, row_group_border_bottom_width: 'str | None' = None, row_group_border_bottom_color: 'str | None' = None, row_group_border_left_style: 'str | None' = None, row_group_border_left_width: 'str | None' = None, row_group_border_left_color: 'str | None' = None, row_group_border_right_style: 'str | None' = None, row_group_border_right_width: 'str | None' = None, row_group_border_right_color: 'str | None' = None, row_group_as_column: 'bool | None' = None, table_body_hlines_style: 'str | None' = None, table_body_hlines_width: 'str | None' = None, table_body_hlines_color: 'str | None' = None, table_body_vlines_style: 'str | None' = None, table_body_vlines_width: 'str | None' = None, table_body_vlines_color: 'str | None' = None, table_body_border_top_style: 'str | None' = None, table_body_border_top_width: 'str | None' = None, table_body_border_top_color: 'str | None' = None, table_body_border_bottom_style: 'str | None' = None, table_body_border_bottom_width: 'str | None' = None, table_body_border_bottom_color: 'str | None' = None, stub_background_color: 'str | None' = None, stub_font_size: 'str | None' = None, stub_font_weight: 'str | int | float | None' = None, stub_text_transform: 'str | None' = None, stub_border_style: 'str | None' = None, stub_border_width: 'str | None' = None, stub_border_color: 'str | None' = None, stub_row_group_font_size: 'str | None' = None, stub_row_group_font_weight: 'str | int | float | None' = None, stub_row_group_text_transform: 'str | None' = None, stub_row_group_border_style: 'str | None' = None, stub_row_group_border_width: 'str | None' = None, stub_row_group_border_color: 'str | None' = None, data_row_padding: 'str | None' = None, data_row_padding_horizontal: 'str | None' = None, summary_row_background_color: 'str | None' = None, summary_row_text_transform: 'str | None' = None, summary_row_padding: 'str | None' = None, summary_row_padding_horizontal: 'str | None' = None, summary_row_border_style: 'str | None' = None, summary_row_border_width: 'str | None' = None, summary_row_border_color: 'str | None' = None, grand_summary_row_background_color: 'str | None' = None, grand_summary_row_text_transform: 'str | None' = None, grand_summary_row_padding: 'str | None' = None, grand_summary_row_padding_horizontal: 'str | None' = None, grand_summary_row_border_style: 'str | None' = None, grand_summary_row_border_width: 'str | None' = None, grand_summary_row_border_color: 'str | None' = None, footnotes_marks: 'str | list[str] | None' = None, source_notes_background_color: 'str | None' = None, source_notes_font_size: 'str | None' = None, source_notes_padding: 'str | None' = None, source_notes_padding_horizontal: 'str | None' = None, source_notes_border_bottom_style: 'str | None' = None, source_notes_border_bottom_width: 'str | None' = None, source_notes_border_bottom_color: 'str | None' = None, source_notes_border_lr_style: 'str | None' = None, source_notes_border_lr_width: 'str | None' = None, source_notes_border_lr_color: 'str | None' = None, source_notes_multiline: 'bool | None' = None, source_notes_sep: 'str | None' = None, row_striping_background_color: 'str | None' = None, row_striping_include_stub: 'bool | None' = None, row_striping_include_table_body: 'bool | None' = None, quarto_disable_processing: 'bool | None' = None) -> 'GTSelf'
Modify the table output options.
Modify the options available in a table. These options are named by the components, the
subcomponents, and the element that can adjusted.
Parameters
----------
container_width
The width of the table's container. Can be specified as a single-length
character with units of pixels or as a percentage. If provided as a scalar numeric
value, it is assumed that the value is given in units of pixels.
container_height
The height of the table's container.
container_padding_x
The horizontal padding of the table's container. Can be specified as a single-length
character with units of pixels or as a percentage. If provided as a scalar numeric
value, it is assumed that the value is given in units of pixels.
container_padding_y
The vertical padding of the table's container. Same rules apply as for
`container_padding_x`.
container_overflow_x
An option to enable scrolling in the horizontal direction when the table content overflows
the container dimensions. Using `True` (the default) means that horizontal scrolling is
enabled to view the entire table in those directions. With `False`, the table may be clipped
if the table width or height exceeds the `container_width`.
container_overflow_y
An option to enable scrolling in the vertical direction when the table content overflows.
Same rules apply as for `container_overflow_x`; the dependency here is that of the table
height (`container_height`).
table_width
The width of the table. Can be specified as a string with units of pixels or as a
percentage. If provided as a numeric value, it is assumed that the value is given in
units of pixels.
table_layout
The value for the `table-layout` CSS style in the HTML output context. By default, this
is `"fixed"` but another valid option is `"auto"`.
table_margin_left
The size of the margins on the left of the table within the container. Can be
specified as a single-length value with units of pixels or as a percentage. If
provided as a numeric value, it is assumed that the value is given in units of pixels.
Using `table_margin_left` will overwrite any values set by `table_align`.
table_margin_right
The size of the margins on the right of the table within the container. Same rules apply
as for `table_margin_left`. Using `table_margin_right` will overwrite any values set by
`table_align`.
table_background_color
The background color for the table. A color name or a hexadecimal color code should be
provided.
table_additional_css
Additional CSS that can be added to the table. This can be used to add any custom CSS
that is not covered by the other options.
table_font_names
The names of the fonts used for the table. This should be provided as a list of font
names. If the first font isn't available, then the next font is tried (and so on).
table_font_size
The font size for the table. Can be specified as a string with units of pixels or as a
percentage. If provided as a numeric value, it is assumed that the value is given in
units of pixels.
table_font_weight
The font weight of the table. Can be a text-based keyword such as `"normal"`, `"bold"`,
`"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`, inclusive. Note that
only variable fonts may support the numeric mapping of weight.
table_font_style
The font style for the table. Can be one of either `"normal"`, `"italic"`, or `"oblique"`.
table_font_color
The text color used throughout the table. A color name or a hexadecimal color code should be
provided.
table_font_color_light
The text color used throughout the table when the background color is dark. A color name or
a hexadecimal color code should be provided.
table_border_top_style
The style of the table's absolute top border. Can be one of either `"solid"`, `"dotted"`,
`"dashed"`, `"double"`, `"groove"`, `"ridge"`, `"inset"`, or `"outset"`.
table_border_top_width
The width of the table's absolute top border. Can be specified as a string with units of
pixels or as a percentage. If provided as a numeric value, it is assumed that the value is
given in units of pixels.
table_border_top_color
The color of the table's absolute top border. A color name or a hexadecimal color code
should be provided.
table_border_bottom_style
The style of the table's absolute bottom border.
table_border_bottom_width
The width of the table's absolute bottom border.
table_border_bottom_color
The color of the table's absolute bottom border.
table_border_left_style
The style of the table's absolute left border.
table_border_left_width
The width of the table's absolute left border.
table_border_left_color
The color of the table's absolute left border.
table_border_right_style
The style of the table's absolute right border.
table_border_right_width
The width of the table's absolute right border.
table_border_right_color
The color of the table's absolute right border.
heading_background_color
The background color for the heading. A color name or a hexadecimal color code should be
provided.
heading_align
Controls the horizontal alignment of the heading title and subtitle. We can either use
`"center"`, `"left"`, or `"right"`.
heading_title_font_size
The font size for the heading title element.
heading_title_font_weight
The font weight of the heading title.
heading_subtitle_font_size
The font size for the heading subtitle element.
heading_subtitle_font_weight
The font weight of the heading subtitle.
heading_padding
The amount of vertical padding to incorporate in the `heading` (title and subtitle). Can be
specified as a string with units of pixels or as a percentage. If provided as a numeric
value, it is assumed that the value is given in units of pixels.
heading_padding_horizontal
The amount of horizontal padding to incorporate in the `heading` (title and subtitle). Can
be specified as a string with units of pixels or as a percentage. If provided as a numeric
value, it is assumed that the value is given in units of pixels.
heading_border_bottom_style
The style of the header's bottom border.
heading_border_bottom_width
The width of the header's bottom border. If the `width` of this border is larger, then it
will be the visible border.
heading_border_bottom_color
The color of the header's bottom border.
heading_border_lr_style
The style of the left and right borders of the `heading` location.
heading_border_lr_width
The width of the left and right borders of the `heading` location. If the `width` of this
border is larger, then it will be the visible border.
heading_border_lr_color
The color of the left and right borders of the `heading` location.
column_labels_background_color
The background color for the column labels. A color name or a hexadecimal color code should
be provided.
column_labels_font_size
The font size to use for all column labels.
column_labels_font_weight
The font weight of the table's column labels.
column_labels_text_transform
The text transformation for the column labels. Either of the `"uppercase"`, `"lowercase"`,
or `"capitalize"` keywords can be used.
column_labels_padding
The amount of vertical padding to incorporate in the `column_labels` (this includes the
column spanners).
column_labels_padding_horizontal
The amount of horizontal padding to incorporate in the `column_labels` (this includes the
column spanners).
column_labels_vlines_style
The style of all vertical lines ('vlines') of the `column_labels`.
column_labels_vlines_width
The width of all vertical lines ('vlines') of the `column_labels`.
column_labels_vlines_color
The color of all vertical lines ('vlines') of the `column_labels`.
column_labels_border_top_style
The style of the top border of the `column_labels` location.
column_labels_border_top_width
The width of the top border of the `column_labels` location. If the `width` of this border
is larger, then it will be the visible border.
column_labels_border_top_color
The color of the top border of the `column_labels` location.
column_labels_border_bottom_style
The style of the bottom border of the `column_labels` location.
column_labels_border_bottom_width
The width of the bottom border of the `column_labels` location. If the `width` of this
border is larger, then it will be the visible border.
column_labels_border_bottom_color
The color of the bottom border of the `column_labels` location.
column_labels_border_lr_style
The style of the left and right borders of the `column_labels` location.
column_labels_border_lr_width
The width of the left and right borders of the `column_labels` location. If the `width` of
this border is larger, then it will be the visible border.
column_labels_border_lr_color
The color of the left and right borders of the `column_labels` location.
column_labels_hidden
An option to hide the column labels. If providing `True` then the entire `column_labels`
location won't be seen and the table header (if present) will collapse downward.
row_group_background_color
The background color for the row group labels. A color name or a hexadecimal color code
should be provided.
row_group_font_weight
The font weight for all row group labels present in the table.
row_group_font_size
The font size to use for all row group labels.
row_group_padding
The amount of vertical padding to incorporate in the row group labels.
row_group_border_top_style
The style of the top border of the `row_group` location.
row_group_border_top_width
The width of the top border of the `row_group` location. If the `width` of this border is
larger, then it will be the visible border.
row_group_border_top_color
The color of the top border of the `row_group` location.
row_group_border_bottom_style
The style of the bottom border of the `row_group` location.
row_group_border_bottom_width
The width of the bottom border of the `row_group` location. If the `width` of this border
is larger, then it will be the visible border.
row_group_border_bottom_color
The color of the bottom border of the `row_group` location.
row_group_border_left_style
The style of the left border of the `row_group` location.
row_group_border_left_width
The width of the left border of the `row_group` location. If the `width` of this border is
larger, then it will be the visible border.
row_group_border_left_color
The color of the left border of the `row_group` location.
row_group_border_right_style
The style of the right border of the `row_group` location.
row_group_border_right_width
The width of the right border of the `row_group` location. If the `width` of this border is
row_group_border_right_color
The color of the right border of the `row_group` location.
row_group_as_column
An option to render the row group labels as a column. If `True`, then the row group labels
will be rendered as a column to the left of the table body. If `False`, then the row group
labels will be rendered as a separate row above the grouping of rows.
table_body_hlines_style
The style of all horizontal lines ('hlines') in the `table_body`.
table_body_hlines_width
The width of all horizontal lines ('hlines') in the `table_body`.
table_body_hlines_color
The color of all horizontal lines ('hlines') in the `table_body`.
table_body_vlines_style
The style of all vertical lines ('vlines') in the `table_body`.
table_body_vlines_width
The width of all vertical lines ('vlines') in the `table_body`.
table_body_vlines_color
The color of all vertical lines ('vlines') in the `table_body`.
table_body_border_top_style
The style of the top border of the `table_body` location.
table_body_border_top_width
The width of the top border of the `table_body` location. If the `width` of this border is
larger, then it will be the visible border.
table_body_border_top_color
The color of the top border of the `table_body` location.
table_body_border_bottom_style
The style of the bottom border of the `table_body` location.
table_body_border_bottom_width
The width of the bottom border of the `table_body` location. If the `width` of this border
table_body_border_bottom_color
The color of the bottom border of the `table_body` location.
stub_background_color
The background color for the stub. A color name or a hexadecimal color code should be
provided.
stub_font_size
The font size to use for all row labels present in the table stub.
stub_font_weight
The font weight for all row labels present in the table stub.
stub_text_transform
The text transformation for the row labels present in the table stub.
stub_border_style
The style of the vertical border of the table stub.
stub_border_width
The width of the vertical border of the table stub.
stub_border_color
The color of the vertical border of the table stub.
stub_row_group_font_size
The font size for the row group column in the stub.
stub_row_group_font_weight
The font weight for the row group column in the stub.
stub_row_group_text_transform
The text transformation for the row group column in the stub.
stub_row_group_border_style
The style of the vertical border of the row group column in the stub.
stub_row_group_border_width
The width of the vertical border of the row group column in the stub.
stub_row_group_border_color
The color of the vertical border of the row group column in the stub.
data_row_padding
The amount of vertical padding to incorporate in the body/stub rows.
data_row_padding_horizontal
The amount of horizontal padding to incorporate in the body/stub rows.
source_notes_background_color
The background color for the source notes. A color name or a hexadecimal color code should
be provided.
source_notes_font_size
The font size to use for all source note text.
source_notes_padding
The amount of vertical padding to incorporate in the source notes.
source_notes_padding_horizontal
The amount of horizontal padding to incorporate in the source notes.
source_notes_multiline
An option to either put source notes in separate lines (the default, or `True`) or render
them as a continuous line of text with `source_notes_sep` providing the separator (by
default `" "`) between notes.
source_notes_sep
The separating characters between adjacent source notes when rendered as a continuous line
of text (when `source_notes_multiline` is `False`). The default value is a single space
character (`" "`).
source_notes_border_bottom_style
The style of the bottom border of the `source_notes` location.
source_notes_border_bottom_width
The width of the bottom border of the `source_notes` location. If the `width` of this border
is larger, then it will be the visible border.
source_notes_border_bottom_color
The color of the bottom border of the `source_notes` location.
source_notes_border_lr_style
The style of the left and right borders of the `source_notes` location.
source_notes_border_lr_width
The width of the left and right borders of the `source_notes` location. If the `width` of
this border is larger, then it will be the visible border.
source_notes_border_lr_color
The color of the left and right borders of the `source_notes` location.
row_striping_background_color
The background color for striped table body rows. A color name or a hexadecimal color code
should be provided.
row_striping_include_stub
An option for whether to include the stub when striping rows.
row_striping_include_table_body
An option for whether to include the table body when striping rows.
quarto_disable_processing
Whether to disable Quarto table processing.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a new table with a number of table
components added. We can use this object going forward to demonstrate some of the features
available in the `tab_options()` method.
```{python}
from great_tables import GT, exibble, md
gt_tbl = (
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
)
gt_tbl
```
We can modify the table width to be set as `"100%`". In effect, this spans the table to entirely
fill the content width area. This is done with the `table_width` option.
```{python}
gt_tbl.tab_options(table_width="100%")
```
With the `table_background_color` option, we can modify the table's background color. Here, we
want that to be `"lightcyan"`.
```{python}
gt_tbl.tab_options(table_background_color="lightcyan")
```
The data rows of a table typically take up the most physical space but we have some control over
the extent of that. With the `data_row_padding` option, it's possible to modify the top and
bottom padding of data rows. We'll do just that in the following example, reducing the padding
to a value of `"3px"`.
```{python}
gt_tbl.tab_options(data_row_padding="3px")
```
The size of the title and the subtitle text in the header of the table can be altered with the
`heading_title_font_size` and `heading_subtitle_font_size` options. Here, we'll use the
`"small"` and `"x-small"` keyword values.
```{python}
gt_tbl.tab_options(heading_title_font_size="small", heading_subtitle_font_size="x-small")
```
## Formatting column data
Columns of data can be formatted with the `fmt_*()` methods. We can specify the rows of these columns quite precisely with the `rows` argument. We get to apply these methods exactly once to each data cell (last call wins). Need to do custom formatting? Use the [`fmt()`](`great_tables.GT.fmt`) method and define your own formatter. The `sub_*()` methods allow you to perform substitution operations and `data_color()` provides a lot of power for colorizing body cells based on their data values.
fmt_number(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, scale_by: 'float' = 1, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format numeric values.
With numeric values within a table's body cells, we can perform number-based formatting so that
the targeted values are rendered with a higher consideration for tabular presentation.
Furthermore, there is finer control over numeric formatting with the following options:
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- scaling: we can choose to scale targeted values by a multiplier value
- large-number suffixing: larger figures (thousands, millions, etc.) can be autoscaled and
decorated with the appropriate suffixes
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in number formatting specific to
the chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
decimals
The `decimals` values corresponds to the exact number of decimal places to use. A value such
as `2.34` can, for example, be formatted with `0` decimal places and it would result in
`"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros
can be removed with `drop_trailing_zeros=True`. If you always need `decimals = 0`, the
[`fmt_integer()`](`great_tables.GT.fmt_integer`) method should be considered.
n_sigfig
A option to format numbers to *n* significant figures. By default, this is `None` and thus
number values will be formatted according to the number of decimal places set via
`decimals`. If opting to format according to the rules of significant figures, `n_sigfig`
must be a number greater than or equal to `1`. Any values passed to the `decimals` and
`drop_trailing_zeros` arguments will be ignored.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
accounting
Whether to use accounting style, which wraps negative numbers in parentheses instead of
using a minus sign.
scale_by
All numeric values will be multiplied by the `scale_by` value before undergoing formatting.
Since the `default` value is `1`, no values will be changed unless a different multiplier
value is supplied.
compact
A boolean value that allows for compact formatting of numeric values. Values will be scaled
and decorated with the appropriate suffixes (e.g., `1230` becomes `1.23K`, and `1230000`
becomes `1.23M`). The `compact` option is `False` by default.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's
preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use the `exibble` dataset to create a table. With the `fmt_number()` method, we'll format
the `num` column to have three decimal places (with `decimals=3`) and omit the use of digit
separators (with `use_seps=False`).
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.fmt_number(columns="num", decimals=3, use_seps=False)
)
```
fmt_integer(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, use_seps: 'bool' = True, scale_by: 'float' = 1, accounting: 'bool' = False, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format values as integers.
With numeric values in one or more table columns, we can perform number-based formatting so that
the targeted values are always rendered as integer values.
We can have fine control over integer formatting with the following options:
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- scaling: we can choose to scale targeted values by a multiplier value
- large-number suffixing: larger figures (thousands, millions, etc.) can be autoscaled and
decorated with the appropriate suffixes
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in number formatting specific to
the chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
scale_by
All numeric values will be multiplied by the `scale_by` value before undergoing formatting.
Since the `default` value is `1`, no values will be changed unless a different multiplier
value is supplied.
accounting
Whether to use accounting style, which wraps negative numbers in parentheses instead of
using a minus sign.
compact
A boolean value that allows for compact formatting of numeric values. Values will be scaled
and decorated with the appropriate suffixes (e.g., `1230` becomes `1K`, and `1230000`
becomes `1M`). The `compact` option is `False` by default.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator marks will be correct for the given locale. Should any value be
provided in `sep_mark`, it will be overridden by the locale's preferred value.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
For this example, we'll use the `exibble` dataset as the input table. With the `fmt_integer()`
method, we'll format the `num` column as integer values having no digit separators (with the
`use_seps=False` option).
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.fmt_integer(columns="num", use_seps=False)
)
```
fmt_scientific(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_by: 'float' = 1, exp_style: 'str' = 'x10n', pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign_m: 'bool' = False, force_sign_n: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format values to scientific notation.
With numeric values in a table, we can perform formatting so that the targeted values are
rendered in scientific notation, where extremely large or very small numbers can be expressed in
a more practical fashion. Here, numbers are written in the form of a mantissa (`m`) and an
exponent (`n`) with the construction *m* x 10^*n* or *m*E*n*. The mantissa component is a number
between `1` and `10`. For instance, `2.5 x 10^9` can be used to represent the value
2,500,000,000 in scientific notation. In a similar way, 0.00000012 can be expressed as
`1.2 x 10^-7`. Due to its ability to describe numbers more succinctly and its ease of
calculation, scientific notation is widely employed in scientific and technical domains.
We have fine control over the formatting task, with the following options:
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- scaling: we can choose to scale targeted values by a multiplier value
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in formatting specific to the
chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
decimals
The `decimals` values corresponds to the exact number of decimal places to use. A value such
as `2.34` can, for example, be formatted with `0` decimal places and it would result in
`"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros
can be removed with `drop_trailing_zeros=True`.
n_sigfig
A option to format numbers to *n* significant figures. By default, this is `None` and thus
number values will be formatted according to the number of decimal places set via
`decimals`. If opting to format according to the rules of significant figures, `n_sigfig`
must be a number greater than or equal to `1`. Any values passed to the `decimals` and
`drop_trailing_zeros` arguments will be ignored.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
scale_by
All numeric values will be multiplied by the `scale_by` value before undergoing formatting.
Since the `default` value is `1`, no values will be changed unless a different multiplier
value is supplied.
exp_style
Style of formatting to use for the scientific notation formatting. By default this is
`"x10n"` but other options include using a single letter (e.g., `"e"`, `"E"`, etc.), a
letter followed by a `"1"` to signal a minimum digit width of one, or `"low-ten"` for using
a stylized `"10"` marker.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign_m
Should the plus sign be shown for positive values of the mantissa (first component)? This
would effectively show a sign for all values except zero on the first numeric component of
the notation. If so, use `True` (the default for this is `False`), where only negative
numbers will display a sign.
force_sign_n
Should the plus sign be shown for positive values of the exponent (second component)? This
would effectively show a sign for all values except zero on the second numeric component of
the notation. If so, use `True` (the default for this is `False`), where only negative
numbers will display a sign.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
a value be provided in `dec_mark` it will be overridden by the locale's preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
For this example, we'll use the `exibble` dataset as the input table. With the
`fmt_scientific()` method, we'll format the `num` column to contain values in scientific
formatting.
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.fmt_scientific(columns="num")
)
```
fmt_engineering(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_by: 'float' = 1, exp_style: 'str' = 'x10n', pattern: 'str' = '{x}', dec_mark: 'str' = '.', force_sign_m: 'bool' = False, force_sign_n: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format values to engineering notation.
With numeric values in a table, we can perform formatting so that the targeted values are
rendered in engineering notation, where numbers are written in the form of a mantissa (`m`) and
an exponent (`n`). When combined the construction is either of the form *m* x 10^*n* or *m*E*n*.
The mantissa is a number between `1` and `1000` and the exponent is a multiple of `3`. For
example, the number `0.0000345` can be written in engineering notation as `34.50 x 10^-6`. This
notation helps to simplify calculations and make it easier to compare numbers that are on very
different scales.
Engineering notation is particularly useful as it aligns with SI prefixes (e.g., *milli-*,
*micro-*, *kilo-*, *mega-*). For instance, numbers in engineering notation with exponent `-3`
correspond to milli-units, while those with exponent `6` correspond to mega-units.
We have fine control over the formatting task, with the following options:
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- scaling: we can choose to scale targeted values by a multiplier value
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in formatting specific to the
chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
decimals
The `decimals` values corresponds to the exact number of decimal places to use. A value such
as `2.34` can, for example, be formatted with `0` decimal places and it would result in
`"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros
can be removed with `drop_trailing_zeros=True`.
n_sigfig
A option to format numbers to *n* significant figures. By default, this is `None` and thus
number values will be formatted according to the number of decimal places set via
`decimals`. If opting to format according to the rules of significant figures, `n_sigfig`
must be a number greater than or equal to `1`. Any values passed to the `decimals` and
`drop_trailing_zeros` arguments will be ignored.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
scale_by
All numeric values will be multiplied by the `scale_by` value before undergoing formatting.
Since the `default` value is `1`, no values will be changed unless a different multiplier
value is supplied.
exp_style
Style of formatting to use for the engineering notation formatting. By default this is
`"x10n"` but other options include using a single letter (e.g., `"e"`, `"E"`, etc.), a
letter followed by a `"1"` to signal a minimum digit width of one, or `"low-ten"` for using
a stylized `"10"` marker.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign_m
Should the plus sign be shown for positive values of the mantissa (first component)? This
would effectively show a sign for all values except zero on the first numeric component of
the notation. If so, use `True` (the default for this is `False`), where only negative
numbers will display a sign.
force_sign_n
Should the plus sign be shown for positive values of the exponent (second component)? This
would effectively show a sign for all values except zero on the second numeric component of
the notation. If so, use `True` (the default for this is `False`), where only negative
numbers will display a sign.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means decimal marks will be correct for the given locale. Should a value be
provided in `dec_mark` it will be overridden by the locale's preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
With numeric values in a table, we can perform formatting so that the targeted values are
rendered in engineering notation. For example, the number `0.0000345` can be written in
engineering notation as `34.50 x 10^-6`.
```{python}
import polars as pl
from great_tables import GT
numbers_df = pl.DataFrame({
"numbers": [0.0000345, 3450, 3450000]
})
GT(numbers_df).fmt_engineering()
```
Notice that in each case, the exponent is a multiple of `3`.
Let's define a DataFrame that contains two columns of values (one small and one large). After
creating a simple table with `GT()`, we'll call `fmt_engineering()` on both columns.
```{python}
small_large_df = pl.DataFrame({
"small": [10**-i for i in range(12, 0, -1)],
"large": [10**i for i in range(1, 13)]
})
GT(small_large_df).fmt_engineering()
```
Notice that within the form of *m* x 10^*n*, the *n* values move in steps of 3 (away from 0),
and *m* values can have 1-3 digits before the decimal. Further to this, any values where *n* is
0 results in a display of only *m* (the first two values in the `large` column demonstrates
this).
Engineering notation expresses values so that they align to certain SI prefixes. Here is a table
that compares select SI prefixes and their symbols to decimal and engineering-notation
representations of the key numbers.
```{python}
import polars as pl
from great_tables import GT
prefixes_df = pl.DataFrame({
"name": [
"peta", "tera", "giga", "mega", "kilo",
None,
"milli", "micro", "nano", "pico", "femto"
],
"symbol": [
"P", "T", "G", "M", "k",
None,
"m", "μ", "n", "p", "f"
],
"decimal": [float(10**i) for i in range(15, -18, -3)],
})
prefixes_df = prefixes_df.with_columns(
engineering=pl.col("decimal")
)
(
GT(prefixes_df)
.fmt_number(columns="decimal", n_sigfig=1)
.fmt_engineering(columns="engineering")
.sub_missing()
)
```
fmt_percent(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, decimals: 'int' = 2, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_values: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, placement: 'str' = 'right', incl_space: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format values as a percentage.
With numeric values in a **gt** table, we can perform percentage-based formatting. It is assumed
the input numeric values are proportional values and, in this case, the values will be
automatically multiplied by `100` before decorating with a percent sign (the other case is
accommodated though setting `scale_values` to `False`). For more control over percentage
formatting, we can use the following options:
- percent sign placement: the percent sign can be placed after or before the values and a space
can be inserted between the symbol and the value.
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- value scaling toggle: choose to disable automatic value scaling in the situation that values
are already scaled coming in (and just require the percent symbol)
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in number formatting specific to
the chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
decimals
The `decimals` values corresponds to the exact number of decimal places to use. A value such
as `2.34` can, for example, be formatted with `0` decimal places and it would result in
`"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros
can be removed with `drop_trailing_zeros=True`.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
scale_values
Should the values be scaled through multiplication by 100? By default this scaling is
performed since the expectation is that incoming values are usually proportional. Setting to
`False` signifies that the values are already scaled and require only the percent sign when
formatted.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
accounting
Whether to use accounting style, which wraps negative numbers in parentheses instead of
using a minus sign.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
placement
This option governs the placement of the percent sign. This can be either be `"right"` (the
default) or `"left"`.
incl_space
An option for whether to include a space between the value and the percent sign. The default
is to not introduce a space character.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's
preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let’s use the `towny` dataset as the input table. With the `fmt_percent()` method, we'll format
the `pop_change_2016_2021_pct` column to to display values as percentages (to two decimal
places).
```{python}
from great_tables import GT
from great_tables.data import towny
towny_mini = (
towny[["name", "pop_change_2016_2021_pct"]]
.head(10)
)
(GT(towny_mini).fmt_percent("pop_change_2016_2021_pct", decimals=2))
```
fmt_partsper(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, to_units: 'str' = 'per-mille', symbol: 'str' = 'auto', decimals: 'int' = 2, drop_trailing_zeros: 'bool' = False, drop_trailing_dec_mark: 'bool' = True, scale_values: 'bool' = True, use_seps: 'bool' = True, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, incl_space: 'str | bool' = 'auto', locale: 'str | None' = None) -> 'GTSelf'
Format values as parts-per quantities.
With numeric values in a **gt** table, we can format the values so that they are rendered as
parts-per quantities (per mille, ppm, ppb, etc.). The following keywords are available for
the `to_units` parameter:
- `"per-mille"`: Per mille (1 part in 1,000)
- `"per-myriad"`: Per myriad (1 part in 10,000)
- `"pcm"`: Per cent mille (1 part in 100,000)
- `"ppm"`: Parts per million (1 part in 1,000,000)
- `"ppb"`: Parts per billion (1 part in 1,000,000,000)
- `"ppt"`: Parts per trillion (1 part in 1,000,000,000,000)
- `"ppq"`: Parts per quadrillion (1 part in 1,000,000,000,000,000)
The function provides a lot of formatting control and we can use the following options:
- custom symbol/units: override the automatic symbol or units display with a custom choice
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- value scaling toggle: choose to disable automatic value scaling in the situation that values
are already scaled coming in (and just require the appropriate symbol or unit display)
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in number formatting specific to
the chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
to_units
A keyword that signifies the desired output quantity. This can be any from the following
set: `"per-mille"`, `"per-myriad"`, `"pcm"`, `"ppm"`, `"ppb"`, `"ppt"`, or `"ppq"`.
symbol
The symbol/units to use for the quantity. By default, this is set to `"auto"` and the
appropriate symbol will be chosen based on the `to_units` keyword and the output context.
This can be changed by supplying a string (e.g., using `symbol="ppbV"` when
`to_units="ppb"`).
decimals
The `decimals` values corresponds to the exact number of decimal places to use. A value such
as `2.34` can, for example, be formatted with `0` decimal places and it would result in
`"2"`. With `4` decimal places, the formatted value becomes `"2.3400"`. The trailing zeros
can be removed with `drop_trailing_zeros=True`.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
scale_values
Should the values be scaled through multiplication according to the keyword set in
`to_units`? By default this is `True` since the expectation is that normally values are
proportions. Setting to `False` signifies that the values are already scaled and require
only the appropriate symbol/units when formatted.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
incl_space
An option for whether to include a space between the value and the symbol/units. The default
is `"auto"` which provides spacing dependent on the mark itself (symbols like `‰` get no
space; text abbreviations like `ppm` get a space). This can be directly controlled by using
either `True` or `False`.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's
preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use a small dataset with proportional values and format them as parts-per-mille values.
```{python}
from great_tables import GT
import pandas as pd
df = pd.DataFrame({"x": [0.001, 0.0001, 0.00001, 0.5, -0.005]})
GT(df).fmt_partsper(columns="x", to_units="per-mille")
```
We can also format values as parts per million (ppm) using a Polars DataFrame:
```{python}
import polars as pl
from great_tables import GT
df = pl.DataFrame({"x": [0.0000015, 0.00035, 0.0001]})
GT(df).fmt_partsper(columns="x", to_units="ppm")
```
If the values are already scaled (not proportions), set `scale_values=False` and use a custom
symbol:
```{python}
import polars as pl
from great_tables import GT
concentrations = pl.DataFrame({"gas": ["CO", "NO2", "O3"], "conc": [1.5, 35.0, 120.0]})
GT(concentrations).fmt_partsper(columns="conc", to_units="ppb", scale_values=False, symbol="ppbV")
```
fmt_currency(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, currency: 'str | None' = None, use_subunits: 'bool' = True, decimals: 'int | None' = None, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, accounting: 'bool' = False, scale_by: 'float' = 1, compact: 'bool' = False, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, placement: 'str' = 'left', incl_space: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format values as currencies.
With numeric values in a **gt** table, we can perform currency-based formatting with the
`fmt_currency()` method. This supports both automatic formatting with a three-letter currency
code. We have fine control over the conversion from numeric values to currency values, where we
could take advantage of the following options:
- the currency: providing a currency code or common currency name will procure the correct
currency symbol and number of currency subunits
- currency symbol placement: the currency symbol can be placed before or after the values
- decimals/subunits: choice of the number of decimal places, and a choice of the decimal symbol,
and an option on whether to include or exclude the currency subunits (the decimal portion)
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- scaling: we can choose to scale targeted values by a multiplier value
- pattern: option to use a text pattern for decoration of the formatted currency values
- locale-based formatting: providing a locale ID will result in currency formatting specific to
the chosen locale; it will also retrieve the locale's currency if none is explicitly given
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
currency
The currency to use for the numeric value. This input can be supplied as a 3-letter currency
code (e.g., `"USD"` for U.S. Dollars, `"EUR"` for the Euro currency).
use_subunits
An option for whether the subunits portion of a currency value should be displayed. For
example, with an input value of `273.81`, the default formatting will produce `"$273.81"`.
Removing the subunits (with `use_subunits = False`) will give us `"$273"`.
decimals
The `decimals` values corresponds to the exact number of decimal places to use. This value
is optional as a currency has an intrinsic number of decimal places (i.e., the subunits).
A value such as `2.34` can, for example, be formatted with `0` decimal places and if the
currency used is `"USD"` it would result in `"$2"`. With `4` decimal places, the formatted
value becomes `"$2.3400"`.
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
accounting
Whether to use accounting style, which wraps negative numbers in parentheses instead of
using a minus sign.
scale_by
All numeric values will be multiplied by the `scale_by` value before undergoing formatting.
Since the `default` value is `1`, no values will be changed unless a different multiplier
value is supplied.
compact
Whether to use compact formatting. This is a boolean value that, when set to `True`, will
format large numbers in a more compact form (e.g., `1,000,000` becomes `1M`). This is
`False` by default.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
placement
The placement of the currency symbol. This can be either be `"left"` (as in `"$450"`) or
`"right"` (which yields `"450$"`).
incl_space
An option for whether to include a space between the value and the currency symbol. The
default is to not introduce a space character.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's
preferred values. In addition to number formatting, providing a `locale` value and not providing
a `currency` allows **Great Tables** to obtain the currency code from the locale's territory.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use the `exibble` dataset to create a table. With the `fmt_currency()` method, we'll
format the `currency` column to display monetary values.
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.fmt_currency(
columns="currency",
decimals=3,
use_seps=False
)
)
```
fmt_roman(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, case: 'str' = 'upper', pattern: 'str' = '{x}') -> 'GTSelf'
Format values as Roman numerals.
With numeric values in a **gt** table we can transform those to Roman numerals, rounding values
as necessary.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
case
Should Roman numerals should be rendered as uppercase (`"upper"`) or lowercase (`"lower"`)
letters? By default, this is set to `"upper"`.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's first create a DataFrame containing small numeric values and then introduce that to
[`GT()`](`great_tables.GT`). We'll then format the `roman` column to appear as Roman numerals
with the `fmt_roman()` method.
```{python}
import pandas as pd
from great_tables import GT
numbers_tbl = pd.DataFrame({"arabic": [1, 8, 24, 85], "roman": [1, 8, 24, 85]})
(
GT(numbers_tbl, rowname_col="arabic")
.fmt_roman(columns="roman")
)
```
fmt_bytes(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, standard: 'str' = 'decimal', decimals: 'int' = 1, n_sigfig: 'int | None' = None, drop_trailing_zeros: 'bool' = True, drop_trailing_dec_mark: 'bool' = True, use_seps: 'bool' = True, pattern: 'str' = '{x}', sep_mark: 'str' = ',', dec_mark: 'str' = '.', force_sign: 'bool' = False, incl_space: 'bool' = True, locale: 'str | None' = None) -> 'GTSelf'
Format values as bytes.
With numeric values in a table, we can transform those to values of bytes with human readable
units. The `fmt_bytes()` method allows for the formatting of byte sizes to either of two common
representations: (1) with decimal units (powers of 1000, examples being `"kB"` and `"MB"`), and
(2) with binary units (powers of 1024, examples being `"KiB"` and `"MiB"`). It is assumed the
input numeric values represent the number of bytes and automatic truncation of values will
occur. The numeric values will be scaled to be in the range of 1 to <1000 and then decorated
with the correct unit symbol according to the standard chosen. For more control over the
formatting of byte sizes, we can use the following options:
- decimals: choice of the number of decimal places, option to drop trailing zeros, and a choice
of the decimal symbol
- digit grouping separators: options to enable/disable digit separators and provide a choice of
separator symbol
- pattern: option to use a text pattern for decoration of the formatted values
- locale-based formatting: providing a locale ID will result in number formatting specific to
the chosen locale
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
standard
The form of expressing large byte sizes is divided between: (1) decimal units (powers of
1000; e.g., `"kB"` and `"MB"`), and (2) binary units (powers of 1024; e.g., `"KiB"` and
`"MiB"`). The default is to use decimal units with the `"decimal"` option. The alternative
is to use binary units with the `"binary"` option.
decimals
This corresponds to the exact number of decimal places to use. A value such as `2.34` can,
for example, be formatted with `0` decimal places and it would result in `"2"`. With `4`
decimal places, the formatted value becomes `"2.3400"`. The trailing zeros can be removed
with `drop_trailing_zeros=True`.
drop_trailing_zeros
A boolean value that allows for removal of trailing zeros (those redundant zeros after the
decimal mark).
drop_trailing_dec_mark
A boolean value that determines whether decimal marks should always appear even if there are
no decimal digits to display after formatting (e.g., `23` becomes `23.` if `False`). By
default trailing decimal marks are not shown.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
dec_mark
The string to be used as the decimal mark. For example, using `dec_mark=","` with the value
`0.152` would result in a formatted value of `"0,152"`). This argument is ignored if a
`locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
incl_space
An option for whether to include a space between the value and the currency symbol. The
default is to not introduce a space character.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). The use of a valid
locale ID here means separator and decimal marks will be correct for the given locale. Should
any values be provided in `sep_mark` or `dec_mark`, they will be overridden by the locale's
preferred values.
Note that a `locale` value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use a single column from the `exibble` dataset and create a new table. We'll format the
`num` column to display as byte sizes in the decimal standard through use of the `fmt_bytes()`
method.
```{python}
from great_tables import GT, exibble
(
GT(exibble[["num"]])
.fmt_bytes(columns="num", standard="decimal")
)
```
fmt_date(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, date_style: 'DateStyle' = 'iso', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf'
Format values as dates.
Format input values to time values using one of 17 preset date styles. Input can be in the form
of `date` type or as a ISO-8601 string (in the form of `YYYY-MM-DD HH:MM:SS` or `YYYY-MM-DD`).
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
date_style
The date style to use. By default this is the short name `"iso"` which corresponds to
ISO 8601 date formatting. There are 41 date styles in total.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Formatting with the `date_style=` argument
-----------------------------------------
We need to supply a preset date style to the `date_style=` argument. The date styles are
numerous and can handle localization to any supported locale. The following table provides a
listing of all date styles and their output values (corresponding to an input date of
`2000-02-29`).
| | Date Style | Output |
|----|-----------------------|-------------------------|
| 1 | `"iso"` | `"2000-02-29"` |
| 2 | `"wday_month_day_year"`| `"Tuesday, February 29, 2000"` |
| 3 | `"wd_m_day_year"` | `"Tue, Feb 29, 2000"` |
| 4 | `"wday_day_month_year"`| `"Tuesday 29 February 2000"` |
| 5 | `"month_day_year"` | `"February 29, 2000"` |
| 6 | `"m_day_year"` | `"Feb 29, 2000"` |
| 7 | `"day_m_year"` | `"29 Feb 2000"` |
| 8 | `"day_month_year"` | `"29 February 2000"` |
| 9 | `"day_month"` | `"29 February"` |
| 10 | `"day_m"` | `"29 Feb"` |
| 11 | `"year"` | `"2000"` |
| 12 | `"month"` | `"February"` |
| 13 | `"day"` | `"29"` |
| 14 | `"year.mn.day"` | `"2000/02/29"` |
| 15 | `"y.mn.day"` | `"00/02/29"` |
| 16 | `"year_week"` | `"2000-W09"` |
| 17 | `"year_quarter"` | `"2000-Q1"` |
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). Note that a `locale`
value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date`
and `time` columns). With the `fmt_date()` method, we'll format the `date` column to display
dates formatted with the `"month_day_year"` date style.
```{python}
from great_tables import GT, exibble
exibble_mini = exibble[["date", "time"]]
(
GT(exibble_mini)
.fmt_date(columns="date", date_style="month_day_year")
)
```
fmt_time(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, time_style: 'TimeStyle' = 'iso', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf'
Format values as times.
Format input values to time values using one of 5 preset time styles. Input can be in the form
of `time` values, or strings in the ISO 8601 forms of `HH:MM:SS` or `YYYY-MM-DD HH:MM:SS`.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
time_style
The time style to use. By default this is the short name `"iso"` which corresponds to how
times are formatted within ISO 8601 datetime values. There are 5 time styles in total.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Formatting with the `time_style=` argument
-----------------------------------------
We need to supply a preset time style to the `time_style=` argument. The time styles are
numerous and can handle localization to any supported locale. The following table provides a
listing of all time styles and their output values (corresponding to an input time of
`14:35:00`).
| | Time Style | Output | Notes |
|----|---------------|---------------------------------|---------------|
| 1 | `"iso"` | `"14:35:00"` | ISO 8601, 24h |
| 2 | `"iso-short"` | `"14:35"` | ISO 8601, 24h |
| 3 | `"h_m_s_p"` | `"2:35:00 PM"` | 12h |
| 4 | `"h_m_p"` | `"2:35 PM"` | 12h |
| 5 | `"h_p"` | `"2 PM"` | 12h |
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Adapting output to a specific `locale`
--------------------------------------
This formatting method can adapt outputs according to a provided `locale` value. Examples
include `"en"` for English (United States) and `"fr"` for French (France). Note that a `locale`
value provided here will override any global locale setting performed in
[`GT()`](`great_tables.GT`)'s own `locale` argument (it is settable there as a value received by
all other methods that have a `locale` argument).
Examples
--------
Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date`
and `time` columns). With the `fmt_time()` method, we'll format the `time` column to display
times formatted with the `"h_m_s_p"` time style.
```{python}
from great_tables import GT, exibble
exibble_mini = exibble[["date", "time"]]
(
GT(exibble_mini)
.fmt_time(columns="time", time_style="h_m_s_p")
)
```
fmt_datetime(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, date_style: 'DateStyle' = 'iso', time_style: 'TimeStyle' = 'iso', format_str: 'str | None' = None, sep: 'str' = ' ', pattern: 'str' = '{x}', locale: 'str | None' = None) -> 'GTSelf'
Format values as datetimes.
Format input values to datetime values using one of 17 preset date styles and one of 5 preset
time styles. Input can be in the form of `datetime` values, or strings in the ISO 8601 forms of
`YYYY-MM-DD HH:MM:SS` or `YYYY-MM-DD`.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
date_style
The date style to use. By default this is the short name `"iso"` which corresponds to
ISO 8601 date formatting. There are 41 date styles in total.
time_style
The time style to use. By default this is the short name `"iso"` which corresponds to how
times are formatted within ISO 8601 datetime values. There are 5 time styles in total.
format_str
A string that specifies the format of the datetime string. This is a `strftime()` format
string that can be used to format date or datetime input. If `format=` is provided, the
`date_style=` and `time_style=` arguments are ignored.
sep
A string that separates the date and time components of the datetime string. The default is
a space character (`" "`). This is ignored if `format=` is provided.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Only relevant if `date_style=` or `time_style=` are provided.
Formatting with the `date_style=` and `time_style=` arguments
-------------------------------------------------------------
If not supplying a formatting string to `format_str=` we need to supply a preset date style to
the `date_style=` argument and a preset time style to the `time_style=` argument. The date
styles are numerous and can handle localization to any supported locale. The following table
provides a listing of all date styles and their output values (corresponding to an input date of
`2000-02-29 14:35:00`).
| | Date Style | Output |
|----|-----------------------|-------------------------|
| 1 | `"iso"` | `"2000-02-29"` |
| 2 | `"wday_month_day_year"`| `"Tuesday, February 29, 2000"` |
| 3 | `"wd_m_day_year"` | `"Tue, Feb 29, 2000"` |
| 4 | `"wday_day_month_year"`| `"Tuesday 29 February 2000"` |
| 5 | `"month_day_year"` | `"February 29, 2000"` |
| 6 | `"m_day_year"` | `"Feb 29, 2000"` |
| 7 | `"day_m_year"` | `"29 Feb 2000"` |
| 8 | `"day_month_year"` | `"29 February 2000"` |
| 9 | `"day_month"` | `"29 February"` |
| 10 | `"day_m"` | `"29 Feb"` |
| 11 | `"year"` | `"2000"` |
| 12 | `"month"` | `"February"` |
| 13 | `"day"` | `"29"` |
| 14 | `"year.mn.day"` | `"2000/02/29"` |
| 15 | `"y.mn.day"` | `"00/02/29"` |
| 16 | `"year_week"` | `"2000-W09"` |
| 17 | `"year_quarter"` | `"2000-Q1"` |
The time styles can also handle localization to any supported locale. The following table
provides a listing of all time styles and their output values (corresponding to an input time of
`2000-02-29 14:35:00`).
| | Time Style | Output | Notes |
|----|---------------|---------------------------------|---------------|
| 1 | `"iso"` | `"14:35:00"` | ISO 8601, 24h |
| 2 | `"iso-short"` | `"14:35"` | ISO 8601, 24h |
| 3 | `"h_m_s_p"` | `"2:35:00 PM"` | 12h |
| 4 | `"h_m_p"` | `"2:35 PM"` | 12h |
| 5 | `"h_p"` | `"2 PM"` | 12h |
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `exibble` dataset to create a simple, two-column table (keeping only the `date`
and `time` columns). With the `fmt_datetime()` method, we'll format the `date` column to display
dates formatted with the `"month_day_year"` date style and the `time` column to display times
formatted with the `"h_m_s_p"` time style.
```{python}
from great_tables import GT, exibble
exibble_mini = exibble[["date", "time"]]
(
GT(exibble_mini)
.fmt_datetime(
columns="date",
date_style="month_day_year",
time_style="h_m_s_p"
)
)
```
fmt_duration(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, input_units: 'str | None' = None, output_units: 'str | list[str] | None' = None, duration_style: 'DurationStyle' = 'narrow', trim_zero_units: 'bool | list[str]' = True, max_output_units: 'int | None' = None, pattern: 'str' = '{x}', use_seps: 'bool' = True, sep_mark: 'str' = ',', force_sign: 'bool' = False, locale: 'str | None' = None) -> 'GTSelf'
Format numeric or duration values as styled time duration strings.
Format input values to time duration values whether those input values are numbers or of the
`timedelta` class. We can specify which time units any numeric input values have (as weeks,
days, hours, minutes, or seconds) and the output can be customized with a duration style
(corresponding to narrow, wide, colon-separated, and ISO forms) and a choice of output units
ranging from weeks to seconds.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
input_units
If one or more selected columns contains numeric values (not `timedelta` values, which
contain the duration units), a keyword must be provided for `input_units` for the values to
be interpreted in terms of duration. The accepted units are: `"seconds"`, `"minutes"`,
`"hours"`, `"days"`, and `"weeks"`. This is required for numeric columns and ignored for
`timedelta` columns.
output_units
Controls the output time units. The default (`None`) means that output units will be
automatically chosen based on the input duration value. To control which time units are to
be considered for output (before trimming with `trim_zero_units=`) we can specify a list of
one or more of the following keywords: `"weeks"`, `"days"`, `"hours"`, `"minutes"`, or
`"seconds"`.
duration_style
A choice of four formatting styles for the output duration values. With `"narrow"` (the
default style), duration values will be formatted with single-letter time-part units (e.g.,
1.35 days will be styled as `"1d 8h 24m"`). With `"wide"`, this example value will be
expanded to `"1 day 8 hours 24 minutes"` after formatting. The `"colon-sep"` style will put
days, hours, minutes, and seconds in the `"([D]/)[HH]:[MM]:[SS]"` format. The `"iso"` style
will produce a value that conforms to the ISO 8601 rules for duration values (e.g., 1.35
days will become `"P1DT8H24M"`).
trim_zero_units
Provides methods to remove output time units that have zero values. By default this is
`True` and duration values that might otherwise be formatted as `"0w 1d 0h 4m 19s"` with
`trim_zero_units=False` are instead displayed as `"1d 4m 19s"`. Aside from using
`True`/`False` we could provide a list of keywords for more precise control. These keywords
are: (1) `"leading"`, to omit all leading zero-value time units (e.g., `"0w 1d"` ->
`"1d"`), (2) `"trailing"`, to omit all trailing zero-value time units (e.g., `"3d 5h 0s"`
-> `"3d 5h"`), and (3) `"internal"`, which removes all internal zero-value time units
(e.g., `"5d 0h 33m"` -> `"5d 33m"`).
max_output_units
If `output_units` is `None`, where the output time units are unspecified and left to be
handled automatically, a numeric value provided for `max_output_units=` will be taken as the
maximum number of time units to display in all output time duration values. By default, this
is `None` and all possible time units will be displayed. This option has no effect when
`duration_style="colon-sep"` (only `output_units` can be used to customize that type of
duration output).
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
use_seps
The `use_seps` option allows for the use of digit group separators. The type of digit group
separator is set by `sep_mark` and overridden if a locale ID is provided to `locale`. This
setting is `True` by default.
sep_mark
The string to use as a separator between groups of digits. For example, using `sep_mark=","`
with a value of `1000` would result in a formatted value of `"1,000"`. This argument is
ignored if a `locale` is supplied (i.e., is not `None`).
force_sign
Should the positive sign be shown for positive values (effectively showing a sign for all
values except zero)? If so, use `True` for this option. The default is `False`, where only
negative numbers will display a minus sign.
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Output units for the colon-separated duration style
---------------------------------------------------
The colon-separated duration style (enabled when `duration_style="colon-sep"`) is essentially a
clock-based output format which uses the display logic of chronograph watch functionality. It
will, by default, display duration values in the `(D/)HH:MM:SS` format. Any duration values
greater than or equal to 24 hours will have the number of days prepended with an adjoining slash
mark. While this output format is versatile, it can be changed somewhat with the `output_units=`
option. The following combinations of output units are permitted:
- `["minutes", "seconds"]` -> `MM:SS`
- `["hours", "minutes"]` -> `HH:MM`
- `["hours", "minutes", "seconds"]` -> `HH:MM:SS`
- `["days", "hours", "minutes"]` -> `(D/)HH:MM`
Any other specialized combinations will result in the default set being used, which is
`["days", "hours", "minutes", "seconds"]`.
Compatibility of formatting function with data values
-----------------------------------------------------
`fmt_duration()` is compatible with body cells that are of `int`, `float`, or
`datetime.timedelta` types. Any other types of body cells are ignored during formatting.
Examples
--------
Let's create a table with duration values in seconds and format them using the default narrow
style. This produces compact output with single-letter unit abbreviations, ideal for
space-constrained displays.
```{python}
import pandas as pd
from great_tables import GT
df = pd.DataFrame({"duration_s": [3661, 86400, 172800, 60, 0]})
(
GT(df)
.fmt_duration(columns="duration_s", input_units="seconds")
)
```
Notice that zero-valued time units are automatically trimmed from the output, keeping the
display clean. A value of `86400` seconds (exactly 1 day) simply shows `"1d"` rather than
`"0w 1d 0h 0m 0s"`.
For reporting contexts where readability is more important than compactness, the wide style
spells out the full unit names with proper singular/plural forms.
```{python}
df = pd.DataFrame({"hours": [1.5, 24.0, 0.5, 100.75]})
(
GT(df)
.fmt_duration(columns="hours", input_units="hours", duration_style="wide")
)
```
The colon-separated style is useful for timing data, race results, or any context where a
clock-like display is expected. Days are shown with a slash prefix when the duration is 24 hours
or more.
```{python}
df = pd.DataFrame({
"event": ["Marathon", "Half Marathon", "10K", "Mile"],
"winning_time_s": [7377, 3542, 1620, 233],
})
(
GT(df)
.fmt_duration(
columns="winning_time_s",
input_units="seconds",
duration_style="colon-sep",
output_units=["hours", "minutes", "seconds"],
)
)
```
The output is zero-padded in the familiar `HH:MM:SS` format. By specifying `output_units` we
control exactly which components appear in the colon-separated output.
When working with `timedelta` columns (common in Pandas when computing differences between
timestamps), `fmt_duration()` automatically detects the units—no `input_units` argument is
needed.
```{python}
from datetime import datetime
events = pd.DataFrame({
"task": ["Build", "Test suite", "Deploy", "Full pipeline"],
"elapsed": [
datetime(2024, 1, 1, 0, 12, 45) - datetime(2024, 1, 1, 0, 0, 0),
datetime(2024, 1, 1, 1, 5, 30) - datetime(2024, 1, 1, 0, 0, 0),
datetime(2024, 1, 1, 0, 3, 15) - datetime(2024, 1, 1, 0, 0, 0),
datetime(2024, 1, 1, 1, 21, 30) - datetime(2024, 1, 1, 0, 0, 0),
],
})
(
GT(events, rowname_col="task")
.fmt_duration(columns="elapsed", duration_style="narrow")
)
```
Polars DataFrames work the same way. Here we format numeric duration values using the ISO 8601
duration style, which is useful for machine-readable output or standards-compliant reporting.
```{python}
import polars as pl
from great_tables import GT
df = pl.DataFrame({"activity": ["Flight", "Layover", "Drive"], "seconds": [14400, 5400, 1830]})
(
GT(df)
.fmt_duration(columns="seconds", input_units="seconds", duration_style="iso")
)
```
Polars also has native `Duration` dtype columns (created via temporal arithmetic or
`timedelta` values). These are handled automatically without needing to specify `input_units`.
```{python}
from datetime import timedelta
df = pl.DataFrame({
"segment": ["Warm-up", "Main set", "Cool-down"],
"duration": [timedelta(minutes=10), timedelta(minutes=45, seconds=30), timedelta(minutes=5)],
})
(
GT(df)
.fmt_duration(columns="duration", duration_style="wide")
)
```
fmt_tf(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, tf_style: 'str' = 'true-false', pattern: 'str' = '{x}', true_val: 'str | None' = None, false_val: 'str | None' = None, na_val: 'str | None' = None, colors: 'list[str] | None' = None) -> 'GTSelf'
Format True and False values
There can be times where boolean values are useful in a display table. You might want to express
a 'yes' or 'no', a 'true' or 'false', or, perhaps use pairings of complementary symbols that
make sense in a table. The `fmt_tf()` method has a set of `tf_style=` presets that can be used
to quickly map `True`/`False` values to strings, or, symbols like up/down or left/right arrows
and open/closed shapes.
While the presets are nice, you can provide your own mappings through the `true_val=` and
`false_val=` arguments. For extra customization, you can also apply color to the individual
`True`, `False`, and NA mappings. Just supply a list of colors (up to a length of 3) to the
`colors=` argument.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
tf_style
The `True`/`False` mapping style to use. By default this is the short name `"true-false"`
which corresponds to the words `"true"` and `"false"`. Two other `tf_style=` values produce
words: `"yes-no"` and `"up-down"`. The remaining options involve pairs of symbols (e.g.,
`"check-mark"` displays a check mark for `True` and an ✗ symbol for `False`).
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
true_val
While the choice of a `tf_style=` will typically supply the `true_val=` and `false_val=`
text, we could override this and supply text for any `True` values. This doesn't need to be
used in conjunction with `false_val=`.
false_val
While the choice of a `tf_style=` will typically supply the `true_val=` and `false_val=`
text, we could override this and supply text for any `False` values. This doesn't need to be
used in conjunction with `true_val=`.
na_val
None of the `tf_style` presets will replace any missing values encountered in the targeted
cells. While we always have the option to use `sub_missing()` for NA replacement, we have
the opportunity handle missing values here with the `na_val=` option. This is useful because
we also have the means to add color to the `na_val=` text or symbol and doing that requires
that a replacement value for NAs is specified here.
colors
Providing a list of color values to colors will progressively add color to the formatted
result depending on the number of colors provided. With a single color, all formatted values
will be in that color. Using two colors results in `True` values being the first color, and
`False` values receiving the second. With the three-color option, the final color will be
given to any missing values replaced through `na_val=`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Formatting with the `tf_style=` argument
----------------------------------------
We need to supply a preset `tf_style=` value. The following table provides a listing of all
`tf_style=` values and their output `True` and `False` values.
| | TF Style | Output |
|----|-----------------|-------------------------|
| 1 | `"true-false"` | `"true" / `"false"` |
| 2 | `"yes-no"` | `"yes" / `"no"` |
| 3 | `"up-down"` | `"up" / `"down"` |
| 4 | `"check-mark"` | `"✓" / `"✗"` |
| 5 | `"circles"` | `"●" / `"○"` |
| 6 | `"squares"` | `"■" / `"□"` |
| 7 | `"diamonds"` | `"◆" / `"◇"` |
| 8 | `"arrows"` | `"↑" / `"↓"` |
| 9 | `"triangles"` | `"▲" / `"▼"` |
| 10 | `"triangles-lr"`| `"▶" / `"◀"` |
Examples
--------
Let's use a subset of the `sp500` dataset to create a small table containing opening and closing
price data for the last few days in 2015. We added a boolean column (`dir`) where `True`
indicates a price increase from opening to closing and `False` is the opposite. Using `fmt_tf()`
generates up and down arrows in the `dir` column. We elect to use green upward arrows and red
downward arrows (through the `colors=` option).
```{python}
from great_tables import GT
from great_tables.data import sp500
import polars as pl
sp500_mini = (
pl.from_pandas(sp500)
.slice(0, 5)
.drop(["volume", "adj_close", "high", "low"])
.with_columns(dir = pl.col("close") > pl.col("open"))
)
(
GT(sp500_mini, rowname_col="date")
.fmt_tf(columns="dir", tf_style="arrows", colors=["green", "red"])
.fmt_currency(columns=["open", "close"])
.cols_label(
open="Opening",
close="Closing",
dir=""
)
)
```
fmt_markdown(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None) -> 'GTSelf'
Format Markdown text.
Any Markdown-formatted text in the incoming cells will be transformed during render when using
the `fmt_markdown()` method.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples:
-------
Let’s first create a DataFrame containing some text that is Markdown-formatted and then introduce
that to [`GT()`](`great_tables.GT`). We’ll then transform the `md` column with the
`fmt_markdown()` method.
```{python}
import pandas as pd
from great_tables import GT
from great_tables.data import towny
text_1 = """
### This is Markdown.
Markdown’s syntax is comprised entirely of
punctuation characters, which punctuation
characters have been carefully chosen so as
to look like what they mean... assuming
you’ve ever used email.
"""
text_2 = """
Info on Markdown syntax can be found
[here](https://daringfireball.net/projects/markdown/).
"""
df = pd.DataFrame({"md": [text_1, text_2]})
(GT(df).fmt_markdown("md"))
```
fmt_units(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, pattern: 'str' = '{x}') -> 'GTSelf'
Format measurement units.
The `fmt_units()` method lets you better format measurement units in the table body. These must
conform to the **Great Tables** *units notation*; as an example of this, `"J Hz^-1 mol^-1"` can
be used to generate units for the *molar Planck constant*. The notation here provides several
conveniences for defining units, so as long as the values to be formatted conform to this
syntax, you'll obtain nicely-formatted inline units. Details pertaining to *units notation* can
be found in the section entitled *How to use units notation*.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
pattern
A formatting pattern that allows for decoration of the formatted value. The formatted value
is represented by the `{x}` (which can be used multiple times, if needed) and all other
characters will be interpreted as string literals.
How to use units notation
-------------------------
The **Great Tables** units notation involves a shorthand of writing units that feels familiar
and is fine-tuned for the task at hand. Each unit is treated as a separate entity (parentheses
and other symbols included) and the addition of subscript text and exponents is flexible and
relatively easy to formulate. This is all best shown with examples:
- `"m/s"` and `"m / s"` both render as `"m/s"`
- `"m s^-1"` will appear with the `"-1"` exponent intact
- `"m /s"` gives the the same result, as `"/"` is equivalent to `"^-1"`
- `"E_h"` will render an `"E"` with the `"h"` subscript
- `"t_i^2.5"` provides a `t` with an `"i"` subscript and a `"2.5"` exponent
- `"m[_0^2]"` will use overstriking to set both scripts vertically
- `"g/L %C6H12O6%"` uses a chemical formula (enclosed in a pair of `"%"` characters) as a unit
partial, and the formula will render correctly with subscripted numbers
- Common units that are difficult to write using ASCII text may be implicitly converted to the
correct characters (e.g., the `"u"` in `"ug"`, `"um"`, `"uL"`, and `"umol"` will be converted to
the Greek *mu* symbol; `"degC"` and `"degF"` will render a degree sign before the temperature
unit)
- We can transform shorthand symbol/unit names enclosed in `":"` (e.g., `":angstrom:"`,
`":ohm:"`, etc.) into proper symbols
- Greek letters can added by enclosing the letter name in `":"`; you can use lowercase letters
(e.g., `":beta:"`, `":sigma:"`, etc.) and uppercase letters too (e.g., `":Alpha:"`, `":Zeta:"`,
etc.)
- The components of a unit (unit name, subscript, and exponent) can be fully or partially
italicized/emboldened by surrounding text with `"*"` or `"**"`
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `illness` dataset and create a new table. The `units` column happens to contain
string values in *units notation* (e.g., `"x10^9 / L"`). Using the `fmt_units()` method here
will improve the formatting of those measurement units.
```{python}
from great_tables import GT, style, loc
from great_tables.data import illness
(
GT(illness, rowname_col="test")
.fmt_units(columns="units")
.fmt_number(columns=lambda x: x.startswith("day"), decimals=2, drop_trailing_zeros=True)
.tab_header(title="Laboratory Findings for the YF Patient")
.tab_spanner(label="Day", columns=lambda x: x.startswith("day"))
.tab_spanner(label="Normal Range", columns=lambda x: x.startswith("norm"))
.cols_label(
norm_l="Lower",
norm_u="Upper",
units="Units"
)
.opt_vertical_padding(scale=0.4)
.opt_align_table_header(align="left")
.tab_options(heading_padding="10px")
.tab_style(
locations=loc.body(columns="norm_l"),
style=style.borders(sides="left")
)
.opt_vertical_padding(scale=0.5)
)
```
The `constants` dataset contains values for hundreds of fundamental physical constants. We'll
take a subset of values that have some molar basis and generate a new display table from that.
Like the `illness` dataset, this one has a `units` column so, again, the `fmt_units()` method
will be used to format those units. Here, the preference for typesetting measurement units is to
have positive and negative exponents (e.g., not `" / "` but rather
`" ^-1"`).
```{python}
from great_tables.data import constants
import polars as pl
import polars.selectors as cs
constants_mini = (
pl.from_pandas(constants)
.filter(pl.col("name").str.contains("molar")).sort("value")
.with_columns(
name=pl.col("name")
.str.to_titlecase()
.str.replace("Kpa", "kpa")
.str.replace("Of", "of")
)
)
(
GT(constants_mini)
.cols_hide(columns=["uncert", "sf_value", "sf_uncert"])
.fmt_units(columns="units")
.fmt_scientific(columns="value", decimals=3)
.tab_header(title="Physical Constants Having a Molar Basis")
.tab_options(column_labels_hidden=True)
)
```
fmt_image(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | int | None' = None, width: 'str | int | None' = None, sep: 'str' = ' ', path: 'str | Path | None' = None, file_pattern: 'str' = '{}', encode: 'bool' = True) -> 'GTSelf'
Format image paths to generate images in cells.
To more easily insert graphics into body cells, we can use the `fmt_image()` method. This allows
for one or more images to be placed in the targeted cells. The cells need to contain some
reference to an image file, either: (1) local paths to the files; (2) complete http/https to the
files; (3) the file names, where a common path can be provided via `path=`; or (4) a fragment of
the file name, where the `file_pattern=` argument helps to compose the entire file name and
`path=` provides the path information. This should be expressly used on columns that contain
*only* references to image files (i.e., no image references as part of a larger block of text).
Multiple images can be included per cell by separating image references by commas. The `sep=`
argument allows for a common separator to be applied between images.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
height
The height of the rendered images.
width
The width of the rendered images.
sep
In the output of images within a body cell, `sep=` provides the separator between each
image.
path
An optional path to local image files or an HTTP/HTTPS URL.
This is combined with the filenames to form the complete image paths.
file_pattern
The pattern to use for mapping input values in the body cells to the names of the graphics
files. The string supplied should use `"{}"` in the pattern to map filename fragments to
input strings.
encode
The option to always use Base64 encoding for image paths that are determined to be local. By
default, this is `True`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using a small portion of `metro` dataset, let's create a new table. We will only include a few
columns and rows from that table. The `lines` column has comma-separated listings of numbers
corresponding to lines served at each station. We have a directory of SVG graphics for all of
these lines in the package (the path for the image directory can be accessed via
`files("great_tables") / "data/metro_images"`, using the `importlib_resources` package). The
filenames roughly corresponds to the data in the `lines` column. The `fmt_image()` method can
be used with these inputs since the `path=` and `file_pattern=` arguments allow us to compose
complete and valid file locations. What you get from this are sequences of images in the table
cells, taken from the referenced graphics files on disk.
```{python}
from great_tables import GT
from great_tables.data import metro
from importlib_resources import files
img_paths = files("great_tables") / "data/metro_images"
metro_mini = metro[["name", "lines", "passengers"]].head(5)
(
GT(metro_mini)
.fmt_image(
columns="lines",
path=img_paths,
file_pattern="metro_{}.svg"
)
.fmt_integer(columns="passengers")
)
```
fmt_flag(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | int | float | None' = '1em', sep: 'str' = ' ', use_title: 'bool' = True) -> 'GTSelf'
Generate flag icons for countries from their country codes.
While it is fairly straightforward to insert images into body cells (using `fmt_image()` is one
way to it), there is often the need to incorporate specialized types of graphics within a table.
One such group of graphics involves iconography representing different countries, and the
`fmt_flag()` method helps with inserting a flag icon (or multiple) in body cells. To make this
work seamlessly, the input cells need to contain some reference to a country, and this can be in
the form of a 2- or 3-letter ISO 3166-1 country code (e.g., Egypt has the `"EG"` country code).
This method will parse the targeted body cells for those codes and insert the appropriate flag
graphics.
Multiple flags can be included per cell by separating country codes with commas (e.g.,
`"GB,TT"`). The `sep=` argument allows for a common separator to be applied between flag icons.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
height
The height of the flag icons. The default value is `"1em"`. If given as a number, it is
assumed to be in pixels.
sep
In the output of multiple flag icons within a body cell, `sep=` provides the separator
between each of the flag icons.
use_title
The option to include a title attribute with the country name when hovering over the flag
icon. The default is `True`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `countrypops` dataset to create a new table with flag icons. We will only include
a few columns and rows from that table. The `country_code_2` column has 2-letter country codes
in the format required for `fmt_flag()` and using that method transforms the codes to circular
flag icons.
```{python}
from great_tables import GT
from great_tables.data import countrypops
import polars as pl
countrypops_mini = (
pl.from_pandas(countrypops)
.filter(pl.col("year") == 2021)
.filter(pl.col("country_name").str.starts_with("S"))
.sort("country_name")
.head(10)
.drop(["year", "country_code_3"])
)
(
GT(countrypops_mini)
.fmt_integer(columns="population")
.fmt_flag(columns="country_code_2")
.cols_label(
country_code_2="",
country_name="Country",
population="Population (2021)"
)
.cols_move_to_start(columns="country_code_2")
)
```
Here's another example (again using `countrypops`) where we generate a table providing
populations every five years for the Benelux countries (`"BEL"`, `"NLD"`, and `"LUX"`). After
some filtering and a pivot, the `fmt_flag()` method is used to obtain flag icons from 3-letter
country codes present in the `country_code_3` column.
```{python}
import polars.selectors as cs
countrypops_mini = (
pl.from_pandas(countrypops)
.filter(pl.col("country_code_3").is_in(["BEL", "NLD", "LUX"]))
.filter((pl.col("year") % 10 == 0) & (pl.col("year") >= 1960))
.pivot("year", index = ["country_code_3", "country_name"], values="population")
)
(
GT(countrypops_mini)
.tab_header(title="Populations of the Benelux Countries")
.tab_spanner(label="Year", columns=cs.numeric())
.fmt_integer(columns=cs.numeric())
.fmt_flag(columns="country_code_3")
.cols_label(
country_code_3="",
country_name="Country"
)
)
```
fmt_icon(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, height: 'str | None' = None, sep: 'str' = ' ', stroke_color: 'str | None' = None, stroke_width: 'str | int | None' = None, stroke_alpha: 'float | None' = None, fill_color: 'str | dict[str, str] | None' = None, fill_alpha: 'float | None' = None, margin_left: 'str | None' = None, margin_right: 'str | None' = None) -> 'GTSelf'
Use icons within a table's body cells.
We can draw from a library of thousands of icons and selectively insert them into a table. The
`fmt_icon()` method makes this possible by mapping input cell labels to an icon name. We are
exclusively using Font Awesome icons here so the reference is the short icon name. Multiple
icons can be included per cell by separating icon names with commas (e.g.,
`"hard-drive,clock"`). The `sep=` argument allows for a common separator to be applied between
icons.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
height
The absolute height of the icon in the table cell. By default, this is set to "1em".
sep
In the output of icons within a body cell, `sep=` provides the separator between each icon.
stroke_color
The icon stroke is essentially the outline of the icon. The color of the stroke can be
modified by applying a single color here. If not provided then the default value of
`"currentColor"` is applied so that the stroke color matches that of the parent HTML
element's color attribute.
stroke_width
The `stroke_width=` option allows for setting the color of the icon outline stroke. By
default, the stroke width is very small at "1px" so a size adjustment here can sometimes be
useful. If an integer value is provided then it is assumed to be in pixels.
stroke_alpha
The level of transparency for the icon stroke can be controlled with a decimal value between
`0` and `1`.
fill_color
The fill color of the icon can be set with `fill_color=`; providing a single color here will
change the color of the fill but not of the icon's 'stroke' or outline (use `stroke_color=`
to modify that). A dictionary comprising the icon names with corresponding fill colors can
alternatively be used here (e.g., `{"circle-check" = "green", "circle-xmark" = "red"}`. If
nothing is provided then the default value of `"currentColor"` is applied so that the fill
matches the color of the parent HTML element's color attribute.
fill_alpha
The level of transparency for the icon fill can be controlled with a decimal value between
`0` and `1`.
margin_left
The length value for the margin that's to the left of the icon. By default, `"auto"` is
used for this but if space is needed on the left-hand side then a length of `"0.2em"` is
recommended as a starting point.
margin_right
The length value for the margin right of the icon. By default, `"auto"` is used but if
space is needed on the right-hand side then a length of `"0.2em"` is recommended as a
starting point.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
For this first example of generating icons with `fmt_icon()`, let's make a simple DataFrame that
has two columns of Font Awesome icon names. We separate multiple icons per cell with commas. By
default, the icons are 1 em in height; we're going to make the icons slightly larger here (so we
can see the fine details of them) by setting height = "4em".
```{python}
import pandas as pd
from great_tables import GT
animals_foods_df = pd.DataFrame(
{
"animals": ["hippo", "fish,spider", "mosquito,locust,frog", "dog,cat", "kiwi-bird"],
"foods": ["bowl-rice", "egg,pizza-slice", "burger,lemon,cheese", "carrot,hotdog", "bacon"],
}
)
(
GT(animals_foods_df)
.fmt_icon(
columns=["animals", "foods"],
height="4em"
)
.cols_align(
align="center",
columns=["animals", "foods"]
)
)
```
Let's take a few rows from the towny dataset and make it so the `csd_type` column contains
*Font Awesome* icon names (we want only the `"city"` and `"house-chimney"` icons here). After
using `fmt_icon()` to format the `csd_type` column, we get icons that are representative of the
two categories of municipality for this subset of data.
```{python}
import polars as pl
from great_tables.data import towny
towny_mini = (
pl.from_pandas(towny.loc[[323, 14, 26, 235]])
.select(["name", "csd_type", "population_2021"])
.with_columns(
csd_type = pl.when(pl.col("csd_type") == "town")
.then(pl.lit("house-chimney"))
.otherwise(pl.lit("city"))
)
)
(
GT(towny_mini)
.fmt_integer(columns="population_2021")
.fmt_icon(columns="csd_type")
.cols_label(
csd_type="",
name="City/Town",
population_2021="Population"
)
)
```
A fairly common thing to do with icons in tables is to indicate whether a quantity is either
higher or lower than another. Up and down arrow symbols can serve as good visual indicators for
this purpose. We can make use of the `"up-arrow"` and `"down-arrow"` icons here. As those
strings are available in the `dir` column of the table derived from the `sp500` dataset,
`fmt_icon()` can be used. We set the `fill_color` argument with a dictionary that indicates
which color should be used for each icon.
```{python}
from great_tables.data import sp500
sp500_mini = (
pl.from_pandas(sp500)
.head(10)
.select(["date", "open", "close"])
.sort("date", descending=False)
.with_columns(
dir = pl.when(pl.col("close") >= pl.col("open")).then(
pl.lit("arrow-up")).otherwise(pl.lit("arrow-down"))
)
)
(
GT(sp500_mini, rowname_col="date")
.fmt_icon(
columns="dir",
fill_color={"arrow-up": "green", "arrow-down": "red"}
)
.cols_label(
open="Opening Value",
close="Closing Value",
dir=""
)
.opt_stylize(style=1, color="gray")
)
```
fmt_nanoplot(self: 'GTSelf', columns: 'str | None' = None, rows: 'int | list[int] | None' = None, plot_type: 'PlotType' = 'line', plot_height: 'str' = '2em', missing_vals: 'MissingVals' = 'gap', autoscale: 'bool' = False, reference_line: 'str | int | float | None' = None, reference_area: 'list[Any] | None' = None, expand_x: 'list[int] | list[float] | list[int | float] | None' = None, expand_y: 'list[int] | list[float] | list[int | float] | None' = None, options: 'dict[str, Any] | None' = None) -> 'GTSelf'
Format data for nanoplot visualizations.
The `fmt_nanoplot()` method is used to format data for nanoplot visualizations. This method
allows for the creation of a variety of different plot types, including line, bar, and scatter
plots.
:::{.callout-warning}
`fmt_nanoplot()` is still experimental.
:::
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in targeted columns being
formatted. Alternatively, we can supply a list of row indices.
plot_type
Nanoplots can either take the form of a line plot (using `"line"`) or a bar plot (with
`"bar"`). A line plot, by default, contains layers for a data line, data points, and a data
area. With a bar plot, the always visible layer is that of the data bars.
plot_height
The height of the nanoplots. The default here is a sensible value of `"2em"`.
missing_vals
If missing values are encountered within the input data, there are three strategies
available for their handling: (1) `"gap"` will show data gaps at the sites of missing data,
where data lines will have discontinuities and bar plots will have missing bars; (2)
`"marker"` will behave like `"gap"` but show prominent visual marks at the missing data
locations; (3) `"zero"` will replace missing values with zero values; and (4) `"remove"`
will remove any incoming missing values.
autoscale
Using `autoscale=True` will ensure that the bounds of all nanoplots produced are based on
the limits of data combined from all input rows. This will result in a shared scale across
all of the nanoplots (for *y*- and *x*-axis data), which is useful in those cases where the
nanoplot data should be compared across rows.
reference_line
A reference line requires a single input to define the line. It could be a numeric value,
applied to all nanoplots generated. Or, the input can be one of the following for generating
the line from the underlying data: (1) `"mean"`, (2) `"median"`, (3) `"min"`, (4) `"max"`,
(5) `"q1"`, (6) `"q3"`, (7) `"first"`, or (8) `"last"`.
reference_area
A reference area requires a list of two values for defining bottom and top boundaries (in
the *y* direction) for a rectangular area. The types of values supplied are the same as
those expected for `reference_line=`, which is either a numeric value or one of the
following keywords for the generation of the value: (1) `"mean"`, (2) `"median"`, (3)
`"min"`, (4) `"max"`, (5) `"q1"`, (6) `"q3"`, (7) `"first"`, or (8) `"last"`. Input can
either be a vector or list with two elements.
expand_x
Should you need to have plots expand in the *x* direction, provide one or more values to
`expand_x=`. Any values provided that are outside of the range of *x*-value data provided to
the plot will result in a *x*-scale expansion.
expand_y
Similar to `expand_x=`, one can have plots expand in the *y* direction. To make this happen,
provide one or more values to `expand_y=`. If any of the provided values are outside of the
range of *y*-value data provided, the plot will result in a *y*-scale expansion.
options
By using the [`nanoplot_options()`](`great_tables.nanoplot_options`) helper function here,
you can alter the layout and styling of the nanoplots in the new column.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Details
-------
Nanoplots try to show individual data with reasonably good visibility. Interactivity is included
as a basic feature so one can hover over the data points and vertical guides will display the
value ascribed to each data point. Because **Great Tables** knows all about numeric formatting,
values will be compactly formatted so as to not take up valuable real estate.
While basic customization options are present in `fmt_nanoplot()`, many more opportunities for
customizing nanoplots on a more granular level are possible with the aforementioned
[`nanoplot_options()`](`great_tables.nanoplot_options`) helper function. With that, layers of
the nanoplots can be selectively removed and the aesthetics of the remaining plot components can
be modified.
Examples
--------
Let's create a nanoplot from a Polars DataFrame containing multiple numbers per cell. The
numbers are represented here as strings, where spaces separate the values, and the same values
are present in two columns: `lines` and `bars`. We will use the `fmt_nanoplot()` method twice
to create a line plot and a bar plot from the data in their respective columns.
```{python}
from great_tables import GT
import polars as pl
random_numbers_df = pl.DataFrame(
{
"i": range(1, 5),
"lines": [
"20 23 6 7 37 23 21 4 7 16",
"2.3 6.8 9.2 2.42 3.5 12.1 5.3 3.6 7.2 3.74",
"-12 -5 6 3.7 0 8 -7.4",
"2 0 15 7 8 10 1 24 17 13 6",
],
}
).with_columns(bars=pl.col("lines"))
(
GT(random_numbers_df, rowname_col="i")
.fmt_nanoplot(columns="lines")
.fmt_nanoplot(columns="bars", plot_type="bar")
)
```
We can always represent the input DataFrame in a different way (with list columns) and
`fmt_nanoplot()` will still work. While the input data is the same as in the previous example,
we'll take the opportunity here to add a reference line and a reference area to the line plot
and also to the bar plot.
```{python}
random_numbers_df = pl.DataFrame(
{
"i": range(1, 5),
"lines": [
{ "val": [20.0, 23.0, 6.0, 7.0, 37.0, 23.0, 21.0, 4.0, 7.0, 16.0] },
{ "val": [2.3, 6.8, 9.2, 2.42, 3.5, 12.1, 5.3, 3.6, 7.2, 3.74] },
{ "val": [-12.0, -5.0, 6.0, 3.7, 0.0, 8.0, -7.4] },
{ "val": [2.0, 0.0, 15.0, 7.0, 8.0, 10.0, 1.0, 24.0, 17.0, 13.0, 6.0] },
],
}
).with_columns(bars=pl.col("lines"))
(
GT(random_numbers_df, rowname_col="i")
.fmt_nanoplot(
columns="lines",
reference_line="mean",
reference_area=["min", "q1"]
)
.fmt_nanoplot(
columns="bars",
plot_type="bar",
reference_line="max",
reference_area=["max", "median"])
)
```
Here's an example to adjust some of the options using
[`nanoplot_options()`](`great_tables.nanoplot_options`).
```{python}
from great_tables import nanoplot_options
(
GT(random_numbers_df, rowname_col="i")
.fmt_nanoplot(
columns="lines",
reference_line="mean",
reference_area=["min", "q1"],
options=nanoplot_options(
data_point_radius=8,
data_point_stroke_color="black",
data_point_stroke_width=2,
data_point_fill_color="white",
data_line_type="straight",
data_line_stroke_color="brown",
data_line_stroke_width=2,
data_area_fill_color="orange",
vertical_guide_stroke_color="green",
),
)
.fmt_nanoplot(
columns="bars",
plot_type="bar",
reference_line="max",
reference_area=["max", "median"],
options=nanoplot_options(
data_bar_stroke_color="gray",
data_bar_stroke_width=2,
data_bar_fill_color="orange",
data_bar_negative_stroke_color="blue",
data_bar_negative_stroke_width=1,
data_bar_negative_fill_color="lightblue",
reference_line_color="pink",
reference_area_fill_color="bisque",
vertical_guide_stroke_color="blue",
),
)
)
```
Single-value bar plots and line plots can be made with `fmt_nanoplot()`. These run in the
horizontal direction, which is ideal for tabular presentation. The key thing here is that
`fmt_nanoplot()` expects a column of numeric values. These plots are meant for comparison
across rows so the method automatically scales the horizontal bars to facilitate this type of
display. The following example shows how `fmt_nanoplot()` can be used to create single-value bar
and line plots.
```{python}
single_vals_df = pl.DataFrame(
{
"i": range(1, 6),
"bars": [4.1, 1.3, -5.3, 0, 8.2],
"lines": [12.44, 6.34, 5.2, -8.2, 9.23]
}
)
(
GT(single_vals_df, rowname_col="i")
.fmt_nanoplot(columns="bars", plot_type="bar")
.fmt_nanoplot(columns="lines", plot_type="line")
)
```
fmt(self: 'GTSelf', fns: 'FormatFn', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, is_substitution: 'bool' = False) -> 'GTSelf'
Set a column format with a formatter function.
The `fmt()` method provides a way to execute custom formatting functionality with raw data
values in a way that can consider all output contexts.
Along with the `columns` and `rows` arguments that provide some precision in targeting data
cells, the `fns` argument allows you to define a function for manipulating the raw data.
Parameters
----------
fns
A formatting function to apply to the targeted cells.
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should undergo
formatting. The default is all rows, resulting in all rows in `columns` being formatted.
Alternatively, we can supply a list of row indices.
is_substitution
Whether the formatter is a substitution. Substitutions are run last, after other formatters.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `exibble` dataset to create a table. With the `fmt()` method, we'll add a prefix
`^` and a suffix `$` to the `row` and `group` columns.
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.fmt(lambda x: f"^{x}$", columns=["row", "group"])
)
```
sub_missing(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, missing_text: 'str | Text | None' = None) -> 'GTSelf'
Substitute missing values in the table body.
Wherever there is missing data (i.e., `None` values) customizable content may present better
than the standard representation of missing values that would otherwise appear. The
`sub_missing()` method allows for this replacement through its `missing_text=` argument.
And by not supplying anything to `missing_text=`, an em dash will serve as a default indicator
of missingness.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should be scanned for
missing values. The default is all rows, resulting in all rows in all targeted columns being
considered for this substitution. Alternatively, we can supply a list of row indices.
missing_text
The text to be used in place of missing values in the rendered table. We can optionally use
the [`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) helper functions to style
the text as Markdown or to retain HTML elements in the text.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using a subset of the `exibble` dataset, let's create a new table. The missing values in two
selections of columns will be given different variations of replacement text (across two
separate calls of `sub_missing()`).
```{python}
from great_tables import GT, md, html, exibble
import polars as pl
import polars.selectors as cs
exibble_mini = pl.from_pandas(exibble).drop("row", "group", "fctr").slice(4, 8)
(
GT(exibble_mini)
.sub_missing(
columns=["num", "char"],
missing_text="missing"
)
.sub_missing(
columns=cs.contains(("date", "time")) | cs.by_name("currency"),
missing_text="nothing"
)
)
```
sub_zero(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, zero_text: 'str' = 'nil') -> 'GTSelf'
Substitute zero values in the table body.
Wherever there is numerical data that are zero in value, replacement text may be better for
explanatory purposes. The `sub_zero()` function allows for this replacement through its
`zero_text=` argument.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should be scanned for
zeros. The default is all rows, resulting in all rows in all targeted columns being
considered for this substitution. Alternatively, we can supply a list of row indices.
zero_text
The text to be used in place of zero values in the rendered table. We can optionally use the
[`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) functions to style the text
as Markdown or to retain HTML elements in the text.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's generate a simple table that contains an assortment of values that could potentially
undergo some substitution via the `sub_zero()` method (i.e., there are two `0` values). The
ordering of the [`fmt_scientific()`](`great_tables.GT.fmt_scientific`) and `sub_zero()` calls
in the example below doesn't affect the final result since any `sub_*()` method won't interfere
with the formatting of the table.
```{python}
from great_tables import GT
import polars as pl
single_vals_df = pl.DataFrame(
{
"i": range(1, 8),
"numbers": [2.75, 0, -3.2, 8, 1e-10, 0, 2.6e9]
}
)
GT(single_vals_df).fmt_scientific(columns="numbers").sub_zero()
```
sub_small_vals(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, threshold: 'int | float' = 0.01, small_pattern: 'str | None' = None, sign: 'str' = '+') -> 'GTSelf'
Substitute small values in the table body.
Wherever there is numerical data that are very small in value, replacement text may be better
for explanatory purposes. The `sub_small_vals()` method allows for this replacement through
specification of a `threshold`, a `small_pattern`, and the sign of the values to be considered.
The substitution will occur for those values found to be between `0` and the threshold value.
This is possible for small positive and small negative values (this can be explicitly set by the
`sign` option). Note that the interval does not include the `0` or the `threshold` value.
Should you need to include zero values, use `sub_zero()`.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should be scanned for
small values. The default is all rows, resulting in all rows in all targeted columns being
considered for this substitution. Alternatively, we can supply a list of row indices.
threshold
The threshold value with which values should be considered small enough for replacement.
small_pattern
The pattern text to be used in place of the suitably small values in the rendered table.
The `{x}` placeholder within the pattern will be replaced with the threshold value. If not
provided, the default is `"<{x}"` for positive values and `">-{x}"` for negative values.
sign
The sign of the numbers to be considered in the replacement. By default, we only consider
positive values (`"+"`). The other option (`"-"`) can be used to consider only negative
values.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's generate a simple, single-column table that contains an assortment of values that could
potentially undergo some substitution via `sub_small_vals()`.
```{python}
from great_tables import GT
import polars as pl
single_vals_df = pl.DataFrame(
{
"i": range(1, 8),
"numbers": [0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
}
)
GT(single_vals_df).fmt_number(columns="numbers").sub_small_vals()
```
We can also target small negative values by setting `sign="-"` and use a custom
`small_pattern` to provide alternative replacement text.
```{python}
from great_tables import GT
import polars as pl
neg_vals_df = pl.DataFrame(
{
"i": range(1, 6),
"numbers": [-0.0001, -0.005, -0.05, -1.0, -100.0]
}
)
(
GT(neg_vals_df)
.fmt_number(columns="numbers")
.sub_small_vals(sign="-", threshold=0.01, small_pattern="~0")
)
```
sub_large_vals(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, threshold: 'int | float' = 1000000000000.0, large_pattern: 'str' = '>={x}', sign: 'str' = '+') -> 'GTSelf'
Substitute large values in the table body.
Wherever there are numerical data that are very large in value, replacement text may be better
for explanatory purposes. The `sub_large_vals()` method allows for this replacement through
specification of a `threshold`, a `large_pattern`, and the sign (positive or negative) of the
values to be considered.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should be scanned for
large values. The default is all rows, resulting in all rows in all targeted columns being
considered for this substitution. Alternatively, we can supply a list of row indices.
threshold
The threshold value with which values should be considered large enough for replacement.
large_pattern
The pattern text to be used in place of the suitably large values in the rendered table.
The `{x}` placeholder within the pattern will be replaced with the threshold value.
sign
The sign of the numbers to be considered in the replacement. By default, we only consider
positive values (`"+"`). The other option (`"-"`) can be used to consider only negative
values. Note that when `sign="-"` and the default `large_pattern=">={x}"` is used, the
`">="` is automatically changed to `"<="`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's generate a simple, single-column table that contains an assortment of values that could
potentially undergo some substitution via `sub_large_vals()`.
```{python}
from great_tables import GT
import polars as pl
single_vals_df = pl.DataFrame(
{
"i": range(1, 8),
"numbers": [0.0, 10.0, 1e8, 1e9, 1e10, 1e11, 1e12]
}
)
GT(single_vals_df).fmt_number(columns="numbers").sub_large_vals(threshold=1e10)
```
Large negative values can also be targeted with `sign="-"`. Notice the `">="` in the default
pattern is automatically changed to `"<="` when dealing with negative values.
```{python}
from great_tables import GT
import polars as pl
neg_vals_df = pl.DataFrame(
{
"i": range(1, 5),
"numbers": [-10.0, -500.0, -1e6, -1e12]
}
)
(
GT(neg_vals_df)
.fmt_number(columns="numbers")
.sub_large_vals(threshold=1000, sign="-")
)
```
sub_values(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'int | list[int] | None' = None, values: 'list[Any] | Any | None' = None, pattern: 'str | None' = None, fn: 'Callable[..., bool] | None' = None, replacement: 'str | int | float | None' = None) -> 'GTSelf'
Substitute targeted values in the table body.
Should you need to replace specific cell values with custom text, `sub_values()` can be a good
choice. We can target cells for replacement through value, regex, and custom matching rules.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which of their rows should be targeted for
substitution. The default is all rows, resulting in all rows in all targeted columns being
considered for this substitution. Alternatively, we can supply a list of row indices.
values
The specific value or values that should be replaced with a `replacement` value. If
`pattern` is also supplied then `values` will be ignored.
pattern
A regex pattern that can target solely those values in character-based columns. If `values`
is also supplied, `pattern` will take precedence.
fn
A supplied function that operates on each cell value `x` and should return a boolean
indicating whether that value should be replaced. If either of `values` or `pattern` is also
supplied, `fn` will take precedence.
replacement
The replacement value for any cell values matched by either `values`, `pattern`, or `fn`.
Must be a string or numeric value.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's create an input table with three columns containing an assortment of values that could
potentially undergo some substitution via `sub_values()`.
```{python}
from great_tables import GT
import polars as pl
tbl = pl.DataFrame(
{
"num_1": [-0.01, 74.0, None, 0.0, 500.0, 0.001, 84.3],
"int_1": [1, -100000, 800, 5, None, 1, -32],
"lett": ["A", "B", "C", "D", "E", "F", "G"],
}
)
GT(tbl).sub_values(values=[74, 500], replacement="—")
```
For the most flexibility, use the `fn` argument. The function you provide should accept a cell
value and return a boolean indicating whether it should be replaced.
```{python}
from great_tables import GT
import polars as pl
tbl = pl.DataFrame(
{
"num_1": [-0.01, 74.0, None, 0.0, 500.0, 0.001, 84.3],
"int_1": [1, -100000, 800, 5, None, 1, -32],
"lett": ["A", "B", "C", "D", "E", "F", "G"],
}
)
(
GT(tbl)
.sub_values(
fn=lambda x: isinstance(x, (int, float)) and x >= 0 and x < 50,
replacement="small"
)
)
```
data_color(self: 'GTSelf', columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, palette: 'str | list[str] | None' = None, domain: 'list[str] | list[int] | list[float] | None' = None, na_color: 'str | None' = None, alpha: 'int | float | None' = None, reverse: 'bool' = False, autocolor_text: 'bool' = True, truncate: 'bool' = False) -> 'GTSelf'
Perform data cell colorization.
It's possible to add color to data cells according to their values with the `data_color()`
method. There is a multitude of ways to perform data cell colorizing here:
- targeting: we can constrain which columns should receive the colorization treatment through
the `columns=` argument)
- color palettes: with `palette=` we could supply a list of colors composed of hexadecimal
values or color names
- value domain: we can either opt to have the range of values define the domain, or, specify
one explicitly with the `domain=` argument
- text autocoloring: `data_color()` will automatically recolor the foreground text to provide
the best contrast (can be deactivated with `autocolor_text=False`)
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
In conjunction with `columns=`, we can specify which rows should be colored. By default,
all rows in the targeted columns will be colored. Alternatively, we can provide a list
of row indices.
palette
The color palette to use. This should be a list of colors (e.g., `["#FF0000", "#00FF00",
"#0000FF"]`). A ColorBrewer palette could also be used, just supply the name (reference
available in the *Color palette access from ColorBrewer* section). If `None`, then a default
palette will be used.
domain
The domain of values to use for the color scheme. This can be a list of floats, integers, or
strings. If `None`, then the domain will be inferred from the data values.
na_color
The color to use for missing values. If `None`, then the default color (`"#808080"`) will be
used.
alpha
An optional, fixed alpha transparency value that will be applied to all color palette
values.
reverse
Should the colors computed operate in the reverse order? If `True` then colors that normally
change from red to blue will change in the opposite direction.
autocolor_text
Whether or not to automatically color the text of the data values. If `True`, then the text
will be colored according to the background color of the cell.
truncate
If `True`, then any values that fall outside of the domain will be truncated to the
minimum or maximum value of the domain (will have the same color). If `False`, then any
values that fall outside of the domain will be set to `NaN` and will follow the `na_color=`
color.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Color palette access from ColorBrewer and viridis
-------------------------------------------------
All palettes from the ColorBrewer package can be accessed by providing the palette name in
`palette=`. There are 35 available palettes:
| | Palette Name | Colors | Category | Colorblind Friendly |
|----|-------------------|---------|-------------|---------------------|
| 1 | `"BrBG"` | 11 | Diverging | Yes |
| 2 | `"PiYG"` | 11 | Diverging | Yes |
| 3 | `"PRGn"` | 11 | Diverging | Yes |
| 4 | `"PuOr"` | 11 | Diverging | Yes |
| 5 | `"RdBu"` | 11 | Diverging | Yes |
| 6 | `"RdYlBu"` | 11 | Diverging | Yes |
| 7 | `"RdGy"` | 11 | Diverging | No |
| 8 | `"RdYlGn"` | 11 | Diverging | No |
| 9 | `"Spectral"` | 11 | Diverging | No |
| 10 | `"Dark2"` | 8 | Qualitative | Yes |
| 11 | `"Paired"` | 12 | Qualitative | Yes |
| 12 | `"Set1"` | 9 | Qualitative | No |
| 13 | `"Set2"` | 8 | Qualitative | Yes |
| 14 | `"Set3"` | 12 | Qualitative | No |
| 15 | `"Accent"` | 8 | Qualitative | No |
| 16 | `"Pastel1"` | 9 | Qualitative | No |
| 17 | `"Pastel2"` | 8 | Qualitative | No |
| 18 | `"Blues"` | 9 | Sequential | Yes |
| 19 | `"BuGn"` | 9 | Sequential | Yes |
| 20 | `"BuPu"` | 9 | Sequential | Yes |
| 21 | `"GnBu"` | 9 | Sequential | Yes |
| 22 | `"Greens"` | 9 | Sequential | Yes |
| 23 | `"Greys"` | 9 | Sequential | Yes |
| 24 | `"Oranges"` | 9 | Sequential | Yes |
| 25 | `"OrRd"` | 9 | Sequential | Yes |
| 26 | `"PuBu"` | 9 | Sequential | Yes |
| 27 | `"PuBuGn"` | 9 | Sequential | Yes |
| 28 | `"PuRd"` | 9 | Sequential | Yes |
| 29 | `"Purples"` | 9 | Sequential | Yes |
| 30 | `"RdPu"` | 9 | Sequential | Yes |
| 31 | `"Reds"` | 9 | Sequential | Yes |
| 32 | `"YlGn"` | 9 | Sequential | Yes |
| 33 | `"YlGnBu"` | 9 | Sequential | Yes |
| 34 | `"YlOrBr"` | 9 | Sequential | Yes |
| 35 | `"YlOrRd"` | 9 | Sequential | Yes |
We can also use the *viridis* and associated color palettes by providing to `palette=` any of
the following string values: `"viridis"`, `"plasma"`, `"inferno"`, `"magma"`, or `"cividis"`.
Examples
--------
The `data_color()` method can be used without any supplied arguments to colorize a table. Let's
do this with the `exibble` dataset:
```{python}
from great_tables import GT
from great_tables.data import exibble
GT(exibble).data_color()
```
What's happened is that `data_color()` applies background colors to all cells of every column
with the palette of eight colors. Numeric columns will use 'numeric' methodology for color
scaling whereas string-based columns will use the 'factor' methodology. The text color undergoes
an automatic modification that maximizes contrast (since `autocolor_text=True` by default).
We can target specific colors and apply color to just those columns. Let's do that and also
supply `palette=` values of `"red"` and `"green"`.
```{python}
GT(exibble).data_color(
columns=["num", "currency"],
palette=["red", "green"]
)
```
With those options in place we see that only the numeric columns `num` and `currency` received
color treatments. Moreover, the palette colors were mapped to the lower and upper limits of the
data in each column; interpolated colors were used for the values in between the numeric limits
of the two columns.
We can manually set the limits of the data with the `domain=` argument (which is preferable in
most cases). Let's colorize just the currency column and set `domain=[0, 50]`. Any values that
are either missing or lie outside of the domain will be colorized with the `na_color=` color
(so we'll set that to `"lightgray"`).
```{python}
GT(exibble).data_color(
columns="currency",
palette=["red", "green"],
domain=[0, 50],
na_color="lightgray"
)
```
## Text transformation
The text_*() method take cell data that are solidified into strings and allow for flexible transformations of those string values. Whereas the `fmt_*()` and `sub_*()` methods are phases 1 and 2 of cell data metamorphoses, the text transformation functions are the final phase, acting on strings generated by formatting and substitution functions with no reference to the source values.
text_replace(self: 'GTSelf', pattern: 'str', replacement: 'str', locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf'
Perform targeted text replacement with a regex pattern.
With `text_replace()` we can target cells in specific locations and replace text fragments
matching a regular expression pattern. This operates on the already-formatted cell content
(i.e., after `fmt_*()` methods have been applied).
Parameters
----------
pattern
A regex pattern used to target text fragments in the resolved cells.
replacement
The replacement text for any matched text fragments. Backreferences (e.g., `"\\1"`)
can be used to refer to capture groups in the pattern.
locations
The cell or set of cells to be associated with the text replacement. Supported locations
include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If
`None`, defaults to `loc.body()`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that
we can facilitate method chaining.
Examples
--------
Use `text_replace()` to add HTML emphasis tags around text in parentheses.
```{python}
import pandas as pd
from great_tables import GT, loc
df = pd.DataFrame({"item": ["Column A (details)", "Colum B (info)"], "value": [1, 2]})
(
GT(df)
.text_replace(
pattern=r"\((.+?)\)",
replacement=r"(\1)",
locations=loc.body(columns="item"),
)
)
```
Replace underscores with spaces in the stub (row labels).
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char", "row"]].head(4), rowname_col="row")
.text_replace(pattern="_", replacement=" ", locations=loc.stub())
)
```
text_case_when(self: 'GTSelf', *cases: 'tuple[Callable[[str], bool], str]', default: 'str | None' = None, locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf'
Perform text replacements using a case-when approach.
With `text_case_when()` we supply a sequence of cases as `(predicate, replacement)` tuples.
Each predicate is a function that takes the cell text (as a string) and returns `True` or
`False`. The first predicate that returns `True` determines the replacement text. This is
analogous to a series of if/elif statements applied to each cell.
Parameters
----------
*cases
One or more tuples of the form `(predicate_fn, new_text)` where `predicate_fn` is a
callable that accepts a string and returns a boolean, and `new_text` is the replacement
string to use when the predicate is `True`.
default
The replacement text to use when no predicate matches. If `None` (the default),
unmatched cells are left unchanged.
locations
The cell or set of cells to be associated with the text replacement. Supported locations
include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If
`None`, defaults to `loc.body()`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that
we can facilitate method chaining.
Examples
--------
Conditionally replace cell values based on their content.
```{python}
import pandas as pd
from great_tables import GT, loc
df = pd.DataFrame({"score": [95, 72, 88, 61, 100]})
(
GT(df)
.fmt_number(columns="score", decimals=0)
.text_case_when(
(lambda x: int(x) >= 90, "A"),
(lambda x: int(x) >= 80, "B"),
(lambda x: int(x) >= 70, "C"),
default="F",
locations=loc.body(columns="score"),
)
)
```
Use string methods in predicates to match patterns.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.text_case_when(
(lambda x: x.startswith("a"), "Starts with A"),
(lambda x: len(x) > 6, "Long text"),
default="other",
locations=loc.body(columns="char"),
)
)
```
text_case_match(self: 'GTSelf', *cases: 'tuple[str | list[str], str]', default: 'str | None' = None, replace: "Literal['all', 'partial']" = 'all', locations: 'Loc | list[Loc] | None' = None) -> 'GTSelf'
Perform text replacements with a switch-like approach.
With `text_case_match()` we can supply a sequence of matching cases in the form of
`(old_text, new_text)` tuples. Each tuple's first element specifies text to match (either a
single string or a list of strings) and the second element provides the replacement. By
default, the matching is performed on the entire cell text (`replace="all"`); use
`replace="partial"` for substring matching and replacement.
Parameters
----------
*cases
One or more tuples of the form `(old_text, new_text)` where `old_text` is a string or
list of strings to match, and `new_text` is the replacement string.
default
The replacement text to use when cell values aren't matched by any of the supplied
cases. If `None` (the default), unmatched cells are left unchanged.
replace
The method for text replacement. Use `"all"` (the default) to match and replace the
entire cell text, or `"partial"` to match and replace substrings within the cell text.
locations
The cell or set of cells to be associated with the text replacement. Supported locations
include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and `loc.column_labels()`. If
`None`, defaults to `loc.body()`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that
we can facilitate method chaining.
Examples
--------
Replace specific cell values in the `char` column with different text.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.text_case_match(
("apricot", "APRICOT"),
(["banana", "coconut"], "tropical fruit"),
default="other",
locations=loc.body(columns="char"),
)
)
```
Use `replace="partial"` to perform substring replacements.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.text_case_match(
("an", "@"),
replace="partial",
locations=loc.body(columns="char"),
)
)
```
text_transform(self: 'GTSelf', locations: 'Loc | list[Loc]', fn: 'Callable[[str], str]') -> 'GTSelf'
Apply a custom text transformation to cells at specified locations.
With the `text_transform()` method we can target specific cells and apply a text
transformation function to their already-formatted content. This is useful for modifying the
rendered text of cells after all formatting (via `fmt_*()` methods) has been applied.
Parameters
----------
locations
The cell or set of cells to be associated with the text transformation. Supported
locations include `loc.body()`, `loc.stub()`, `loc.row_groups()`, and
`loc.column_labels()`.
fn
A function that takes a cell's text content as a string and returns the transformed
string.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `exibble` dataset to demonstrate `text_transform()`. We'll format the `num`
column and then apply a text transformation to wrap the values in parentheses.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.fmt_number(columns="num", decimals=1)
.text_transform(
locations=loc.body(columns="num"),
fn=lambda x: f"({x})",
)
)
```
Using `text_transform()` we can also convert specific cells to uppercase. Here we target only
the first two rows of the `char` column.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.text_transform(
locations=loc.body(columns="char", rows=[0, 1]),
fn=lambda x: x.upper(),
)
)
```
Multiple locations can be targeted at once by passing a list. In this example, we add a
prefix to all cells in both the `num` and `char` columns.
```{python}
from great_tables import GT, loc, exibble
(
GT(exibble[["num", "char"]].head(4))
.fmt_number(columns="num", decimals=2)
.text_transform(
locations=[loc.body(columns="num"), loc.body(columns="char")],
fn=lambda x: f"~ {x}",
)
)
```
## Modifying columns
The `cols_*()` methods allow for modifications that act on entire columns. This includes alignment of the data in columns ([`cols_align()`](`great_tables.GT.cols_align`)), hiding columns from view ([`cols_hide()`](`great_tables.GT.cols_hide`)), re-labeling the column labels ([`cols_label()`](`great_tables.GT.cols_label`)), and moving columns around (with the `cols_move*()` methods).
cols_align(self: 'GTSelf', align: 'str' = 'left', columns: 'SelectExpr' = None) -> 'GTSelf'
Set the alignment of one or more columns.
The `cols_align()` method sets the alignment of one or more columns. The `align` argument
can be set to one of `"left"`, `"center"`, or `"right"` and the `columns` argument can be
used to specify which columns to apply the alignment to. If `columns` is not specified, the
alignment is applied to all columns.
Parameters
----------
align
The alignment to apply. Must be one of `"left"`, `"center"`, or `"right"`.
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list. If `None`, the alignment is applied to all columns.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `countrypops` to create a small table. We can change the alignment of the
`population` column with `cols_align()`. In this example, the column label and body cells of
`population` will be aligned to the left.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "San Marino"][
["country_name", "year", "population"]
].tail(5)
(
GT(countrypops_mini, rowname_col="year", groupname_col="country_name")
.cols_align(align="left", columns="population")
)
```
cols_width(self: 'GTSelf', cases: 'dict[str, str] | None' = None, **kwargs: 'str') -> 'GTSelf'
Set the widths of columns.
Manual specifications of column widths can be performed using the `cols_width()` method. We
choose which columns get specific widths. This can be in units of pixels or as percentages.
Width assignments are supplied inside of a dictionary where columns are the keys and the
corresponding width is the value.
Parameters
----------
cases
A dictionary where the keys are column names and the values are the widths. Widths can be
specified in pixels (e.g., `"50px"`) or as percentages (e.g., `"20%"`).
**kwargs
Keyword arguments to specify column widths. Each keyword corresponds to a column name, with
its value indicating the width in pixels or percentages.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use select columns from the `exibble` dataset to create a new table. We can specify the
widths of columns with `cols_width()`. This is done by specifying the exact widths for table
columns in a dictionary. In this example, we'll set the width of the `num` column to `"150px"`,
the `char` column to `"100px"`, the `date` column to `"300px"`. All other columns won't be
affected (their widths will be automatically set by their content).
```{python}
import warnings
from great_tables import GT, exibble
warnings.filterwarnings("ignore")
exibble_mini = exibble[["num", "char", "date", "datetime", "row"]].head(5)
(
GT(exibble_mini)
.cols_width(
cases={
"num": "150px",
"char": "100px",
"date": "300px"
}
)
)
```
We can also specify the widths of columns as percentages. In this example, we'll set the width
of the `num` column to `"20%"`, the `char` column to `"10%"`, and the `date` column to `"30%"`.
Note that the percentages are relative and don't need to sum to 100%.
```{python}
(
GT(exibble_mini)
.cols_width(
cases={
"num": "20%",
"char": "10%",
"date": "30%"
}
)
)
```
We can also mix and match pixel and percentage widths. In this example, we'll set the width of
the `num` column to `"150px"`, the `char` column to `"10%"`, and the `date` column to `"30%"`.
```{python}
(
GT(exibble_mini)
.cols_width(
cases={
"num": "150px",
"char": "10%",
"date": "30%"
}
)
)
```
If we set the width of all columns, the table will be forced to use the specified widths (i.e.,
a column width less than the content width will be honored). In this next example, we'll set
widths for all columns. This is a good way to ensure that the widths you specify are fully
respected (and not overridden by automatic width calculations).
```{python}
(
GT(exibble_mini)
.cols_width(
cases={
"num": "30px",
"char": "100px",
"date": "100px",
"datetime": "200px",
"row": "50px"
}
)
)
```
Notice that in the above example, the `num` column is very small (only `30px`) and the content
overflows. When not specifying the width of all columns, the table will automatically adjust the
column widths based on the content (and you wouldn't get the overflowing behavior seen in the
previous example).
cols_label(self: 'GTSelf', cases: 'dict[str, str | BaseText] | None' = None, **kwargs: 'str | BaseText') -> 'GTSelf'
Relabel one or more columns.
There are three important pieces to labelling:
* Each argument has the form: {name in data} = {new label}.
* Multiple columns may be given the same label.
* Labels may use curly braces to apply special formatting, called unit notation.
For example, "area ({{ft^2}})" would appear as "area (ft²)".
See [`define_units()`](`great_tables.define_units`) for details on unit notation.
Parameters
----------
cases
A dictionary where the keys are column names and the values are the labels. Labels may use
[`md()`](`great_tables.md`) or [`html()`](`great_tables.html`) helpers for formatting.
**kwargs
Keyword arguments to specify column labels. Each keyword corresponds to a column name, with
its value indicating the new label.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Notes
-----
GT always selects columns using their name in the underlying data. This means that a column's
label is purely for final presentation.
Examples
--------
The example below relabels columns from the `countrypops` data to start with uppercase.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Uganda"][
["country_name", "year", "population"]
].tail(5)
(
GT(countrypops_mini)
.cols_label(
country_name="Country Name",
year="Year",
population="Population"
)
)
```
Note that we supplied the name of the column as the key, and the new label as the value.
We can also use Markdown formatting for the column labels. In this example, we'll use
`md("*Population*")` to make the label italicized.
```{python}
from great_tables import GT, md
from great_tables.data import countrypops
(
GT(countrypops_mini)
.cols_label(
country_name="Name",
year="Year",
population=md("*Population*")
)
)
```
We can also use unit notation to format the column labels. In this example, we'll use
`{{cm^3 molecules^-1 s^-1}}` for part of the label for the `OH_k298` column.
```{python}
from great_tables import GT
from great_tables.data import reactions
import polars as pl
reactions_mini = (
pl.from_pandas(reactions)
.filter(pl.col("cmpd_type") == "mercaptan")
.select(["cmpd_name", "OH_k298"])
)
(
GT(reactions_mini)
.fmt_scientific("OH_k298")
.sub_missing()
.cols_label(
cmpd_name="Compound Name",
OH_k298="OH, {{cm^3 molecules^-1 s^-1}}",
)
)
```
cols_label_with(self: 'GTSelf', columns: 'SelectExpr' = None, fn: 'Callable[[str], str] | None' = None) -> 'GTSelf'
Relabel one or more columns using a function.
The `cols_label_with()` function allows for modification of column labels through a supplied
function. By default, the function will be invoked on all column labels but this can be limited
to a subset via the `columns` parameter.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
fn
A function that accepts a column name as input and returns a label as output.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Notes
-----
GT always selects columns using their name in the underlying data. This means that a column's
label is purely for final presentation.
Examples
--------
Let's use a subset of the `sp500` dataset to create a gt table.
```{python}
from great_tables import GT, md
from great_tables.data import sp500
gt = GT(sp500.head())
gt
```
We can pass `str.upper` to the `fn` parameter to convert all column labels to uppercase.
```{python}
gt.cols_label_with(fn=str.upper)
```
One useful use case is using `md()`, provided by **Great Tables**, to format column labels.
For example, the following code demonstrates how to make the `date` and `adj_close` column labels
bold using markdown syntax.
```{python}
gt.cols_label_with(["date", "adj_close"], lambda x: md(f"**{x}**"))
```
cols_label_rotate(self: 'GTSelf', columns: 'SelectExpr' = None, dir: "Literal['sideways-lr', 'sideways-rl', 'vertical-lr']" = 'sideways-lr', align: "Literal['left', 'center', 'right'] | None" = None, padding: 'int' = 8) -> 'GTSelf'
Rotate the column label for one or more columns.
The `cols_label_rotate()` method sets the orientation of the column label text to make it flow
vertically. The `dir` argument can be set to one of `"sideways-lr"`, `"sideways-rl"`, or
`"vertical-lr"`, and the `columns` argument can be used to specify which columns to apply the
alignment to. If `columns` is not specified, the alignment is applied to all columns.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list. If `None`, the alignment is applied to all columns.
dir
A string that gives the direction of the text. Options: `"sideways-lr"`, `"sideways-rl"`,
`"vertical-lr"`. See note for information on text layout.
align
The alignment to apply. Must be one of `"left"`, `"center"`, `"right"`, or `"none"`. If text
is laid out vertically, this affects alignment along the vertical axis.
padding
The vertical padding to apply to the column labels.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
The example below rotates column labels such that the text is set to the left.
```{python}
from great_tables import GT, style, loc, exibble
exibble_sm = exibble[["num", "fctr", "row", "group"]]
(
GT(exibble_sm, rowname_col="row", groupname_col="group")
.cols_label_rotate(columns=["num", "fctr"])
)
```
Other styles you provide won't override the column label rotation directives. Here we set the
text to the right.
```{python}
(
GT(exibble_sm, rowname_col="row", groupname_col="group")
.cols_label_rotate(columns=["num", "fctr"], dir="vertical-lr")
.tab_style(style=style.text(weight="bold"), locations=loc.column_labels(["fctr"]))
)
```
Labels that are restricted by the height of the stub head will wrap horizontally.
```{python}
(
GT(exibble_sm, rowname_col="row", groupname_col="group")
.cols_label({"fctr": "A longer description of the values in the column below"})
.cols_label_rotate(columns=["num", "fctr"], dir="sideways-lr")
.tab_style(
style=[style.text(weight="bold"), style.css(rule="height: 200px;")],
locations=loc.column_labels(["fctr"])
)
)
```
Note
--------
The `dir` parameter uses the following keywords to alter the direction of the column label text.
##### `"sideways-lr"`
For ltr scripts, content flows vertically from bottom to top. For rtl scripts, content flows
vertically from top to bottom. Characters are set sideways toward the left. Overflow lines are
appended to the right.
##### `"sideways-rl"`
For ltr scripts, content flows vertically from top to bottom. For rtl scripts, content flows
vertically from bottom to top. Characters are set sideways toward the right. Overflow lines are
appended to the left.
##### `"vertical-lr"`
Identical to `"sideways-rl"`, but overflow lines are appended to the right.
cols_move(self: 'GTSelf', columns: 'SelectExpr', after: 'str') -> 'GTSelf'
Move one or more columns.
On those occasions where you need to move columns this way or that way, we can make use of the
`cols_move()` method. While it's true that the movement of columns can be done upstream of
**Great Tables**, it is much easier and less error prone to use the method provided here. The
movement procedure here takes one or more specified columns (in the `columns` argument) and
places them to the right of a different column (the `after` argument). The ordering of the
`columns` to be moved is preserved, as is the ordering of all other columns in the table.
The columns supplied in `columns` must all exist in the table and none of them can be in the
`after` argument. The `after` column must also exist and only one column should be provided
here. If you need to place one more or columns at the beginning of the column series, the
`cols_move_to_start()` method should be used. Similarly, if those columns to move should be
placed at the end of the column series then use `cols_move_to_end()`.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
after
The column after which the `columns` should be placed. This can be any column name that
exists in the table.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use the `countrypops` dataset to create a table. We'll choose to position the `population`
column after the `country_name` column by using the `cols_move()` method.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Japan"][
["country_name", "year", "population"]
].tail(5)
(
GT(countrypops_mini)
.cols_move(
columns="population",
after="country_name"
)
)
```
cols_move_to_start(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf'
Move one or more columns to the start.
We can easily move set of columns to the beginning of the column series and we only need to
specify which `columns`. It's possible to do this upstream of **Great Tables**, however, it is
easier with this method and it presents less possibility for error. The ordering of the
`columns` that are moved to the start is preserved (same with the ordering of all other columns
in the table).
The columns supplied in `columns` must all exist in the table. If you need to place one or
columns at the end of the column series, the `cols_move_to_end()` method should be used. More
control is offered with the `cols_move()` method, where columns could be placed after a specific
column.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
For this example, we'll use a portion of the `countrypops` dataset to create a simple table.
Let's move the `year` column, which is the middle column, to the start of the column series with
the `cols_move_to_start()` method.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Fiji"][
["country_name", "year", "population"]
].tail(5)
GT(countrypops_mini).cols_move_to_start(columns="year")
```
We can also move multiple columns at a time. With the same `countrypops`-based table
(`countrypops_mini`), let's move both the `year` and `population` columns to the start of the
column series.
```{python}
GT(countrypops_mini).cols_move_to_start(columns=["year", "population"])
```
cols_move_to_end(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf'
Move one or more columns to the end.
We can easily move set of columns to the beginning of the column series and we only need to
specify which `columns`. It's possible to do this upstream of **Great Tables**, however, it is
easier with this method and it presents less possibility for error. The ordering of the
`columns` that are moved to the end is preserved (same with the ordering of all other columns in
the table).
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
For this example, we'll use a portion of the `countrypops` dataset to create a simple table.
Let's move the `year` column, which is the middle column, to the end of the column series with
the `cols_move_to_end()` method.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][
["country_name", "year", "population"]
].tail(5)
GT(countrypops_mini).cols_move_to_end(columns="year")
```
We can also move multiple columns at a time. With the same `countrypops`-based table
(`countrypops_mini`), let's move both the `year` and `country_name` columns to the end of the
column series.
```{python}
GT(countrypops_mini).cols_move_to_end(columns=["year", "country_name"])
```
cols_reorder(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf'
Reorder all columns in a specified order.
The `cols_reorder()` method allows you to completely rearrange the column order of a table.
Provide all column names in the exact order you want them to appear. This is useful when you
need full control over the column layout and want to express the entire ordering in a single
call, rather than using multiple `cols_move()`, `cols_move_to_start()`, or `cols_move_to_end()`
calls.
Every column in the table must appear exactly once in the `columns=` list. If any columns are
missing or extra names are provided, a `ValueError` will be raised.
Parameters
----------
columns
A list of all column names in the desired display order. This can be a list of column name
strings or a column selection expression (e.g., Polars selectors). All columns in the table
must be included exactly once.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Raises
------
ValueError
If the provided columns do not match all columns in the table (e.g., missing columns,
extra columns, or duplicates).
Examples
--------
Let's use a subset of columns from the `exibble` dataset to create a table.
```{python}
from great_tables import GT
from great_tables.data import exibble
exibble_mini = exibble[["num", "char", "fctr", "date", "time"]]
GT(exibble_mini)
```
Now, let's reorder the columns so that `fctr` and `date` come first, followed by the remaining
columns in a custom order:
```{python}
(
GT(exibble_mini)
.cols_reorder(["fctr", "date", "time", "char", "num"])
)
```
For tables with many columns, you can use Python's iterable unpacking to build the column list
programmatically. Here we use the full `exibble` dataset (9 columns) and move `fctr` to the
front while pushing `num` and `char` to the end—without typing every column name in between:
```{python}
# Unpack the first three column names and capture all remaining ones in `rest`
# exibble.columns is: ["num", "char", "fctr", "date", "time", "datetime", "currency", "row", "group"]
num, char, fctr, *rest = exibble.columns
# Build the new order: fctr first, then all middle columns in their
# original order, and finally char and num moved to the end
(
GT(exibble)
.cols_reorder([fctr, *rest, char, num])
)
```
This unpacking technique is especially handy for wide tables where you want to pin a few columns
to the start or end without manually listing every column in between. The `*rest` variable
automatically adapts if columns are added to or removed from the dataset, making your table code
more resilient to upstream schema changes.
cols_hide(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf'
Hide one or more columns.
The `cols_hide()` method allows us to hide one or more columns from appearing in the final
output table. While it's possible and often desirable to omit columns from the input table data
before introduction to the `GT()` class, there can be cases where the data in certain columns is
useful (as a column reference during formatting of other columns) but the final display of those
columns is not necessary.
Parameters
----------
columns
The columns to hide in the output display table. Can either be a single column name or a
series of column names provided in a list.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
For this example, we'll use a portion of the `countrypops` dataset to create a simple table.
Let's hide the `year` column with the `cols_hide()` method.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][
["country_name", "year", "population"]
].tail(5)
GT(countrypops_mini).cols_hide(columns="year")
```
Details
-------
The hiding of columns is internally a rendering directive, so, all columns that are 'hidden' are
still accessible and useful in any expression provided to a `rows` argument. Furthermore, the
`cols_hide()` method (as with many of the methods available in **Great Tables**) can be placed
anywhere in a chain of calls (acting as a promise to hide columns when the timing is right).
However there's perhaps greater readability when placing this call closer to the end of such a
chain. The `cols_hide()` method quietly changes the visible state of a column and doesn't yield
warnings when changing the state of already-invisible columns.
cols_unhide(self: 'GTSelf', columns: 'SelectExpr') -> 'GTSelf'
Unhide one or more columns.
The `cols_unhide()` method allows us to unhide one or more columns from appearing in the final
output table. This may be important in cases where the user obtains a `GT` instance with hidden
columns and there is motivation to reveal one or more of those.
Parameters
----------
columns
The columns to unhide in the output display table. Can either be a single column name or a
series of column names provided in a list.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
For this example, we'll use a portion of the `countrypops` dataset to create a simple table.
We'll hide the `year` column using `cols_hide()` and then unhide it with `cols_unhide()`,
ensuring that the `year` column remains visible in the table.
```{python}
from great_tables import GT
from great_tables.data import countrypops
countrypops_mini = countrypops.loc[countrypops["country_name"] == "Benin"][
["country_name", "year", "population"]
].tail(5)
GT(countrypops_mini).cols_hide(columns="year").cols_unhide(columns="year")
```
cols_merge(self: 'GTSelf', columns: 'SelectExpr', hide_columns: 'SelectExpr | Literal[False]' = None, rows: 'int | list[int] | None' = None, pattern: 'str | None' = None) -> 'GTSelf'
Merge data from two or more columns into a single column.
This method takes input from two or more columns and allows the contents to be merged into a
single column by using a pattern that specifies the arrangement. The first column in the
`columns=` parameter operates as the target column (i.e., the column that will undergo mutation)
whereas all following columns will be untouched. There is the option to hide the non-target
columns. The formatting of values in different columns will be preserved upon merging.
Parameters
----------
columns
The columns for which the merging operations should be applied. The first column name
resolved will be the target column (i.e., undergo mutation) and the other columns will serve
to provide input. Can be a list of column names or a selection expression, though a list is
preferred here to ensure the order of columns is exactly as intended (since order matters
for the `pattern=` parameter).
hide_columns
Any column names provided here will have their state changed to hidden (via internal use
of `.cols_hide()`) if they aren't already hidden. This is convenient if the shared purpose
of these specified columns is only to provide string input to the target column. To
suppress any hiding of columns, `False` can be used here. By default, all columns other
than the first one specified in `columns=` will be hidden.
rows
In conjunction with `columns=`, we can specify which of their rows should participate in
the merging process. The default is all rows, resulting in all rows in `columns=` being
formatted. Alternatively, we can supply a list of row indices.
pattern
A formatting pattern that specifies the arrangement of the column values and any string
literals. The pattern uses numbers (within `{}`) that correspond to the indices of columns
provided in `columns=`. If two columns are provided in `columns=` and we would like to
combine the cell data onto the first column, `"{0} {1}"` could be used. If a pattern isn't
provided then a space-separated pattern that includes all columns will be generated
automatically. The pattern can also use `<<`/`>>` to surround spans of text that will be
removed if any of the contained `{}` yields a missing value. Further details are provided in
the *How the pattern works* section.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Details
-------
### How the pattern works
There are two types of templating for the `pattern` string:
- `{` `}` for arranging single column values in a row-wise fashion
- `<<` `>>` to surround spans of text that will be removed if any of the contained `{` `}`
yields a missing value
Integer values are placed in `{}` and those values correspond to the columns involved in the
merge, in the order they are provided in the `columns=` argument. So the pattern
`"{0} ({1}-{2})"` corresponds to the target column value listed first in `columns` and the
second and third columns cited (formatted as a range in parentheses). With hypothetical values,
this might result as the merged string `"38.2 (3-8)"`.
Because some values involved in merging may be missing, it is likely that something like
`"38.2 (3-None)"` would be undesirable. For such cases, placing sections of text in `<<>>`
results in the entire span being eliminated if there were to be an `None` value (arising from
`{}` values). We could instead opt for a pattern like `"{0}<< ({1}-{2})>>"`, which results in
`"38.2"` if either columns `{1}` or `{2}` have a `None` value. We can even use a more complex
nesting pattern like `"{0}<< ({1}-<<{2}>>)>>"` to retain a lower limit in parentheses (where
`{2}` is `None`) but remove the range altogether if `{1}` is `None`.
One more thing to note here is that if `.sub_missing()` is used on values in a column, those
specific values affected won't be considered truly missing by `.cols_merge()` (since they have
been explicitly handled with substitute text).
Examples
--------
Let's use a subset of the `sp500` dataset to create a table. We'll merge the `open` & `close`
columns together, and the `low` & `high` columns (putting an em dash between both).
```{python}
from great_tables import GT
from great_tables.data import sp500
import polars as pl
sp500_mini = (
pl.from_pandas(sp500)
.slice(49, 6)
.select("open", "close", "low", "high")
)
(
GT(sp500_mini)
.fmt_number(
columns=["open", "close", "low", "high"],
decimals=2,
use_seps=False
)
.cols_merge(columns=["open", "close"], pattern="{0}—{1}")
.cols_merge(columns=["low", "high"], pattern="{0}—{1}")
.cols_label(open="open/close", low="low/high")
)
```
Now we'll use a portion of the `gtcars` for the next example that accounts for missing values in
the `pattern=` parameter. Use the `.cols_merge()` method twice to merge together the: (1) `trq`
and `trq_rpm` columns, and (2) `mpg_c` & `mpg_h` columns. Given the presence of missing values,
we can use patterns with `<<`/`>>` to create conditional text spans, avoiding results where
any of the merged columns have missing values.
```{python}
from great_tables.data import gtcars
import polars.selectors as cs
gtcars_pl = (
pl.from_pandas(gtcars)
.filter(pl.col("year") == 2017)
.select(["mfr", "model", "trq", "trq_rpm", "mpg_c", "mpg_h"])
)
(
GT(gtcars_pl)
.fmt_integer(columns=[cs.starts_with("trq"), cs.starts_with("mpg")])
.cols_merge(columns=["trq", "trq_rpm"], pattern="{0}<< ({1} rpm)>>")
.cols_merge(columns=["mpg_c", "mpg_h"], pattern="<<{0} city<{1} hwy>>>>")
.cols_label(mfr="Manufacturer", model="Car Model", trq="Torque", mpg_c="MPG")
)
```
cols_merge_uncert(self: 'GTSelf', col_val: 'SelectExpr', col_uncert: 'SelectExpr', rows: 'int | list[int] | None' = None, sep: 'str' = ' +/- ', autohide: 'bool' = True) -> 'GTSelf'
Merge columns to a value-with-uncertainty column.
`cols_merge_uncert()` is a specialized variant of `cols_merge()`. It takes as input a base
value column (`col_val`) and either: (1) a single uncertainty column, or (2) two columns
representing lower and upper uncertainty bounds. These columns will be essentially merged into a
single column (that of `col_val`). What results is a column with values and associated
uncertainties, and any columns specified in `col_uncert` are hidden from appearing in the output
table.
Parameters
----------
col_val
The column that contains values for the base measurement. While column selection
expressions can be used, it's recommended that a single column name be used to ensure that
exactly one column is provided here.
col_uncert
The column or columns that contain uncertainty values. The most common case involves
supplying a single column with uncertainties; these values will be combined with those in
`col_val`. Less commonly, the lower and upper uncertainty bounds may be different. For that
case, two columns representing the lower and upper uncertainty values away from `col_val`,
respectively, should be provided as a list.
rows
In conjunction with `col_val`, we can specify which rows should participate in the merging
process. The default is all rows. Alternatively, we can supply a list of row indices.
sep
The separator text that contains the uncertainty mark for a single uncertainty value. The
default value of `" +/- "` indicates that an appropriate plus/minus mark will be used
depending on the output context. The plus/minus symbol (±) is used in HTML output.
autohide
An option to automatically hide any columns specified in `col_uncert`. Any columns with
their state changed to hidden will behave the same as before, they just won't be displayed
in the finalized table. Defaults to `True`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Details
-------
### Specialized NA handling
This function employs specialized semantics for missing value handling that differ from the
generic `cols_merge()`:
1. Missing values in `col_val` result in missing values for the merged column (e.g.,
`NA` + `0.1` = `NA`)
2. Missing values in `col_uncert` (but not `col_val`) result in base values only for the
merged column (e.g., `12.0` + `NA` = `12.0`)
3. Missing values in both `col_val` and `col_uncert` result in missing values for the merged
column (e.g., `NA` + `NA` = `NA`)
Examples
--------
Use the `exibble` dataset to create a simple, two-column table. Merge the `currency` and `num`
columns together as a value with uncertainty.
```{python}
from great_tables import GT
from great_tables.data import exibble
import polars as pl
exibble_mini = (
pl.from_pandas(exibble)
.select("num", "currency")
.slice(0, 7)
)
(
GT(exibble_mini)
.fmt_number(columns="num", decimals=3, use_seps=False)
.cols_merge_uncert(col_val="currency", col_uncert="num")
.cols_label(currency="value + uncert.")
)
```
When there are missing values in the uncertainty column, the merged result shows only the base
value. When the base value itself is missing, the entire merged cell is empty.
```{python}
df = pl.DataFrame({
"measurement": [12.5, 8.3, 15.0, 9.7],
"error": [0.2, None, 0.5, None],
})
(
GT(df)
.fmt_number(columns="error", decimals=2)
.cols_merge_uncert(col_val="measurement", col_uncert="error")
.cols_label(measurement="Measurement")
)
```
cols_merge_range(self: 'GTSelf', col_begin: 'SelectExpr', col_end: 'SelectExpr', rows: 'int | list[int] | None' = None, sep: 'str | None' = None, autohide: 'bool' = True, locale: 'str | None' = None) -> 'GTSelf'
Merge two columns to a value range column.
`cols_merge_range()` is a specialized variant of `cols_merge()`. It operates by taking two
columns that constitute a range of values (`col_begin` and `col_end`) and merges them into a
single column. What results is a column containing both values separated by an en dash (or a
custom separator). The column specified in `col_end` is dropped from the output table.
Parameters
----------
col_begin
The column that contains values for the start of the range. While column selection
expressions can be used, it's recommended that a single column name be used to ensure that
exactly one column is provided here.
col_end
The column that contains values for the end of the range. While column selection
expressions can be used, it's recommended that a single column name be used to ensure that
exactly one column is provided here.
rows
In conjunction with `col_begin`, we can specify which rows should participate in the
merging process. The default is all rows. Alternatively, we can supply a list of row
indices.
sep
The separator text that indicates the values are ranged. If not provided, an en dash
(`"–"`) will be used. You can use `"--"` for an en dash or `"---"` for an em dash.
autohide
An option to automatically hide the column specified as `col_end`. Any columns with their
state changed to hidden will behave the same as before, they just won't be displayed in
the finalized table. Defaults to `True`.
locale
An optional locale identifier that can be used for applying a separator pattern specific to
a locale's rules. Currently reserved for future use.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Details
-------
### Specialized NA handling
This function employs specialized semantics for missing value handling that differ from the
generic `cols_merge()`:
1. Missing values in `col_begin` (but not `col_end`) result in a display of only the
`col_end` value
2. Missing values in `col_end` (but not `col_begin`) result in a display of only the
`col_begin` value
3. Missing values in both `col_begin` and `col_end` result in missing values for the merged
column
Examples
--------
Use a subset of the `gtcars` dataset to create a table. Merge the `mpg_c` and `mpg_h` columns
together as a range.
```{python}
from great_tables import GT
from great_tables.data import gtcars
import polars as pl
gtcars_mini = (
pl.from_pandas(gtcars)
.select("model", "mpg_c", "mpg_h")
.slice(0, 8)
)
(
GT(gtcars_mini)
.cols_merge_range(col_begin="mpg_c", col_end="mpg_h")
.cols_label(mpg_c="MPG")
)
```
When there are missing values, the merged result gracefully degrades: if only one side is
missing, the other value is shown alone (without a separator). A custom separator can be
provided via the `sep=` argument.
```{python}
df = pl.DataFrame({
"city": ["NYC", "LA", "CHI", "HOU"],
"temp_low": [28, 55, None, 45],
"temp_high": [35, None, 50, 60],
})
(
GT(df)
.cols_merge_range(col_begin="temp_low", col_end="temp_high", sep=" to ")
.cols_label(temp_low="Temp. Range (°F)")
)
```
cols_merge_n_pct(self: 'GTSelf', col_n: 'SelectExpr', col_pct: 'SelectExpr', rows: 'int | list[int] | None' = None, autohide: 'bool' = True) -> 'GTSelf'
Merge two columns to combine counts and percentages.
`cols_merge_n_pct()` is a specialized variant of `cols_merge()`. It operates by taking two
columns that constitute both a count (`col_n`) and a fraction of the total population
(`col_pct`) and merges them into a single column. What results is a column containing both
counts and their associated percentages (e.g., `12 (23.2%)`). The column specified in
`col_pct` is dropped from the output table.
Parameters
----------
col_n
The column that contains values for the count component. While column selection expressions
can be used, it's recommended that a single column name be used to ensure that exactly one
column is provided here.
col_pct
The column that contains values for the percentage component. While column selection
expressions can be used, it's recommended that a single column name be used to ensure that
exactly one column is provided here. This column should be formatted such that percentages
are displayed (e.g., with `fmt_percent()`).
rows
In conjunction with `col_n`, we can specify which rows should participate in the merging
process. The default is all rows. Alternatively, we can supply a list of row indices.
autohide
An option to automatically hide the column specified as `col_pct`. Any columns with their
state changed to hidden will behave the same as before, they just won't be displayed in
the finalized table. Defaults to `True`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Details
-------
### Specialized NA and zero-value handling
This function employs specialized semantics for missing value and zero-value handling:
1. Missing values in `col_n` result in missing values for the merged column (e.g.,
`NA` + `10.2%` = `NA`)
2. Missing values in `col_pct` (but not `col_n`) result in base values only for the merged
column (e.g., `13` + `NA` = `13`)
3. Missing values in both `col_n` and `col_pct` result in missing values for the merged
column (e.g., `NA` + `NA` = `NA`)
4. If a zero (`0`) value is in `col_n` then the formatted output will be `"0"` (i.e., no
percentage will be shown)
It is the responsibility of the user to ensure that values are correct in both the `col_n` and
`col_pct` columns (this function neither generates nor recalculates values in either).
Formatting of each column can be done independently in separate `fmt_number()` and
`fmt_percent()` calls.
Examples
--------
Create a simple table with counts and percentages, then merge them.
```{python}
from great_tables import GT
import polars as pl
df = pl.DataFrame({
"category": ["A", "B", "C"],
"n": [10, 20, 30],
"pct": [0.167, 0.333, 0.500],
})
(
GT(df)
.fmt_percent(columns="pct")
.cols_merge_n_pct(col_n="n", col_pct="pct")
.cols_label(n="Count (%)")
)
```
Zero values in the count column suppress the percentage display. Missing values in the
percentage column result in just the count being shown, and missing counts produce empty cells.
```{python}
df = pl.DataFrame({
"item": ["Alpha", "Beta", "Gamma", "Delta"],
"count": [15, 0, 8, None],
"frac": [0.375, 0.0, None, 0.125],
})
(
GT(df)
.fmt_percent(columns="frac", decimals=1)
.cols_merge_n_pct(col_n="count", col_pct="frac")
.cols_label(count="N (%)")
)
```
## Adding rows
The [`summary_rows()`](`great_tables.GT.summary_rows`) function adds rows to summarize data within each row group, while [`grand_summary_rows()`](`great_tables.GT.grand_summary_rows`) summarizes across the entire table.
summary_rows(self: 'GTSelf', *, fns: 'dict[str, PlExpr] | dict[str, Callable[[TblData], Any]]', fmt: 'FormatFn | None' = None, columns: 'SelectExpr' = None, groups: 'list[str] | None' = None, side: "Literal['bottom', 'top']" = 'bottom', missing_text: 'str' = '---') -> 'GTSelf'
Add group-wise summary rows to the table.
Add summary rows by using the table data and any suitable aggregation functions. With
`summary_rows()`, the data within each row group is aggregated separately and summary rows are
placed adjacent to each group. Multiple summary rows can be added via expressions given to
`fns=`. You can selectively format the values in the resulting summary cells by use of
formatting expressions from the `vals.fmt_*` class of functions.
Note that currently all arguments are keyword-only, since the final positions may change.
Parameters
----------
fns
A dictionary mapping row labels to aggregation expressions. Can be either Polars expressions
or callable functions that take a DataFrame subset and return aggregated results. Each key
becomes the label for a summary row within each group.
fmt
A formatting function from the `vals.fmt_*` family (e.g., `vals.fmt_number`,
`vals.fmt_currency`) to apply to the summary row values. If `None`, no formatting is
applied.
columns
Currently, this function does not support selection by columns. If you would like to choose
which columns to summarize, you can select columns within the functions given to `fns=`.
See examples below for more explicit cases.
groups
The groups to target for summary row insertion. Can be a list of group IDs as strings. By
default (`None`), summary rows are generated for all groups.
side
Should the summary rows be placed at the `"bottom"` (the default) or the `"top"` of each
group?
missing_text
The text to be used in summary cells with no data outputs.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use a subset of the `gtcars` dataset to create a table with group summary rows. We'll
group by manufacturer and show min and max values for horsepower and torque columns.
```{python}
import polars as pl
from great_tables import GT, vals
from great_tables.data import gtcars
gtcars_mini = (
pl.from_pandas(gtcars)
.select(["mfr", "model", "hp", "trq"])
.head(12)
)
(
GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
.summary_rows(
fns={
"Min": pl.col("hp", "trq").min(),
"Max": pl.col("hp", "trq").max(),
},
fmt=vals.fmt_integer,
)
)
```
We can also target specific groups by using the `groups=` parameter. Here we only show
summary rows for the `"Ferrari"` group:
```{python}
(
GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
.summary_rows(
fns={
"Average": pl.col("hp", "trq").mean(),
},
groups=["Ferrari"],
fmt=vals.fmt_number,
)
)
```
Callable functions work with pandas DataFrames. Each function receives the subset of data
for that group:
```{python}
from great_tables import GT, vals
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq"]].head(12),
rowname_col="model",
groupname_col="mfr",
)
.summary_rows(
fns={
"Min": lambda df: df.min(numeric_only=True),
"Max": lambda df: df.max(numeric_only=True),
},
fmt=vals.fmt_integer,
)
)
```
Summary rows can be placed at the top of each group using `side="top"`:
```{python}
import polars as pl
from great_tables import GT, vals
from great_tables.data import gtcars
gtcars_mini = (
pl.from_pandas(gtcars)
.select(["mfr", "model", "hp", "trq"])
.head(12)
)
(
GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
.summary_rows(
fns={"Mean": pl.col("hp", "trq").mean()},
side="top",
fmt=vals.fmt_number,
)
)
```
Combining group summaries with grand summary rows and styling provides a comprehensive
summary view of the data. Use `loc.summary()` to style all group summary cells:
```{python}
import polars as pl
from great_tables import GT, vals, style, loc
from great_tables.data import gtcars
gtcars_mini = (
pl.from_pandas(gtcars)
.select(["mfr", "model", "hp", "trq"])
.head(12)
)
(
GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
.summary_rows(
fns={
"Min": pl.col("hp", "trq").min(),
"Max": pl.col("hp", "trq").max(),
},
fmt=vals.fmt_integer,
)
.grand_summary_rows(
fns={"Overall Mean": pl.col("hp", "trq").mean()},
fmt=vals.fmt_number,
)
.tab_style(
style=[style.fill(color="lightyellow")],
locations=loc.summary(),
)
.tab_style(
style=[style.fill(color="lightblue")],
locations=loc.grand_summary(),
)
)
```
When groups are displayed as a column in the stub (using `row_group_as_column=True`),
the summary row labels span the stub columns:
```{python}
import polars as pl
from great_tables import GT, vals
from great_tables.data import gtcars
gtcars_mini = (
pl.from_pandas(gtcars)
.select(["mfr", "model", "hp", "trq"])
.head(12)
)
(
GT(gtcars_mini, rowname_col="model", groupname_col="mfr")
.tab_options(row_group_as_column=True)
.summary_rows(
fns={
"Min": pl.col("hp", "trq").min(),
"Max": pl.col("hp", "trq").max(),
},
fmt=vals.fmt_integer,
)
)
```
grand_summary_rows(self: 'GTSelf', *, fns: 'dict[str, PlExpr] | dict[str, Callable[[TblData], Any]]', fmt: 'FormatFn | None' = None, columns: 'SelectExpr' = None, side: "Literal['bottom', 'top']" = 'bottom', missing_text: 'str' = '---') -> 'GTSelf'
Add grand summary rows to the table.
Add grand summary rows by using the table data and any suitable aggregation functions. With
grand summary rows, all of the available data in the gt table is incorporated (regardless of
whether some of the data are part of row groups). Multiple grand summary rows can be added via
expressions given to fns. You can selectively format the values in the resulting grand summary
cells by use of formatting expressions from the `vals.fmt_*` class of functions.
Note that currently all arguments are keyword-only, since the final positions may change.
Parameters
----------
fns
A dictionary mapping row labels to aggregation expressions. Can be either Polars
expressions or callable functions that take the entire DataFrame and return aggregated
results. Each key becomes the label for a grand summary row.
fmt
A formatting function from the `vals.fmt_*` family (e.g., `vals.fmt_number`,
`vals.fmt_currency`) to apply to the summary row values. If `None`, no formatting
is applied.
columns
Currently, this function does not support selection by columns. If you would like to choose
which columns to summarize, you can select columns within the functions given to `fns=`.
See examples below for more explicit cases.
side
Should the grand summary rows be placed at the `"bottom"` (the default) or the `"top"` of
the table?
missing_text
The text to be used in summary cells with no data outputs.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's use a subset of the `sp500` dataset to create a table with grand summary rows. We'll
calculate min, max, and mean values for the numeric columns. Notice the different
approaches to selecting columns to apply the aggregations to: we can use polars selectors
or select the columns directly.
```{python}
import polars as pl
import polars.selectors as cs
from great_tables import GT, vals, style, loc
from great_tables.data import sp500
sp500_mini = (
pl.from_pandas(sp500)
.slice(0, 7)
.drop(["volume", "adj_close"])
)
(
GT(sp500_mini, rowname_col="date")
.grand_summary_rows(
fns={
"Minimum": pl.min("open", "high", "low", "close"),
"Maximum": pl.col("open", "high", "low", "close").max(),
"Average": cs.numeric().mean(),
},
fmt=vals.fmt_currency,
)
.tab_style(
style=[
style.text(color="crimson"),
style.fill(color="lightgray"),
],
locations=loc.grand_summary(),
)
)
```
We can also use custom callable functions to create more complex summary calculations.
Notice here that grand summary rows can be placed at the top of the table and formatted
with currency notation, by passing a formatter from the `vals.fmt_*` class of functions.
```{python}
from great_tables import GT, style, loc, vals
from great_tables.data import gtcars
def pd_median(df):
return df.median(numeric_only=True)
(
GT(
gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6),
rowname_col="model",
)
.fmt_integer(columns=["hp", "trq", "mpg_c"])
.grand_summary_rows(
fns={
"Min": lambda df: df.min(numeric_only=True),
"Max": lambda df: df.max(numeric_only=True),
"Median": pd_median,
},
side="top",
fmt=vals.fmt_integer,
)
.tab_style(
style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")],
locations=loc.grand_summary_stub(),
)
)
```
## Location Targeting and Styling Classes
Location targeting is a powerful feature of Great Tables. It allows for the precise selection of table locations for styling (using the `tab_style()` method). The styling classes allow for the specification of the styling properties to be applied to the targeted locations.
LocHeader() -> None
Target the table header (title and subtitle).
With `loc.header()`, we can target the table header which contains the title and the subtitle.
This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Returns
-------
LocHeader
A LocHeader object, which is used for a `locations=` argument if specifying the title of the
table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table header
(the 'title' and 'subtitle' parts. This can be done by using `locations=loc.header()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_header(
title="Select Cars from the gtcars Dataset",
subtitle="Only the first five cars are displayed"
)
.tab_style(
style=style.fill(color="lightblue"),
locations=loc.header()
)
.fmt_currency(columns="msrp", decimals=0)
)
```
LocTitle() -> None
Target the table title.
With `loc.title()`, we can target the part of table containing the title (within the table
header). This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Returns
-------
LocTitle
A LocTitle object, which is used for a `locations=` argument if specifying the title of the
table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style only the 'title' part
of the table header (leaving the 'subtitle' part unaffected). This can be done by using
`locations=loc.title()` within [`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_header(
title="Select Cars from the gtcars Dataset",
subtitle="Only the first five cars are displayed"
)
.tab_style(
style=style.text(color="blue", size="large", weight="bold"),
locations=loc.title()
)
.fmt_currency(columns="msrp", decimals=0)
)
```
LocSubTitle() -> None
Target the table subtitle.
With `loc.subtitle()`, we can target the part of table containing the subtitle (within the table
header). This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Returns
-------
LocSubTitle
A LocSubTitle object, which is used for a `locations=` argument if specifying the subtitle
of the table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style only the 'subtitle'
part of the table header (leaving the 'title' part unaffected). This can be done by using
`locations=loc.subtitle()` within [`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_header(
title="Select Cars from the gtcars Dataset",
subtitle="Only the first five cars are displayed"
)
.tab_style(
style=style.fill(color="lightblue"),
locations=loc.subtitle()
)
.fmt_currency(columns="msrp", decimals=0)
)
```
LocStubhead() -> None
Target the stubhead.
With `loc.stubhead()`, we can target the part of table that resides both at the top of the
stub and also beside the column header. This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Returns
-------
LocStubhead
A LocStubhead object, which is used for a `locations=` argument if specifying the stubhead
of the table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. This table contains a stub (produced
by setting `rowname_col="model"` in the initial `GT()` call). The stubhead is given a label by
way of the [`tab_stubhead()`](`great_tables.GT.tab_stubhead`) method and this label can be
styled by using `locations=loc.stubhead()` within [`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5),
rowname_col="model",
groupname_col="mfr"
)
.tab_stubhead(label="car")
.tab_style(
style=style.text(color="red", weight="bold"),
locations=loc.stubhead()
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocColumnHeader() -> None
Target column spanners and column labels.
With `loc.column_header()`, we can target the column header which contains all of the column
labels and any spanner labels that are present. This is useful for applying custom styling with
the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument
and this class should be used there to perform the targeting.
Returns
-------
LocColumnHeader
A LocColumnHeader object, which is used for a `locations=` argument if specifying the column
header of the table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We create spanner labels through
use of the [`tab_spanner()`](`great_tables.GT.tab_spanner`) method; this gives us a column
header with a mix of column labels and spanner labels. We will style the entire column header at
once by using `locations=loc.column_header()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5))
.tab_spanner(
label="performance",
columns=["hp", "trq"]
)
.tab_spanner(
label="make and model",
columns=["mfr", "model"]
)
.tab_style(
style=[
style.text(color="white", weight="bold"),
style.fill(color="steelblue")
],
locations=loc.column_header()
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocSpannerLabels(ids: 'SelectExpr' = None) -> None
Target spanner labels.
With `loc.spanner_labels()`, we can target the cells containing the spanner labels. This is
useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method.
That method has a `locations=` argument and this class should be used there to perform the
targeting.
Parameters
----------
ids:
The ID values for the spanner labels to target. A list of one or more ID values is required.
Returns
-------
LocSpannerLabels
A LocSpannerLabels object, which is used for a `locations=` argument if specifying the
table's spanner labels.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We create two spanner labels through
two separate calls of the [`tab_spanner()`](`great_tables.GT.tab_spanner`) method. In each of
those, the text supplied to `label=` argument is used as the ID value (though they have to be
explicitly set via the `id=` argument). We will style only the spanner label having the text
`"performance"` by using `locations=loc.spanner_labels(ids=["performance"])` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5))
.tab_spanner(
label="performance",
columns=["hp", "trq"]
)
.tab_spanner(
label="make and model",
columns=["mfr", "model"]
)
.tab_style(
style=style.text(color="blue", weight="bold"),
locations=loc.spanner_labels(ids=["performance"])
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocColumnLabels(columns: 'SelectExpr' = None) -> None
Target column labels.
With `loc.column_labels()`, we can target the cells containing the column labels. This is useful
for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That
method has a `locations=` argument and this class should be used there to perform the targeting.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list. If no columns are specified, all columns are targeted.
Returns
-------
LocColumnLabels
A LocColumnLabels object, which is used for a `locations=` argument if specifying the
table's column labels.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style all three of the column
labels by using `locations=loc.column_labels()` within
[`tab_style()`](`great_tables.GT.tab_style`). Note that no specification of `columns=` is needed
here because we want to target all columns.
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_style(
style=style.text(color="blue", size="large", weight="bold"),
locations=loc.column_labels()
)
)
```
LocGrandSummaryStub(rows: 'RowSelectExpr' = None) -> None
Target the grand summary stub.
With `loc.grand_summary_stub()` we can target the cells containing the grand summary row labels,
which reside in the table stub. This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Parameters
----------
rows
The rows to target within the grand summary stub. Can either be a single row name or a
series of row names provided in a list. If no rows are specified, all grand summary rows
are targeted. Note that if rows are targeted by index, top and bottom grand summary rows
are indexed as one combined list starting with the top rows.
Returns
-------
LocGrandSummaryStub
A LocGrandSummaryStub object, which is used for a `locations=` argument if specifying the
table's grand summary rows' labels.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table grand
summary stub (the row labels) by using `locations=loc.grand_summary_stub()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc, vals
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6),
rowname_col="model",
)
.fmt_integer(columns=["hp", "trq", "mpg_c"])
.grand_summary_rows(
fns={
"Min": lambda df: df.min(numeric_only=True),
"Max": lambda x: x.max(numeric_only=True),
},
side="top",
fmt=vals.fmt_integer,
)
.tab_style(
style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")],
locations=loc.grand_summary_stub(),
)
)
```
LocStub(rows: 'RowSelectExpr' = None) -> None
Target the table stub.
With `loc.stub()` we can target the cells containing the row labels, which reside in the table
stub. This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Parameters
----------
rows
The rows to target within the stub. Can either be a single row name or a series of row names
provided in a list. If no rows are specified, all rows are targeted.
Returns
-------
LocStub
A LocStub object, which is used for a `locations=` argument if specifying the table's stub.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style the entire table stub
(the row labels) by using `locations=loc.stub()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5),
rowname_col="model",
groupname_col="mfr"
)
.tab_stubhead(label="car")
.tab_style(
style=[
style.text(color="crimson", weight="bold"),
style.fill(color="lightgray")
],
locations=loc.stub()
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocRowGroups(rows: 'RowSelectExpr' = None) -> None
Target row groups.
With `loc.row_groups()` we can target the cells containing the row group labels, which span
across the table body. This is useful for applying custom styling with the
[`tab_style()`](`great_tables.GT.tab_style`) method. That method has a `locations=` argument and
this class should be used there to perform the targeting.
Parameters
----------
rows
The row groups to target. Can either be a single group name or a series of group names
provided in a list. If no groups are specified, all are targeted.
Returns
-------
LocRowGroups
A LocRowGroups object, which is used for a `locations=` argument if specifying the table's
row groups.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style all of the cells
comprising the row group labels by using `locations=loc.row_groups()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5),
rowname_col="model",
groupname_col="mfr"
)
.tab_stubhead(label="car")
.tab_style(
style=[
style.text(color="crimson", weight="bold"),
style.fill(color="lightgray")
],
locations=loc.row_groups()
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocGrandSummary(columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, mask: 'PlExpr | None' = None) -> None
Target the data cells in grand summary rows.
With `loc.grand_summary()` we can target the cells containing the grand summary data.
This is useful for applying custom styling with the [`tab_style()`](`great_tables.GT.tab_style`)
method. That method has a `locations=` argument and this class should be used there to perform
the targeting.
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
The rows to target. Can either be a single row name or a series of row names provided in a
list. Note that if rows are targeted by index, top and bottom grand summary rows are indexed
as one combined list starting with the top rows.
Returns
-------
LocGrandSummary
A LocGrandSummary object, which is used for a `locations=` argument if specifying the
table's grand summary rows.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style all of the grand
summary cells by using `locations=loc.grand_summary()` within
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc, vals
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "mpg_c"]].head(6),
rowname_col="model",
)
.fmt_integer(columns=["hp", "trq", "mpg_c"])
.grand_summary_rows(
fns={
"Min": lambda df: df.min(numeric_only=True),
"Max": lambda x: x.max(numeric_only=True),
},
side="top",
fmt=vals.fmt_integer,
)
.tab_style(
style=[style.text(color="crimson", weight="bold"), style.fill(color="lightgray")],
locations=loc.grand_summary(),
)
)
```
LocBody(columns: 'SelectExpr' = None, rows: 'RowSelectExpr' = None, mask: 'PlExpr | None' = None) -> None
Target data cells in the table body.
With `loc.body()`, we can target the data cells in the table body. This is useful for applying
custom styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a
`locations=` argument and this class should be used there to perform the targeting.
:::{.callout-warning}
`mask=` is still experimental.
:::
Parameters
----------
columns
The columns to target. Can either be a single column name or a series of column names
provided in a list.
rows
The rows to target. Can either be a single row name or a series of row names provided in a
list.
mask
The cells to target. If the underlying wrapped DataFrame is a Polars DataFrame,
you can pass a Polars expression for cell-based selection. This argument must be used
exclusively and cannot be combined with the `columns=` or `rows=` arguments.
Returns
-------
LocBody
A LocBody object, which is used for a `locations=` argument if specifying the table body.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. We will style all of the body cells
by using `locations=loc.body()` within [`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(
gtcars[["mfr", "model", "hp", "trq", "msrp"]].head(5),
rowname_col="model",
groupname_col="mfr"
)
.tab_stubhead(label="car")
.tab_style(
style=[
style.text(color="darkblue", weight="bold"),
style.fill(color="gainsboro")
],
locations=loc.body()
)
.fmt_integer(columns=["hp", "trq"])
.fmt_currency(columns="msrp", decimals=0)
)
```
LocFooter() -> None
Target the table footer.
With `loc.footer()` we can target the table's footer, which currently contains the source notes
(and may contain a 'footnotes' location in the future). This is useful when applying custom
styling with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a
`locations=` argument and this class should be used there to perform the targeting. The 'footer'
location is generated by [`tab_source_note()`](`great_tables.GT.tab_source_note`).
Returns
-------
LocFooter
A `LocFooter` object, which is used for a `locations=` argument if specifying the footer of
the table.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. Add a source note (with
[`tab_source_note()`](`great_tables.GT.tab_source_note`) and style this footer section inside of
[`tab_style()`](`great_tables.GT.tab_style`) with `locations=loc.footer()`.
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_source_note(source_note="From edmunds.com")
.tab_style(
style=style.text(color="blue", size="small", weight="bold"),
locations=loc.footer()
)
)
```
LocSourceNotes() -> None
Target the source notes.
With `loc.source_notes()`, we can target the source notes in the table. This is useful when
applying custom with the [`tab_style()`](`great_tables.GT.tab_style`) method. That method has a
`locations=` argument and this class should be used there to perform the targeting. The
'source_notes' location is generated by
[`tab_source_note()`](`great_tables.GT.tab_source_note`).
Returns
-------
LocSourceNotes
A `LocSourceNotes` object, which is used for a `locations=` argument if specifying the
source notes.
Examples
--------
Let's use a subset of the `gtcars` dataset in a new table. Add a source note (with
[`tab_source_note()`](`great_tables.GT.tab_source_note`) and style the source notes section
inside [`tab_style()`](`great_tables.GT.tab_style`) with `locations=loc.source_notes()`.
```{python}
from great_tables import GT, style, loc
from great_tables.data import gtcars
(
GT(gtcars[["mfr", "model", "msrp"]].head(5))
.tab_source_note(source_note="From edmunds.com")
.tab_style(
style=style.text(color="blue", size="small", weight="bold"),
locations=loc.source_notes()
)
)
```
CellStyleFill(color: 'str | ColumnExpr') -> None
A style specification for the background fill of targeted cells.
The `style.fill()` class is to be used with the `tab_style()` method, which itself allows for
the setting of custom styles to one or more cells. Specifically, the call to `style.fill()`
should be bound to the `styles` argument of `tab_style()`.
Parameters
----------
color
The color to use for the cell background fill. This can be any valid CSS color value, such
as a hex code, a named color, or an RGB value.
Returns
-------
CellStyleFill
A CellStyleFill object, which is used for a `styles` argument if specifying a cell fill
value.
Examples
------
See [`GT.tab_style()`](`great_tables.GT.tab_style`).
CellStyleText(color: 'str | ColumnExpr | None' = None, font: 'str | ColumnExpr | GoogleFont | None' = None, size: 'str | ColumnExpr | None' = None, align: "Literal['center', 'left', 'right', 'justify'] | ColumnExpr | None" = None, v_align: "Literal['middle', 'top', 'bottom'] | ColumnExpr | None" = None, style: "Literal['normal', 'italic', 'oblique'] | ColumnExpr | None" = None, weight: "Literal['normal', 'bold', 'bolder', 'lighter'] | ColumnExpr | None" = None, stretch: "Literal['normal', 'condensed', 'ultra-condensed', 'extra-condensed', 'semi-condensed', 'semi-expanded', 'expanded', 'extra-expanded', 'ultra-expanded'] | ColumnExpr | None" = None, decorate: "Literal['overline', 'line-through', 'underline', 'underline overline'] | ColumnExpr | None" = None, transform: "Literal['uppercase', 'lowercase', 'capitalize'] | ColumnExpr | None" = None, whitespace: "Literal['normal', 'nowrap', 'pre', 'pre-wrap', 'pre-line', 'break-spaces'] | ColumnExpr | None" = None) -> None
A style specification for cell text.
The `style.text()` class is to be used with the `tab_style()` method, which itself allows for
the setting of custom styles to one or more cells. With it, you can specify the color of the
text, the font family, the font size, and the horizontal and vertical alignment of the text and
more.
Parameters
----------
color
The text color can be modified through the `color` argument.
font
The font or collection of fonts (subsequent font names are) used as fallbacks.
size
The size of the font. Can be provided as a number that is assumed to represent `px` values
(or could be wrapped in the `px()` helper function). We can also use one of the following
absolute size keywords: `"xx-small"`, `"x-small"`, `"small"`, `"medium"`, `"large"`,
`"x-large"`, or `"xx-large"`.
align
The text in a cell can be horizontally aligned though one of the following options:
`"center"`, `"left"`, `"right"`, or `"justify"`.
v_align
The vertical alignment of the text in the cell can be modified through the options
`"middle"`, `"top"`, or `"bottom"`.
style
Can be one of either `"normal"`, `"italic"`, or `"oblique"`.
weight
The weight of the font can be modified thorough a text-based option such as `"normal"`,
`"bold"`, `"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`, inclusive.
Note that only variable fonts may support the numeric mapping of weight.
stretch
Allows for text to either be condensed or expanded. We can use one of the following
text-based keywords to describe the degree of condensation/expansion: `"ultra-condensed"`,
`"extra-condensed"`, `"condensed"`, `"semi-condensed"`, `"normal"`, `"semi-expanded"`,
`"expanded"`, `"extra-expanded"`, or `"ultra-expanded"`. Alternatively, we can supply
percentage values from `0%` to `200%`, inclusive. Negative percentage values are not
allowed.
decorate
Allows for text decoration effect to be applied. Here, we can use `"overline"`,
`"line-through"`, or `"underline"`.
transform
Allows for the transformation of text. Options are `"uppercase"`, `"lowercase"`, or
`"capitalize"`.
whitespace
A white-space preservation option. By default, runs of white-space will be collapsed into
single spaces but several options exist to govern how white-space is collapsed and how lines
might wrap at soft-wrap opportunities. The options are `"normal"`, `"nowrap"`, `"pre"`,
`"pre-wrap"`, `"pre-line"`, and `"break-spaces"`.
Returns
-------
CellStyleText
A CellStyleText object, which is used for a `styles` argument if specifying any cell text
properties.
Examples
------
See [`GT.tab_style()`](`great_tables.GT.tab_style`).
CellStyleBorders(sides: "Literal['all', 'top', 'bottom', 'left', 'right'] | list[Literal['all', 'top', 'bottom', 'left', 'right']]" = 'all', color: 'str | ColumnExpr' = '#000000', style: 'str | ColumnExpr' = 'solid', weight: 'str | ColumnExpr' = '1px') -> None
A style specification for cell borders.
The `styles.borders()` class is to be used with the `tab_style()` method, which itself allows
for the setting of custom styles to one or more cells. The `sides` argument is where we define
which borders should be modified (e.g., `"left"`, `"right"`, etc.). With that selection, the
`color`, `style`, and `weight` of the selected borders can then be modified.
Parameters
----------
sides
The border sides to be modified. Options include `"left"`, `"right"`, `"top"`, and
`"bottom"`. For all borders surrounding the selected cells, we can use the `"all"` option.
color
The border `color` can be defined with any valid CSS color value, such as a hex code, a
named color, or an RGB value. The default `color` value is `"#000000"` (black).
style
The border `style` can be one of either `"solid"` (the default), `"dashed"`, `"dotted"`,
`"hidden"`, or `"double"`.
weight
The default value for `weight` is `"1px"` and higher values will become more visually
prominent.
Returns
-------
CellStyleBorders
A CellStyleBorders object, which is used for a `styles` argument if specifying cell borders.
Examples
------
See [`GT.tab_style()`](`great_tables.GT.tab_style`).
CellStyleCss(rule: 'str') -> None
A style specification for custom CSS rules.
The `style.css()` class is to be used with the `tab_style()` method, which itself allows for
the setting of custom styles to one or more cells. With `style.css()`, you can specify any CSS
rule that you would like to apply to the targeted cells.
Parameters
----------
rule
The CSS rule to apply to the targeted cells. This can be any valid CSS rule, such as
`background-color: red;` or `font-size: 14px;`.
Returns
-------
CellStyleCss
A CellStyleCss object, which is used for a `styles` argument if specifying a custom CSS
rule.
Examples
--------
See [`GT.tab_style()`](`great_tables.GT.tab_style`).
## Helper Functions
An assortment of helper functions is available in the Great Tables package. The `md()` and `html()` helper functions can be used during label creation with the `tab_header()`, `tab_spanner()`, `tab_stubhead()`, and `tab_source_note()` methods.
with_id(self: 'GTSelf', id: 'str | None' = None) -> 'GTSelf'
Set the id for this table.
Note that this is a shortcut for the `table_id=` argument in `GT.tab_options()`.
Parameters
----------
id
By default (with `None`) the table ID will be a random, ten-letter string as generated
through internal use of the `random_id()` function. A custom table ID can be used here by
providing a string.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
The use of `with_id` is straightforward—simply pass a string to `id=` to set the table ID:
```{python}
from great_tables import GT, exibble
GT(exibble).with_id("your-table-id")
```
with_locale(self: 'GTSelf', locale: 'str | None' = None) -> 'GTSelf'
Set a column to be the default locale.
Setting a default locale affects formatters like `fmt_number()`, and `fmt_date()`,
by having them default to locale-specific features (e.g. representing one thousand
as 1.000,00)
Parameters
----------
locale
An optional locale identifier that can be used for formatting values according the locale's
rules. Examples include `"en"` for English (United States) and `"fr"` for French (France).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Let's create a table and set its `locale=` to `"ja"` for Japan. Then, we call `fmt_currency()`
to format the `"currency"` column. Since we didn't specify a `locale=` for `fmt_currency()`,
it will adopt the globally set `"ja"` locale.
```{python}
from great_tables import GT, exibble
(
GT(exibble)
.with_locale("ja")
.fmt_currency(
columns="currency",
decimals=3,
use_seps=False
)
)
```
**Great Tables** internally supports many locale options. You can find the available locales in
the following table:
```{python}
from great_tables.data import __x_locales
columns = ["locale", "lang_name", "lang_desc", "territory_name", "territory_desc"]
GT(__x_locales.loc[:, columns]).cols_align("right")
```
md(text: 'str') -> 'Md'
Interpret input text as Markdown-formatted text.
Markdown can be used in certain places (e.g., source notes, table title/subtitle, etc.) and we
can expect it to render to HTML. There is also the [`html()`](`great_tables.html`) helper
function that allows you to use raw HTML text.
Parameters
----------
text
The text that is understood to contain Markdown formatting.
Examples
------
See [`GT.tab_header()`](`great_tables.GT.tab_header`).
html(text: 'str') -> 'Html'
Interpret input text as HTML-formatted text.
For certain pieces of text (like in column labels or table headings) we may want to express them
as raw HTML. In fact, with HTML, anything goes so it can be much more than just text. The
`html()` function will guard the input HTML against escaping, so, your HTML tags will come
through as HTML when rendered.
Parameters
----------
text
The text that is understood to contain HTML formatting.
Examples
------
See [`GT.tab_header()`](`great_tables.GT.tab_header`).
FromColumn(column: 'str', na_value: 'Any | None' = None, fn: 'Callable[[Any], Any] | None' = None) -> None
Specify that a style value should be fetched from a column in the data.
Parameters
----------
column
A column name in the data containing the styling information.
na_value
A single value to replace any NA values in the column (currently not supported).
fn
A callable applied to transform each value extracted from `column=`.
Examples
--------
This example demonstrates styling the `"x"` column.
Style the text color using the `"color"` column:
```{python}
import pandas as pd
import polars as pl
from great_tables import GT, from_column, loc, style, px
df = pd.DataFrame({"x": [15, 20], "color": ["red", "blue"]})
(GT(df).tab_style(style=style.text(color=from_column("color")), locations=loc.body(columns=["x"])))
```
With polars, you can pass expressions directly:
```{python}
df_polars = pl.from_pandas(df)
(
GT(df_polars).tab_style(
style=style.text(color=pl.col("color")), locations=loc.body(columns=["x"])
)
)
```
Style the text size using values from the `"x"` column, with the
`px()` helper function as the `fn=` parameter:
```{python}
(
GT(df).tab_style(
style=style.text(color=from_column("color"), size=from_column("x", fn=px)),
locations=loc.body(columns=["x"]),
)
)
```
google_font(name: 'str') -> 'GoogleFont'
Specify a font from the *Google Fonts* service.
The `google_font()` helper function can be used wherever a font name might be specified. There
are two instances where this helper can be used:
1. `opt_table_font(font=...)` (for setting a table font)
2. `style.text(font=...)` (itself used in [`tab_style()`](`great_tables.GT.tab_style`))
Parameters
----------
name
The name of the Google Font to use.
Returns
-------
GoogleFont
A GoogleFont object, which contains the name of the font and methods for incorporating the
font in HTML output tables.
Examples
--------
Let's use the `exibble` dataset to create a table of two columns and eight rows. We'll replace
missing values with em dashes using [`sub_missing()`](`great_tables.GT.sub_missing`). For text
in the time column, we will use the font called `"IBM Plex Mono"` which is available from Google
Fonts. This is defined inside the `google_font()` call, itself within the
[`style.text()`](`great_tables.style.text`) method that's applied to the `style=` parameter of
[`tab_style()`](`great_tables.GT.tab_style`).
```{python}
from great_tables import GT, exibble, style, loc, google_font
(
GT(exibble[["char", "time"]])
.sub_missing()
.tab_style(
style=style.text(font=google_font(name="IBM Plex Mono")),
locations=loc.body(columns="time")
)
)
```
We can use a subset of the `sp500` dataset to create a small table. With
[`fmt_currency()`](`great_tables.GT.fmt_currency`), we can display values as monetary values.
Then, we'll set a larger font size for the table and opt to use the `"Merriweather"` font by
calling `google_font()` within [`opt_table_font()`](`great_tables.GT.opt_table_font`). In cases
where that font may not materialize, we include two font fallbacks: `"Cochin"` and the catchall
`"Serif"` group.
```{python}
from great_tables import GT, google_font
from great_tables.data import sp500
(
GT(sp500.drop(columns=["volume", "adj_close"]).head(10))
.fmt_currency(columns=["open", "high", "low", "close"])
.tab_options(table_font_size="20px")
.opt_table_font(font=[google_font(name="Merriweather"), "Cochin", "Serif"])
)
```
system_fonts(name: 'FontStackName' = 'system-ui') -> 'list[str]'
Get a themed font stack that works well across systems.
A font stack can be obtained from `system_fonts()` using one of various keywords such as
`"system-ui"`, `"old-style"`, and `"humanist"` (there are 15 in total) representing a themed set
of fonts. These sets comprise a font family that has been tested to work across a wide range of
computer systems.
Parameters
----------
name
The name of a font stack. Must be drawn from the set of `"system-ui"` (the default),
`"transitional"`, `"old-style"`, `"humanist"`, `"geometric-humanist"`,
`"classical-humanist"`, `"neo-grotesque"`, `"monospace-slab-serif"`, `"monospace-code"`,
`"industrial"`, `"rounded-sans"`, `"slab-serif"`, `"antique"`, `"didone"`, and
`"handwritten"`.
Returns
-------
list[str]
A list of font names that make up the font stack.
The font stacks and the individual fonts used by platform
---------------------------------------------------------
### System UI (`"system-ui"`)
```css
font-family: system-ui, sans-serif;
```
The operating system interface's default typefaces are known as system UI fonts. They contain a
variety of font weights, are quite readable at small sizes, and are perfect for UI elements.
These typefaces serve as a great starting point for text in data tables and so this font stack
is the default for **Great Tables**.
### Transitional (`"transitional"`)
```css
font-family: Charter, 'Bitstream Charter', 'Sitka Text', Cambria, serif;
```
The Enlightenment saw the development of transitional typefaces, which combine Old Style and
Modern typefaces. *Times New Roman*, a transitional typeface created for the Times of London
newspaper, is among the most well-known instances of this style.
### Old Style (`"old-style"`)
```css
font-family: 'Iowan Old Style', 'Palatino Linotype', 'URW Palladio L', P052, serif;
```
Old style typefaces were created during the Renaissance and are distinguished by diagonal
stress, a lack of contrast between thick and thin strokes, and rounded serifs. *Garamond* is
among the most well-known instances of an antique typeface.
### Humanist (`"humanist"`)
```css
font-family: Seravek, 'Gill Sans Nova', Ubuntu, Calibri, 'DejaVu Sans', source-sans-pro, sans-serif;
```
Low contrast between thick and thin strokes and organic, calligraphic forms are traits of
humanist typefaces. These typefaces, which draw their inspiration from Renaissance calligraphy,
are frequently regarded as being more readable and easier to read than other sans serif
typefaces.
### Geometric Humanist (`"geometric-humanist"`)
```css
font-family: Avenir, Montserrat, Corbel, 'URW Gothic', source-sans-pro, sans-serif;
```
Clean, geometric forms and consistent stroke widths are characteristics of geometric humanist
typefaces. These typefaces, which are frequently used for headlines and other display purposes,
are frequently thought to be contemporary and slick in appearance. A well-known example of this
classification is *Futura*.
### Classical Humanist (`"classical-humanist"`)
```css
font-family: Optima, Candara, 'Noto Sans', source-sans-pro, sans-serif;
```
The way the strokes gradually widen as they approach the stroke terminals without ending in a
serif is what distinguishes classical humanist typefaces. The stone carving on Renaissance-era
tombstones and classical Roman capitals served as inspiration for these typefaces.
### Neo-Grotesque (`"neo-grotesque"`)
```css
font-family: Inter, Roboto, 'Helvetica Neue', 'Arial Nova', 'Nimbus Sans', Arial, sans-serif;
```
Neo-grotesque typefaces are a form of sans serif that originated in the late 19th and early 20th
centuries. They are distinguished by their crisp, geometric shapes and regular stroke widths.
*Helvetica* is among the most well-known examples of a Neo-grotesque typeface.
### Monospace Slab Serif (`"monospace-slab-serif"`)
```css
font-family: 'Nimbus Mono PS', 'Courier New', monospace;
```
Monospace slab serif typefaces are distinguished by their fixed-width letters, which are the
same width irrespective of their shape, and their straightforward, geometric forms. For reports,
tabular work, and technical documentation, this technique is used to simulate typewriter output.
### Monospace Code (`"monospace-code"`)
```css
font-family: ui-monospace, 'Cascadia Code', 'Source Code Pro', Menlo, Consolas, 'DejaVu Sans Mono', monospace;
```
Specifically created for use in programming and other technical applications, monospace code
typefaces are used in these fields. These typefaces are distinguished by their clear, readable
forms and monospaced design, which ensures that all letters and characters are the same width.
### Industrial (`"industrial"`)
```css
font-family: Bahnschrift, 'DIN Alternate', 'Franklin Gothic Medium', 'Nimbus Sans Narrow', sans-serif-condensed, sans-serif;
```
The development of industrial typefaces began in the late 19th century and was greatly
influenced by the industrial and technological advancements of the time. Industrial typefaces
are distinguished by their strong sans serif letterforms, straightforward appearance, and use of
geometric shapes and straight lines.
### Rounded Sans (`"rounded-sans"`)
```css
font-family: ui-rounded, 'Hiragino Maru Gothic ProN', Quicksand, Comfortaa, Manjari, 'Arial Rounded MT', 'Arial Rounded MT Bold', Calibri, source-sans-pro, sans-serif;
```
The rounded, curved letterforms that define rounded typefaces give them a softer, friendlier
appearance. The typeface's rounded edges give it a more natural and playful feel, making it
appropriate for use in casual or kid-friendly designs. Since the 1950s, the rounded sans-serif
design has gained popularity and is still frequently used in branding, graphic design, and other
fields.
### Slab Serif (`"slab-serif"`)
```css
font-family: Rockwell, 'Rockwell Nova', 'Roboto Slab', 'DejaVu Serif', 'Sitka Small', serif;
```
Slab Serif typefaces are distinguished by the thick, block-like serifs that appear at the ends
of each letterform. Typically, these serifs are unbracketed, which means that they do not have
any curved or tapered transitions to the letter's main stroke.
### Antique (`"antique"`)
```css
font-family: Superclarendon, 'Bookman Old Style', 'URW Bookman', 'URW Bookman L', 'Georgia Pro', Georgia, serif;
```
Serif typefaces that were popular in the 19th century include antique typefaces, also referred
to as Egyptians. They are distinguished by their thick, uniform stroke weight and block-like
serifs. The typeface *Clarendon* is a highly regarded example of this style and *Superclarendon*
is a modern take on that revered typeface.
### Didone (`"didone"`)
```css
font-family: Didot, 'Bodoni MT', 'Noto Serif Display', 'URW Palladio L', P052, Sylfaen, serif;
```
Didone typefaces, also referred to as Modern typefaces, are distinguished by their vertical
stress, sharp contrast between thick and thin strokes, and hairline serifs without bracketing.
The Didone style first appeared in the late 18th century and became well-known in the early
19th century. *Bodoni* and *Didot* are two of the most well-known typefaces in this category.
### Handwritten (`"handwritten"`)
```css
font-family: 'Segoe Print', 'Bradley Hand', Chilanka, TSCu_Comic, casual, cursive;
```
The appearance and feel of handwriting are replicated by handwritten typefaces. Although there
are a wide variety of handwriting styles, this font stack tends to use a more casual and
commonplace style. In regards to these types of fonts in tables, one can say that any table
having a handwritten font will evoke a feeling of gleefulness.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll set a font for the entire table using the
`tab_options()` method with the `table_font_names` parameter. Instead of passing a list of font
names, we'll use the `system_fonts()` helper function to get a font stack. In this case, we'll
use the `"industrial"` font stack.
```{python}
from great_tables import GT, exibble, md, system_fonts
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_align_table_header(align="left")
.tab_options(table_font_names=system_fonts("industrial"))
)
```
Invoking the `system_fonts()` helper function with the `"industrial"` argument will return a
list of font names that make up the font stack. This is exactly the type of input that the
`table_font_names` parameter requires.
define_units(units_notation: 'str') -> 'UnitDefinitionList'
With `define_units()` you can work with a specially-crafted units notation string and emit the
units as HTML (with the `.to_html()` method). This function is useful as a standalone utility
and it powers the `fmt_units()` method in **Great Tables**.
Parameters
----------
units_notation : str
A string of units notation.
Returns
-------
UnitDefinitionList
A list of unit definitions.
Specification of units notation
-------------------------------
The following table demonstrates the various ways in which units can be specified in the
`units_notation` string and how the input is processed by the `define_units()` function. The
concluding step for display of the units in HTML is to use the `to_html()` method.
```{python}
#| echo: false
from great_tables import GT, style, loc
import polars as pl
units_tbl = pl.DataFrame(
{
"rule": [
"'^' creates a superscript",
"'_' creates a subscript",
"subscripts and superscripts can be combined",
"use '[_subscript^superscript]' to create an overstrike",
"a '/' at the beginning adds the superscript '-1'",
"hyphen is transformed to minus sign when preceding a unit",
"'x' at the beginning is transformed to '×'",
"ASCII terms from biology/chemistry turned into terminology forms",
"can create italics with '*' or '_'; create bold text with '**' or '__'",
"special symbol set surrounded by colons",
"chemistry notation: '%C6H6%'",
],
"input": [
"m^2",
"h_0",
"h_0^3",
"h[_0^3]",
"/s",
"-h^2",
"x10^3 kg^2 m^-1",
"ug",
"*m*^**2**",
":permille:C",
"g/L %C6H12O6%",
],
}
).with_columns(output=pl.col("input"))
(
GT(units_tbl)
.fmt_units(columns="output")
.tab_style(
style=style.text(font="courier"),
locations=loc.body(columns="input")
)
)
```
Examples
--------
Let’s demonstrate a use case where we utilize `define_units()` to render an equation as
the subtitle in the table header, which currently doesn’t accept unit notation as input.
We'll start by creating a Polars DataFrame representing the calculations of the equation
$y= a_2x^2 + a_1x + a_0$.
```{python}
#| code-fold: true
import polars as pl
from great_tables import GT, html, define_units
df = pl.DataFrame(
{"x": [1, 2, 3], "a2": [2, 3, 4], "a1": [3, 4, 5], "a0": [4, 5, 6]}
).with_columns(
y=(
pl.col("a2").mul(pl.col("x").pow(2))
+ pl.col("a1").mul(pl.col("x"))
+ pl.col("a0")
)
)
df
```
If we try to use unit annotations to format the equation as the subtitle in the header, it
won’t work as expected:
```{python}
(
GT(df)
.cols_label(a2="{{a_2}}", a1="{{a_1}}", a0="{{a_0}}")
.tab_header(title="Linear Algebra", subtitle="y={{a_2}}{{x^2}}+{{a_1}}x+{{a_0}}")
)
```
To address this, we can create a small helper function, `u2html()`, which wraps a given string
in `define_units()` and emits the units to HTML. Next, we can build the subtitle by applying
`u2html()` to the string with unit annotations. Finally, we pass the assembled subtitle string
through `html()` to ensure it renders correctly.
```{python}
def u2html(x: str) -> str:
return define_units(x).to_html()
subtitle = (
"y"
+ "="
+ u2html("{{a_2}}")
+ u2html("{{x^2}}")
+ "+"
+ u2html("{{a_1}}")
+ "x"
+ "+"
+ u2html("{{a_0}}")
)
(
GT(df)
.cols_label(a2="{{a_2}}", a1="{{a_1}}", a0="{{a_0}}")
.tab_header(title="Linear Algebra", subtitle=html(subtitle))
)
```
nanoplot_options(data_point_radius: 'int | list[int] | None' = None, data_point_stroke_color: 'str | list[str] | None' = None, data_point_stroke_width: 'int | list[int] | None' = None, data_point_fill_color: 'str | list[str] | None' = None, data_line_type: 'str | None' = None, data_line_stroke_color: 'str | None' = None, data_line_stroke_width: 'int | None' = None, data_area_fill_color: 'str | None' = None, data_bar_stroke_color: 'str | list[str] | None' = None, data_bar_stroke_width: 'int | list[int] | None' = None, data_bar_fill_color: 'str | list[str] | None' = None, data_bar_negative_stroke_color: 'str | None' = None, data_bar_negative_stroke_width: 'int | None' = None, data_bar_negative_fill_color: 'str | None' = None, reference_line_color: 'str | None' = None, reference_area_fill_color: 'str | None' = None, vertical_guide_stroke_color: 'str | None' = None, vertical_guide_stroke_width: 'int | None' = None, show_data_points: 'bool | None' = None, show_data_line: 'bool | None' = None, show_data_area: 'bool | None' = None, show_reference_line: 'bool | None' = None, show_reference_area: 'bool | None' = None, show_vertical_guides: 'bool | None' = None, show_y_axis_guide: 'bool | None' = None, interactive_data_values: 'bool | None' = None, y_val_fmt_fn: 'Callable[..., str] | None' = None, y_axis_fmt_fn: 'Callable[..., str] | None' = None, y_ref_line_fmt_fn: 'Callable[..., str] | None' = None, currency: 'str | None' = None) -> 'dict[str, Any]'
Helper for setting the options for a nanoplot.
When using `cols_nanoplot()`, the defaults for the generated nanoplots can be modified with
`nanoplot_options()` within the `options=` argument.
Parameters
----------
data_point_radius
The `data_point_radius=` option lets you set the radius for each of the data points. By
default this is set to `10`. Individual radius values can be set by using a list of numeric
values; however, the list provided must match the number of data points.
data_point_stroke_color
The default stroke color of the data points is `"#FFFFFF"` (`"white"`). This works well when
there is a visible data line combined with data points with a darker fill color. The stroke
color can be modified with `data_point_stroke_color=` for all data points by supplying a
single color value. With a list of colors, each data point's stroke color can be changed
(ensure that the list length matches the number of data points).
data_point_stroke_width
The width of the outside stroke for the data points can be modified with the
`data_point_stroke_width=` option. By default, a value of `4` (as in '4px') is used.
data_point_fill_color
By default, all data points have a fill color of `"#FF0000"` (`"red"`). This can be changed
for all data points by providing a different color to `data_point_fill_color=`. And, a list
of different colors can be supplied so long as the length is equal to the number of data
points; the fill color values will be applied in order of left to right.
data_line_type
This can accept either `"curved"` or `"straight"`. Curved lines are recommended when the
nanoplot has less than 30 points and data points are evenly spaced. In most other cases,
straight lines might present better.
data_line_stroke_color
The color of the data line can be modified from its default `"#4682B4"` (`"steelblue"`)
color by supplying a color to the `data_line_stroke_color=` option.
data_line_stroke_width
The width of the connecting data line can be modified with `data_line_stroke_width=`. By
default, a value of `4` (as in '4px') is used.
data_area_fill_color
The fill color for the area that bounds the data points in line plot. The default is
`"#FF0000"` (`"red"`) but can be changed by providing a color value to
`data_area_fill_color=`.
data_bar_stroke_color
The color of the stroke used for the data bars can be modified from its default `"#3290CC"`
color by supplying a color to `data_bar_stroke_color=`.
data_bar_stroke_width
The width of the stroke used for the data bars can be modified with the
`data_bar_stroke_width=` option. By default, a value of `4` (as in '4px') is used.
data_bar_fill_color
By default, all data bars have a fill color of `"#3FB5FF"`. This can be changed for all data
bars by providing a different color to `data_bar_fill_color=`. And, a list of different
colors can be supplied so long as the length is equal to the number of data bars; the fill
color values will be applied in order of left to right.
data_bar_negative_stroke_color
The color of the stroke used for the data bars that have negative values. The default color
is `"#CC3243"` but this can be changed by supplying a color value to the
`data_bar_negative_stroke_color=` option.
data_bar_negative_stroke_width
The width of the stroke used for negative value data bars. This has the same default as
`data_bar_stroke_width=` with a value of `4` (as in '4px'). This can be changed by giving a
numeric value to the `data_bar_negative_stroke_width=` option.
data_bar_negative_fill_color
By default, all negative data bars have a fill color of `"#D75A68"`. This can however be
changed by providing a color value to `data_bar_negative_fill_color=`.
reference_line_color
The reference line will have a color of `"#75A8B0"` if it is set to appear. This color can
be changed by providing a single color value to `reference_line_color=`.
reference_area_fill_color
If a reference area has been defined and is visible it has by default a fill color of
`"#A6E6F2"`. This can be modified by declaring a color value in the
`reference_area_fill_color=` option.
vertical_guide_stroke_color
Vertical guides appear when hovering in the vicinity of data points. Their default color is
`"#911EB4"` (a strong magenta color) and a fill opacity value of `0.4` is automatically
applied to this. However, the base color can be changed with the
`vertical_guide_stroke_color=` option.
vertical_guide_stroke_width
The vertical guide's stroke width, by default, is relatively large at `12` (this is '12px').
This is modifiable by setting a different value with `vertical_guide_stroke_width=`.
show_data_points
By default, all data points in a nanoplot are shown but this layer can be hidden by setting
`show_data_points=` to `False`.
show_data_line
The data line connects data points together and it is shown by default. This data line layer
can be hidden by setting `show_data_line=` to `False`.
show_data_area
The data area layer is adjacent to the data points and the data line. It is shown by default
but can be hidden with `show_data_area=False`.
show_reference_line
The layer with a horizontal reference line appears underneath that of the data points and
the data line. Like vertical guides, hovering over a reference will show its value. The
reference line (if available) is shown by default but can be hidden by setting
`show_reference_line=` to `False`.
show_reference_area
The reference area appears at the very bottom of the layer stack, if it is available (i.e.,
defined in `cols_nanoplot()`). It will be shown in the default case but can be hidden by
using `show_reference_area=False`.
show_vertical_guides
Vertical guides appear when hovering over data points. This hidden layer is active by
default but can be deactivated by using `show_vertical_guides=False`.
show_y_axis_guide
The *y*-axis guide will appear when hovering over the far left side of a nanoplot. This
hidden layer is active by default but can be deactivated by using `show_y_axis_guide=False`.
interactive_data_values
By default, numeric data values will be shown only when the user interacts with certain
regions of a nanoplot. This is because the values may be numerous (i.e., clutter the display
when all are visible) and it can be argued that the values themselves are secondary to the
presentation. However, for some types of plots (like horizontal bar plots), a persistent
display of values alongside the plot marks may be desirable. By setting
`interactive_data_values=False` we can opt for always displaying the data values alongside
the plot components.
y_val_fmt_fn
If providing a function to `y_val_fmt_fn=`, customized formatting of the *y* values
associated with the data points/bars is possible.
y_axis_fmt_fn
A function supplied to `y_axis_fmt_fn=` will result in customized formatting of the *y*-axis
label values.
y_ref_line_fmt_fn
Providing a function for `y_ref_line_fmt_fn=` yields customized formatting of the reference
line (if present).
currency
If the values are to be displayed as currency values, supply either: (1) a 3-letter currency
code (e.g., `"USD"` for U.S. Dollars, `"EUR"` for the Euro currency), or (2) a common
currency name (e.g., `"dollar"`, `"pound"`, `"yen"`, etc.).
Examples
--------
See [`fmt_nanoplot()`](`great_tables.GT.fmt_nanoplot`).
## Table options
With the `opt_*()` functions, we have an easy way to set commonly-used table options without having to use `tab_options()` directly.
opt_stylize(self: 'GTSelf', style: 'int' = 1, color: 'str' = 'blue', add_row_striping: 'bool' = True) -> 'GTSelf'
Stylize your table with a colorful look.
With the `opt_stylize()` method you can quickly style your table with a carefully curated set of
background colors, line colors, and line styles. There are six styles to choose from and they
largely vary in the extent of coloring applied to different table locations. Some have table
borders applied, some apply darker colors to the table stub and summary sections, and, some even
have vertical lines. In addition to choosing a `style` preset, there are six `color` variations
that each use a range of five color tints. Each of the color tints have been fine-tuned to
maximize the contrast between text and its background. There are 36 combinations of `style` and
`color` to choose from. For examples of each style, see the
[*Premade Themes*](../get-started/table-theme-premade.qmd) section of the **Get Started**
guide.
Parameters
----------
style
Six numbered styles are available. Simply provide a number from `1` (the default) to `6` to
choose a distinct look.
color
The color scheme of the table. The default value is `"blue"`. The valid values are `"blue"`,
`"cyan"`, `"pink"`, `"green"`, `"red"`, and `"gray"`.
add_row_striping
An option to enable row striping in the table body for the style chosen.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll apply a predefined style to the table using the
`opt_stylize()` method.
```{python}
from great_tables import GT, exibble, md
gt_tbl = (
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_stylize()
)
gt_tbl
```
The table has been stylized with the default style and color. The default style is `1` and the
default color is `"blue"`. The resulting table style is a combination of color and border
settings that are applied to the table.
We can modify the overall style and choose a different color theme by providing different values
to the `style=` and `color=` arguments.
```{python}
gt_tbl.opt_stylize(style=2, color="green")
```
opt_footnote_marks(self: 'GTSelf', marks: 'str | list[str]' = 'numbers') -> 'GTSelf'
Option to modify the set of footnote marks.
Alter the footnote marks for any footnotes that may be present in the table. Either a list
of marks can be provided (including Unicode characters), or, a specific keyword could be
used to signify a preset sequence. This method serves as a shortcut for using
`tab_options(footnotes_marks=)`
We can supply a list of strings will represent the series of marks. The series of footnote
marks is recycled when its usage goes beyond the length of the set. At each cycle, the marks
are simply doubled, tripled, and so on (e.g., `*` -> `**` -> `***`). The option exists for
providing keywords for certain types of footnote marks. The keywords are
- `"numbers"`: numeric marks, they begin from 1 and these marks are not subject to recycling
behavior
- `"letters"`: lowercase alphabetic marks. Same as using the `gt.letters()` function which
produces a list of 26 lowercase letters from the Roman alphabet
- `"LETTERS"`: uppercase alphabetic marks. Same as using the `gt.LETTERS()` function which
produces a list of 26 uppercase letters from the Roman alphabet
- `"standard"`: symbolic marks, four symbols in total
- `"extended"`: symbolic marks, extends the standard set by adding two more symbols, making
six
Parameters
----------
marks
Either a list of strings that will represent the series of marks or a keyword string
that represents a preset sequence of marks. The valid keywords are: `"numbers"` (for
numeric marks), `"letters"` and `"LETTERS"` (for lowercase and uppercase alphabetic
marks), `"standard"` (for a traditional set of four symbol marks), and `"extended"`
(which adds two more symbols to the standard set).
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
opt_row_striping(self: 'GTSelf', row_striping: 'bool' = True) -> 'GTSelf'
Option to add or remove row striping.
By default, a table does not have row striping enabled. However, this method allows us to easily
enable or disable striped rows in the table body. It's a convenient shortcut for
`tab_options(row_striping_include_table_body=)`.
Parameters
----------
row_striping
A boolean that indicates whether row striping should be added or removed. Defaults to
`True`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using only a few columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll add row striping to every second row with the
`opt_row_striping()` method.
```{python}
from great_tables import GT, exibble, md
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_row_striping()
)
```
opt_align_table_header(self: 'GTSelf', align: 'str' = 'center') -> 'GTSelf'
Option to align the table header.
By default, an added table header will have center alignment for both the title and the subtitle
elements. This method allows us to easily set the horizontal alignment of the title and subtitle
to the left, right, or center by using the `"align"` argument. This method serves as a
convenient shortcut for `gt.tab_options(heading_align=)`.
Parameters
----------
align
The alignment of the title and subtitle elements in the table header. Options are `"center"`
(the default), `"left"`, or `"right"`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll align the header contents (consisting of the title and
the subtitle) to the left with the `opt_align_table_header()` method.
```{python}
from great_tables import GT, exibble, md
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_align_table_header(align="left")
)
```
opt_vertical_padding(self: 'GTSelf', scale: 'float' = 1.0) -> 'GTSelf'
Option to scale the vertical padding of the table.
This method allows us to scale the vertical padding of the table by a factor of `scale`. The
default value is `1.0` and this method serves as a convenient shortcut for
`gt.tab_options(heading_padding=, column_labels_padding=,
data_row_padding=, row_group_padding=, source_notes_padding=)`.
Parameters
----------
scale
The factor by which to scale the vertical padding. The default value is `1.0`. A value
less than `1.0` will reduce the padding, and a value greater than `1.0` will increase the
padding. The value must be between `0` and `3`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll scale the vertical padding of the table by a factor of
`3` using the `opt_vertical_padding()` method.
```{python}
from great_tables import GT, exibble, md
gt_tbl = (
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
)
gt_tbl.opt_vertical_padding(scale=3)
```
Now that's a tall table! The overall effect of scaling the vertical padding is that the table
will appear taller and there will be more buffer space between the table elements. A value of
`3` is pretty extreme and is likely to be too much in most cases, so, feel free to experiment
with different values when looking to increase the vertical padding.
Let's go the other way (using a value less than `1`) and try to condense the content vertically
with a `scale` factor of `0.5`. This will reduce the top and bottom padding globally and make
the table appear more compact.
```{python}
gt_tbl.opt_vertical_padding(scale=0.5)
```
A value of `0.5` provides a reasonable amount of vertical padding and the table will appear more
compact. This is useful when space is limited and, in such a situation, this is a practical
solution to that problem.
opt_horizontal_padding(self: 'GTSelf', scale: 'float' = 1.0) -> 'GTSelf'
Option to scale the horizontal padding of the table.
This method allows us to scale the horizontal padding of the table by a factor of `scale`. The
default value is `1.0` and this method serves as a convenient shortcut for `gt.tab_options(
heading_padding_horizontal=, column_labels_padding_horizontal=,
data_row_padding_horizontal=, row_group_padding_horizontal=,
source_notes_padding_horizontal=)`.
Parameters
----------
scale
The factor by which to scale the horizontal padding. The default value is `1.0`. A value
less than `1.0` will reduce the padding, and a value greater than `1.0` will increase the
padding. The value must be between `0` and `3`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll scale the horizontal padding of the table by a factor of
`3` using the `opt_horizontal_padding()` method.
```{python}
from great_tables import GT, exibble, md
gt_tbl = (
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
)
gt_tbl.opt_horizontal_padding(scale=3)
```
The overall effect of scaling the horizontal padding is that the table will appear wider or
and there will added buffer space between the table elements. The overall look of the table will
be more spacious and neighboring pieces of text will be less cramped.
Let's go the other way and scale the horizontal padding of the table by a factor of `0.5` using
the `opt_horizontal_padding()` method.
```{python}
gt_tbl.opt_horizontal_padding(scale=0.5)
```
What you get in this case is more condensed text across the horizontal axis. This may not always
be desired when cells consist mainly of text, but it could be useful when the table is more
visual and the cells are filled with graphics or other non-textual elements.
opt_all_caps(self: 'GTSelf', all_caps: 'bool' = True, locations: 'type[LocColumnLabels] | type[LocRowGroups] | type[LocStub] | list[type[LocColumnLabels] | type[LocRowGroups] | type[LocStub]] | str | list[str] | None' = None) -> 'GTSelf'
Option to use all caps in select table locations.
Sometimes an all-capitalized look is suitable for a table. By using `opt_all_caps()`, we can
transform characters in the column labels, the stub, and in all row groups in this way (and
there's control over which of these locations are transformed). This method serves as a
convenient shortcut for `tab_options(_text_transform="uppercase",
_font_size="80%", _font_weight="bolder")` (for all `locations` selected).
Parameters
----------
all_caps
Indicates whether the text transformation to all caps should be performed (`True`, the
default) or reset to default values (`False`) for the `locations` targeted.
locations
Which locations should undergo this text transformation? By default it includes all of
the `loc.column_labels`, the `loc.stub`, and the `loc.row_groups` locations. However, we
could just choose one or two of those.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll ensure that all text in the column labels, the stub, and
in all row groups is transformed to all caps using the `opt_all_caps()` method.
```{python}
from great_tables import GT, exibble, loc, md
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_all_caps()
)
```
`opt_all_caps()` accepts a `locations` parameter that allows us to specify which components
should be transformed. For example, if we only want to ensure that all text in the stub and all
row groups is converted to all caps:
```{python}
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_all_caps(locations=[loc.stub, loc.row_groups])
)
```
opt_table_outline(self: 'GTSelf', style: 'str' = 'solid', width: 'str' = '3px', color: 'str' = '#D3D3D3') -> 'GTSelf'
Option to wrap an outline around the entire table.
The `opt_table_outline()` method puts an outline of consistent `style=`, `width=`, and `color=`
around the entire table. It'll write over any existing outside lines so long as the `width=`
value is larger that of the existing lines. The default value of `style=` (`"solid"`) will draw
a solid outline, whereas using `"none"` will remove any present outline.
Parameters
----------
style
The style of the table outline. The default value is `"solid"`. The valid values are
`"solid"`, `"dashed"`, `"dotted"`, and `"none"`.
width
The width of the table outline. The default value is `"3px"`. The value must be in pixels
and it must be an integer value.
color
The color of the table outline, where the default is `"#D3D3D3"`. The value must either a
hexadecimal color code or a color name.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Examples
--------
Using select columns from the `exibble` dataset, let's create a table with a number of
components added. Following that, we'll put an outline around the entire table using the
`opt_table_outline()` method.
```{python}
from great_tables import GT, exibble, md
(
GT(
exibble[["num", "char", "currency", "row", "group"]],
rowname_col="row",
groupname_col="group"
)
.tab_header(
title=md("Data listing from **exibble**"),
subtitle=md("`exibble` is a **Great Tables** dataset.")
)
.fmt_number(columns="num")
.fmt_currency(columns="currency")
.tab_source_note(source_note="This is only a subset of the dataset.")
.opt_table_outline()
)
```
opt_table_font(self: 'GTSelf', font: 'str | list[str] | dict[str, str] | GoogleFont | None' = None, stack: 'FontStackName | None' = None, weight: 'str | int | float | None' = None, style: 'str | None' = None, add: 'bool' = True) -> 'GTSelf'
Options to define font choices for the entire table.
The `opt_table_font()` method makes it possible to define fonts used for an entire table. Any
font names supplied in `font=` will (by default, with `add=True`) be placed before the names
present in the existing font stack (i.e., they will take precedence). You can choose to base the
font stack on those provided by the [`system_fonts()`](`system_fonts.md`) helper function by
providing a valid keyword for a themed set of fonts. Take note that you could still have
entirely different fonts in specific locations of the table. To make that possible you would
need to use [`tab_style()`](`great_tables.GT.tab_style`) in conjunction with
[`style.text()`](`great_tables.style.text`).
Parameters
----------
font
One or more font names available on the user's system. This can be provided as a string or
a list of strings. Alternatively, you can specify font names using the `google_font()`
helper function. The default value is `None` since you could instead opt to use `stack` to
define a list of fonts.
stack
A name that is representative of a font stack (obtained via internally via the
`system_fonts()` helper function. If provided, this new stack will replace any defined fonts
and any `font=` values will be prepended.
style
An option to modify the text style. Can be one of either `"normal"`, `"italic"`, or
`"oblique"`.
weight
Option to set the weight of the font. Can be a text-based keyword such as `"normal"`,
`"bold"`, `"lighter"`, `"bolder"`, or, a numeric value between `1` and `1000`. Please note
that typefaces have varying support for the numeric mapping of weight.
add
Should fonts be added to the beginning of any already-defined fonts for the table? By
default, this is `True` and is recommended since those fonts already present can serve as
fallbacks when everything specified in `font` is not available. If a `stack=` value is
provided, then `add` will automatically set to `False`.
Returns
-------
GT
The GT object is returned. This is the same object that the method is called on so that we
can facilitate method chaining.
Possibilities for the `stack` argument
--------------------------------------
There are several themed font stacks available via the [`system_fonts()`](`system_fonts.md`)
helper function. That function can be used to generate all or a segment of a list supplied to
the `font=` argument. However, using the `stack=` argument with one of the 15 keywords for the
font stacks available in [`system_fonts()`](`system_fonts.md`), we could be sure that the
typeface class will work across multiple computer systems. Any of the following keywords can be
used with `stack=`:
- `"system-ui"`
- `"transitional"`
- `"old-style"`
- `"humanist"`
- `"geometric-humanist"`
- `"classical-humanist"`
- `"neo-grotesque"`
- `"monospace-slab-serif"`
- `"monospace-code"`
- `"industrial"`
- `"rounded-sans"`
- `"slab-serif"`
- `"antique"`
- `"didone"`
- `"handwritten"`
Examples
--------
Let's use a subset of the `sp500` dataset to create a small table. With `opt_table_font()` we
can add some preferred font choices for modifying the text of the entire table. Here we'll use
the `"Superclarendon"` and `"Georgia"` fonts (the second font serves as a fallback).
```{python}
import polars as pl
from great_tables import GT
from great_tables.data import sp500
sp500_mini = pl.from_pandas(sp500).slice(0, 10).drop(["volume", "adj_close"])
(
GT(sp500_mini, rowname_col="date")
.fmt_currency(use_seps=False)
.opt_table_font(font=["Superclarendon", "Georgia"])
)
```
In practice, both of these fonts are not likely to be available on all systems. The
`opt_table_font()` method safeguards against this by prepending the fonts in the `font=` list to
the existing font stack. This way, if both fonts are not available, the table will fall back to
using the list of default table fonts. This behavior is controlled by the `add=` argument, which
is `True` by default.
With the `sza` dataset we'll create a two-column, eleven-row table. Within `opt_table_font()`,
the `stack=` argument will be supplied with the "rounded-sans" font stack. This sets up a family
of fonts with rounded, curved letterforms that should be locally available in different
computing environments.
```{python}
from great_tables.data import sza
sza_mini = (
pl.from_pandas(sza)
.filter((pl.col("latitude") == "20") & (pl.col("month") == "jan"))
.drop_nulls()
.drop(["latitude", "month"])
)
(
GT(sza_mini)
.opt_table_font(stack="rounded-sans")
.opt_all_caps()
)
```
opt_css(self: 'GTSelf', css: 'str', add: 'bool' = True, allow_duplicates: 'bool' = False) -> 'GTSelf'
Option to add custom CSS for the table.
`opt_css()` makes it possible to add extra CSS rules to a table. This CSS will be added after
the compiled CSS that Great Tables generates automatically when the object is transformed to an
HTML output table.
If you want to set CSS styles on a specific table location, use `tab_style()` with `style.css()`
instead.
Parameters
----------
css
The CSS to include as part of the rendered table's `
```
```{python}
# | echo: false
# | output: asis
print(":::{.grid}")
for ii in range(1, 7):
gt_html = gt_ex.opt_stylize(style=ii).as_raw_html()
print(
":::{.g-col-lg-4 .g-col-12 .shrink-example}",
f"{ii}
",
gt_html,
":::",
sep="\n\n"
)
print(":::")
```
## `opt_*()` convenience methods
This section shows the different `opt_*()` methods available. They serve as convenience methods for
common `~~GT.tab_options()` tasks.
### Align table header
```{python}
gt_ex.opt_align_table_header(align="left")
```
The title and subtitle are now left-aligned rather than centered, which works well for tables
embedded in text-heavy documents.
### Make text ALL CAPS
```{python}
gt_ex.opt_all_caps()
```
Column labels and row group labels are rendered in uppercase, giving the table a more formal,
structured appearance.
### Reduce or expand padding
```{python}
gt_ex.opt_vertical_padding(scale=0.3)
```
Reducing vertical padding creates a more compact table that fits more data into less vertical space.
```{python}
gt_ex.opt_horizontal_padding(scale=3)
```
Increasing horizontal padding adds breathing room between columns, which improves readability when
columns contain long values.
### Set table outline
```{python}
gt_ex.opt_table_outline()
```
The `opt_*()` methods give you quick access to common styling patterns without needing to remember
the specific `~~GT.tab_options()` argument names. For full control, you can always drop down to
`~~GT.tab_options()` directly, but these convenience methods cover the most frequent customization
needs in just a single method call.
### Nanoplots
:::{.callout-warning}
`~~GT.fmt_nanoplot()` is still experimental.
:::
Nanoplots are tiny plots you can use in your table. They are simple by design, mainly because there
isn't a lot of space to work with. With that simplicity, however, you do get a set of very succinct
data visualizations that adapt nicely to the amount of data you feed into them. The main features of
nanoplots include the following:
- interactivity: you can hover over data and other elements to show values
- choice of line and bar charting
- you can annotate plots with a reference line and/or area
- plenty of easy-to-use options for composing your plots
## A simple line-based nanoplot
Let's make some simple plots with a Polars DataFrame. Here we are using lists to define data values
for each cell in the `numbers` column. The `~~GT.fmt_nanoplot()` method understands that these are
input values for a line plot (the default type of nanoplot).
```{python}
from great_tables import GT
import polars as pl
random_numbers_df = pl.DataFrame(
{
"example": ["Row " + str(x) for x in range(1, 5)],
"numbers": [
"20 23 6 7 37 23 21 4 7 16",
"2.3 6.8 9.2 2.42 3.5 12.1 5.3 3.6 7.2 3.74",
"-12 -5 6 3.7 0 8 -7.4",
"2 0 15 7 8 10 1 24 17 13 6",
],
}
)
GT(random_numbers_df).fmt_nanoplot(columns="numbers")
```
This looks a lot like the familiar sparklines you might see in tables where space for plots is
limited. The input values, strings of space-separated values, can be considered here as *y* values
and they are evenly spaced along the imaginary *x* axis.
Hovering over (or touching) the values is something of a treat! You might notice that:
- data values are automatically formatted for you in a compact fashion
- the plot elements also display pertinent values
This sort of interactively is baked into the rendered SVG graphics that `~~GT.fmt_nanoplot()`
generates from your data and selection of options.
Polars lets us express 'lists-of-values-per-cell' in different ways and **Great Tables** is pretty
good at understanding different column *dtypes*. So, you can alternatively create the same table as
above with the following code.
```python
random_numbers_df = pl.DataFrame(
{
"example": ["Row " + str(x) for x in range(1, 5)],
"numbers": [
{ "val": [20, 23, 6, 7, 37, 23, 21, 4, 7, 16] },
{ "val": [2.3, 6.8, 9.2, 2.42, 3.5, 12.1, 5.3, 3.6, 7.2, 3.74] },
{ "val": [-12, -5, 6, 3.7, 0, 8, -7.4] },
{ "val": [2, 0, 15, 7, 8, 10, 1, 24, 17, 13, 6] },
],
}
)
GT(random_numbers_df).fmt_nanoplot(columns="numbers")
```
Both forms of the `numbers` column in the two DataFrames look the same to `~~GT.fmt_nanoplot()`. The
key for the list of values (here, `"val"`) can be anything as long as it's repeated down the column.
So the choice is yours on how you want to prepare those column values.
## The reference line and the reference area
You can insert two additional things which may be useful: a reference line and a reference area. You
can define them either through literal values or via keywords (these are: `"mean"`, `"median"`,
`"min"`, `"max"`, `"q1"`, `"q3"`, `"first"`, or `"last"`). Here's a reference line that corresponds
to the mean data value of each nanoplot:
```{python}
GT(random_numbers_df).fmt_nanoplot(columns="numbers", reference_line="mean")
```
This example uses a reference area that bounds the minimum value to the median value:
```{python}
GT(random_numbers_df).fmt_nanoplot(columns="numbers", reference_area=["min", "median"])
```
As an added touch, you don't need to worry about the order of the keywords provided to
`reference_area=` (which could be potentially problematic if providing a literal value and a
keyword).
## Using `autoscale=` to have a common *y*-axis scale across plots
There are lots of options. Like, if you want to ensure that the scale is shared across all of the
nanoplots (so you can better get a sense of overall magnitude), you can set `autoscale=` to `True`:
```{python}
GT(random_numbers_df).fmt_nanoplot(columns="numbers", autoscale=True)
```
If you hover along or touch the left side of any of the plots above, you'll see that each *y* scale
runs from `-12.0` to `37.0`. Using `autoscale=True` is very useful if you want to compare the
magnitudes of values across rows in addition to their trends. It won't, however, make much sense if
the overall magnitudes of values vary wildly across rows (e.g., comparing changing currency values
or stock prices over time).
## Using the `nanoplot_options()`{.qd-no-link} helper function
There are many options for customization. You can radically change the look of a collection of
nanoplots with the `nanoplot_options()` helper function. With that function, you invoke it in the
`options=` argument of `~~GT.fmt_nanoplot()`. You can modify the sizes and colors of different
elements, decide which elements are even present, and much more! Here's an example where a
line-based nanoplot retains all of its elements, but the overall appearance is greatly altered.
```{python}
from great_tables import nanoplot_options
(
GT(random_numbers_df)
.fmt_nanoplot(
columns="numbers",
options=nanoplot_options(
data_point_radius=8,
data_point_stroke_color="black",
data_point_stroke_width=2,
data_point_fill_color="white",
data_line_type="straight",
data_line_stroke_color="brown",
data_line_stroke_width=2,
data_area_fill_color="orange",
vertical_guide_stroke_color="green",
),
)
)
```
As can be seen, you have a lot of fine-grained control over the look of a nanoplot.
## Making nanoplots with bars using `plot_type="bar"`
We don't just support line plots in `~~GT.fmt_nanoplot()`, we also have the option to show bar
plots. The only thing you need to change is the value of `plot_type=` argument to `"bar"`:
```{python}
GT(random_numbers_df).fmt_nanoplot(columns="numbers", plot_type="bar")
```
An important difference between line plots and bar plots is that the bars project from a zero line.
Notice that some negative values in the bar-based nanoplot appear red and radiate downward from the
gray zero line.
Using `plot_type="bar"` still allows us to supply a reference line and a reference area with
`reference_line=` and `reference_area=`. The `autoscale=` option works here as well. We also have a
set of options just for bar plots available inside `nanoplot_options()`. Here's an example where we
use all of the aforementioned customization possibilities:
```{python}
(
GT(random_numbers_df)
.fmt_nanoplot(
columns="numbers",
plot_type="bar",
autoscale=True,
reference_line="min",
reference_area=[0, "max"],
options=nanoplot_options(
data_bar_stroke_color="gray",
data_bar_stroke_width=2,
data_bar_fill_color="orange",
data_bar_negative_stroke_color="blue",
data_bar_negative_stroke_width=1,
data_bar_negative_fill_color="lightblue",
reference_line_color="pink",
reference_area_fill_color="bisque",
vertical_guide_stroke_color="blue",
),
)
)
```
The customized bars use orange fills, gray strokes, and a pink reference line. Negative values are
styled separately with blue strokes and light blue fills, making it easy to distinguish positive and
negative trends at a glance.
## Horizontal bar and line plots
Single-value bar plots, running in the horizontal direction, can be made by simply invoking
`~~GT.fmt_nanoplot()` on a column of numeric values. These plots are meant for comparison across
rows so the method automatically scales the horizontal bars to facilitate this type of display.
Here's a simple example that uses `plot_type="bar"` on the `numbers` column that contains a single
numeric value in every cell.
```{python}
single_vals_df = pl.DataFrame(
{
"example": ["Row " + str(x) for x in range(1, 5)],
"numbers": [2.75, 0, -3.2, 8]
}
)
GT(single_vals_df).fmt_nanoplot(columns="numbers", plot_type="bar")
```
This, interestingly enough, works with the `"line"` type of nanoplot. The result is akin to a
lollipop plot:
```{python}
GT(single_vals_df).fmt_nanoplot(columns="numbers")
```
You get to customize the line and the data point marker with the latter display of single values,
and that's a plus. Nonetheless, it is more common to see horizontal bar plots in tables and the
extra customization of negative values makes that form of presentation more advantageous.
## Line plots with paired *x* and *y* values
Aside from a single stream of *y* values, we can plot pairs of *x* and *y* values. This works only
for the `"line"` type of plot. We can set up a column of Polars `struct` values in a DataFrame to
have this input data prepared for `~~GT.fmt_nanoplot()`. Notice that the dictionary values in the
enclosed list must have the `"x"` and `"y"` keys. Further to this, the list lengths for each of
`"x"` and `"y"` must match (i.e., to make valid pairs of *x* and *y*).
```{python}
weather_2 = pl.DataFrame(
{
"station": ["Station " + str(x) for x in range(1, 4)],
"temperatures": [
{
"x": [6.1, 8.0, 10.1, 10.5, 11.2, 12.4, 13.1, 15.3],
"y": [24.2, 28.2, 30.2, 30.5, 30.5, 33.1, 33.5, 32.7],
},
{
"x": [7.1, 8.2, 10.3, 10.75, 11.25, 12.5, 13.5, 14.2],
"y": [18.2, 18.1, 20.3, 20.5, 21.4, 21.9, 23.1, 23.3],
},
{
"x": [6.3, 7.1, 10.3, 11.0, 12.07, 13.1, 15.12, 16.42],
"y": [15.2, 17.77, 21.42, 21.63, 25.23, 26.84, 27.2, 27.44],
},
]
}
)
(
GT(weather_2)
.fmt_nanoplot(
columns="temperatures",
plot_type="line",
expand_x=[5, 16],
expand_y=[10, 40],
options=nanoplot_options(
show_data_area=False,
show_data_line=False
)
)
)
```
The options for removing the *data area* and the *data line* (though the corresponding `show_*`
arguments of `nanoplot_options()`) make the finalized nanoplots look somewhat like scatter plots.
Nanoplots bring data visualization directly into your table cells, giving readers an immediate
visual sense of trends and distributions without leaving the tabular format. Between line plots, bar
charts, reference annotations, and extensive customization through `nanoplot_options()`, you can
tailor these compact visualizations to match your data and your presentation style.
## Advanced Topics
### Column Selection
Many **Great Tables** methods accept a `columns=` argument for targeting specific columns. Rather
than limiting you to a simple list of column names, the package supports a flexible selection system
that includes positional indexing, pattern-matching functions, and Polars selectors. This page
demonstrates each of these approaches.
## Selection Options
The `columns=` argument for methods like `~~GT.tab_spanner()`, `~~GT.cols_move()`, and
`~~GT.tab_style()` allows a range of options for selecting columns.
The simplest approach is just a list of strings with the exact column names. However, we can specify
columns using any of the following:
* a single string column name.
* an integer for the column's position.
* a list of strings or integers.
* a **Polars** selector.
* a function that takes a string and returns `True` or `False`.
```{python}
from great_tables import GT
from great_tables.data import exibble
lil_exibble = exibble[["num", "char", "fctr", "date", "time"]].head(4)
gt_ex = GT(lil_exibble)
gt_ex
```
This five-column table will serve as the basis for demonstrating each selection approach.
## Using integers
We can use a list of strings or integers to select columns by name or position, respectively.
```{python}
gt_ex.cols_move_to_start(columns=["date", 1, -1])
```
Note the code above moved the following columns:
* The string `"date"` matched the column of the same name.
* The integer `1` matched the second column (this is similar to list indexing).
* The integer `-1` matched the last column.
Moreover, the order of the list defines the order of selected columns. In this case, `"data"` was
the first entry, so it's the very first column in the new table.
## Using **Polars** selectors
When using a **Polars** DataFrame, you can select columns using
[**Polars** selectors](https://pola-rs.github.io/polars/py-polars/html/reference/selectors.html).
The example below uses **Polars** selectors to move all columns that start with `"c"` or `"f"` to
the start of the table.
```{python}
import polars as pl
import polars.selectors as cs
pl_df = pl.from_pandas(lil_exibble)
GT(pl_df).cols_move_to_start(columns=cs.starts_with("c") | cs.starts_with("f"))
```
In general, selection should match the behaviors of the **Polars** `DataFrame.select()` method.
```{python}
pl_df.select(cs.starts_with("c") | cs.starts_with("f")).columns
```
See the
[Selectors page in the polars docs](https://pola-rs.github.io/polars/py-polars/html/reference/selectors.html)
for more information on this.
## Using functions
A function can be used to select columns. It should take a column name as a string and return `True`
or `False`.
```{python}
gt_ex.cols_move_to_start(columns=lambda x: "c" in x)
```
These selection methods work consistently across all **Great Tables** methods that accept a
`columns=` argument. Whether you prefer explicit column names, positional indexing, Polars
selectors, or custom functions, you can choose the approach that best fits your workflow and data.
### Row Selection
Just as you can target specific columns, **Great Tables** also provides flexible ways to select
rows. The `rows=` argument appears in formatting methods, location specifiers, and styling calls,
allowing you to apply operations to a precise subset of your data. This page covers each of the
available selection mechanisms.
## Selection Options
Location and formatter functions (e.g. `loc.body()` and `~~GT.fmt_number()`) can be applied to
specific rows, using the `rows=` argument.
Rows may be specified using any of the following:
* None (the default), to select everything.
* an integer for the row's position.
* a list of or integers.
* a **Polars** selector for filtering.
* a function that takes a DataFrame and returns a boolean Series.
The following sections will use a subset of the `exibble` data, to demonstrate these options.
```{python}
from great_tables import GT, exibble, loc, style
lil_exibble = exibble[["num", "char", "currency"]].head(3)
gt_ex = GT(lil_exibble)
```
## Using integers
Use a single integer, or a list of integers, to select rows by position.
```{python}
gt_ex.fmt_currency("currency", rows=0, decimals=1)
```
Notice that a dollar sign (`$`) was only added to the first row (index `0` in python).
Indexing works the same as selecting items from a python list. This negative integers select
relative to the final row.
```{python}
gt_ex.fmt_currency("currency", rows=[0, -1], decimals=1)
```
The first and last rows now show currency formatting, while the middle row remains unchanged.
Negative indices count backward from the end, just as with Python lists.
## Using polars expressions
The `rows=` argument accepts polars expressions, which return a boolean Series, indicating which
rows to operate on.
For example, the code below only formats the `num` column, but only when currency is less than 40.
```{python}
import polars as pl
gt_polars = GT(pl.from_pandas(lil_exibble))
gt_polars.fmt_integer("num", rows=pl.col("currency") < 40)
```
Here's a more realistic example, which highlights the row with the highest value for currency.
```{python}
import polars.selectors as cs
gt_polars.tab_style(
style.fill("yellow"),
loc.body(
columns=cs.all(),
rows=pl.col("currency") == pl.col("currency").max()
)
)
```
The row with the maximum currency value is highlighted with a yellow background. Using expressions
for row selection keeps the logic declarative and close to the styling call.
## Using a function
Since libraries like `pandas` don't have lazy expressions, the `rows=` argument also accepts a
function for selecting rows. The function should take a DataFrame and return a boolean series.
Here's the same example as the previous polars section, but with pandas data, and a lambda for
selecting rows.
```{python}
gt_ex.fmt_integer("num", rows=lambda D: D["currency"] < 40)
```
Here's the styling example from the previous polars section.
```{python}
import polars.selectors as cs
gt_ex.tab_style(
style.fill("yellow"),
loc.body(
columns=lambda colname: True,
rows=lambda D: D["currency"] == D["currency"].max()
)
)
```
Whether you prefer integer indexing for quick positional access, Polars expressions for declarative
filtering, or functions for compatibility with pandas, the `rows=` argument adapts to your data
workflow. Combined with column selection, these tools give you fine-grained control over exactly
which cells your formatting and styling operations affect.
### Location Selection
The `loc` module is what connects your styling intentions to specific parts of the table. Each
location specifier identifies a region of the table (such as the header, body, stub, or footer) and
many of them also support targeting specific columns or rows within that region. This page provides
a comprehensive overview of all available location specifiers and how to use them effectively.
## Overview
Great Tables uses the `loc` module to specify locations for styling in `~~GT.tab_style()`. Some
location specifiers also allow selecting specific columns and rows of data.
For example, you might style a particular row name, group, column, or spanner label.
The table below shows the different location specifiers, along with the types of column or row
selection they allow.
```{python}
# | echo: false
import polars as pl
from great_tables import GT
data = [
["header", "loc.header()", "composite"],
["", "loc.title()", ""],
["", "loc.subtitle()", ""],
["boxhead", "loc.column_header()", "composite"],
["", "loc.spanner_labels()", "columns"],
["", "loc.column_labels()", "columns"],
["row stub", "loc.stub()", "rows"],
["", "loc.row_groups()", "rows"],
# ["", "loc.summary_stub()", "rows"],
["", "loc.grand_summary_stub()", "rows"],
["table body", "loc.body()", "columns and rows"],
# ["", "loc.summary_rows()", "columns and rows"],
["", "loc.grand_summary_rows()", "columns and rows"],
["footer", "loc.footer()", "composite"],
["", "loc.source_notes()", ""],
]
df = pl.DataFrame(data, schema=["table part", "name", "selection"], orient="row")
GT(df)
```
Note that composite specifiers are ones that target multiple locations. For example, `loc.header()`
specifies both `loc.title()` and `loc.subtitle()`.
## Setting up data
The examples below will use this small dataset to show selecting different locations, as well as
specific rows and columns within a location (where supported).
```{python}
import polars as pl
import polars.selectors as cs
from great_tables import GT, loc, style, exibble
pl_exibble = pl.from_pandas(exibble)[[0, 1, 4], ["num", "char", "group"]]
pl_exibble
```
This small three-row, three-column dataset gives us enough structure to demonstrate row and column
targeting without cluttering the output.
## Simple locations
Simple locations don't take any arguments.
For example, styling the title uses `loc.title()`.
```{python}
(
GT(pl_exibble)
.tab_header("A title", "A subtitle")
.tab_style(
style.fill("yellow"),
loc.title(),
)
)
```
Only the title receives the yellow fill; the subtitle and the rest of the table remain unstyled.
Simple locations are useful when you want precise control over a single element.
## Composite locations
Composite locations target multiple simple locations.
For example, `loc.header()` includes both `loc.title()` and `loc.subtitle()`.
```{python}
(
GT(pl_exibble)
.tab_header("A title", "A subtitle")
.tab_style(
style.fill("yellow"),
loc.header(),
)
)
```
Both the title and subtitle are filled with yellow because `loc.header()` targets the entire header
region. Composite locations are a convenient shorthand when you want the same style on all
sub-parts.
## Body columns, rows and mask
Use `columns=` and `rows=` in `loc.body()` to style specific cells in the table body.
```{python}
(
GT(pl_exibble).tab_style(
style.fill("yellow"),
loc.body(
columns=cs.starts_with("cha"),
rows=pl.col("char").str.contains("a"),
),
)
)
```
Alternatively, use `mask=` in `loc.body()` to apply conditional styling to rows on a per-column
basis.
```{python}
(
GT(pl_exibble).tab_style(
style.fill("yellow"),
loc.body(mask=cs.string().str.contains("p")),
)
)
```
This is discussed in detail in [Styling the Table Body](./11-styling-the-table-body.qmd).
## Column labels
Locations like `loc.spanner_labels()` and `loc.column_labels()` can select specific column and
spanner labels.
You can use name strings, index position, or polars selectors.
```{python}
GT(pl_exibble).tab_style(
style.fill("yellow"),
loc.column_labels(
cs.starts_with("cha"),
),
)
```
However, note that `loc.spanner_labels()` currently only accepts list of string names.
## Row and group names
Row and group names in `loc.stub()` and `loc.row_groups()` may be specified three ways:
* by name
* by index
* by polars expression
```{python}
gt = GT(pl_exibble).tab_stub(
rowname_col="char",
groupname_col="group",
)
gt.tab_style(style.fill("yellow"), loc.stub())
```
All row labels in the stub are highlighted in yellow.
```{python}
gt.tab_style(style.fill("yellow"), loc.stub("banana"))
```
Only the `"banana"` row label is styled, demonstrating name-based targeting.
```{python}
gt.tab_style(style.fill("yellow"), loc.stub(["apricot", 2]))
```
You can mix names and integer indices in a list to target multiple specific rows at once.
### Groups by name and position
Note that for specifying row groups, the group corresponding to the group name or row number in the
original data is used.
For example, the code below styles the group corresponding to the row at index 1 (i.e., the second
row) in the data.
```{python}
gt.tab_style(
style.fill("yellow"),
loc.row_groups(1),
)
```
Since the second row (starting with "banana") is in "grp_a", that is the group that gets styled.
This means you can use a polars expression to select groups:
```{python}
gt.tab_style(
style.fill("yellow"),
loc.row_groups(pl.col("group") == "grp_b"),
)
```
You can also specify group names using a string (or list of strings).
```{python}
gt.tab_style(
style.fill("yellow"),
loc.row_groups("grp_b"),
)
```
The `loc` module provides a complete vocabulary for addressing any part of your table. By combining
location specifiers with column selectors, row filters, and Polars expressions, you can apply styles
to exactly the right cells. For more details on styling itself, see
[Styling the Table Body](./11-styling-the-table-body.qmd) and
[Styling the Whole Table](./12-styling-the-whole-table.qmd).
### Exporting and Saving Tables
Once you have built a table, you need to get it into its final destination. That might be a notebook
cell, a standalone HTML file, a LaTeX document, or an image file for inclusion in a report or
presentation. **Great Tables** provides several export methods to cover these use cases, each with
options to control the output format.
## Displaying Tables
In most notebook environments (Jupyter, Quarto, Marimo), simply placing a `GT` object as the last
expression in a cell will render the table automatically. However, you can also use the
`~~GT.show()` method for explicit control over where the table is displayed.
```{python}
from great_tables import GT
from great_tables.data import exibble
gt_tbl = (
GT(exibble.head(3)[["num", "char", "currency"]])
.tab_header(title="Example Table", subtitle="A small demonstration")
.fmt_currency(columns="currency")
.fmt_number(columns="num", decimals=2)
)
gt_tbl.show()
```
The `target=` argument controls the display destination. The available options are:
- `"auto"` (the default): displays inline in a notebook if possible, otherwise opens a browser
window.
- `"notebook"`: forces inline notebook display.
- `"browser"`: opens the table in your default web browser. This is particularly useful when working
in the console or when you want to see the full styled output that some IDEs may suppress.
```python
# Open in a browser window (useful when running from a script or console)
gt_tbl.show(target="browser")
```
## Getting HTML as a String
The `~~GT.as_raw_html()` method returns the table as an HTML string. This is useful for embedding
tables in web applications, email templates, or custom HTML documents.
```{python}
html_str = gt_tbl.as_raw_html()
# Show the first 200 characters to see the structure
print(html_str[:200])
```
The method accepts several arguments that control the output format.
### Inline CSS for Email
Email clients typically strip `