GT.fmt_nanoplot

GT.fmt_nanoplot(
    self,
    columns=None,
    rows=None,
    plot_type='line',
    plot_height='2em',
    missing_vals='marker',
    autoscale=False,
    reference_line=None,
    reference_area=None,
    expand_x=None,
    expand_y=None,
    options=None,
)

Format data for nanoplot visualizations.

The fmt_nanoplot() method is used to format data for nanoplot visualizations. This method allows for the creation of a variety of different plot types, including line, bar, and scatter plots.

Warning

fmt_nanoplot() is still experimental.

Parameters

columns : str | None = None: The columns to target. Can either be a single column name or a series of column names provided in a list.
rows : int | list[int] | None = None: In conjunction with columns=, we can specify which of their rows should undergo formatting. The default is all rows, resulting in all rows in targeted columns being formatted. Alternatively, we can supply a list of row indices.
plot_type : PlotType = 'line': Nanoplots can either take the form of a line plot (using "line") or a bar plot (with "bar"). A line plot, by default, contains layers for a data line, data points, and a data area. With a bar plot, the always visible layer is that of the data bars.
plot_height : str = '2em': The height of the nanoplots. The default here is a sensible value of "2em".
missing_vals : MissingVals = 'marker': If missing values are encountered within the input data, there are three strategies available for their handling: (1) "gap" will show data gaps at the sites of missing data, where data lines will have discontinuities and bar plots will have missing bars; (2) "marker" will behave like "gap" but show prominent visual marks at the missing data locations; (3) "zero" will replace missing values with zero values; and (4) "remove" will remove any incoming missing values.
autoscale : bool = False: Using autoscale=True will ensure that the bounds of all nanoplots produced are based on the limits of data combined from all input rows. This will result in a shared scale across all of the nanoplots (for y- and x-axis data), which is useful in those cases where the nanoplot data should be compared across rows.
reference_line : str | int | float | None = None: A reference line requires a single input to define the line. It could be a numeric value, applied to all nanoplots generated. Or, the input can be one of the following for generating the line from the underlying data: (1) "mean", (2) "median", (3) "min", (4) "max", (5) "q1", (6) "q3", (7) "first", or (8) "last".
reference_area : list[Any] | None = None: A reference area requires a list of two values for defining bottom and top boundaries (in the y direction) for a rectangular area. The types of values supplied are the same as those expected for reference_line=, which is either a numeric value or one of the following keywords for the generation of the value: (1) "mean", (2) "median", (3) "min", (4) "max", (5) "q1", (6) "q3", (7) "first", or (8) "last". Input can either be a vector or list with two elements.
expand_x : list[int] | list[float] | list[int | float] | None = None: Should you need to have plots expand in the x direction, provide one or more values to expand_x=. Any values provided that are outside of the range of x-value data provided to the plot will result in a x-scale expansion.
expand_y : list[int] | list[float] | list[int | float] | None = None: Similar to expand_x=, one can have plots expand in the y direction. To make this happen, provide one or more values to expand_y=. If any of the provided values are outside of the range of y-value data provided, the plot will result in a y-scale expansion.
options : dict[str, Any] | None = None: By using the nanoplot_options() helper function here, you can alter the layout and styling of the nanoplots in the new column.

Returns

: GT: The GT object is returned. This is the same object that the method is called on so that we can facilitate method chaining.

Details

Nanoplots try to show individual data with reasonably good visibility. Interactivity is included as a basic feature so one can hover over the data points and vertical guides will display the value ascribed to each data point. Because Great Tables knows all about numeric formatting, values will be compactly formatted so as to not take up valuable real estate.

While basic customization options are present in fmt_nanoplot(), many more opportunities for customizing nanoplots on a more granular level are possible with the aforementioned nanoplot_options() helper function. With that, layers of the nanoplots can be selectively removed and the aesthetics of the remaining plot components can be modified.

Examples

Let’s create a nanoplot from a Polars DataFrame containing multiple numbers per cell. The numbers are represented here as strings, where spaces separate the values, and the same values are present in two columns: lines and bars. We will use the fmt_nanoplot() method twice to create a line plot and a bar plot from the data in their respective columns.

from great_tables import GT
import polars as pl

random_numbers_df = pl.DataFrame(
    {
        "i": range(1, 5),
        "lines": [
            "20 23 6 7 37 23 21 4 7 16",
            "2.3 6.8 9.2 2.42 3.5 12.1 5.3 3.6 7.2 3.74",
            "-12 -5 6 3.7 0 8 -7.4",
            "2 0 15 7 8 10 1 24 17 13 6",
        ],
    }
).with_columns(bars=pl.col("lines"))

(
    GT(random_numbers_df, rowname_col="i")
    .fmt_nanoplot(columns="lines")
    .fmt_nanoplot(columns="bars", plot_type="bar")
)

	lines	bars
1
2
3
4

We can always represent the input DataFrame in a different way (with list columns) and fmt_nanoplot() will still work. While the input data is the same as in the previous example, we’ll take the opportunity here to add a reference line and a reference area to the line plot and also to the bar plot.

random_numbers_df = pl.DataFrame(
    {
        "i": range(1, 5),
        "lines": [
            { "val": [20.0, 23.0, 6.0, 7.0, 37.0, 23.0, 21.0, 4.0, 7.0, 16.0] },
            { "val": [2.3, 6.8, 9.2, 2.42, 3.5, 12.1, 5.3, 3.6, 7.2, 3.74] },
            { "val": [-12.0, -5.0, 6.0, 3.7, 0.0, 8.0, -7.4] },
            { "val": [2.0, 0.0, 15.0, 7.0, 8.0, 10.0, 1.0, 24.0, 17.0, 13.0, 6.0] },
        ],
    }
).with_columns(bars=pl.col("lines"))

(
    GT(random_numbers_df, rowname_col="i")
    .fmt_nanoplot(
        columns="lines",
        reference_line="mean",
        reference_area=["min", "q1"]
    )
    .fmt_nanoplot(
        columns="bars",
        plot_type="bar",
        reference_line="max",
        reference_area=["max", "median"])
)

	lines	bars
1
2
3
4

Here’s an example to adjust some of the options using nanoplot_options().

from great_tables import nanoplot_options

(
    GT(random_numbers_df, rowname_col="i")
    .fmt_nanoplot(
        columns="lines",
        reference_line="mean",
        reference_area=["min", "q1"],
        options=nanoplot_options(
            data_point_radius=8,
            data_point_stroke_color="black",
            data_point_stroke_width=2,
            data_point_fill_color="white",
            data_line_type="straight",
            data_line_stroke_color="brown",
            data_line_stroke_width=2,
            data_area_fill_color="orange",
            vertical_guide_stroke_color="green",
        ),
    )
    .fmt_nanoplot(
        columns="bars",
        plot_type="bar",
        reference_line="max",
        reference_area=["max", "median"],
        options=nanoplot_options(
            data_bar_stroke_color="gray",
            data_bar_stroke_width=2,
            data_bar_fill_color="orange",
            data_bar_negative_stroke_color="blue",
            data_bar_negative_stroke_width=1,
            data_bar_negative_fill_color="lightblue",
            reference_line_color="pink",
            reference_area_fill_color="bisque",
            vertical_guide_stroke_color="blue",
        ),
    )
)

	lines	bars
1
2
3
4

Single-value bar plots and line plots can be made with fmt_nanoplot(). These run in the horizontal direction, which is ideal for tabular presentation. The key thing here is that fmt_nanoplot() expects a column of numeric values. These plots are meant for comparison across rows so the method automatically scales the horizontal bars to facilitate this type of display. The following example shows how fmt_nanoplot() can be used to create single-value bar and line plots.

single_vals_df = pl.DataFrame(
    {
        "i": range(1, 6),
        "bars": [4.1, 1.3, -5.3, 0, 8.2],
        "lines": [12.44, 6.34, 5.2, -8.2, 9.23]
    }
)
(
    GT(single_vals_df, rowname_col="i")
    .fmt_nanoplot(columns="bars", plot_type="bar")
    .fmt_nanoplot(columns="lines", plot_type="line")
)

	bars	lines
1
2
3
4
5