Create a comprehensive data summary table with visualizations.
The gt_plt_summary() function takes a DataFrame and generates a summary table showing key statistics and visual representations for each column. Each row displays the column type, missing data percentage, descriptive statistics (mean, median, standard deviation), and a small plot overview appropriate for the data type (histograms for numeric and datetime and a categorical bar chart for strings).
Inspired by the Observable team and the observablehq/SummaryTable function: https://observablehq.com/@observablehq/summary-table
Parameters
df:IntoDataFrame
A DataFrame to summarize. Can be any DataFrame type that you would pass into a GT.
title:str | None=None
Optional title for the summary table. If None, defaults to “Summary Table”.
show_desc_stats:bool=True
Boolean that allows the hiding of the Mean, Median, and SD columns.
add_mode:bool=False
Boolean that allows the addition of a Mode column.
interactivity:bool=True
Boolean that toggles interactivity in Plot Overview column graphs. Interactivity refers to hovering css and tooltips code applied to the graphs.
new_color_mapping:dict | None=None
A dictionary that maps data types (string, numeric, datetime, boolean, and other) to their corresponding color codes in hexadecimal format.
Returns
:GT
A GT object containing the summary table with columns for Type, Column name, Plot Overview, Missing percentage, Mean, Median, and Standard Deviation.
The datatype (dtype) of each column in your dataframe will determine the classified type in the summary table. Keep in mind that sometimes pandas or polars have differing behaviors with datatypes, especially when null values are present.