gdtest_tbl_preview — Synthetic Package Details

#286 gdtest_tbl_preview OK INIT

Table preview showcase with diverse table types and options.

Table preview showcase exercising tbl_preview() with six user-guide pages: basic dict/list-of-dicts previews, Pandas DataFrames, Polars DataFrames, missing-value highlighting (None/NaN/Inf), column subsets and wide tables, and a full-options page showing every parameter. Tests badge rendering (Table/Pandas/Polars), head/tail splitting, dark-mode CSS, HTML escaping, and side-by-side comparison with raw DataFrame output.

View Site → Build Log 🧪 Test Coverage

Build Mode

○ No great-docs.yml

This package has no pre-supplied config. It tests the full great-docs init → great-docs build pipeline from scratch, relying entirely on auto-detection of the package layout, docstring style, and exports.

Dimensions

A1 B1 C1 D1 M2 G1

A1Flat layoutlayout

B1Explicit __all__exports

C1Functions onlyobjects

D1NumPydocstrings

M2Numbered UG filesuser_guide

G1README.mdlanding

Source Files

📁 gdtest_tbl_preview/

📄 __init__.py

"""Sample data generators for table preview demos."""

__version__ = "0.1.0"
__all__ = [
    "sample_scores",
    "sample_inventory",
    "sample_wide",
    "sample_missing",
    "sample_types",
]

from .data import (
    sample_scores,
    sample_inventory,
    sample_wide,
    sample_missing,
    sample_types,
)

📄 data.py

"""Functions that generate sample data for preview demos."""

from __future__ import annotations


def sample_scores(n: int = 20) -> dict[str, list]:
    """
    Generate a student scores dataset.

    Parameters
    ----------
    n
        Number of rows.

    Returns
    -------
    dict[str, list]
        Column-oriented dict with name, subject, score, grade, and
        pass/fail columns.

    Examples
    --------
    >>> data = sample_scores(5)
    >>> len(data["name"])
    5
    """
    import random
    random.seed(42)
    names = ["Alice", "Bob", "Charlie", "Diana", "Eve",
             "Frank", "Grace", "Hank", "Iris", "Jack"]
    subjects = ["Math", "Science", "English", "History", "Art"]
    grades = ["A+", "A", "A-", "B+", "B", "B-", "C+", "C", "D", "F"]
    rows_name = [random.choice(names) for _ in range(n)]
    rows_subj = [random.choice(subjects) for _ in range(n)]
    rows_score = [round(random.uniform(40, 100), 1) for _ in range(n)]
    rows_grade = [random.choice(grades) for _ in range(n)]
    rows_pass = [s >= 60.0 for s in rows_score]
    return {
        "name": rows_name,
        "subject": rows_subj,
        "score": rows_score,
        "grade": rows_grade,
        "passed": rows_pass,
    }


def sample_inventory(n: int = 30) -> dict[str, list]:
    """
    Generate a product inventory dataset.

    Parameters
    ----------
    n
        Number of rows.

    Returns
    -------
    dict[str, list]
        Column-oriented dict with product, category, price, stock,
        and rating columns.

    Examples
    --------
    >>> data = sample_inventory(10)
    >>> len(data["product"])
    10
    """
    import random
    random.seed(99)
    products = [
        "Widget", "Gadget", "Doohickey", "Thingamajig",
        "Gizmo", "Whatchamacallit", "Contraption", "Apparatus",
    ]
    categories = ["Electronics", "Tools", "Kitchen", "Garden", "Office"]
    rows_prod = [random.choice(products) for _ in range(n)]
    rows_cat = [random.choice(categories) for _ in range(n)]
    rows_price = [round(random.uniform(5.0, 200.0), 2) for _ in range(n)]
    rows_stock = [random.randint(0, 500) for _ in range(n)]
    rows_rating = [round(random.uniform(1.0, 5.0), 1) for _ in range(n)]
    return {
        "product": rows_prod,
        "category": rows_cat,
        "price": rows_price,
        "stock": rows_stock,
        "rating": rows_rating,
    }


def sample_wide(n_rows: int = 15, n_cols: int = 20) -> dict[str, list]:
    """
    Generate a wide dataset with many columns.

    Parameters
    ----------
    n_rows
        Number of rows.
    n_cols
        Number of columns.

    Returns
    -------
    dict[str, list]
        Column-oriented dict with columns named ``col_001``
        through ``col_{n_cols:03d}``.

    Examples
    --------
    >>> data = sample_wide(5, 8)
    >>> len(data)
    8
    """
    import random
    random.seed(7)
    return {
        f"col_{i+1:03d}": [round(random.gauss(0, 1), 3) for _ in range(n_rows)]
        for i in range(n_cols)
    }


def sample_missing(n: int = 15) -> dict[str, list]:
    """
    Generate a dataset riddled with missing values.

    Parameters
    ----------
    n
        Number of rows.

    Returns
    -------
    dict[str, list]
        Column-oriented dict where roughly 25 percent of values are
        ``None`` or ``float('nan')``.

    Examples
    --------
    >>> data = sample_missing(10)
    >>> None in data["alpha"]
    True
    """
    import random
    import math
    random.seed(13)

    def _maybe_none(val):
        return None if random.random() < 0.25 else val

    return {
        "alpha": [_maybe_none(random.choice(["foo", "bar", "baz"])) for _ in range(n)],
        "beta": [_maybe_none(round(random.gauss(50, 15), 2)) for _ in range(n)],
        "gamma": [
            float("nan") if random.random() < 0.2 else random.randint(1, 100)
            for _ in range(n)
        ],
        "delta": [_maybe_none(random.choice([True, False])) for _ in range(n)],
    }


def sample_types() -> dict[str, list]:
    """
    Generate a dataset that exercises many Python types.

    Returns
    -------
    dict[str, list]
        Six rows with int, float, bool, string, None, and large-number
        columns.

    Examples
    --------
    >>> data = sample_types()
    >>> len(data["integer"])
    6
    """
    return {
        "integer": [0, 1, -42, 1_000_000, 2**31, None],
        "floating": [0.0, 3.14, -2.718, 1e10, float("inf"), float("nan")],
        "boolean": [True, False, True, False, None, True],
        "text": ["hello", "world", "", "café", "<b>bold</b>", None],
        "big_number": [10**18, 10**15, 10**12, 10**9, 10**6, 10**3],
    }

📁 user_guide/

📄 01-basic-preview.qmd

---
title: Basic Preview
---

## Default Settings

The simplest way to use `tbl_preview()` — pass a column-oriented
dict and let the defaults do the work.

```{python}
from great_docs import tbl_preview
from gdtest_tbl_preview import sample_scores

tbl_preview(sample_scores(20))
```

## From a List of Dicts

You can also pass a list of row dicts:

```{python}
rows = [
    {"city": "Tokyo", "pop_m": 37.4, "country": "Japan"},
    {"city": "Delhi", "pop_m": 32.9, "country": "India"},
    {"city": "Shanghai", "pop_m": 29.2, "country": "China"},
    {"city": "São Paulo", "pop_m": 22.4, "country": "Brazil"},
    {"city": "Mexico City", "pop_m": 21.8, "country": "Mexico"},
]
tbl_preview(rows)
```

## With a Caption

```{python}
tbl_preview(
    sample_scores(12),
    caption="Student Performance — Fall 2025",
)
```

📄 02-pandas-tables.qmd

---
title: Pandas Tables
---

## Pandas DataFrame

Pass a Pandas DataFrame directly. The preview auto-detects the
library and shows a **Pandas** badge.

```{python}
import pandas as pd
from great_docs import tbl_preview

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie", "Diana", "Eve",
             "Frank", "Grace", "Hank", "Iris", "Jack",
             "Kate", "Leo", "Mia", "Noah", "Olivia"],
    "department": ["Eng", "Sales", "Eng", "HR", "Sales",
                   "Eng", "HR", "Sales", "Eng", "HR",
                   "Sales", "Eng", "HR", "Sales", "Eng"],
    "salary": [95000, 72000, 88000, 65000, 78000,
              105000, 62000, 81000, 92000, 58000,
              74000, 110000, 67000, 83000, 97000],
    "years": [5, 3, 7, 2, 4, 10, 1, 6, 8, 3, 4, 12, 2, 5, 9],
})

tbl_preview(df)
```

## Custom Head and Tail

Show 8 rows from the top and 3 from the bottom:

```{python}
tbl_preview(df, n_head=8, n_tail=3)
```

## Show All Rows

```{python}
tbl_preview(df, show_all=True)
```

📄 03-polars-tables.qmd

---
title: Polars Tables
---

## Polars DataFrame

Polars DataFrames are detected automatically and show a blue
**Polars** badge with precise dtype labels.

```{python}
import polars as pl
from great_docs import tbl_preview

df = pl.DataFrame({
    "id": range(1, 26),
    "value": [x * 1.1 for x in range(1, 26)],
    "category": ["A", "B", "C", "D", "E"] * 5,
    "flag": [True, False] * 12 + [True],
})

tbl_preview(df)
```

## Head Only (No Tail)

```{python}
tbl_preview(df, n_head=10, n_tail=0)
```

📄 04-missing-values.qmd

---
title: Missing Values
---

## Highlighted Missing Values

By default, `None` and `NaN` values are highlighted in red:

```{python}
from great_docs import tbl_preview
from gdtest_tbl_preview import sample_missing

tbl_preview(sample_missing(15))
```

## Without Highlighting

Turn off missing-value highlighting with `highlight_missing=False`:

```{python}
tbl_preview(sample_missing(15), highlight_missing=False)
```

## Mixed Python Types

Inf, NaN, None, empty strings, HTML-unsafe characters, and large
numbers:

```{python}
from gdtest_tbl_preview import sample_types

tbl_preview(sample_types(), show_all=True)
```

📄 05-column-options.qmd

---
title: Column Options
---

## Column Subset

Select and reorder columns with the `columns` parameter:

```{python}
from great_docs import tbl_preview
from gdtest_tbl_preview import sample_inventory

data = sample_inventory(25)
tbl_preview(data, columns=["product", "price", "rating"])
```

## Wide Table

A table with 20 columns overflows and scrolls horizontally:

```{python}
from gdtest_tbl_preview import sample_wide

tbl_preview(sample_wide(12, 20))
```

## No Row Numbers

```{python}
tbl_preview(
    sample_inventory(10),
    show_row_numbers=False,
)
```

## No Dtype Labels

```{python}
tbl_preview(
    sample_inventory(10),
    show_dtypes=False,
)
```

📄 06-all-options.qmd

---
title: All Options
---

## Minimal Chrome

Turn off every optional element — no row numbers, no dtypes,
no dimension badges:

```{python}
from great_docs import tbl_preview
from gdtest_tbl_preview import sample_scores

tbl_preview(
    sample_scores(8),
    show_row_numbers=False,
    show_dtypes=False,
    show_dimensions=False,
    show_all=True,
)
```

## Full Chrome with Caption

Everything enabled plus a caption:

```{python}
tbl_preview(
    sample_scores(50),
    n_head=10,
    n_tail=5,
    caption="Top & bottom of the class roster",
)
```

## Custom Column Width

Restrict columns to 120px max width:

```{python}
tbl_preview(
    sample_scores(15),
    max_col_width=120,
    min_tbl_width=400,
)
```

## Side-by-Side Comparison

Default Pandas output vs. `tbl_preview()` on the same data:

::: {layout-ncol=2}

```{python}
#| echo: false
import pandas as pd
df = pd.DataFrame(sample_scores(10))
df
```

```{python}
#| echo: false
tbl_preview(df)
```

:::

📄 07-text-heavy-tables.qmd

---
title: Text-Heavy Tables
---

## Long Strings (Default Width)

Cells with very long text are capped at `max_col_width` (250px
by default) and show an ellipsis instead of wrapping.

```{python}
from great_docs import tbl_preview

data = {
    "id": [1, 2, 3, 4, 5],
    "title": [
        "A short title",
        "A moderately long title that tests mid-range widths",
        "This title is intentionally very long so that it will definitely exceed the maximum column width and trigger text-overflow ellipsis behavior in the rendered table cell",
        "Brief",
        "Another extremely verbose title string that goes on and on to stress-test the truncation and overflow handling in the preview table renderer",
    ],
    "status": ["draft", "published", "review", "archived", "published"],
}

tbl_preview(data, show_all=True)
```

## Descriptions and Paragraphs

Real-world data often has paragraph-length text in columns.

```{python}
data = {
    "package": ["NumPy", "Pandas", "Polars", "Great Tables", "Pointblank"],
    "description": [
        "Fundamental package for scientific computing with Python. Provides N-dimensional arrays, linear algebra, Fourier transforms, and random number generation.",
        "Powerful data structures for data analysis, time series, and statistics. Built on NumPy with labeled axes, automatic alignment, and rich I/O.",
        "Lightning-fast DataFrame library in Rust with a Python API. Lazy evaluation, multi-threaded queries, and Apache Arrow memory format.",
        "Build beautiful, publication-quality tables in Python. Supports Polars and Pandas DataFrames with fine-grained styling, formatting, and export.",
        "Data validation library for Python. Define expectations, validate data, and generate detailed reports with table-level and column-level checks.",
    ],
    "version": ["1.26.0", "2.2.0", "0.20.0", "0.15.0", "0.14.0"],
}

tbl_preview(data, show_all=True)
```

## Narrow Max Width (120px)

Force aggressive truncation with a tight `max_col_width`:

```{python}
tbl_preview(data, show_all=True, max_col_width=120)
```

## Wide Max Width (500px)

Allow generous room — long text is still capped, but more is visible:

```{python}
tbl_preview(data, show_all=True, max_col_width=500)
```

## Mixed Short and Long Columns

Short numeric/code columns alongside verbose text — each column
gets its own computed width.

```{python}
data = {
    "code": ["E001", "E002", "E003", "W001", "W002", "I001", "I002", "E004"],
    "severity": ["error", "error", "error", "warning", "warning", "info", "info", "error"],
    "message": [
        "Undefined variable: foobar",
        "Type mismatch: expected int, got str in argument `count` of function process_batch()",
        "Division by zero in expression total / n_items where n_items evaluates to 0",
        "Unused import: os (imported but never referenced in module)",
        "Variable `tmp` assigned on line 42 but never used anywhere in the function body",
        "Module docstring missing: consider adding a module-level docstring",
        "Line too long: 127 characters (max 120). Consider breaking this into multiple lines for readability",
        "Syntax error: unexpected token ) at position 34 in expression parse(input))",
    ],
    "line": [12, 45, 78, 3, 42, 1, 99, 34],
}

tbl_preview(data, show_all=True)
```

📄 08-tsv-files.qmd

---
title: TSV Files
---

## Read a TSV File

`tbl_preview()` auto-detects `.tsv` and `.tab` files and reads
them with tab-delimited parsing.

```{python}
#| echo: false
import pathlib

tsv_path = pathlib.Path('assets/cities.tsv')
tsv_path.parent.mkdir(parents=True, exist_ok=True)
tsv_path.write_text(
    'city\tcountry\tpopulation\tarea_km2\n'
    'Tokyo\tJapan\t13960000\t2194\n'
    'Delhi\tIndia\t11030000\t1484\n'
    'Shanghai\tChina\t24870000\t6341\n'
    'São Paulo\tBrazil\t12330000\t1521\n'
    'Mexico City\tMexico\t9210000\t1485\n'
    'Cairo\tEgypt\t9540000\t3085\n'
    'Mumbai\tIndia\t12440000\t603\n'
    'Beijing\tChina\t21540000\t16411\n'
)
```

```{python}
from great_docs import tbl_preview

tbl_preview('assets/cities.tsv', show_all=True)
```

The badge shows **TSV** and the header reports the correct
row and column counts.

📄 09-jsonl-files.qmd

---
title: JSONL Files
---

## Read a JSONL File

Newline-delimited JSON (`.jsonl` / `.ndjson`) is a common
format for streaming data and log records.

```{python}
#| echo: false
import pathlib, json

records = [
    {'timestamp': '2025-01-15T08:30:00', 'level': 'INFO', 'module': 'auth', 'message': 'User login successful'},
    {'timestamp': '2025-01-15T08:31:12', 'level': 'WARNING', 'module': 'db', 'message': 'Slow query detected (3.2s)'},
    {'timestamp': '2025-01-15T08:32:45', 'level': 'ERROR', 'module': 'api', 'message': 'Request timeout on /v2/users'},
    {'timestamp': '2025-01-15T08:33:01', 'level': 'INFO', 'module': 'cache', 'message': 'Cache miss for key user:42'},
    {'timestamp': '2025-01-15T08:34:20', 'level': 'DEBUG', 'module': 'auth', 'message': 'Token refresh for session abc123'},
    {'timestamp': '2025-01-15T08:35:55', 'level': 'ERROR', 'module': 'db', 'message': 'Connection pool exhausted'},
    {'timestamp': '2025-01-15T08:36:10', 'level': 'INFO', 'module': 'api', 'message': 'Health check passed'},
    {'timestamp': '2025-01-15T08:37:30', 'level': 'WARNING', 'module': 'auth', 'message': 'Failed login attempt from 192.168.1.100'},
]

jsonl_path = pathlib.Path('assets/server_logs.jsonl')
jsonl_path.parent.mkdir(parents=True, exist_ok=True)
jsonl_path.write_text('\n'.join(json.dumps(r) for r in records) + '\n')
```

```{python}
from great_docs import tbl_preview

tbl_preview('assets/server_logs.jsonl', show_all=True)
```

## NDJSON Extension

The `.ndjson` extension is treated identically:

```{python}
#| echo: false
import shutil
shutil.copy('assets/server_logs.jsonl', 'assets/server_logs.ndjson')
```

```{python}
tbl_preview('assets/server_logs.ndjson', show_all=True)
```

📄 10-parquet-files.qmd

---
title: Parquet Files
---

## Read a Parquet File

Apache Parquet is a columnar storage format popular in data
engineering workflows.

```{python}
#| echo: false
import polars as pl, pathlib

df = pl.DataFrame({
    'product': ['Widget', 'Gadget', 'Gizmo', 'Doohickey', 'Thingamajig'],
    'category': ['Electronics', 'Tools', 'Kitchen', 'Garden', 'Office'],
    'price': [29.99, 49.50, 12.00, 8.75, 199.99],
    'in_stock': [True, False, True, True, False],
    'rating': [4.5, 3.8, 4.9, 4.2, 2.1],
})

pq_path = pathlib.Path('assets/products.parquet')
pq_path.parent.mkdir(parents=True, exist_ok=True)
df.write_parquet(str(pq_path))
```

```{python}
from great_docs import tbl_preview

tbl_preview('assets/products.parquet', show_all=True)
```

The badge shows **Parquet** and dtype labels are preserved
from the original Polars schema.

📄 11-feather-arrow-files.qmd

---
title: Feather & Arrow IPC Files
---

## Feather File

Feather (Apache Arrow IPC format) is fast for local analytics.

```{python}
#| echo: false
import polars as pl, pathlib

df = pl.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve', 'Frank'],
    'department': ['Engineering', 'Marketing', 'Engineering', 'Sales', 'Marketing', 'Sales'],
    'salary': [95000, 72000, 105000, 68000, 88000, 71000],
    'years': [5, 3, 8, 2, 6, 4],
})

feather_path = pathlib.Path('assets/employees.feather')
feather_path.parent.mkdir(parents=True, exist_ok=True)
df.write_ipc(str(feather_path))
```

```{python}
from great_docs import tbl_preview

tbl_preview('assets/employees.feather', show_all=True)
```

## Arrow IPC Extension

Files with `.arrow` or `.ipc` extensions are also read as
Arrow IPC, but get the **Arrow** badge instead of Feather:

```{python}
#| echo: false
import shutil
shutil.copy('assets/employees.feather', 'assets/employees.arrow')
```

```{python}
tbl_preview('assets/employees.arrow', show_all=True)
```

📄 12-arrow-tables.qmd

---
title: PyArrow Tables
---

## In-Memory Arrow Table

`tbl_preview()` also accepts a `pyarrow.Table` directly —
no file needed.

```{python}
import pyarrow as pa
from great_docs import tbl_preview

tbl = pa.table({
    'city': ['Tokyo', 'Delhi', 'Shanghai', 'São Paulo', 'Mexico City',
             'Cairo', 'Mumbai', 'Beijing', 'Dhaka', 'Osaka'],
    'country': ['Japan', 'India', 'China', 'Brazil', 'Mexico',
                'Egypt', 'India', 'China', 'Bangladesh', 'Japan'],
    'population_m': [13.96, 11.03, 24.87, 12.33, 9.21,
                     9.54, 12.44, 21.54, 8.91, 2.75],
    'area_km2': [2194, 1484, 6341, 1521, 1485,
                 3085, 603, 16411, 306, 225],
})

tbl_preview(tbl, show_all=True)
```

## Arrow Table with Typed Columns

PyArrow preserves rich type information — booleans, dates,
decimals — which `tbl_preview()` maps to short dtype labels.

```{python}
import pyarrow as pa
from datetime import date

tbl = pa.table({
    'event': ['Launch', 'Update', 'Hotfix', 'Deprecation'],
    'date': [date(2025, 1, 15), date(2025, 3, 1), date(2025, 3, 12), date(2025, 6, 30)],
    'critical': [True, False, True, False],
    'affected_users': [50000, 12000, 8500, 2000],
})

tbl_preview(tbl, show_all=True)
```

📄 README.md

# gdtest-tbl-preview

A showcase site demonstrating the `tbl_preview()` function from
Great Docs. Each user-guide page exercises a different combination
of data sources, table shapes, and display options.

📄 great-docs.yml

{}