nokap v0.1.0
  • User Guide
  • Reference
  • Changelog

Skills

A skill is a package of structured files that teaches an AI coding agent how to work with a specific tool or framework. This project ships multiple skills — use the switcher below to browse each one. Install a skill in your agent and it will be able to run commands, edit configuration, write content, and troubleshoot problems without step-by-step guidance from you.

Any agent — install all with npx:

npx skills add https://posit-dev.github.io/nokap/

CLI — install all skills in a project:

great-docs skill install nokap

Codex / OpenCode

Tell the agent to fetch these skill files:
https://posit-dev.github.io/nokap/.well-known/agent-skills/nokap/SKILL.md
https://posit-dev.github.io/nokap/.well-known/agent-skills/nokap-web-capture/SKILL.md
https://posit-dev.github.io/nokap/.well-known/agent-skills/nokap-html-capture/SKILL.md
https://posit-dev.github.io/nokap/.well-known/agent-skills/nokap-pdf/SKILL.md

Or browse the skill files below.

SKILL.md

---
name: nokap
description: >
  Capture screenshots and PDFs from web pages using headless Chrome via CDP.
  Use when writing Python code that captures, screenshots, or renders web
  pages, local HTML files, or raw HTML strings to images or PDFs.
license: MIT
compatibility: Requires Python >=3.10 and Chrome/Chromium installed on the system.
---

# nokap

Capture screenshots and PDFs from web pages using headless Chrome via CDP.

## Installation

```bash
pip install nokap
```

Chrome or Chromium must be installed separately. nokap auto-discovers the
browser binary, or you can set `CHROME_PATH`.

## Decision Table

| Need | Use |
|------|-----|
| Screenshot a URL | `nokap.webshot(url, "out.png")` |
| Screenshot a local HTML file | `nokap.webshot("page.html", "out.png")` |
| Screenshot raw HTML string | `nokap.from_html(html_str, "out.png")` |
| Capture specific element | `nokap.webshot(url, "out.png", selector="table")` |
| High-resolution capture (2x) | `nokap.webshot(url, "out.png", zoom=2)` |
| Add padding around element | `nokap.webshot(url, "out.png", selector="h1", expand=10)` |
| Full-page PDF | `nokap.webshot(url, "out.pdf")` |
| Element-bounded PDF | `nokap.webshot(url, "out.pdf", selector="table")` |
| Capture from HTML to PDF | `nokap.from_html(html_str, "out.pdf", selector="table")` |
| Clean up browser process | `nokap.close()` |
| CLI screenshot | `nokap webshot URL file.png` |
| CLI from HTML file | `nokap from-html file.html out.png` |
| Check Chrome availability | `nokap doctor` or `nokap info` |

## Core API

### `webshot(url, file, *, ...)`

Main capture function. Output format determined by file extension (`.png`,
`.jpg`, `.webp`, `.pdf`).

Key parameters:

- `url`: URL or local file path (auto-converted to `file://`)
- `file`: Output path (default: `"webshot.png"`)
- `selector`: CSS selector to crop to element's bounding box
- `cliprect`: Explicit `(x, y, width, height)` clip rectangle
- `expand`: Padding around selector (int for all sides, or 4-tuple)
- `zoom`: Scale factor for raster images (>1 = higher resolution)
- `delay`: Seconds to wait after page load (default: 0.2)
- `vwidth` / `vheight`: Viewport dimensions (default: 992×744)
- `useragent`: Custom User-Agent string

### `from_html(html, file, *, selector="html", encoding="utf-8", **kwargs)`

Render an HTML string to image or PDF. Writes HTML to a temp file and calls
`webshot()`. Accepts all `webshot()` keyword arguments.

### `close()`

Explicitly close the module-level browser. Called automatically at exit, but
use this in long-running processes or after batch captures.

## Gotchas

1. The module name is `nokap`, not `no-kap` or `no_kap`.
2. `selector` and `cliprect` are mutually exclusive: never pass both.
3. `zoom` only affects raster images (PNG/JPEG/WebP), not PDF output. PDFs are vector and always sharp.
4. `from_html()` defaults to `selector="html"` (full document), while `webshot()` defaults to `selector=None` (viewport only).
5. The browser singleton auto-starts on first call. Call `nokap.close()` to free resources in long-running scripts.
6. Local file paths are auto-converted to `file://` URLs: no manual conversion needed.
7. `expand` uses CSS pixel units and applies padding around the selector bounding box.
8. For wide elements (tables), nokap auto-detects intrinsic width and widens the viewport to avoid clipping.
9. Chrome must be installed separately: nokap does not bundle a browser.
10. Set `CHROME_PATH` environment variable if Chrome is not in a standard location.

## Error Handling

All nokap errors inherit from `NokapError`:

```python
import nokap

try:
    nokap.webshot("https://example.com", "out.png", selector="#missing")
except nokap.SelectorError:
    print("Element not found")
except nokap.PageLoadTimeout:
    print("Page took too long")
except nokap.ChromeNotFoundError:
    print("Install Chrome first")
except nokap.NokapError as e:
    print(f"Capture failed: {e}")
```

## Resources

- [Full documentation](https://posit-dev.github.io/nokap/)
- [llms.txt](https://posit-dev.github.io/nokap/llms.txt): API overview for LLMs
- [llms-full.txt](https://posit-dev.github.io/nokap/llms-full.txt): Complete reference
- [Source code](https://github.com/posit-dev/nokap)

SKILL.md

---
name: nokap-web-capture
description: >
  Capture screenshots from live web pages using nokap. Use when writing code
  that screenshots URLs, configures viewports, handles page load timing, or
  captures specific page elements from remote sites.
license: MIT
compatibility: Requires Python >=3.10, Chrome/Chromium, and network access for remote URLs.
---

# nokap: Web Page Capture

Capture screenshots and PDFs from live web pages (HTTP/HTTPS URLs) or local
HTML files served via `file://`.

## Basic Capture

```python
import nokap

# Screenshot a web page
nokap.webshot("https://example.com", "page.png")

# JPEG output (determined by extension)
nokap.webshot("https://example.com", "page.jpg")

# WebP output
nokap.webshot("https://example.com", "page.webp")
```

## Viewport Configuration

Control the browser viewport to simulate different devices or layouts:

```python
# Desktop (default: 992×744)
nokap.webshot("https://example.com", "desktop.png")

# Mobile viewport
nokap.webshot("https://example.com", "mobile.png", vwidth=375, vheight=812)

# Tablet viewport
nokap.webshot("https://example.com", "tablet.png", vwidth=768, vheight=1024)

# Wide viewport for dashboards
nokap.webshot("https://example.com", "wide.png", vwidth=1920, vheight=1080)
```

## High-Resolution Captures

Use `zoom` to produce Retina/HiDPI screenshots:

```python
# 2x resolution (double pixel density)
nokap.webshot("https://example.com", "retina.png", zoom=2)

# 3x for very sharp icons/logos
nokap.webshot("https://example.com", "logo.png", selector=".logo", zoom=3)
```

The output image dimensions are multiplied by the zoom factor (e.g., a 992px
viewport at zoom=2 produces a 1984px-wide image).

## Element Selection

Target specific page elements with CSS selectors:

```python
# Capture just the header
nokap.webshot("https://example.com", "header.png", selector="header")

# Capture a specific element by ID
nokap.webshot("https://example.com", "chart.png", selector="#main-chart")

# Capture by class
nokap.webshot("https://example.com", "nav.png", selector=".navbar")

# Add padding around the element
nokap.webshot("https://example.com", "card.png", selector=".card", expand=20)

# Asymmetric padding: (top, right, bottom, left)
nokap.webshot(
    "https://example.com", "hero.png",
    selector=".hero",
    expand=(10, 20, 10, 20),
)
```

## Page Load Timing

Control when the capture happens relative to page load:

```python
# Default: 0.2s delay after load event
nokap.webshot("https://example.com", "page.png")

# Longer delay for heavy JavaScript (SPAs, charts, animations)
nokap.webshot("https://example.com", "dashboard.png", delay=3.0)

# Minimal delay for static pages
nokap.webshot("https://example.com", "static.png", delay=0)

# Very long delay for pages with lazy-loaded content
nokap.webshot("https://example.com", "lazy.png", delay=5.0)
```

## Custom User-Agent

Spoof the browser identity for responsive sites or bot detection:

```python
# Mobile User-Agent for responsive content
nokap.webshot(
    "https://example.com", "mobile.png",
    vwidth=375, vheight=812,
    useragent="Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) ...",
)
```

## Explicit Clip Rectangles

Capture an exact pixel region (alternative to selectors):

```python
# Capture a 400×300 region starting at (100, 50)
nokap.webshot(
    "https://example.com", "region.png",
    cliprect=(100, 50, 400, 300),
)
```

Note: `cliprect` and `selector` are mutually exclusive.

## Batch Captures

For multiple captures, the browser stays alive between calls:

```python
import nokap

urls = [
    "https://example.com/page1",
    "https://example.com/page2",
    "https://example.com/page3",
]

for i, url in enumerate(urls):
    nokap.webshot(url, f"page_{i}.png")

# Clean up when done
nokap.close()
```

## Local File Capture

Local HTML files work the same way and nokap auto-converts paths to `file://`:

```python
# Path string
nokap.webshot("report.html", "report.png")

# pathlib.Path
from pathlib import Path
nokap.webshot(Path("output/chart.html"), "chart.png")
```

## CLI Usage

```bash
# Basic screenshot
nokap webshot https://example.com page.png

# With options
nokap webshot https://example.com hero.png -s "h1" -z 2 -e 10

# Mobile viewport
nokap webshot https://example.com mobile.png --vwidth 375 --vheight 812

# Longer delay for SPAs
nokap webshot https://example.com app.png -d 3.0
```

## Gotchas

1. The `delay` parameter counts from the page load event, not from navigation start. JavaScript that runs after `load` may need a longer delay.
2. Selectors must match exactly one element. If the selector matches nothing, `SelectorError` is raised.
3. `zoom` multiplies the pixel dimensions so a 992×744 viewport at zoom=2 produces a 1984×1488 image file.
4. Wide elements (e.g., tables) are auto-detected. nokap temporarily widens the viewport to capture the full natural width without horizontal clipping.
5. For pages behind authentication, nokap cannot handle login flows. Pre-authenticate or use a session cookie via a custom user-agent workaround.

## Resources

- [Full documentation](https://posit-dev.github.io/nokap/)
- [Screenshots guide](https://posit-dev.github.io/nokap/user-guide/screenshots.html)
- [Selectors guide](https://posit-dev.github.io/nokap/user-guide/selectors-and-clipping.html)

SKILL.md

---
name: nokap-html-capture
description: >
  Capture screenshots and PDFs from raw HTML strings or local HTML files using
  nokap. Use when rendering HTML output from packages like Great Tables, Plotly,
  or custom report generators to images or PDFs.
license: MIT
compatibility: Requires Python >=3.10 and Chrome/Chromium installed on the system.
---

# nokap: HTML Capture

Render raw HTML strings or local HTML files to images (PNG, JPEG, WebP) or PDFs.
This is the primary integration point for packages that generate HTML and need
to convert it to a visual format.

## `from_html()`: Core Function

```python
import nokap

html = "<h1>Hello World</h1><p>Rendered by nokap</p>"
nokap.from_html(html, "output.png")
```

Key differences from `webshot()`:

- Accepts an HTML string directly (no need to write a file manually)
- Defaults to `selector="html"` (captures the full document, not just viewport)
- Manages temp file creation and cleanup internally

## Capturing Generated HTML

### Great Tables Integration

```python
import great_tables as gt
import nokap

# Build a table
table = gt.GT(df).tab_header(title="Sales Report")

# Get the HTML and capture it
html = table.as_raw_html()
nokap.from_html(html, "table.png", selector="table", zoom=2, expand=10)
```

### Plotly Integration

```python
import plotly.express as px
import nokap

fig = px.scatter(df, x="x", y="y", title="My Chart")
html = fig.to_html(include_plotlyjs="cdn")
nokap.from_html(html, "chart.png", selector=".plotly-graph-div", delay=1.0)
```

### Custom HTML

```python
import nokap

html = """
<!DOCTYPE html>
<html>
<head>
  <style>
    .card { padding: 20px; border: 1px solid #ddd; border-radius: 8px; }
  </style>
</head>
<body>
  <div class="card">
    <h2>Status: Active</h2>
    <p>All systems operational.</p>
  </div>
</body>
</html>
"""
nokap.from_html(html, "card.png", selector=".card", expand=5)
```

## Selector Strategies for HTML Capture

| HTML Structure | Selector | Notes |
|----------------|----------|-------|
| Full document | `"html"` (default) | Captures everything |
| A single table | `"table"` | Element-bounded capture |
| Table by class | `"table.gt_table"` | Great Tables default class |
| Div by ID | `"#my-chart"` | Plotly containers, custom widgets |
| First match | `".card"` | Takes first matching element |
| Body content | `"body"` | Excludes html margins |

## Controlling Output Size

```python
# Natural width: nokap auto-detects wide elements (tables, charts)
# and expands the viewport to avoid clipping
nokap.from_html(wide_table_html, "table.png", selector="table")

# Force a specific viewport width
nokap.from_html(html, "narrow.png", vwidth=600)

# High-resolution output
nokap.from_html(html, "sharp.png", selector="table", zoom=2)

# Padding around element
nokap.from_html(html, "padded.png", selector=".widget", expand=15)

# Asymmetric padding: (top, right, bottom, left)
nokap.from_html(html, "padded.png", selector=".widget", expand=(5, 10, 5, 10))
```

## HTML to PDF

```python
# Element-bounded PDF (sized exactly to the element)
nokap.from_html(html, "table.pdf", selector="table")

# Full-page PDF with standard paper size
nokap.from_html(html, "report.pdf", selector="html", page_size="letter")

# Landscape PDF
nokap.from_html(html, "wide.pdf", selector="html", landscape=True)

# PDF with CSS backgrounds printed
nokap.from_html(html, "styled.pdf", selector="table", print_background=True)
```

## Encoding

```python
# Default UTF-8
nokap.from_html(html, "out.png")

# Explicit encoding for non-UTF-8 content
nokap.from_html(html, "out.png", encoding="latin-1")
```

nokap auto-injects a `<meta charset>` tag if one is not present in the HTML.

## Batch HTML Capture

```python
import nokap

tables = [generate_table(data) for data in datasets]

for i, html in enumerate(tables):
    nokap.from_html(html, f"table_{i}.png", selector="table", zoom=2)

nokap.close()
```

## Capturing Local HTML Files

For existing HTML files on disk, use `webshot()` with a file path:

```python
import nokap

# String path
nokap.webshot("report.html", "report.png", selector="table")

# pathlib.Path
from pathlib import Path
nokap.webshot(Path("output/chart.html"), "chart.png")
```

To read a file and use `from_html()` instead (useful when you want to
modify the HTML first):

```python
from pathlib import Path
import nokap

html = Path("template.html").read_text()
html = html.replace("{{TITLE}}", "My Report")
nokap.from_html(html, "report.png")
```

## CLI for HTML Files

```bash
# Render a local HTML file
nokap from-html report.html report.png

# With selector and zoom
nokap from-html data.html table.png -s "table" -z 2

# With padding
nokap from-html gt_table.html table.png -s "table" -e 10

# To PDF
nokap from-html invoice.html invoice.pdf -s "table"
```

## Gotchas

1. `from_html()` defaults to `selector="html"`; it captures the full document, not just the viewport. This differs from `webshot()` which defaults to `selector=None` (viewport-only).
2. The HTML is written to a temp file and loaded via `file://`. External resources using relative paths won't resolve unless you use absolute URLs or inline styles.
3. For HTML with external CSS/JS (CDN links), add `delay=1.0` or more to allow resources to load.
4. nokap auto-detects wide elements (tables wider than viewport) and expands the viewport. You generally don't need to set `vwidth` manually for tables.
5. `zoom` does not affect PDF output. PDFs are vector format and always render at full resolution.
6. If your HTML references local images with relative paths, use `webshot()` with the HTML file path instead of `from_html()` with the string, so relative paths resolve correctly.

## Resources

- [Full documentation](https://posit-dev.github.io/nokap/)
- [Screenshots guide](https://posit-dev.github.io/nokap/user-guide/screenshots.html)
- [Quick start](https://posit-dev.github.io/nokap/user-guide/quick-start.html)

SKILL.md

---
name: nokap-pdf
description: >
  Generate PDFs from web pages or HTML using nokap. Use when writing code that
  produces PDF output from URLs or HTML strings, configures paper sizes, margins,
  or creates element-bounded PDFs sized exactly to content.
license: MIT
compatibility: Requires Python >=3.10 and Chrome/Chromium installed on the system.
---

# nokap: PDF Generation

Generate PDFs from web pages, local HTML files, or raw HTML strings. nokap
supports two modes: full-page PDFs with standard paper dimensions, and
element-bounded PDFs sized exactly to a selected element.

## Two PDF Modes

| Mode | When to use | How to trigger |
|------|-------------|----------------|
| Full-page PDF | Reports, articles, printable pages | `webshot(url, "out.pdf")` |
| Element-bounded PDF | Tables, charts, widgets for embedding | `webshot(url, "out.pdf", selector="table")` |

The mode is automatically selected based on whether a `selector` (other than
`"html"`) or `cliprect` is provided.

## Full-Page PDF

Standard paper-sized PDF, like printing from the browser:

```python
import nokap

# Default: letter size, 0.5" margins
nokap.webshot("https://example.com", "page.pdf")

# A4 paper
nokap.webshot("https://example.com", "page.pdf", page_size="a4")

# Landscape orientation
nokap.webshot("https://example.com", "wide.pdf", landscape=True)

# Custom margins (inches): single value for all sides
nokap.webshot("https://example.com", "tight.pdf", margins=0.25)

# Asymmetric margins: (top, right, bottom, left) in inches
nokap.webshot("https://example.com", "report.pdf", margins=(1.0, 0.75, 1.0, 0.75))

# Print CSS backgrounds (colors, images)
nokap.webshot("https://example.com", "styled.pdf", print_background=True)
```

### Available Paper Sizes

| Size | Dimensions (inches) |
|------|---------------------|
| `"letter"` | 8.5 × 11 |
| `"legal"` | 8.5 × 14 |
| `"tabloid"` | 11 × 17 |
| `"ledger"` | 17 × 11 |
| `"a3"` | 11.7 × 16.5 |
| `"a4"` | 8.27 × 11.7 |
| `"a5"` | 5.83 × 8.27 |
| `"a6"` | 4.13 × 5.83 |

The `PaperSize` type is a `Literal` union of these values.

## Element-Bounded PDF

Creates a PDF sized exactly to the selected element. The text remains
selectable and the output is vector (resolution-independent). Perfect for
embedding tables or charts in presentations.

```python
import nokap

# PDF sized to a table element
nokap.webshot("https://example.com", "table.pdf", selector="table")

# PDF of a chart with padding
nokap.webshot("https://example.com", "chart.pdf", selector="#chart", expand=10)

# From raw HTML (Great Tables, Plotly, etc.)
nokap.from_html(table_html, "table.pdf", selector="table")

# With CSS backgrounds (important for styled tables)
nokap.from_html(table_html, "table.pdf", selector="table", print_background=True)
```

### When to Use Element-Bounded PDFs

- Embedding a single table in a LaTeX document or presentation
- Creating vector assets from HTML widgets
- Producing PDFs of exact content dimensions for design workflows
- Great Tables output for reports and slide decks

## HTML to PDF

```python
import nokap

# Full-page PDF from HTML string
html = "<html><body><h1>Report</h1><p>Content here.</p></body></html>"
nokap.from_html(html, "report.pdf", page_size="letter", margins=1.0)

# Element-bounded PDF from HTML
table_html = "<table>...</table>"
nokap.from_html(table_html, "table.pdf", selector="table", expand=5)

# Great Tables to PDF
import great_tables as gt

table = gt.GT(df).tab_header(title="Q4 Results")
nokap.from_html(table.as_raw_html(), "results.pdf", selector="table")
```

## PDF from Local Files

```python
import nokap

# Render a local HTML file to PDF
nokap.webshot("invoice.html", "invoice.pdf")

# With options
nokap.webshot("report.html", "report.pdf", page_size="a4", margins=0.75)

# Element from a local file
nokap.webshot("dashboard.html", "chart.pdf", selector="#revenue-chart")
```

## CLI for PDF Generation

```bash
# Full-page PDF
nokap webshot https://example.com page.pdf

# A4 paper size
nokap webshot https://example.com doc.pdf --page-size a4

# Landscape with backgrounds
nokap webshot report.html report.pdf --landscape --print-background

# Element-bounded PDF from HTML file
nokap from-html data.html table.pdf -s "table" -e 5
```

## Gotchas

1. `zoom` is ignored for PDF output. PDFs are vector format and always render at full resolution regardless of zoom setting.
2. `selector="html"` triggers full-page PDF mode (with paper dimensions). Any other selector triggers element-bounded mode.
3. Element-bounded PDFs use `print_background=True` by default (different from full-page PDFs which default to `False`).
4. Full-page PDF margins are in **inches**, while `expand` (for element-bounded PDFs) is in **CSS pixels**.
5. For HTML with dark backgrounds or colored table cells, always set `print_background=True` or the colors won't appear in the PDF.
6. The `margins` parameter only applies to full-page PDFs. For element-bounded PDFs, use `expand` to add spacing.
7. Paper size names are lowercase strings: `"letter"`, `"a4"`, `"legal"` (not `"Letter"` or `"A4"`).
8. Element-bounded PDFs have selectable text so they're true vector PDFs, not rasterized images wrapped in PDF.

## Decision Table

| Need | Code |
|------|------|
| Standard report PDF | `webshot(url, "out.pdf")` |
| A4 with narrow margins | `webshot(url, "out.pdf", page_size="a4", margins=0.25)` |
| Landscape PDF | `webshot(url, "out.pdf", landscape=True)` |
| Table as vector PDF | `webshot(url, "out.pdf", selector="table")` |
| Table PDF with styling | `from_html(html, "out.pdf", selector="table", print_background=True)` |
| Chart element as PDF | `webshot(url, "out.pdf", selector="#chart", expand=10)` |
| Printable with backgrounds | `webshot(url, "out.pdf", print_background=True)` |

## Resources

- [Full documentation](https://posit-dev.github.io/nokap/)
- [PDF generation guide](https://posit-dev.github.io/nokap/user-guide/pdf-generation.html)
- [llms-full.txt](https://posit-dev.github.io/nokap/llms-full.txt): complete reference

Developed by Rich Iannone and Richard Iannone.
Site created with Great Docs.