----------------------------------------------------------------------
This is the API documentation for the nokap library.
----------------------------------------------------------------------


## Core Functions

The main screenshot and PDF capture API


webshot(url: 'str | Path', file: 'str | Path' = 'webshot.png', *, vwidth: 'int' = 992, vheight: 'int' = 744, selector: 'str | list[str] | None' = None, cliprect: 'tuple[float, float, float, float] | None' = None, expand: 'int | tuple[int, int, int, int]' = 0, delay: 'float' = 0.2, zoom: 'float' = 1, useragent: 'str | None' = None, page_size: 'PaperSize' = 'letter', margins: 'float | tuple[float, float, float, float]' = 0.5, landscape: 'bool' = False, print_background: 'bool' = False) -> 'Path'

Take a screenshot or PDF of a web page.

Parameters
----------
url
    URL to capture. Can be an http/https URL, a `file://` URL, or a local
    file path (automatically converted to a `file://` URL).
file
    Output file path. The format is determined by extension:
    `.png`, `.jpg`/`.jpeg`, `.webp` for images; `.pdf` for PDF.
vwidth
    Viewport width in pixels.
vheight
    Viewport height in pixels.
selector
    CSS selector(s) to capture. For images, the screenshot is cropped to
    the element's bounding box. For PDFs, produces an element-bounded PDF
    sized to fit the element. Mutually exclusive with `cliprect=`.
cliprect
    Explicit clip rectangle as (x, y, width, height) in CSS pixels.
    Mutually exclusive with `selector=`.
expand
    Pixels to expand around the selector bounding box.
    Single int for all sides, or (top, right, bottom, left) tuple.
delay
    Seconds to wait after page load before capturing.
zoom
    Zoom/scale factor for raster images (PNG, JPEG, WebP). Values > 1
    produce higher resolution images. Ignored for PDF output since PDFs
    are vector format and always render at full resolution.
useragent
    Custom User-Agent string.
page_size
    Paper size for PDF output (e.g., `"letter"`, `"a4"`).
margins
    Margins in inches for PDF output. Single float or 4-tuple.
landscape
    Whether to use landscape orientation for PDF output.
print_background
    Whether to print CSS backgrounds in PDF output.

Returns
-------
Path
    The absolute path to the output file.

from_html(html: 'str', file: 'str | Path' = 'webshot.png', *, selector: 'str' = 'html', encoding: 'str' = 'utf-8', **kwargs: 'Any') -> 'Path'

Take a screenshot or PDF from an HTML string.

This is the primary integration point for packages like `great-tables`
that generate HTML and need to convert it to an image or PDF.

For PDF output with a selector (other than `"html"`), produces an
element-bounded PDF sized to fit the selected element with selectable
text preserved, and this is useful for embedding tables in presentations.

Parameters
----------
html
    The HTML content to render.
file
    Output file path. Format determined by extension (`.png`, `.jpg`, `.webp`
    for images; `.pdf` for PDF).
selector
    CSS selector to capture (default: `"html"` for full page). When a
    specific selector is used with PDF output, produces a tightly-bounded
    PDF. Wide elements (e.g., tables) are automatically detected and
    rendered at their natural width.
encoding
    Character encoding for the HTML file.
**kwargs
    Additional arguments passed to `webshot()` (e.g., zoom, expand,
    delay, vwidth, vheight).

Returns
-------
Path
    The absolute path to the output file.

close() -> 'None'

Explicitly close the module-level browser and CDP connection.

Call this when you're done taking screenshots to clean up Chrome processes.
If not called, cleanup happens automatically at interpreter exit.


## Browser Management

Chrome process lifecycle and discovery


Chrome(path: 'str | None' = None, headless: 'bool' = True, extra_args: 'list[str] | None' = None, timeout: 'float' = 20.0) -> 'None'

Manages a headless Chrome browser process.

Launches Chrome with remote debugging enabled and provides the WebSocket
URL for CDP communication.

Parameters
----------
path
    Path to Chrome executable. If None, auto-detected via `find_chrome()`.
headless
    Whether to run in headless mode.
extra_args
    Additional command-line arguments to pass to Chrome.
timeout
    Maximum seconds to wait for Chrome to start and report its WS URL.

find_chrome() -> 'str'

Locate the Chrome or Chromium binary on the system.

Search order:
1. CHROME_PATH environment variable
2. Platform-specific known locations

Returns the path to the Chrome executable.

Raises
------
RuntimeError
    If Chrome cannot be found.


## Session

Low-level browser tab and navigation control


Session(cdp: 'SyncCDP', width: 'int' = 992, height: 'int' = 744) -> 'None'

A CDP session representing a single browser tab.

Provides methods for navigation, viewport control, JavaScript evaluation,
and DOM queries needed for screenshot capture.

Parameters
----------
cdp
    The SyncCDP connection to use.
width
    Initial viewport width in pixels.
height
    Initial viewport height in pixels.


## Errors

Exception hierarchy for error handling


NokapError

Base exception for all nokap errors.

CDPError(message: 'str', error_data: 'dict[str, Any] | None' = None) -> 'None'

Error returned by the Chrome DevTools Protocol.

ChromeNotFoundError(message: 'str | None' = None) -> 'None'

Raised when Chrome/Chromium cannot be found on the system.

ChromeStartError(message: 'str', stderr: 'str' = '') -> 'None'

Raised when Chrome fails to start or report its DevTools URL.

NavigationError(url: 'str', reason: 'str' = '') -> 'None'

Raised when page navigation fails.

PageLoadTimeout(url: 'str', timeout: 'float') -> 'None'

Raised when a page does not finish loading within the timeout.

SelectorError(selector: 'str') -> 'None'

Raised when a CSS selector matches no elements.


## Types

Type definitions for configuration


Literal(*args, **kwargs)


----------------------------------------------------------------------
This is the CLI documentation for the package.
----------------------------------------------------------------------

## CLI: nokap

```
Usage: nokap [OPTIONS] COMMAND [ARGS]...

  Screenshots and PDFs from web pages. Powered by headless Chrome.

  nokap captures web pages and HTML content as PNG, JPEG, WebP, or PDF using a
  headless Chrome browser. It can target specific elements via CSS selectors,
  add padding, and produce element-bounded PDFs sized exactly to the content.

  Examples:
    nokap webshot https://example.com screenshot.png
    nokap from-html table.html table.pdf -s "table" -e 10
    nokap doctor

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  batch      Capture multiple URLs/files from a JSON manifest.
  doctor     Run a full diagnostic: find Chrome, launch it, and test a...
  from-html  Render an HTML file to an image or PDF.
  info       Display system info and whether a compatible browser is found.
  webshot    Take a screenshot or PDF of a URL or local file.
```

### nokap webshot

```
Usage: nokap webshot [OPTIONS] URL [FILE]

  Take a screenshot or PDF of a URL or local file.

  URL can be an http/https URL or a local file path. `FILE` is the output path
  (default: `webshot.png`). The output format is determined by the file
  extension: `.png`, `.jpg`, `.webp` for raster images; `.pdf` for PDF.

  When a selector is provided, only the matching element is captured (cropped
  to its bounding box). For PDF output with a selector, the paper is sized
  exactly to the element, producing a tight-fit vector PDF.

  Examples:
    nokap webshot https://example.com page.png
    nokap webshot https://example.com hero.png -s "h1" -z 2
    nokap webshot page.html table.png -s "table" -e 10 -z 2
    nokap webshot https://example.com mobile.png --vwidth 375 --vheight 812
    nokap webshot https://example.com doc.pdf --page-size a4
    nokap webshot report.html report.pdf --landscape --print-background
    nokap webshot https://example.com shot.png -d 2.0

Options:
  --vwidth INTEGER      Viewport width in pixels.
  --vheight INTEGER     Viewport height in pixels.
  -s, --selector TEXT   CSS selector to capture.
  -e, --expand INTEGER  Pixels to expand around selector.
  -d, --delay FLOAT     Seconds to wait after page load.
  -z, --zoom FLOAT      Zoom/scale factor.
  --useragent TEXT      Custom User-Agent string.
  --page-size TEXT      Paper size for PDF (e.g., letter, a4).
  --landscape           Use landscape orientation for PDF.
  --print-background    Print CSS backgrounds in PDF.
  --help                Show this message and exit.
```

### nokap from-html

```
Usage: nokap from-html [OPTIONS] HTML_FILE [FILE]

  Render an HTML file to an image or PDF.

  `HTML_FILE` is a path to an HTML file to render. `FILE` is the output path
  (default: `webshot.png`). Unlike 'nokap webshot', this command defaults to
  `selector="html"` so the entire document is captured regardless of viewport
  height.

  This is the natural choice for HTML generated by packages like Great Tables,
  Plotly, or custom report builders. Combine with a selector and `.pdf`
  extension to produce element-bounded vector PDFs.

  Examples:
    nokap from-html report.html report.png
    nokap from-html data.html table.png -s "table" -z 2
    nokap from-html chart.html chart.png -s "#chart" -e 20
    nokap from-html gt_table.html table.pdf -s "table" -e 5
    nokap from-html invoice.html invoice.pdf

Options:
  -s, --selector TEXT   CSS selector to capture.
  --vwidth INTEGER      Viewport width in pixels.
  --vheight INTEGER     Viewport height in pixels.
  -e, --expand INTEGER  Pixels to expand around selector.
  -d, --delay FLOAT     Seconds to wait after page load.
  -z, --zoom FLOAT      Zoom/scale factor.
  --help                Show this message and exit.
```

### nokap info

```
Usage: nokap info [OPTIONS]

  Display system info and whether a compatible browser is found.

  Shows the nokap version, Python version, platform, and whether a
  Chrome/Chromium binary can be located. This is a quick sanity check that
  does NOT launch Chrome. Use 'nokap doctor' for a full end-to-end diagnostic
  that actually tests captures.

  Examples:
    nokap info

Options:
  --help  Show this message and exit.
```

### nokap doctor

```
Usage: nokap doctor [OPTIONS]

  Run a full diagnostic: find Chrome, launch it, and test a capture.

  Goes beyond 'nokap info' by actually launching headless Chrome, creating a
  tab, rendering a test page, and capturing both a PNG screenshot and an
  element-bounded PDF. Reports timing for each step so you can identify
  bottlenecks in CI or slow environments.

  Steps performed:

  1. Locate Chrome/Chromium binary 2. Launch headless Chrome and connect via
  WebSocket 3. Render test HTML and capture as PNG 4. Render test HTML and
  capture as element-bounded PDF

  Examples:
    nokap doctor              # Run full diagnostic
    CHROME_PATH=/usr/bin/chromium nokap doctor

Options:
  --help  Show this message and exit.
```

### nokap batch

```
Usage: nokap batch [OPTIONS] MANIFEST

  Capture multiple URLs/files from a JSON manifest.

  `MANIFEST` is a path to a JSON file containing an array of capture jobs.
  Each job object requires a 'file' key (output filename) and either a 'url'
  or 'html' key (source). All other keys map to `webshot()` parameters and
  override the command-line defaults.

  Chrome is launched once and reused across all jobs, making batch processing
  significantly faster than individual 'nokap webshot' calls. Jobs that fail
  are reported individually without stopping the batch.

  Supported job keys:

    'url', 'html', 'file', 'selector', 'expand', 'zoom', 'delay',   'vwidth',
    'vheight', 'page_size', 'landscape', 'print_background'

  Example manifest (captures.json):

  [   {"url": "https://example.com", "file": "homepage.png"},   {"url":
  "report.html", "file": "report.pdf", "selector": "table"},   {"html":
  "<h1>Hello</h1>", "file": "hello.png", "zoom": 2} ]

  Examples:
    nokap batch captures.json -o output/
    nokap batch tables.json -o images/ -s "table" -z 2 -e 5
    nokap batch pages.json -o out/ -d 0

Options:
  -o, --output-dir PATH  Directory to write output files.
  -d, --delay FLOAT      Default delay (seconds).
  -z, --zoom FLOAT       Default zoom factor.
  -s, --selector TEXT    Default CSS selector for all entries.
  -e, --expand INTEGER   Default expand (pixels).
  --help                 Show this message and exit.
```


----------------------------------------------------------------------
This is the User Guide documentation for the package.
----------------------------------------------------------------------


## Getting Started

# Introduction

Learn what nokap is, why it exists, and how it works under the hood. This page provides a high-level overview of nokap's design philosophy and architecture, giving you the foundational context you'll need before diving into screenshots, PDFs, and the CLI. By the end, you'll understand how nokap communicates with Chrome and why that approach keeps things fast and dependency-free.

## What Is nokap?

**nokap** is a lightweight Python package that captures screenshots and PDFs from web pages using headless Chrome via the [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/) (CDP). It communicates directly with Chrome over WebSockets so no Selenium, no Playwright, or no browser driver binaries needed.

## Why Use nokap?

If you need to:

- Convert HTML tables or reports to images for embedding in documents
- Generate PDFs from web pages with precise control over page size and margins
- Capture specific elements using CSS selectors
- Produce retina-quality (2×, 3×) screenshots
- Automate web page captures without heavyweight browser automation frameworks

...then nokap is the right tool.

## How It Works

nokap's architecture is intentionally simple:

1. **Launch**: Finds and starts headless Chrome with a random debugging port
2. **Connect**: Opens a WebSocket to Chrome's CDP endpoint
3. **Capture**: Creates a tab, navigates, waits, then calls `Page.captureScreenshot` or `Page.printToPDF`
4. **Cleanup**: Closes the tab; Chrome stays running for reuse until `nokap.close()` or process exit

The entire package has a single runtime dependency: [`websockets`](https://pypi.org/project/websockets/).

## Key Features

| Feature | Details |
|---------|---------|
| **Image formats** | PNG, JPEG, WebP |
| **PDF generation** | Configurable page size, margins, orientation |
| **CSS selectors** | Capture specific elements (or union of multiple) |
| **Zoom/scale** | Produce retina-quality (2×, 3×) images |
| **Expand/padding** | Add whitespace around captured elements |
| **Local HTML** | Render HTML strings or local `.html` files |
| **Viewport control** | Set width/height for responsive layouts |
| **Custom User-Agent** | Override the browser UA string |
| **CLI** | `nokap webshot` and `nokap from-html` commands |
| **Auto-cleanup** | Chrome process managed via `atexit` |
| **Jupyter-safe** | Works in notebooks (no event loop conflicts) |

## Next Steps

You now have a clear picture of what nokap is, how it communicates with Chrome, and what it can do. The architecture is intentionally minimal (a direct WebSocket connection to Chrome's DevTools Protocol) which keeps things fast, predictable, and free of heavyweight dependencies.

- [Installation](01-installation.qmd): Get nokap set up on your system
- [Quick Start](02-quick-start.qmd): Your first screenshot in 30 seconds

Once nokap is installed and Chrome is discoverable on your system, you'll be ready to capture your first screenshot in just a few lines of code.


# Installation

This guide covers installing nokap and setting up Chrome on your system. You'll find instructions for installing from PyPI or from source, along with details on how nokap discovers Chrome automatically. If you're running in a container or CI environment, the platform-specific notes at the end will help you get things working smoothly.

## Requirements

nokap requires:

- **Python ≥ 3.10**
- **Chrome or Chromium** installed on the system

## Installing from PyPI

```bash
pip install nokap
```

This installs nokap and its single runtime dependency (`websockets`).

## Installing from Source

```bash
git clone https://github.com/posit-dev/nokap.git
cd nokap
pip install -e ".[dev]"
```

The `[dev]` extra includes testing and linting tools (pytest, ruff, pyright).

## Chrome Setup

nokap needs a system-installed Chrome or Chromium browser. It does **not** download or bundle a browser.

### Automatic Discovery

nokap searches for Chrome in platform-specific locations:

:::{.callout-note collapse="true" title="Platform search paths"}

**macOS**

- `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`
- `/Applications/Chromium.app/Contents/MacOS/Chromium`
- `/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary`
- `/Applications/Brave Browser.app/Contents/MacOS/Brave Browser`
- `/Applications/Microsoft Edge.app/Contents/MacOS/Microsoft Edge`

**Linux**

Searches `PATH` for: `google-chrome`, `google-chrome-stable`, `chromium-browser`, `chromium`, `microsoft-edge`

**Windows**

Searches `Program Files`, `Program Files (x86)`, and `LOCALAPPDATA` for:

- `Google\Chrome\Application\chrome.exe`
- `Microsoft\Edge\Application\msedge.exe`
- `Chromium\Application\chrome.exe`

:::

### Manual Override

Set the `CHROME_PATH` environment variable to use a specific browser:

```bash
export CHROME_PATH="/usr/bin/chromium-browser"
```

### CI Environments

On Ubuntu CI runners (GitHub Actions), install Chrome with:

```yaml
- name: Install Chrome
  uses: browser-actions/setup-chrome@v1
  with:
    chrome-version: stable
```

## Verifying the Installation

```{python}
#| eval: false
import nokap

# Check that Chrome can be found
path = nokap.find_chrome()
print(f"Chrome found at: {path}")
```

## Dependencies

| Package | Role | Required? |
|---------|------|-----------|
| `websockets` | Async WebSocket client for CDP | Yes (runtime) |
| `click` | CLI framework | Yes (runtime) |
| `Pillow` | Image post-processing (expand/crop) | Optional (`pip install nokap[extra]`) |

## Next Steps

nokap is now installed and Chrome is discoverable on your system. The setup is intentionally lightweight: just one runtime dependency and a system Chrome installation. Everything else (browser launch, WebSocket connection, tab management) is handled internally.

- [Quick Start](02-quick-start.qmd): Take your first screenshot

With installation complete, you're ready to capture your first screenshot. The Quick Start guide walks you through the three main input modes in under a minute.


# Quick Start

This guide walks you through taking your first screenshot with nokap in under a minute. We'll cover the three main input modes (capturing a live URL, a local HTML file, and raw HTML strings) so you can choose the approach that fits your workflow. Each example is self-contained and ready to copy into a script or notebook.

## Your First Screenshot

The simplest use of nokap is capturing a web page as a PNG:

```{python}
#| eval: false
import nokap

nokap.webshot("https://example.com", "example.png")
```

This launches headless Chrome (if not already running), navigates to the URL, captures the full page, and saves it as a PNG file.

## Screenshot a Local HTML File

You can also capture local HTML files:

```{python}
#| eval: false
nokap.webshot("page.html", "page.png")
```

nokap automatically converts local paths to `file://` URLs.

## From an HTML String

The `from_html()` function is designed for rendering HTML content directly, which is great for libraries that generate HTML:

```{python}
#| eval: false
html = """
<html>
<body style="font-family: system-ui; padding: 2rem;">
  <h1>Hello, nokap!</h1>
  <p>This is rendered from a string.</p>
</body>
</html>
"""

nokap.from_html(html, "hello.png")
```

## Capture a Specific Element

Use a CSS selector to capture just one element:

```{python}
#| eval: false
nokap.webshot(
    "https://example.com",
    "heading.png",
    selector="h1"
)
```

## Generate a PDF

Change the output extension to `.pdf`:

```{python}
#| eval: false
nokap.webshot("https://example.com", "page.pdf")
```

PDF options like page size, margins, and orientation are available:

```{python}
#| eval: false
nokap.webshot(
    "https://example.com",
    "page.pdf",
    page_size="a4",
    landscape=True,
    print_background=True,
)
```

## Retina Quality

Use `zoom` for higher resolution output:

```{python}
#| eval: false
# 2× resolution (retina)
nokap.webshot("https://example.com", "retina.png", zoom=2)
```

## Using the CLI

nokap also provides a command-line interface:

```bash
# Screenshot a URL
nokap webshot https://example.com screenshot.png

# Capture a specific element with 2× zoom
nokap webshot https://example.com header.png -s "h1" -z 2

# Render an HTML file
nokap from-html report.html report.png --selector "table" --expand 10
```

## Cleanup

Chrome runs in the background for reuse. Clean up explicitly when done:

```{python}
#| eval: false
nokap.close()
```

This is called automatically at process exit, but explicit cleanup is good practice in long-running applications.

## Next Steps

You've seen the essentials: capturing URLs, local files, and HTML strings as images or PDFs, targeting specific elements with selectors, and using the CLI for shell-based workflows. These building blocks cover the majority of day-to-day use cases.

- [Screenshots](10-screenshots.qmd): Deep dive into image capture options
- [PDF Generation](12-pdf-generation.qmd): Full PDF configuration
- [The CLI](13-cli.qmd): Command-line usage reference

Each of the guides above expands on what you've learned here, covering every parameter in detail. Start with whichever matches your immediate need. (You don't need to read them in order.)


## Core Concepts

# Screenshots

A deep dive into image capture options: formats, viewport, zoom, delay, and more. nokap gives you fine-grained control over how pages are rendered and captured, from choosing between PNG, JPEG, and WebP output to configuring viewport dimensions and device scale factor. This page covers every screenshot-related parameter so you can produce exactly the output you need.

## Image Formats

nokap supports three image formats, determined by the output file extension:

| Extension | Format | Notes |
|-----------|--------|-------|
| `.png` | PNG | Default. Lossless, supports transparency |
| `.jpg` / `.jpeg` | JPEG | Lossy compression, smaller files |
| `.webp` | WebP | Modern format, excellent compression |

```{python}
#| eval: false
import nokap

nokap.webshot("https://example.com", "page.png")   # PNG
nokap.webshot("https://example.com", "page.jpg")   # JPEG
nokap.webshot("https://example.com", "page.webp")  # WebP
```

## Viewport Size

Control the browser viewport dimensions with `vwidth` and `vheight`:

```{python}
#| eval: false
# Mobile viewport
nokap.webshot("https://example.com", "mobile.png", vwidth=375, vheight=812)

# Desktop widescreen
nokap.webshot("https://example.com", "wide.png", vwidth=1920, vheight=1080)
```

The defaults are `992×744` pixels, matching common screenshot tool conventions.

## Zoom / Scale Factor

The `zoom` parameter scales the output resolution without changing the viewport layout:

```{python}
#| eval: false
# Standard resolution
nokap.webshot("https://example.com", "1x.png", zoom=1)

# Retina (2×)
nokap.webshot("https://example.com", "2x.png", zoom=2)

# Triple resolution
nokap.webshot("https://example.com", "3x.png", zoom=3)
```

A `zoom=2` produces an image that is 2× the viewport dimensions in pixels. This is ideal for HiDPI displays.

:::{.callout-note}
The `zoom` parameter applies only to raster image output (PNG, JPEG, WebP). It is ignored for PDF output since PDFs are vector format and always render at full resolution. See [PDF Generation](12-pdf-generation.qmd) for details on vector PDF output.
:::

In general, `zoom=2` is a good default for images destined for documentation or presentations, balancing file size with crisp rendering on modern screens.

## Delay

The `delay` parameter adds a wait (in seconds) after the page load event fires, before capturing. This is useful for pages with animations or lazy-loaded content:

```{python}
#| eval: false
# Wait 1 second for animations to complete
nokap.webshot("https://example.com", "animated.png", delay=1.0)

# No delay (capture immediately after load)
nokap.webshot("https://example.com", "instant.png", delay=0)
```

The default delay is `0.2` seconds.

## Custom User-Agent

Override the browser's User-Agent string:

```{python}
#| eval: false
nokap.webshot(
    "https://example.com",
    "bot.png",
    useragent="MyBot/1.0"
)
```

## From HTML Strings

The `from_html()` function writes an HTML string to a temp file and screenshots it:

```{python}
#| eval: false
html = "<h1 style='color: navy;'>Generated Report</h1>"
nokap.from_html(html, "report.png")
```

All `webshot()` options are available as keyword arguments:

```{python}
#| eval: false
nokap.from_html(
    html,
    "report.png",
    zoom=2,
    vwidth=800,
    delay=0.5,
)
```

## Next Steps

You now have a solid foundation for capturing raster screenshots with nokap, from choosing formats and viewport sizes to controlling resolution with zoom and timing with delay. These options cover the majority of screenshot workflows, but nokap offers much more.

- [Selectors and Clipping](11-selectors-and-clipping.qmd): Capture specific page elements
- [PDF Generation](12-pdf-generation.qmd): Generate PDFs with full layout control

With formats, dimensions, and timing under your belt, the next step is learning to target specific elements on a page rather than capturing the entire viewport.


# Selectors and Clipping

Capture specific elements using CSS selectors, expand padding, or define manual clip regions. Rather than screenshotting an entire page and cropping afterward, nokap can target individual DOM elements directly, producing tightly framed images with no post-processing required. This page explains how to use single selectors, combine multiple selectors, add padding around elements, and specify exact pixel-level clip rectangles.

## CSS Selectors

The most common way to target a specific part of a page is with a CSS selector. When you pass a `selector`, nokap locates the matching element in the DOM, computes its bounding box, and captures only that region. This works for both image and PDF output.

```{python}
#| eval: false
import nokap

# Capture just the main heading
nokap.webshot("https://example.com", "heading.png", selector="h1")

# Capture by class
nokap.webshot("https://example.com", "nav.png", selector=".navigation")

# Capture by ID
nokap.webshot("https://example.com", "hero.png", selector="#hero-section")
```

## Multiple Selectors

Sometimes the content you need spans multiple elements that aren't wrapped in a single container. In those cases, pass a list of selectors and nokap computes the union bounding box (the smallest rectangle that encompasses all matched elements):

```{python}
#| eval: false
# Capture header and footer together
nokap.webshot(
    "page.html",
    "combined.png",
    selector=["header", "footer"]
)
```

Elements with zero width or height (invisible elements) are automatically excluded from the union calculation. This makes it safe to include selectors that might not be visible on every page.

## Expand / Padding

Capturing an element right at its bounding box edge can feel cramped. The `expand` parameter adds breathing room (in pixels) around the element, giving you a more polished result without needing to modify the page's CSS:

```{python}
#| eval: false
# Add 10px padding on all sides
nokap.webshot("page.html", "padded.png", selector="table", expand=10)
```

For asymmetric padding, pass a 4-tuple of `(top, right, bottom, left)`:

```{python}
#| eval: false
# 20px top/bottom, 10px left/right
nokap.webshot(
    "page.html",
    "padded.png",
    selector="table",
    expand=(20, 10, 20, 10),
)
```

A small expand value (5–10px) is usually enough to prevent visual crowding. Larger values are useful when the surrounding whitespace is part of the design intent.

## Clip Rectangle

When you need pixel-perfect control over the captured region (or when the area of interest doesn't correspond to any single DOM element) use `cliprect`. It takes a tuple of `(x, y, width, height)` in CSS pixels, specifying an exact rectangle on the page:

```{python}
#| eval: false
# Capture a 400×300 region starting at (50, 100)
nokap.webshot(
    "https://example.com",
    "region.png",
    cliprect=(50, 100, 400, 300),
)
```

:::{.callout-warning}
`selector` and `cliprect` are mutually exclusive. Passing both raises an error.
:::

## Selectors with from_html()

When rendering HTML strings with `from_html()`, the default selector is `"html"`, which captures the entire rendered page. For most use cases (especially when working with generated HTML from tools like Great Tables) you'll want to override this with a more specific selector to capture just the element of interest:

```{python}
#| eval: false
html = """
<div style="padding: 2rem;">
  <table id="data-table">
    <tr><th>Name</th><th>Value</th></tr>
    <tr><td>Alpha</td><td>100</td></tr>
    <tr><td>Beta</td><td>200</td></tr>
  </table>
  <p>Footer text (not captured)</p>
</div>
"""

nokap.from_html(html, "table.png", selector="#data-table", expand=5)
```

## Automatic Width Detection

A common frustration when capturing wide content (like a table with many columns) is that the default viewport width clips the element. nokap solves this automatically: when it detects that an element's width fills or exceeds the viewport, it temporarily widens the viewport to allow the element to render at its natural/intrinsic width. The result is a capture that shows the full content regardless of how wide it is.

This behavior uses a two-pass heuristic to distinguish between two types of elements:

- **Intrinsic-width elements** (tables, code blocks): These have a fixed natural width determined by their content. If the viewport constrains them, nokap widens the viewport and captures the full content.
- **Fluid-layout elements** (`<main>`, `<div>` with percentage widths): These grow with the viewport and have no fixed intrinsic width. nokap detects this pattern and reverts to the original viewport to avoid producing output with unnecessary whitespace on both sides.

No manual `vwidth` configuration is needed in either case. The detection works for both image and PDF output, so wide tables are always captured completely whether you're producing a PNG or an element-bounded PDF.

## Selector Errors

When a selector doesn't match any element in the rendered page, nokap raises a `SelectorError` immediately rather than producing a blank or broken capture. This makes it easy to catch typos or selectors that depend on content that hasn't loaded yet:

```{python}
#| eval: false
import nokap
from nokap import SelectorError

try:
    nokap.webshot("page.html", "out.png", selector="#nonexistent")
except SelectorError as e:
    print(f"Selector failed: {e}")
    # "No element matches selector: '#nonexistent'"
```

## Next Steps

With selectors, expand, clip rectangles, and automatic width detection, you can precisely control what gets captured, from a single heading to a wide data table with dozens of columns. These tools eliminate the need for post-processing crops and ensure your output matches exactly what you see in the browser.

- [PDF Generation](12-pdf-generation.qmd): Generate PDFs with page layout options
- [Integration with Great Tables](20-great-tables-integration.qmd): Use nokap with GT

Selectors become especially powerful when combined with PDF output, where a targeted selector produces a compact, element-sized PDF rather than a full page. Head to the PDF guide to see how these concepts come together.


# PDF Generation

Generate PDFs with configurable page size, margins, orientation, and background printing. nokap leverages Chrome's built-in print-to-PDF functionality through the DevTools Protocol, giving you access to the same rendering engine used for on-screen display. This page walks through all available PDF options including standard paper sizes, custom dimensions, margin control, and header/footer configuration.

## Basic PDF Capture

nokap determines the output format from the file extension. Any `webshot()` or `from_html()` call with a `.pdf` output extension generates a PDF instead of a raster screenshot (no separate function or mode switch is required):

```{python}
#| eval: false
import nokap

nokap.webshot("https://example.com", "page.pdf")
```

## Page Size

For full-page PDFs, the `page_size` parameter controls the paper dimensions. This affects how content is paginated and how much fits on each page:

```{python}
#| eval: false
nokap.webshot("https://example.com", "a4.pdf", page_size="a4")
nokap.webshot("https://example.com", "letter.pdf", page_size="letter")
```

Available page sizes:

| Size | Dimensions (inches) |
|------|-------------------|
| `letter` | 8.5 × 11 |
| `legal` | 8.5 × 14 |
| `tabloid` | 11 × 17 |
| `ledger` | 17 × 11 |
| `a0` | 33.1 × 46.8 |
| `a1` | 23.4 × 33.1 |
| `a2` | 16.5 × 23.4 |
| `a3` | 11.7 × 16.5 |
| `a4` | 8.27 × 11.7 |
| `a5` | 5.83 × 8.27 |
| `a6` | 4.13 × 5.83 |

## Margins

Set page margins in inches. A single value applies to all sides:

```{python}
#| eval: false
# 1-inch margins on all sides
nokap.webshot("https://example.com", "page.pdf", margins=1.0)

# No margins
nokap.webshot("https://example.com", "page.pdf", margins=0)
```

For asymmetric margins, pass a 4-tuple of `(top, right, bottom, left)`:

```{python}
#| eval: false
nokap.webshot(
    "https://example.com",
    "page.pdf",
    margins=(1.0, 0.75, 1.0, 0.75),  # top, right, bottom, left
)
```

The default margin is `0.5` inches on all sides.

## Orientation

By default, PDFs render in portrait orientation. Use `landscape=True` to rotate the page, which is useful for wide content like data tables or charts that benefit from extra horizontal space:

```{python}
#| eval: false
nokap.webshot("https://example.com", "wide.pdf", landscape=True)
```

## Print Background

Chrome's print-to-PDF follows the same convention as physical printing: CSS background colors and images are omitted by default to save ink. For web content where backgrounds are part of the design (styled tables, colored headings, etc.), enable them explicitly:

```{python}
#| eval: false
nokap.webshot(
    "https://example.com",
    "styled.pdf",
    print_background=True,
)
```

## A Note on Scale

Unlike raster screenshots where `zoom` controls pixel density, PDFs are a vector format: text and graphics are stored as scalable geometry, not pixels. This means PDFs always render at the highest possible resolution regardless of any scale setting.

:::{.callout-note}
The `zoom` parameter is ignored for PDF output. Use `zoom` only for raster screenshots (PNG, JPEG, WebP) where it controls output pixel density and sharpness on HiDPI displays.
:::

## Element-Bounded PDF

One of nokap's most powerful features is the ability to produce a PDF that's sized exactly to a specific element rather than a standard paper size. When you combine a `.pdf` output with a `selector`, nokap produces a tightly-bounded PDF with the paper dimensions matched to the element's bounding box. The text remains fully selectable and searchable (unlike a raster screenshot embedded in a PDF).

This is the feature to reach for when you need to place a table or chart into a presentation (Keynote, PowerPoint, Google Slides) at vector quality, or when you want a compact PDF that contains only the content you care about.

```{python}
#| eval: false
# Capture just the table as a tight-fit PDF
nokap.webshot("report.html", "table.pdf", selector="table", expand=10)
```

Under the hood, the element-bounded PDF works by:

1. Measuring the element's bounding box in CSS pixels
2. Setting the PDF paper size to exactly match those dimensions (converted to inches)
3. Injecting print-specific CSS to isolate the element and hide surrounding content
4. Rendering via Chrome's `Page.printToPDF` with zero page margins

The result is a single-page PDF where the page boundary coincides with the element boundary (plus any `expand` padding you specify).

### With `from_html()`

For packages that generate HTML programmatically (like Great Tables, Plotly, or custom report builders) `from_html()` is the natural entry point. Pass the HTML string and a selector to extract just the relevant element as a vector PDF:

```{python}
#| eval: false
from great_tables import GT, exibble

html = GT(exibble).as_raw_html(make_page=True, all_important=True)

# Tight PDF of just the table (useful for slides)
nokap.from_html(html, "table.pdf", selector="table", expand=5)
```

### Expand / Padding

The `expand` parameter adds even whitespace around the element in the PDF. This is purely a visual affordance and it gives the content some breathing room so it doesn't feel jammed against the edges when placed into a slide or document:

```{python}
#| eval: false
# No padding: PDF edges touch the table
nokap.from_html(html, "tight.pdf", selector="table")

# 20px breathing room on all sides
nokap.from_html(html, "padded.pdf", selector="table", expand=20)

# Asymmetric padding
nokap.from_html(html, "asym.pdf", selector="table", expand=(10, 20, 10, 20))
```

### Backgrounds

Element-bounded PDFs include CSS backgrounds by default. This is intentional: when you're exporting a styled table or chart, the background colors, borders, and shading are typically essential to the visual design. Full-page PDFs follow Chrome's printing convention and omit backgrounds unless you explicitly set `print_background=True`.

### Wide Tables

A common challenge with table-to-PDF export is that wide tables get clipped when they exceed the viewport width. nokap handles this automatically: it detects when an element's natural width is being constrained by the viewport, temporarily widens the viewport to let the content expand, then sizes the PDF paper to fit the full element. No manual `vwidth` adjustment is needed (even very wide tables with dozens of columns render completely).

```{python}
#| eval: false
import pandas as pd
from great_tables import GT

# A 20-column table that's wider than the default 992px viewport
df = pd.DataFrame({f"col_{i}": [f"val_{i}"] for i in range(20)})
html = GT(df).as_raw_html(make_page=True, all_important=True)

# Automatically expands to fit all columns
nokap.from_html(html, "wide_table.pdf", selector="table", expand=5)
```

## Full-Page PDF vs Element-Bounded PDF

To summarize the two PDF modes and when to reach for each:

| | Full-Page PDF | Element-Bounded PDF |
|--|--|--|
| **Trigger** | `.pdf` output, no selector | `.pdf` output + `selector` |
| **Paper size** | Standard (letter, A4, etc.) | Sized to element |
| **Content** | Entire page | Just the selected element |
| **Backgrounds** | Off by default | On by default |
| **Use case** | Print a document | Export a table/chart for slides |

Both modes produce true vector PDFs with selectable text. The choice comes down to whether you want a traditional paginated document or a compact, element-sized artifact for embedding elsewhere.

## PDF from HTML Strings

Just as with screenshots, you can generate a full-page PDF directly from an HTML string using `from_html()`. This is convenient for generating documents from templates or programmatic HTML without writing to a temporary file yourself:

```{python}
#| eval: false
html = """
<html>
<body>
  <h1>Invoice #1234</h1>
  <table>
    <tr><td>Widget</td><td>$10.00</td></tr>
    <tr><td>Gadget</td><td>$25.00</td></tr>
    <tr><th>Total</th><th>$35.00</th></tr>
  </table>
</body>
</html>
"""

nokap.from_html(html, "invoice.pdf", page_size="a4", margins=1.0)
```

## Complete Example

Here's a full-page PDF capture with all common options specified explicitly, showing how the parameters work together:

```{python}
#| eval: false
nokap.webshot(
    "report.html",
    "report.pdf",
    page_size="a4",
    margins=(1.0, 0.75, 1.0, 0.75),
    landscape=False,
    print_background=True,
)
```

And an element-bounded PDF workflow for exporting a styled table to a presentation:

```{python}
#| eval: false
from great_tables import GT, exibble

html = GT(exibble).as_raw_html(make_page=True, all_important=True)
nokap.from_html(html, "table_for_slides.pdf", selector="table", expand=10)
```

## Next Steps

This page covered both full-page and element-bounded PDF generation from standard paper sizes and margins to the tightly-fitted, selector-driven PDFs that are ideal for embedding in presentations. Because PDFs are vector format, you get crisp text at any zoom level without worrying about pixel density.

- [The CLI](13-cli.qmd): Generate PDFs from the command line
- [Integration with Great Tables](20-great-tables-integration.qmd): Export tables as images or PDFs

All of the PDF options covered here are also available via the command-line interface, making it easy to integrate nokap into scripts and automation pipelines.


# The CLI

nokap includes a command-line interface for capturing screenshots and PDFs without writing Python. The CLI exposes the same capabilities as the Python API through four commands, making it easy to integrate nokap into shell scripts, Makefiles, and CI pipelines. All screenshot and PDF options are available as command-line flags with sensible defaults.

## Overview

| Command | Purpose |
|---------|---------|
| `nokap webshot` | Capture a URL or local file as an image or PDF |
| `nokap from-html` | Render an HTML file to an image or PDF |
| `nokap doctor` | Run full diagnostics (launch Chrome, test captures) |
| `nokap batch` | Process multiple capture jobs from a JSON manifest |

There are also two informational commands: `nokap info` (quick system check) and `nokap --version`.

## Installation

The CLI is installed automatically with the package:

```bash
pip install nokap
nokap --version
```

## `nokap webshot`

The primary command. It takes a URL (or local file path) and produces a screenshot or PDF, depending on the output file extension. This is the command you'll reach for most often, whether you're grabbing a quick screenshot from the terminal or building capture steps into a CI pipeline.

```bash
nokap webshot [OPTIONS] URL [FILE]
```

### Arguments

| Argument | Description | Default |
|----------|-------------|---------|
| `URL` | URL or local file path to capture | *(required)* |
| `FILE` | Output file path | `webshot.png` |

The format is determined by the file extension: `.png`, `.jpg`, or `.webp` for raster images; `.pdf` for PDF. If you omit `FILE`, it defaults to `webshot.png` in the current directory.

### Options

| Option | Short | Description | Default |
|--------|-------|-------------|---------|
| `--vwidth` | | Viewport width in pixels | `992` |
| `--vheight` | | Viewport height in pixels | `744` |
| `--selector` | `-s` | CSS selector to capture | *(full page)* |
| `--expand` | `-e` | Pixels to expand around selector | `0` |
| `--delay` | `-d` | Seconds to wait after page load | `0.2` |
| `--zoom` | `-z` | Zoom/scale factor | `1.0` |
| `--useragent` | | Custom User-Agent string | *(Chrome default)* |
| `--page-size` | | Paper size for PDF output | `letter` |
| `--landscape` | | Use landscape orientation (PDF) | *(flag)* |
| `--print-background` | | Print CSS backgrounds (PDF) | *(flag)* |

### Examples

Capture a basic screenshot (defaults to PNG at 992×744 viewport):

```bash
nokap webshot https://example.com screenshot.png
```

Target a specific element with a CSS selector. nokap will automatically crop the output to just that element's bounding box:

```bash
nokap webshot https://example.com header.png -s "h1"
```

Combine a selector with zoom and expand for a presentation-ready capture with padding and Retina resolution:

```bash
nokap webshot page.html table.png -s "table" -z 2 -e 10
```

Simulate a mobile viewport by setting custom dimensions:

```bash
nokap webshot https://example.com mobile.png --vwidth 375 --vheight 812
```

Generate a PDF. The output format is determined entirely by the file extension:

```bash
nokap webshot https://example.com doc.pdf --page-size a4
```

Landscape orientation with CSS backgrounds printed (useful for styled dashboards or reports):

```bash
nokap webshot report.html report.pdf --landscape --print-background
```

Add a delay for pages with lazy-loaded content or animations that need time to settle:

```bash
nokap webshot https://example.com animated.png -d 2.0
```

Capture a local HTML file (paths are resolved relative to the current directory):

```bash
nokap webshot ./build/index.html output.png
```

## `nokap from-html`

Renders an HTML file to an image or PDF. This is the CLI equivalent of `nokap.from_html()` in Python. The key difference from `webshot` is that `from-html` defaults to `selector="html"` (captures the full rendered document rather than the viewport), making it the natural choice for HTML files generated by packages like Great Tables or report builders.

```bash
nokap from-html [OPTIONS] HTML_FILE [FILE]
```

### Arguments

| Argument | Description | Default |
|----------|-------------|---------|
| `HTML_FILE` | Path to an HTML file to render | *(required)* |
| `FILE` | Output file path | `webshot.png` |

### Options

| Option | Short | Description | Default |
|--------|-------|-------------|---------|
| `--selector` | `-s` | CSS selector to capture | `html` |
| `--vwidth` | | Viewport width in pixels | `992` |
| `--vheight` | | Viewport height in pixels | `744` |
| `--expand` | `-e` | Pixels to expand around selector | `0` |
| `--delay` | `-d` | Seconds to wait after page load | `0.2` |
| `--zoom` | `-z` | Zoom/scale factor | `1.0` |

### Examples

Render a full HTML page to PNG. Since the default selector is `html`, the entire document is captured regardless of viewport height:

```bash
nokap from-html report.html report.png
```

Capture just a specific element from the page. This is especially useful when your HTML file contains a styled table surrounded by boilerplate:

```bash
nokap from-html data.html table.png -s "table" -z 2
```

Add breathing room around the element with `--expand`:

```bash
nokap from-html chart.html chart.png -s "#chart" -e 20
```

Generate a PDF from an HTML file. When combined with a selector, nokap produces an element-bounded PDF sized exactly to the element (perfect for embedding tables in presentations):

```bash
nokap from-html invoice.html invoice.pdf
nokap from-html gt_table.html table.pdf -s "table" -e 5
```

### Typical Workflow: Great Tables → PDF

A common pattern is generating an HTML table with Great Tables, then using `from-html` to convert it to a tight-fit PDF for slides:

```bash
# Step 1: Generate HTML with Python
python -c "
from great_tables import GT, exibble
GT(exibble).save('table.html')
"

# Step 2: Convert to element-bounded PDF
nokap from-html table.html table.pdf -s "table" -e 10
```

## Output and Exit Codes

On success, both `webshot` and `from-html` print the absolute path to the output file on stdout and exit with code `0`. This makes them easy to compose with other shell tools:

```bash
$ nokap webshot https://example.com /tmp/shot.png
/tmp/shot.png
```

You can pipe the output path to another command:

```bash
open "$(nokap webshot page.html screenshot.png)"
```

On error, a descriptive message is printed to stderr and the exit code is `1`:

```bash
$ nokap webshot page.html out.png -s "#missing"
Error: No element matches selector: '#missing'
```

Common errors include missing selectors (`SelectorError`), navigation failures (`NavigationError`), and Chrome not being found (`ChromeNotFoundError`). See the [Error Handling](21-error-handling.qmd) guide for the full taxonomy.

## Global Options

```bash
nokap --version   # Show version
nokap --help      # Show top-level help
nokap webshot --help  # Command-specific help
```

Every command supports `--help` for its own usage details.

## `nokap doctor`

The `doctor` command runs a full end-to-end diagnostic of your nokap installation. While `nokap info` only checks whether Chrome can be *found*, `doctor` goes further: it actually launches headless Chrome, opens a tab, renders test content, and produces both a raster screenshot and an element-bounded PDF. Each step is timed, making it invaluable for diagnosing slow environments (CI runners, Docker containers) or permission issues.

```bash
nokap doctor
```

### What It Tests

| Step | What happens | Catches |
|------|-------------|---------|
| 1. Find Chrome | Locates the Chrome/Chromium binary | Missing install, bad `CHROME_PATH` |
| 2. Launch Chrome | Starts headless Chrome, connects via WebSocket | Permission errors, port conflicts, sandboxing issues |
| 3. Test PNG | Renders HTML and captures a PNG screenshot | Rendering failures, codec issues |
| 4. Test PDF | Renders HTML and captures an element-bounded PDF | Print-to-PDF failures, paper sizing bugs |

### Sample Output

```
nokap doctor
========================================
nokap version: 0.1.0
Python: 3.12.0
Platform: macos (macOS-14.0-arm64)

1. Finding Chrome... OK (0ms)
   Path: /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
2. Launching headless Chrome... OK (210ms)
   WebSocket: ws://127.0.0.1:51234/devtools/browser/abc123
3. Test capture (HTML → PNG)... OK (320ms, 8.6 KB)
4. Test capture (HTML → element PDF)... OK (290ms, 7.2 KB)

All checks passed.
```

### When to Use

Run `doctor` when:

- **Setting up CI**: Confirm that Chrome is available and captures work before wiring up the full pipeline. If step 2 takes more than a few seconds, your container may need a warm-up step or more resources.
- **After system updates**: macOS or Linux updates can change Chrome's binary path or sandbox permissions.
- **Debugging slow captures**: The per-step timing makes it easy to tell whether the bottleneck is Chrome startup, rendering, or PDF generation.
- **First-time setup**: After installing nokap in a new environment, one `nokap doctor` confirms everything works end-to-end.

If any step fails, `doctor` prints a descriptive error message and exits with code `1`:

```bash
$ nokap doctor
...
2. Launching headless Chrome... FAIL
   ChromeStartError: Chrome process exited immediately (exit code 1)
```

## `nokap batch`

The `batch` command processes multiple capture jobs from a single JSON manifest file. Chrome is launched once and kept alive across all jobs, so you only pay the startup cost once (making this significantly faster than running individual `nokap webshot` commands in a loop). This is ideal for documentation builds, automated report generation, or any workflow where you need to capture many pages with potentially different settings.

```bash
nokap batch [OPTIONS] MANIFEST
```

### Arguments

| Argument | Description | Default |
|----------|-------------|---------|
| `MANIFEST` | Path to a JSON file with an array of capture jobs | *(required)* |

### Options

| Option | Short | Description | Default |
|--------|-------|-------------|---------|
| `--output-dir` | `-o` | Directory to write output files | `.` |
| `--delay` | `-d` | Default delay for all jobs (seconds) | `0.2` |
| `--zoom` | `-z` | Default zoom factor for all jobs | `1.0` |
| `--selector` | `-s` | Default CSS selector for all jobs | *(none)* |
| `--expand` | `-e` | Default expand for all jobs (pixels) | `0` |

Command-line options set *defaults* for all jobs. Individual jobs in the manifest can override any default by specifying the same key. This lets you set baseline settings once and override per-job where needed.

### Manifest Format

The manifest is a JSON array where each object describes one capture job. Every job must have:

- `"file"`: output filename (relative to `--output-dir`)
- `"url"` or `"html"`: the source to capture

All other keys are optional and map directly to `webshot()` / `from_html()` parameters:

```json
[
  {"url": "https://example.com", "file": "homepage.png"},
  {"url": "https://example.com/pricing", "file": "pricing.png", "selector": "#plans"},
  {"url": "report.html", "file": "report.pdf", "selector": "table", "expand": 10},
  {"html": "<h1>Generated Title</h1>", "file": "title.png", "zoom": 2},
  {"url": "dashboard.html", "file": "dash.pdf", "landscape": true, "print_background": true}
]
```

Available job keys: `url`, `html`, `file`, `selector`, `expand`, `zoom`, `delay`, `vwidth`, `vheight`, `page_size`, `landscape`, `print_background`.

### Examples

Process a manifest and write all output to a directory:

```bash
nokap batch captures.json -o output/
```

Apply a default selector and zoom to every job (individual jobs can still override):

```bash
nokap batch tables.json -o images/ -s "table" -z 2 -e 5
```

Zero delay for static HTML files where no lazy-loading or animations occur (speeds up batch processing):

```bash
nokap batch pages.json -o out/ -d 0
```

### Error Handling

Jobs that fail don't stop the batch. Each failure is reported inline and a summary is printed at the end:

```
  [1/4] https://example.com → output/example.png ... OK
  [2/4] https://bad-host.invalid → output/bad.png ... FAIL
       NavigationError: net::ERR_NAME_NOT_RESOLVED
  [3/4] report.html → output/report.pdf ... OK
  [4/4] (html string) → output/title.png ... OK

Done: 3/4 succeeded, 1 failed.
```

The exit code is `1` if any job fails, `0` if all succeed. This makes it easy to use in CI: the pipeline will flag partial failures without aborting mid-batch.

### Real-World Example: Documentation Build

A documentation project might maintain a manifest of all screenshots that need regenerating:

```json
[
  {"html": "gt_summary.html", "file": "summary_table.pdf", "selector": "table", "expand": 8},
  {"html": "gt_revenue.html", "file": "revenue_table.pdf", "selector": "table", "expand": 8},
  {"url": "http://localhost:4321/charts", "file": "chart_overview.png", "selector": "#main", "zoom": 2},
  {"url": "http://localhost:4321/dashboard", "file": "dashboard.png", "vwidth": 1440, "delay": 1.0}
]
```

Then in a Makefile:

```makefile
screenshots:
	nokap batch docs/captures.json -o docs/images/ -d 0.5
```

This regenerates all documentation screenshots in one pass, with Chrome staying warm across all four captures.

## Next Steps

The CLI gives you access to all of nokap's capture capabilities without writing Python (useful for shell scripts, CI pipelines, Makefiles, and quick one-off captures from the terminal). Every option available in the Python API has a corresponding command-line flag.

- [Integration with Great Tables](20-great-tables-integration.qmd): Automate table screenshots
- [Error Handling](21-error-handling.qmd): Handle errors gracefully

If you're building automation around nokap, the error handling guide will help you write robust scripts that recover gracefully from common failure modes like missing selectors or Chrome startup issues.


## Advanced Usage

# Error Handling

nokap provides a clear exception hierarchy so you can catch and respond to specific failure modes. Every exception inherits from a common `NokapError` base class, which means you can handle all nokap errors with a single except clause or drill down into specific error types for more targeted recovery. This page documents each exception, when it's raised, and how to handle it gracefully in your code.

## Exception Hierarchy

All nokap exceptions inherit from `NokapError`, making it easy to catch any nokap-specific error:

```
NokapError
├── ChromeNotFoundError   # Chrome binary not found
├── ChromeStartError      # Chrome failed to start
├── ConnectionError_      # WebSocket connection failed
├── NavigationError       # Page navigation failed
├── PageLoadTimeout       # Page didn't load in time
├── SelectorError         # CSS selector matched nothing
└── CDPError              # Chrome DevTools Protocol error
```

## Catching All Errors

```{python}
#| eval: false
import nokap
from nokap import NokapError

try:
    nokap.webshot("https://example.com", "out.png")
except NokapError as e:
    print(f"nokap error: {e}")
```

## Specific Error Handling

### ChromeNotFoundError

Raised when nokap cannot locate a Chrome or Chromium binary:

```{python}
#| eval: false
from nokap import ChromeNotFoundError

try:
    nokap.webshot("https://example.com", "out.png")
except ChromeNotFoundError:
    print("Please install Chrome or set CHROME_PATH")
```

**Fix:** Install Chrome/Chromium, or set the `CHROME_PATH` environment variable.

### ChromeStartError

Raised when Chrome is found but fails to start (e.g., missing shared libraries on Linux):

```{python}
#| eval: false
from nokap import ChromeStartError

try:
    nokap.webshot("https://example.com", "out.png")
except ChromeStartError as e:
    print(f"Chrome failed to start: {e}")
    print(f"Chrome stderr: {e.stderr}")
```

**Fix:** Check that Chrome runs from the command line. On Linux CI, you may need `--no-sandbox` (nokap sets this automatically).

### NavigationError

Raised when page navigation fails (e.g., DNS resolution failure, invalid URL):

```{python}
#| eval: false
from nokap import NavigationError

try:
    nokap.webshot("https://nonexistent.invalid", "out.png")
except NavigationError as e:
    print(f"Failed to navigate to: {e.url}")
    print(f"Reason: {e.reason}")
```

### PageLoadTimeout

Raised when a page doesn't finish loading within the timeout period:

```{python}
#| eval: false
from nokap import PageLoadTimeout

try:
    nokap.webshot("https://slow-site.example.com", "out.png")
except PageLoadTimeout as e:
    print(f"Timed out after {e.timeout}s loading: {e.url}")
```

### SelectorError

Raised when a CSS selector matches no elements on the page:

```{python}
#| eval: false
from nokap import SelectorError

try:
    nokap.webshot("page.html", "out.png", selector="#missing")
except SelectorError as e:
    print(f"Selector not found: {e.selector}")
```

**Fix:** Check that the selector matches an element in the loaded page. Use browser DevTools to verify.

### CDPError

Raised when Chrome returns an error response to a CDP command:

```{python}
#| eval: false
from nokap import CDPError

try:
    nokap.webshot("https://example.com", "out.png")
except CDPError as e:
    print(f"CDP protocol error: {e}")
```

## Recovery Pattern

nokap automatically recovers from stale connections. If Chrome crashes between calls, the next `webshot()` or `from_html()` call will relaunch Chrome transparently:

```{python}
#| eval: false
import nokap

# First call launches Chrome
nokap.webshot("https://example.com", "first.png")

# Simulate Chrome dying
nokap.close()

# Next call relaunches automatically
nokap.webshot("https://example.com", "second.png")  # Works fine
```

## Next Steps

With a clear understanding of nokap's exception hierarchy, you can write capture code that fails gracefully (whether that means retrying on a timeout, logging a missing selector, or alerting when Chrome can't be found). The automatic recovery behavior means most transient failures resolve themselves without intervention.

- [Performance Tips](22-performance-tips.qmd): Optimize for high-throughput capture

Now that you know how to handle errors, the performance guide will help you maximize throughput when processing many pages or running in batch pipelines.


# Performance Tips

Optimize nokap for batch processing and high-throughput screenshot capture. nokap is designed to be fast by default (it reuses a single Chrome instance across calls and communicates over a low-overhead WebSocket connection) but there are additional strategies that can dramatically improve throughput when processing many pages. This page covers browser reuse, batch workflows, viewport tuning, and other techniques for getting the most out of nokap.

## Browser Reuse

nokap keeps Chrome running between calls via a module-level singleton. The first call pays the startup cost (~200ms); subsequent calls skip browser launch entirely:

```{python}
#| eval: false
import nokap

# First call: launches Chrome (~200ms overhead)
nokap.webshot("page1.html", "out1.png")

# Second call: reuses Chrome (just navigation + capture)
nokap.webshot("page2.html", "out2.png")
```

**Tip:** Don't call `nokap.close()` between sequential captures.

## Batch Processing

For many screenshots, keep Chrome alive across the entire batch:

```{python}
#| eval: false
import nokap
from pathlib import Path

html_files = Path("reports/").glob("*.html")

for html_file in html_files:
    out = html_file.with_suffix(".png")
    nokap.webshot(html_file, out, selector="table", zoom=2)

# Clean up once at the end
nokap.close()
```

## Minimize Delay

The default `delay=0.2` adds 200ms per capture. For static HTML content (no animations, no lazy loading), set it to zero:

```{python}
#| eval: false
nokap.from_html(html, "out.png", selector="table", delay=0)
```

## Viewport Optimization

A smaller viewport means less content to render. If you're capturing a small element with a selector, a narrow viewport is fine:

```{python}
#| eval: false
# Only capturing a table so no need for a wide viewport
nokap.from_html(html, "table.png", selector="table", vwidth=600, vheight=400)
```

## Choose the Right Format

| Format | Speed | File Size | Use When |
|--------|-------|-----------|----------|
| PNG | Fast | Large | Need lossless quality or transparency |
| JPEG | Fast | Small | Photos, complex backgrounds |
| WebP | Fast | Smallest | Modern pipelines that support WebP |
| PDF | Fast | Varies | Need vector/print output |

## Connection Recovery

If Chrome becomes unresponsive, nokap automatically restarts it on the next call. This adds one-time latency but requires no manual intervention:

```{python}
#| eval: false
# If Chrome crashed, this transparently relaunches
nokap.webshot("https://example.com", "recovery.png")
```

## Memory Considerations

Each `webshot()` or `from_html()` call creates and closes a browser tab. This prevents memory buildup from accumulated page state. For very long batch jobs, Chrome's memory usage stays stable.

## Timing Summary

| Operation | Typical Duration |
|-----------|-----------------|
| Chrome launch (first call) | ~200ms |
| Tab creation + navigation | ~50ms |
| Page load (local HTML) | ~20ms |
| Screenshot capture | ~30ms |
| PDF generation | ~50ms |
| Tab cleanup | ~10ms |

For local HTML content with `delay=0`, expect ~100ms per capture after the initial Chrome launch.

## Next Steps

With browser reuse, minimal delay, and the right format choices, nokap can process hundreds of captures efficiently. The key insight is that Chrome stays warm between calls (you only pay the startup cost once, and each subsequent capture is just a tab open, navigate, capture, and close cycle).

You now have all the tools to build high-throughput capture pipelines. Combine what you've learned here with the error handling patterns from the previous page to build robust batch workflows that recover from transient failures without losing progress.


# AI Agent Integration

nokap ships with first-class support for AI coding agents through two complementary systems: an **MCP server** that lets agents capture screenshots and PDFs directly, and **agent skills** that give agents structured knowledge about how to use nokap effectively in code they write for you.

## MCP Server

The [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server exposes nokap's capture capabilities as tools that AI agents can call directly. This means an agent can take a screenshot, generate a PDF, or verify system readiness without you needing to write any code.

### Installation

Install nokap with the MCP extra:

```bash
pip install nokap[mcp]
```

### Available Tools

| Tool | Purpose |
|------|---------|
| `screenshot_url` | Capture a screenshot from a URL or local HTML file |
| `screenshot_html` | Capture a screenshot from a raw HTML string |
| `generate_pdf` | Generate a full-page or element-bounded PDF |
| `doctor` | Verify Chrome is installed and captures work |

### Connecting to the Server

#### Claude Desktop / Claude Code

Add to your MCP configuration (`claude_desktop_config.json` or project `.mcp.json`):

```json
{
  "mcpServers": {
    "nokap": {
      "command": "python",
      "args": ["-m", "mcp.run", "nokap.mcp:server"]
    }
  }
}
```

#### VS Code (GitHub Copilot)

Add to `.vscode/mcp.json` in your project:

```json
{
  "servers": {
    "nokap": {
      "command": "python",
      "args": ["-m", "mcp.run", "nokap.mcp:server"]
    }
  }
}
```

### Tool Reference

#### `screenshot_url`

Captures a web page or local HTML file as an image:

```
screenshot_url(
  url: str,              # URL or local file path (required)
  file: str,             # Output path (default: "screenshot.png")
  selector: str,         # CSS selector to capture specific element
  vwidth: int,           # Viewport width (default: 992)
  vheight: int,          # Viewport height (default: 744)
  zoom: float,           # Scale factor for HiDPI (default: 1)
  expand: int,           # Padding around selector in pixels (default: 0)
  delay: float,          # Seconds to wait after load (default: 0.2)
)
```

#### `screenshot_html`

Renders raw HTML to an image, which is ideal for capturing output from packages like Great Tables or Plotly:

```
screenshot_html(
  html: str,             # HTML content to render (required)
  file: str,             # Output path (default: "screenshot.png")
  selector: str,         # CSS selector (default: "html")
  vwidth: int,           # Viewport width (default: 992)
  vheight: int,          # Viewport height (default: 744)
  zoom: float,           # Scale factor (default: 1)
  expand: int,           # Padding in pixels (default: 0)
  delay: float,          # Wait after load (default: 0.2)
)
```

#### `generate_pdf`

Produces PDFs in two modes: full-page (standard paper) or element-bounded (sized to content):

```
generate_pdf(
  source: str,           # URL, file path, or raw HTML (required)
  file: str,             # Output path (default: "output.pdf")
  selector: str,         # Element selector for bounded PDF
  page_size: str,        # Paper size: letter, a4, legal, etc.
  landscape: bool,       # Landscape orientation (default: false)
  margins: float,        # Margins in inches (default: 0.5)
  print_background: bool,# Print CSS backgrounds (default: false)
  expand: int,           # Padding for element-bounded PDF (default: 0)
  vwidth: int,           # Viewport width (default: 992)
  delay: float,          # Wait after load (default: 0.2)
)
```

#### `doctor`

Checks system readiness. No parameters and returns Chrome path, version, and runs a test capture to confirm everything works.

### Prompts

The MCP server also provides prompt templates for common workflows:

| Prompt | Purpose |
|--------|---------|
| `capture_strategy` | Get recommendations on format, selector, and settings for a specific use case |
| `batch_capture` | Generate a batch-capture script tailored to your data source |

### Resources

| URI | Description |
|-----|-------------|
| `nokap://capabilities` | JSON summary of supported formats, paper sizes, and features |

## Agent Skills

nokap publishes [agent skills](https://agentskills.io/), which are structured context files that tell AI coding agents how to use the package correctly. Skills are complementary to the MCP server: while MCP lets agents *call* nokap tools directly, skills help agents write *better nokap code* for you.

### Installing Skills

If you use nokap in a project and want your AI agent to have context about it:
 
```bash
# Auto-detect your agent and install
great-docs skill install nokap

# Or target a specific agent
great-docs skill install nokap --agent copilot
great-docs skill install nokap --agent claude

# Or use the standard protocol
npx skills add https://posit-dev.github.io/nokap/
```

### Available Skills

nokap provides four focused skills, each covering a different aspect of the package:

| Skill | Focus | When it activates |
|-------|-------|-------------------|
| `nokap` | Primary overview: decision table, API summary, gotchas | Any nokap usage |
| `nokap-web-capture` | Capturing from live URLs: viewports, timing, selectors | Web page screenshots |
| `nokap-html-capture` | Rendering HTML strings: Great Tables, Plotly integration | `from_html()` usage |
| `nokap-pdf` | PDF generation: paper sizes, element-bounded PDFs | PDF output |

### What Skills Contain

Each skill is a concise cheat sheet that encodes:

- **Decision tables**: mapping common tasks to the right function and parameters
- **Gotchas**: mistakes agents make without guidance (e.g., `zoom` is ignored for PDFs)
- **Code examples**: patterns for the most common workflows
- **Boundaries**: what nokap can and cannot do

### Multi-Skill Mode

The four skills are designed to work together. An agent connected to a project with all four installed can quickly determine:

1. Which function to call (`webshot` vs `from_html`)
2. What format to use (PNG vs PDF)
3. How to configure selectors, zoom, and timing
4. What gotchas to avoid

## Using Both Together

The MCP server and skills serve different purposes and work best in combination:

| Scenario | Best approach |
|----------|--------------|
| Agent needs to take a screenshot right now | MCP tool: `screenshot_url` |
| Agent is writing a script that captures images | Skill: guides correct API usage |
| Agent needs to verify Chrome works | MCP tool: `doctor` |
| Agent is deciding between PNG and PDF | Skill: decision tables + MCP prompt: `capture_strategy` |
| Agent needs to generate batch capture code | MCP prompt: `batch_capture` + skill: code patterns |

The MCP server handles *execution* while skills handle *knowledge*. Together they give agents everything they need to work with nokap effectively.