Reference

Store

Vector storage backends for storing and retrieving chunks

store.BaseStore

Abstract base class for vector stores.

store.DuckDBStore

A vector store backed by DuckDB.

store.ChromaDBStore

A vector store backed by ChromaDB.

store.OpenAIStore

A vector store backed by OpenAI’s Vector Store API.

store.PostgreSQLStore

A store backed by a PostgreSQL database with pgvector.

Embedding

Embedding providers for generating vector representations

embedding.EmbeddingProvider

Interface for embedding function providers.

embedding.EmbedInputType

Specifies the type of input being embedded.

embedding.EmbeddingOpenAI

Creates an embedding function provider backed by OpenAI’s embedding models

embedding.EmbeddingCohere

Creates an embedding function provider backed by Cohere’s embedding models.

embedding.EmbeddingSentenceTransformers

Creates an embedding function provider backed by sentence-transformers models.

Chunker

Text chunking utilities for splitting documents

chunker.BaseChunker

Base class for chunkers.

chunker.MarkdownChunker

Chunk Markdown documents into overlapping segments at semantic boundaries.

Utilities

Utility functions for reading and scraping content

read.read_as_markdown()

Read a markdown file from a URI and return its content as a string.

scrape.find_links()

Discover hyperlinks starting from one or many documents and return them as URLs.

Chunk

Chunk data types

chunk.Chunk

A segment of text extracted from a document.

chunk.MarkdownChunk

A chunk extracted from a Markdown document.

chunk.RetrievedChunk

A chunk returned from a retrieval operation with associated metrics.

chunk.Metric

A named metric value associated with a retrieved chunk.

Document

Document types for unchunked and chunked content

document.Document

A document containing text content to be chunked and indexed.

document.ChunkedDocument

A document with an attached sequence of chunks.

document.MarkdownDocument

A Markdown document with source tracking.

document.ChunkedMarkdownDocument

A Markdown document with an attached sequence of chunks.

Types

Protocol types for type checking compatibility

types.ChunkLike

Any chunk-like object (chonkie, raghilda, or custom).

types.ChunkedDocumentLike

Any chunked document-like object.

types.DocumentLike

Any document-like object.

types.ChunkerLike

Any chunker-like object (chonkie, raghilda, or custom).

types.IntoChunk

Any object that can be converted into a Chunk via to_chunk().

types.IntoDocument

Any object that can be converted into a Document via to_document().