Reference
Store
Vector storage backends for storing and retrieving chunks
- store.BaseStore
-
Abstract base class for vector stores.
- store.DuckDBStore
-
A vector store backed by DuckDB.
- store.ChromaDBStore
-
A vector store backed by ChromaDB.
- store.OpenAIStore
-
A vector store backed by OpenAI’s Vector Store API.
- store.PostgreSQLStore
-
A store backed by a PostgreSQL database with pgvector.
Embedding
Embedding providers for generating vector representations
- embedding.EmbeddingProvider
-
Interface for embedding function providers.
- embedding.EmbedInputType
-
Specifies the type of input being embedded.
- embedding.EmbeddingOpenAI
-
Creates an embedding function provider backed by OpenAI’s embedding models
- embedding.EmbeddingCohere
-
Creates an embedding function provider backed by Cohere’s embedding models.
- embedding.EmbeddingSentenceTransformers
-
Creates an embedding function provider backed by sentence-transformers models.
Chunker
Text chunking utilities for splitting documents
- chunker.BaseChunker
-
Base class for chunkers.
- chunker.MarkdownChunker
-
Chunk Markdown documents into overlapping segments at semantic boundaries.
Utilities
Utility functions for reading and scraping content
- read.read_as_markdown()
-
Read a markdown file from a URI and return its content as a string.
- scrape.find_links()
-
Discover hyperlinks starting from one or many documents and return them as URLs.
Chunk
Chunk data types
- chunk.Chunk
-
A segment of text extracted from a document.
- chunk.MarkdownChunk
-
A chunk extracted from a Markdown document.
- chunk.RetrievedChunk
-
A chunk returned from a retrieval operation with associated metrics.
- chunk.Metric
-
A named metric value associated with a retrieved chunk.
Document
Document types for unchunked and chunked content
- document.Document
-
A document containing text content to be chunked and indexed.
- document.ChunkedDocument
-
A document with an attached sequence of chunks.
- document.MarkdownDocument
-
A Markdown document with source tracking.
- document.ChunkedMarkdownDocument
-
A Markdown document with an attached sequence of chunks.
Types
Protocol types for type checking compatibility
- types.ChunkLike
-
Any chunk-like object (chonkie, raghilda, or custom).
- types.ChunkedDocumentLike
-
Any chunked document-like object.
- types.DocumentLike
-
Any document-like object.
- types.ChunkerLike
-
Any chunker-like object (chonkie, raghilda, or custom).
- types.IntoChunk
-
Any object that can be converted into a Chunk via to_chunk().
- types.IntoDocument
-
Any object that can be converted into a Document via to_document().