store.BaseStore

Abstract base class for vector stores.

Usage

store.BaseStore()

A store is responsible for storing documents and their embeddings, and retrieving relevant chunks based on similarity search.

Subclasses must implement all abstract methods to provide a concrete storage backend:

raghilda.store.DuckDBStore: local storage with embedding and BM25 search.
raghilda.store.ChromaDBStore: local storage using ChromaDB.
raghilda.store.OpenAIStore: hosted storage using OpenAI’s Vector Store API.

Methods

Name	Description
connect()	Connect to an existing store.
create()	Create a new store.
ingest()	Prepare and upsert a stream of documents.
retrieve()	Retrieve the most similar chunks to the given text.
size()	Count the number of documents in the store.
upsert()	Upsert a document into the store.

connect()

Connect to an existing store.

Usage

Source

connect(*args, **kwargs)

Returns

BaseStore: A connected store instance.

create()

Create a new store.

Usage

Source

create(*args, **kwargs)

Returns

BaseStore: A newly created store instance.

ingest()

Prepare and upsert a stream of documents.

Usage

Source

ingest(documents, *, prepare=None, max_workers=1)

Inputs are consumed lazily and submitted incrementally. After prepare is applied, recent non-empty string origins are checked for duplicates as the stream is consumed. Duplicate detection is best effort: a duplicate raises ValueError when encountered, after any writes already in flight complete. No rollback is attempted.

Returns

IngestSummary: Aggregate counts for inserted, replaced, and skipped documents. Call upsert() directly when per-document WriteResult values are needed.

retrieve()

Retrieve the most similar chunks to the given text.

Usage

Source

retrieve(text, top_k, *args, **kwargs)

Parameters

text: str: The query text to search for.
top_k: int: The maximum number of chunks to return.

Returns

Sequence[RetrievedChunk]: The most similar chunks, ordered by relevance.

size()

Count the number of documents in the store.

Usage

Source

size()

Returns

int: The number of documents (not chunks) in the store.

upsert()

Upsert a document into the store.

Usage

Source

upsert(document, *, skip_if_unchanged=True)

Insert or replace a document in the store.

Parameters

document: Document: The document to upsert.
skip_if_unchanged: bool = True: If True (default), skip the write when the existing document for the same identity key already has identical content and chunk metadata. This helps avoid unnecessary embedding work.