store.BaseStore
Abstract base class for vector stores.
Usage
store.BaseStore()A store is responsible for storing documents and their embeddings, and retrieving relevant chunks based on similarity search.
Subclasses must implement all abstract methods to provide a concrete storage backend:
- raghilda.store.DuckDBStore: local storage with embedding and BM25 search.
- raghilda.store.ChromaDBStore: local storage using ChromaDB.
- raghilda.store.OpenAIStore: hosted storage using OpenAI’s Vector Store API.
Methods
| Name | Description |
|---|---|
| connect() | Connect to an existing store. |
| create() | Create a new store. |
| ingest() | Prepare and upsert a stream of documents. |
| retrieve() | Retrieve the most similar chunks to the given text. |
| size() | Count the number of documents in the store. |
| upsert() | Upsert a document into the store. |
connect()
Connect to an existing store.
Usage
connect(*args, **kwargs)Returns
BaseStore- A connected store instance.
create()
Create a new store.
Usage
create(*args, **kwargs)Returns
BaseStore- A newly created store instance.
ingest()
Prepare and upsert a stream of documents.
Usage
ingest(documents, *, prepare=None, max_workers=1)Inputs are consumed lazily and submitted incrementally. After prepare is applied, recent non-empty string origins are checked for duplicates as the stream is consumed. Duplicate detection is best effort: a duplicate raises ValueError when encountered, after any writes already in flight complete. No rollback is attempted.
Returns
IngestSummary-
Aggregate counts for inserted, replaced, and skipped documents. Call upsert() directly when per-document
WriteResultvalues are needed.
retrieve()
Retrieve the most similar chunks to the given text.
Usage
retrieve(text, top_k, *args, **kwargs)Parameters
text: str-
The query text to search for.
top_k: int- The maximum number of chunks to return.
Returns
Sequence[RetrievedChunk]- The most similar chunks, ordered by relevance.
size()
Count the number of documents in the store.
Usage
size()Returns
int- The number of documents (not chunks) in the store.
upsert()
Upsert a document into the store.
Usage
upsert(document, *, skip_if_unchanged=True)Insert or replace a document in the store.
Parameters
document: Document-
The document to upsert.
skip_if_unchanged: bool = True- If True (default), skip the write when the existing document for the same identity key already has identical content and chunk metadata. This helps avoid unnecessary embedding work.