AI / Agents

Skills
llms.txt
llms-full.txt

Developers

Daniel Falbel

Author

Posit

Tomasz Kalinowski

Author

Posit

Posit Software, PBC

Copyright holder, funder

Community

Contributing guide

Meta

Requires: Python >=3.11, <3.14
Provides-Extra: test, examples, chromadb, sentence-transformers, postgres

raghilda raghilda hex logo

RAG made simple.

raghilda is a Python package for implementing Retrieval-Augmented Generation (RAG) workflows. It provides a complete solution with sensible defaults while remaining transparent—not a black box.

Installation

pip install raghilda

Or install from GitHub:

pip install git+https://github.com/posit-dev/raghilda.git

Key Steps

raghilda handles the complete RAG pipeline:

  1. Document Processing — Convert documents to Markdown using MarkItDown
  2. Text Chunking — Split text at semantic boundaries (headings, paragraphs, sentences)
  3. Embedding — Generate vector representations via OpenAI or other providers
  4. Storage — Store chunks and embeddings in DuckDB, ChromaDB, or OpenAI Vector Stores
  5. Retrieval — Find relevant chunks using similarity search or BM25

Usage

from raghilda.store import DuckDBStore
from raghilda.embedding import EmbeddingOpenAI
from raghilda.scrape import find_links
from raghilda.read import read_as_markdown
from raghilda.chunker import MarkdownChunker

## Create a store with embeddings
store = DuckDBStore.create(
    location="chatlas.db",
    embed=EmbeddingOpenAI(),
)

## Find and index pages from the chatlas documentation
links = find_links("https://posit-dev.github.io/chatlas/")
chunker = MarkdownChunker()

for link in links:
    document = read_as_markdown(link)
    chunked_document = chunker.chunk(document)
    store.upsert(chunked_document)

## Retrieve relevant chunks
chunks = store.retrieve("How do I stream a response?", top_k=5)
for chunk in chunks:
    print(chunk.text)