A Markdown document with source tracking.
document.MarkdownDocument()
MarkdownDocument extends Document with markdown-specific semantics for content that comes from a source origin (e.g., URL or file path). This is useful for citation and provenance tracking in RAG applications.
Parameter Attributes
content: str
-
origin: Optional[str] = None
-
attributes: Optional[dict[str, Any]] = None
-
Examples
from raghilda.document import MarkdownDocument
# Create from content directly
doc = MarkdownDocument(
content="# Hello World\n\nThis is a test document.",
origin="https://example.com/hello.md",
)
print(f"Document from: {doc.origin}")
print(f"Content length: {len(doc.content)} characters")
Document from: https://example.com/hello.md
Content length: 39 characters
Methods
|
Name
|
Description
|
|
from_any()
|
Convert any document-like or IntoDocument object to a MarkdownDocument.
|
|
to_chunked()
|
Return a ChunkedMarkdownDocument with the same fields and chunks.
|
from_any()
Convert any document-like or IntoDocument object to a MarkdownDocument.
from_any(doc, origin=None)
This conversion only accepts unchunked inputs. If the source object already carries chunks, use ChunkedMarkdownDocument.from_any() instead.
Parameters
doc: Union[DocumentLike, IntoDocument]
-
An object that implements the DocumentLike protocol or has a to_document() method.
origin: Optional[str] = None
-
Optional origin to set if the source object doesn’t have one.
Returns
MarkdownDocument
-
A raghilda MarkdownDocument instance.
to_chunked()
Return a ChunkedMarkdownDocument with the same fields and chunks.