MarkdownChunkerStrategy
Markdown-aware chunking strategy that respects heading boundaries.
Usage
MarkdownChunkerStrategy()Parameters
max_chunk_size: int = 1000-
Maximum size of each chunk in characters.
overlap_size: int = 200-
Number of overlapping characters between chunks.
preserve_code_blocks: bool = True-
Whether to keep code blocks intact.
Methods
| Name | Description |
|---|---|
| merge_undersized_fragments() | Merge chunks that are too small to stand alone. |
| split_by_heading_hierarchy() | Split content by markdown heading hierarchy. |
merge_undersized_fragments()
Merge chunks that are too small to stand alone.
Usage
merge_undersized_fragments(chunks)split_by_heading_hierarchy()
Split content by markdown heading hierarchy.
Usage
split_by_heading_hierarchy(content)