markdown_to_xml()

Parse CommonMark/GFM and render as XML.

Usage

Source

markdown_to_xml(
    text,
    *,
    hardbreaks=False,
    smart=False,
    normalize=False,
    sourcepos=False,
    unsafe=False,
    footnotes=False,
    extensions=(),
    options=0
)

Produces an XML representation of the parsed abstract syntax tree (AST). Each Markdown construct is represented as a named element (e.g., <paragraph>, <strong>, <code_block>), with text content preserved in <text> elements.

The XML output conforms to the CommonMark DTD and includes an XML declaration and DOCTYPE. This format is useful for debugging, AST inspection, or piping into XML-based toolchains.

Parameters

text: str

The Markdown string to parse and render. Must be a Python str.

hardbreaks: bool = False

Render soft line breaks as hard breaks in the AST. By default, this is False.

smart: bool = False

Convert straight quotes to curly quotes, -- to en-dashes, --- to em-dashes, and ... to ellipses. By default, this is False.

normalize: bool = False

Consolidate adjacent text nodes in the parsed AST. By default, this is False.

sourcepos: bool = False

Include sourcepos attributes on elements indicating their line/column positions in the original Markdown source. By default, this is False.

unsafe: bool = False

Allow raw HTML to pass through. By default, this is False.

footnotes: bool = False

Enable footnote syntax parsing. By default, this is False.

extensions: Sequence[str] = ()

A sequence of GFM extension names to enable. Valid names are "table", "strikethrough", "autolink", "tagfilter", and "tasklist".

options: int = 0
An integer bitmask of Options flags (e.g., Options.SMART | Options.UNSAFE). Merged via OR with any boolean keyword arguments set to True. Defaults to 0.

Keyword Arguments Vs. Options Flags

The boolean keyword arguments (smart, unsafe, hardbreaks, etc.) are convenience shortcuts for the most common Options flags. When both styles are provided, they are merged via OR. See markdown_to_html() for a detailed explanation.

Returns

str
The rendered XML string, including the XML declaration and DOCTYPE.

Examples

Inspect the AST structure of a simple document:

from multimark import markdown_to_xml

print(markdown_to_xml("**hello**"))
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <strong>
      <text xml:space="preserve">hello</text>
    </strong>
  </paragraph>
</document>

Add source position information:

markdown_to_xml("# Title\n\nBody text\n", sourcepos=True)
'<?xml version="1.0" encoding="UTF-8"?>\n<!DOCTYPE document SYSTEM "CommonMark.dtd">\n<document sourcepos="1:1-3:9" xmlns="http://commonmark.org/xml/1.0">\n  <heading sourcepos="1:1-1:7" level="1">\n    <text sourcepos="1:3-1:7" xml:space="preserve">Title</text>\n  </heading>\n  <paragraph sourcepos="3:1-3:9">\n    <text sourcepos="3:1-3:9" xml:space="preserve">Body text</text>\n  </paragraph>\n</document>\n'