Output Formats

This page explores the five output formats that multimark can produce from a single Markdown input, showing what each format looks like and when to reach for it.

Five Renderers, One Parser

All five output renderers share the same underlying parser. The Markdown is parsed once into an abstract syntax tree (AST), and then a format-specific renderer walks the tree to produce output. This means that parsing behavior (extensions, options) is consistent regardless of which renderer you choose.

The renderers are:

Each is a standalone function that you call with a Markdown string and optional parameters.

HTML

HTML is the most common output format. It produces well-formed fragments suitable for embedding in web pages, documentation systems, or email templates.

from multimark import markdown_to_html

doc = """\
## Features

Multimark is **fast** and supports:

1. CommonMark
2. GFM extensions
3. Multiple output formats
"""

print(markdown_to_html(doc))
<h2>Features</h2>
<p>Multimark is <strong>fast</strong> and supports:</p>
<ol>
<li>CommonMark</li>
<li>GFM extensions</li>
<li>Multiple output formats</li>
</ol>

The HTML renderer supports one parameter that the others do not: sourcepos. When enabled, each block-level element receives a data-sourcepos attribute indicating its line and column range in the original Markdown source. This is useful for building editors or tools that map rendered output back to source positions.

print(markdown_to_html("# Title\n\nParagraph\n", sourcepos=True))
<h1 data-sourcepos="1:1-1:7">Title</h1>
<p data-sourcepos="3:1-3:9">Paragraph</p>

LaTeX

The LaTeX renderer produces fragments suitable for inclusion in a .tex document. Headings map to sectioning commands, inline formatting maps to text commands, and code blocks use verbatim environments.

from multimark import markdown_to_latex

doc = """\
# Introduction

This is **bold** and *italic* with `inline code`.

> A block quote.
"""

print(markdown_to_latex(doc))
\section{Introduction}

This is \textbf{bold} and \emph{italic} with \texttt{inline code}.

\begin{quote}
A block quote.

\end{quote}

Line Wrapping

The width parameter controls line wrapping in the LaTeX output. A value of 0 (the default) disables wrapping entirely. Any positive value wraps lines at approximately that column.

long_para = "This is a sentence that demonstrates line wrapping. " * 5
print(markdown_to_latex(long_para, width=60))
This is a sentence that demonstrates line wrapping. This is
a sentence that demonstrates line wrapping. This is a
sentence that demonstrates line wrapping. This is a sentence
that demonstrates line wrapping. This is a sentence that
demonstrates line wrapping.

Wrapping makes the LaTeX source more readable and produces cleaner diffs in version control.

Man Pages

The groff man page renderer produces output compatible with the man command and related roff-based toolchains. Formatting uses font-switching requests and standard man macros.

from multimark import markdown_to_man

doc = """\
## Synopsis

**multimark** - Python bindings for cmark-gfm

## Description

Converts *Markdown* to multiple output formats.
"""

print(markdown_to_man(doc))
.SS
Synopsis
.PP
\f[B]multimark\f[] \- Python bindings for cmark\-gfm
.SS
Description
.PP
Converts \f[I]Markdown\f[] to multiple output formats.

Like LaTeX, the man renderer supports width for line wrapping. Setting this to 72 or 78 matches common man page formatting conventions.

XML

The XML renderer produces a structured representation of the parsed AST. Every Markdown construct becomes a named XML element, making this format ideal for debugging, testing, or feeding into XML-based toolchains.

from multimark import markdown_to_xml

simple = "A **bold** word.\n"
print(markdown_to_xml(simple))
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text xml:space="preserve">A </text>
    <strong>
      <text xml:space="preserve">bold</text>
    </strong>
    <text xml:space="preserve"> word.</text>
  </paragraph>
</document>

The output includes an XML declaration and DOCTYPE reference. Each element directly corresponds to an AST node type: <paragraph>, <strong>, <text>, <softbreak>, and so on.

Inspecting Document Structure

The XML format is particularly valuable when you need to understand exactly how the parser interprets a given input. Ambiguous constructs or edge cases become immediately clear in the AST representation.

# How does the parser handle nested emphasis?
tricky = "***bold and italic***\n"
print(markdown_to_xml(tricky))
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <emph>
      <strong>
        <text xml:space="preserve">bold and italic</text>
      </strong>
    </emph>
  </paragraph>
</document>

CommonMark (Normalization)

The CommonMark renderer parses Markdown and re-renders it in a canonical form. Different but semantically equivalent inputs produce identical output after normalization.

from multimark import markdown_to_commonmark

# These three inputs all mean the same thing
inputs = [
    "_hello_ **world**\n",
    "*hello* __world__\n",
    "*hello* **world**\n",
]

for md in inputs:
    print(repr(markdown_to_commonmark(md)))
'*hello* **world**\n'
'*hello* **world**\n'
'*hello* **world**\n'

All three inputs normalize to the same canonical form. This is useful for enforcing consistent formatting across a document or codebase, and for testing that transformations preserve semantics.

Round-Trip Stability

A key property of the CommonMark renderer is round-trip stability: normalizing an already-normalized document produces the same output. This means you can apply it idempotently.

original = "- First\n- Second\n- Third\n"
once = markdown_to_commonmark(original)
twice = markdown_to_commonmark(once)

print(f"Stable: {once == twice}")
print(once)
Stable: True
  - First
  - Second
  - Third

Choosing a Format

The right renderer depends on your downstream pipeline. Here is a quick decision guide.

When to use each format

HTML when you are producing web content, documentation, or any output that will be displayed in a browser.

LaTeX when you are generating input for pdflatex, xelatex, or any TeX-based typesetting system.

Man when you are writing Unix manual pages for command-line tools.

XML when you need to inspect the AST, run XPath queries, or feed structured data into an XML pipeline.

CommonMark when you want to normalize formatting, enforce style consistency, or verify round-trip stability of your Markdown processing pipeline.