import textwrap
from yaml12 import parse_yaml
first_example_text = textwrap.dedent(
"""\
title: A Modern YAML parser written in Rust
properties: [correct, safe, fast, simple]
score: 9.5
categories:
- yaml
- python
- example
settings:
note: >
This is a folded block
that turns line breaks
into spaces.
note_literal: |
This is a literal block
that keeps
line breaks.
"""
)YAML in 2 Minutes
Here’s a short introduction to YAML for Python users. YAML is a data serialization format designed to be easy for humans to read and write.
Think of YAML as “JSON with comments and nicer multiline strings.” yaml12 parses YAML 1.2 (the modern specification that removes some of YAML 1.1’s surprising eager conversions) into plain Python objects.
YAML has three building blocks: scalars (single values), sequences (ordered collections), and mappings (key/value pairs). JSON is a subset of YAML 1.2, so all valid JSON is also valid YAML and parses the same way.
A first example
title: A Modern YAML parser written in Rust
properties: [correct, safe, fast, simple]
score: 9.5
categories:
- yaml
- python
- example
settings:
note: >
This is a folded block
that turns line breaks
into spaces.
note_literal: |
This is a literal block
that keeps
line breaks.Let’s parse that with yaml12:
doc = parse_yaml(first_example_text)
assert doc == {
"title": "A Modern YAML parser written in Rust",
"properties": ["correct", "safe", "fast", "simple"],
"score": 9.5,
"categories": ["yaml", "python", "example"],
"settings": {
"note": "This is a folded block that turns line breaks into spaces.\n",
"note_literal": "This is a literal block\nthat keeps\nline breaks.\n",
},
}Collections
There are two collection types: sequences and mappings.
Sequences: YAML’s ordered collections
A sequence is a list of items. Each item begins with - at the parent indent.
- cat
- dog→ ["cat", "dog"]
Sequences become lists in Python.
JSON-style arrays work too:
[cat, dog]→ same result
Anything belonging to one of the sequence entries is indented at least one space past the dash:
- name: cat
toys: [string, box]
- name: dog
toys: [ball, bone]parses to:
[
{"name": "cat", "toys": ["string", "box"]},
{"name": "dog", "toys": ["ball", "bone"]},
][{'name': 'cat', 'toys': ['string', 'box']},
{'name': 'dog', 'toys': ['ball', 'bone']}]
Mappings: key/value pairs
A mapping is a set of key: value pairs at the same indent:
foo: 1
bar: true→ {"foo": 1, "bar": True}
Mappings become dicts in Python.
A key at its indent owns anything indented more:
settings:
debug: true
max_items: 3parses to {"settings": {"debug": True, "max_items": 3}}.
JSON-style objects work too:
{a: true}→ {"a": True}
Scalars
All nodes that are not collections are scalars; these are the leaf values of a YAML document.
Scalars can come in three forms: block, quoted, or plain.
Block scalars
| starts a literal block that keeps newlines; > starts a folded block that joins lines with spaces (except blank/indented lines keep breaks). Block scalars always become strings.
|
hello
world→ "hello\nworld\n"
>
hello
world→ "hello world\n"
Quoted scalars
Quoted scalars always become strings. Double quotes interpret escapes (\n, \t, \\, \"). Single quotes are literal and do not interpret escapes, except for '' which is parsed as a single '.
["line\nbreak", "quote: \"here\""]→ ["line\nbreak", 'quote: "here"']
['line\nbreak', 'quote: ''here''']→ ["line\\nbreak", "quote: 'here'"]
Plain (unquoted) scalars
Plain (unquoted) nodes can resolve to one of five types: string, int, float, bool, or null.
true/false->True/Falsenull,~, or empty ->None- numbers: signed, decimal, scientific, hex (
0x), octal (0o),.inf,.nan->intorfloat - everything else stays a string (
yes,no,on,offand other aliases remain strings in YAML 1.2)
[true, 123, 4.5e2, 0x10, .inf, yes]→ [True, 123, 450.0, 16, float("inf"), "yes"]
End-to-end example
doc:
pets:
- cat
- dog
numbers: [1, 2.5, 0x10, .inf, null]
integers: [1, 2, 3, 0x10, null]
flags: {enabled: true, label: on}
literal: |
hello
world
folded: >
hello
world
quoted:
- "line\nbreak"
- 'quote: ''here'''
plain: [yes, no]
mixed: [won't simplify, 123, true]Python result (parse_yaml() with defaults):
end_to_end_text = textwrap.dedent(
"""\
doc:
pets:
- cat
- dog
numbers: [1, 2.5, 0x10, .inf, null]
integers: [1, 2, 3, 0x10, null]
flags: {enabled: true, label: on}
literal: |
hello
world
folded: >
hello
world
quoted:
- "line\\nbreak"
- 'quote: ''here'''
plain: [yes, no]
mixed: [won't simplify, 123, true]
"""
)
parsed = parse_yaml(end_to_end_text)
assert parsed == {
"doc": {
"pets": ["cat", "dog"],
"numbers": [1, 2.5, 16, float("inf"), None],
"integers": [1, 2, 3, 16, None],
"flags": {"enabled": True, "label": "on"},
"literal": "hello\nworld\n",
"folded": "hello world\n",
"quoted": ["line\nbreak", "quote: 'here'"],
"plain": ["yes", "no"],
"mixed": ["won't simplify", 123, True],
}
}Quick notes
- Indentation defines structure for collections. Sibling elements share an indent; children are indented more. YAML 1.2 forbids tabs; use spaces.
- All JSON is valid YAML.
- Sequences stay Python lists; there is no vector “simplification.”
- Block scalars (
|,>) always produce strings. - Booleans are only
true/false;nullmaps toNone. - Numbers can be signed, scientific, hex (
0x), octal (0o),.inf, and.nan.
These essentials cover most YAML you’ll run into in practice. If you encounter tags, anchors, or non-string mapping keys, the advanced guide walks through those in detail.
Comments
Comments start with
#and run to the end of the line. They must be separated from values by whitespace and can sit on their own line or at line ends.yaml12ignores them.→
{"title": "example", "items": ["a", "b"]}