Skip to contents

Overview

frontmatter extracts and parses structured metadata (YAML or TOML) from the beginning of text documents. Front matter is a common pattern in Quarto documents, R Markdown documents, static site generators, documentation systems, content management tools and even Python and R scripts where metadata is placed at the top of a document, separated from the main content by delimiter fences.

Features

  • Fast C++ parsing for optimal performance
  • Multiple formats supported:
  • Flexible parser integration - use default parsers or provide your own
  • Graceful handling of invalid front matter

Installation

You can install frontmatter from CRAN with:

install.packages("frontmatter")

To install the latest development version, you can install from posit-dev.r-universe.dev:

# install.packages("pak")

pak::repo_add("https://posit-dev.r-universe.dev")
pak::pak("frontmatter")

Or you can install the development version from GitHub:

# install.packages("pak")
pak::pak("posit-dev/btw")

Usage

Basic Usage

---
title: My Document
date: 2024-01-01
tags:
  - tutorial
  - R
---

Document content starts here.
str(parse_front_matter(text_yaml))
#> List of 2
#>  $ data:List of 3
#>   ..$ title: chr "My Document"
#>   ..$ date : chr "2024-01-01"
#>   ..$ tags : chr [1:2] "tutorial" "R"
#>  $ body: chr "Document content starts here."

Parse from File

result <- read_front_matter("document.md")

TOML Front Matter

+++
title = 'My Document'
count = 42
+++

Content here
str(parse_front_matter(text_toml))
#> List of 2
#>  $ data:List of 2
#>   ..$ title: chr "My Document"
#>   ..$ count: int 42
#>  $ body: chr "Content here"

Comment-Wrapped Formats

For R and Python files, front matter can be wrapped in comments:

# ---
# title: My Analysis
# author: Data Scientist
# ---

library(dplyr)
# Analysis code...
str(parse_front_matter(text_r))
#> List of 2
#>  $ data:List of 2
#>   ..$ title : chr "My Analysis"
#>   ..$ author: chr "Data Scientist"
#>  $ body: chr "library(dplyr)\n# Analysis code..."

Roxygen-style comments are also supported:

#' ---
#' title: My Function
#' ---
#'
#' Documentation here
str(parse_front_matter(text_roxy))
#> List of 2
#>  $ data:List of 1
#>   ..$ title: chr "My Function"
#>  $ body: chr "Documentation here"

PEP 723 Python Metadata

# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "requests<3",
# ]
# ///

import requests
str(parse_front_matter(text_py))
#> List of 2
#>  $ data:List of 2
#>   ..$ requires-python: chr ">=3.11"
#>   ..$ dependencies   :List of 1
#>   .. ..$ : chr "requests<3"
#>  $ body: chr "import requests"

Custom Parsers

# Get raw YAML without parsing
str(parse_front_matter(text_yaml, parse_yaml = identity))
#> List of 2
#>  $ data: chr "title: My Document\ndate: 2024-01-01\ntags:\n  - tutorial\n  - R\n"
#>  $ body: chr "Document content starts here."

# Use a custom parser that adds metadata
custom_parser <- function(x) {
  data <- yaml12::parse_yaml(x)
  data$.parsed_with <- "{frontmatter}"
  data
}

str(parse_front_matter(text_yaml, parse_yaml = custom_parser))
#> List of 2
#>  $ data:List of 4
#>   ..$ title       : chr "My Document"
#>   ..$ date        : chr "2024-01-01"
#>   ..$ tags        : chr [1:2] "tutorial" "R"
#>   ..$ .parsed_with: chr "{frontmatter}"
#>  $ body: chr "Document content starts here."

Default Parsers

YAML 1.1 Support

To use YAML 1.1 parsing (via the yaml package) instead of the default YAML 1.2, set either:

  • The R option: options(frontmatter.parse_yaml.spec = "1.1")
  • The environment variable: FRONTMATTER_PARSE_YAML_SPEC=1.1

The option takes precedence over the environment variable.

---
# In YAML 1.1, 'yes' is parsed as TRUE
enabled: yes
---

Content
# Default (YAML 1.2): 'yes' is a string
parse_front_matter(text_yaml11)$data
#> $enabled
#> [1] "yes"

# With YAML 1.1: 'yes' is boolean TRUE
rlang::with_options(
  frontmatter.parse_yaml.spec = "1.1",
  parse_front_matter(text_yaml11)$data
)
#> $enabled
#> [1] TRUE

Error Handling

Incomplete front matter returns NULL as data and the original content unchanged:

text <- "---\nNot valid front matter"
str(parse_front_matter(text))
#> List of 2
#>  $ data: NULL
#>  $ body: chr "---\nNot valid front matter"

Invalid front matter is handled by the parsing function. For example, invalid YAML will likely result in an error from the YAML parser. Use a custom parser if you need to handle such cases gracefully.

Performance

The package uses C++11 for optimal performance:

  • Single-pass parsing
  • Minimal string copying
  • Efficient fence detection and validation

Designed for high throughput processing of many documents.

Acknowledgments

This package was inspired by the simplematter JavaScript package.

Thanks also to Yihui Xie’s implementation in xfun::yaml_body().