Formatter

Air is first and foremost a formatter of R code. On this page, you’ll find details about what a formatter is, why you’d want to use one, and you’ll learn about how Air makes decisions on how to format your R code.

What’s a formatter?

A formatter is in charge of the layout of your R code. Formatters do not change the meaning of code; instead they ensure that whitespace, newlines, and other punctuation conform to a set of rules and standards, such as:

  • Making sure your code is indented with the appropriate amount of leading whitespace depending on the context. By default, Air uses an indentation of 2 spaces. You will see this indentation in pipelines:

    data |>
      ggplot(aes(x, y)) +
      geom_point()

    as well as expanded (i.e. vertically laid out) function calls:

    list(
      foo = 1,
      bar = 2
    )
  • Preventing your code from overflowing a given line width. By default, we use a line width of 80 characters. It does so by splitting lines of code that have become too long over multiple lines. For instance, let’s say that we’ve set the line width to be extremely small and these expressions would overflow:

    data |> select(foo)
    
    foo <- function(bar = 1, baz = 2) {
      list(bar, baz)
    }

    To respect the very small line width, Air would switch these expressions from a horizontal layout (called “flat”) to a vertical one (called “expanded”):

    data |>
      select(foo)
    
    foo <- function(
      bar = 1,
      baz = 2
    ) {
      list(
        bar,
        baz
      )
    }
  • Standardizing the whitespace around code elements. Have you ever had difficulties deciphering very dense code?

    1+2:3*(4/5)

    Air reformats this expression to:

    1 + 2:3 * (4 / 5)

In general, a formatter takes over the whitespace in your code and moves elements around to respect style conventions and maximize readability.

How does a formatter help your workflow?

By using a formatter it might seem like you’re rescinding control over the layout of your code. And indeed you are! However, putting Air in charge of styling your code has substantial advantages.

First, it automatically forces you to write legible code that is neither too wide nor too narrow, with proper breathing room around syntactic elements. Having a formatter as a companion significantly improves the process of writing code as you no longer have to think about style as much - the formatter does that for you!

Second, it reduces friction when working in a team. By agreeing to use a formatter in a project, collaborators no longer have to discuss styling and layout issues. Code sent to you by a colleague will adhere to the standards that you’re used to. Code review no longer has to be about style nitpicks and can focus on the substance of the changes instead.

How does Air decide how to format your code?

Air tries to strike a balance between enforcing rigid rules and allowing authors some control over the layout. Our main source of styling rules is the Tidyverse style guide, but we occasionally deviate from these.

There is a trend among modern formatters of being opinionated. Air certainly fits this trend and provides very few configuration options, mostly the indent style (spaces versus tabs), the indent width, and the line width. However, Air also puts code authors in charge of certain aspects of the layout through the notion of persistent line breaks.

In general, Air is in control of deciding where to put vertical space (line breaks) in your code. For instance if you write:

list(foo,
bar)

Air will figure out that this expression fits on a single line without exceeding the line width. It will discard the line break and reformat to:

list(foo, bar)

However there are very specific places at which you can enforce a line break, i.e. make it persistent.

  • Before the very first argument in a function call. This:

    list(
    foo, bar)

    gets formatted as:

    list(
      foo,
      bar
    )
  • Before the very first right-hand side expression in a pipeline. This:

    data |>
    select(foo) |> filter(!bar)

    gets formatted as:

    data |>
      select(foo) |>
      filter(!bar)

A persistent line break will never be removed by Air. But you can remove it manually. Taking the last example, if you join the first lines like this:

list(foo,
  bar
)

1 + 2 +
  3

Air will recognize that you’ve removed the persistent line break, and reformat as:

list(foo, bar)

1 + 2 + 3

The goal of this feature is to strike a balance between being opinionated and recognizing that users often know when taking up more vertical space results in more readable output.

Autobracing

To encourage more consistent, readable, and portable code, Air will autobrace the following elements:

  • If statements

  • For, while, and repeat loops

  • Function definitions

Autobracing is the process of wrapping the body of these code elements with { } if braces don’t already exist.

If statements

Air will autobrace if statements if:

  • Any existing part of the if statement spans multiple lines

  • Any existing part of the if statement is already braced

  • The if statement is nested, i.e. there is an else if {

  • The if statement exceeds the line length

For example, the following will all be autobraced:

if (condition)
  a

# Becomes:
if (condition) {
  a
}
if (condition) a else { b }

# Becomes:
if (condition) {
  a
} else {
  b
}
if (condition) a else if (condition2) b else c

# Becomes:
if (condition) {
  a
} else if (condition2) {
  b
} else {
  c
}

Simple if statements that don’t hit any of the autobracing criteria mentioned above are allowed to stay on one line as long as they are also in value position, as opposed to effect position.

  • Top level if statements are in effect position.

  • If statements that are direct children of {} are in effect position, unless the if statement is the last child of the {} expression list, in which case it is in value position (because it is the returned value from that scope).

  • Otherwise, the if statement is in value position.

This if statement is at top level, putting it in effect position, and would be autobraced:

if (condition) a else b

# Becomes:
if (condition) {
  a
} else {
  b
}

These if statements are children of {} (but aren’t the last child!), putting them in effect position, and would be autobraced:

fn <- function(x) {
  if (condition) stop("oh no")
  if (condition) return(1)
  if (condition) x <- 1
  x + 1
}

# Becomes:
fn <- function(x) {
  if (condition) {
    stop("oh no")
  }
  if (condition) {
    return(1)
  }
  if (condition) {
    x <- 1
  }
  x + 1
}

It’s particularly important for code involving stop(), return(), and <- to be easily readable on their own line because they cause side effects that affect control flow or state.

These if statements are in value position, and would not be autobraced:

x <- if (condition) 1 else 2

x <- x %||% if (condition) 1 else 2

list(a = if (condition) 1 else 2)

function(
  a,
  optional = if (is.null(a)) 1 else 2
) {
}

# If statement is the last expression of the `{}` scope
map(xs, function(x) {
  if (is.null(x)) 1 else 2
})

Portability

It is particularly important to autobrace multiline if statements for portability, which is the ability to copy and paste that if statement into any context and have it still parse. Consider the following if statement:

fn <- function(a) {
  if (is.null(a))
    1
  else
    2
}

This parses and runs correctly while the if statement is nested within the {} braces of the function. But if you’re testing this code and you copy and paste it out of the function, then it no longer parses:

if (is.null(a))
  1
else
  2

If you try and run this, then you’ll see an error like Error: unexpected 'else'. This is particularly annoying when you’re working inside a debugger. Most R debuggers allow you to pause inside functions and highlight and run chunks of that function. If you’re paused inside fn() and try to highlight and run the if statement, then it will confusingly fail to parse. Autobracing multiline if statements avoids this problem entirely.

For, while, and repeat loops

Air unconditionally autobraces the body of all R loops. This is mostly for consistency with if statements, as it is fairly uncommon to see braceless loops in practice.

for (i in 1:5) x <- x + i

# Becomes:
for (i in 1:5) {
  x <- x + i
}
while (x < 5) x <- x + 1

# Becomes:
while (x < 5) {
  x <- x + 1
}

Function definitions

Air will autobrace the body of a function definition if:

  • Any existing part of the function definition spans multiple lines

  • The function definition exceeds the line length

fn <- function(a, b)
  a + b

# Becomes:
fn <- function(a, b) {
  a + b
}
fn <- function(
  a,
  b
) a + b

# Becomes:
fn <- function(
  a,
  b
) {
  a + b
}
fn <- function(a_really_long_variable_name, another_really_long_name) a_really_long_variable_name + another_really_long_name

# Becomes:
fn <- function(
  a_really_long_variable_name,
  another_really_long_name
) {
  a_really_long_variable_name + another_really_long_name
}

Short function definitions are allowed on one line and will not be autobraced. These are all allowed by Air:

add_one <- function(x) x + 1

map_lgl(xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x))

# This includes anonymous functions
map_lgl(xs, \(x) is.list(x) && length(x) == 0L)

With persistent line breaks

Autobracing is particularly useful as a code rewriting tool when combined with persistent line breaks. Consider:

result <- map_lgl(xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x))

This may be easier to read if it spanned across multiple lines. You could manually rework this, or you could let Air help you! There are two places you could put a persistent line break depending on what your desired final result is:

# Adding a line break before `xs` expands the call
result <- map_lgl(
  xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x))

# Becomes:
result <- map_lgl(
  xs,
  function(x) is.logical(x) && length(x) == 1L && !is.na(x)
)
# Adding a line break before `is.logical(x)` forces autobracing
result <- map_lgl(xs, function(x)
    is.logical(x) && length(x) == 1L && !is.na(x))

# Becomes:
result <- map_lgl(xs, function(x) {
  is.logical(x) && length(x) == 1L && !is.na(x)
})

Comments

Air generally avoids moving your comments. However, when Air autobraces code, it may have to adjust them. This generally works quite well for most code, but is impossible to do perfectly. It is possible that you will have to adjust the placement of your comments after Air runs.

For example, leading comments on autobraced elements are generally placed in a way that you’d expect:

if (condition)
  # My comment
  a

# Becomes:
if (condition) {
  # My comment
  a
}

But trailing comments might need manual adjustment:

if (condition)
  a # My comment

# Becomes:
if (condition) {
  a
} # My comment

# You may want to adjust it to:
if (condition) {
  a # My comment
}

In general, prefer leading comments over trailing comments for readability and to have the highest chance of Air placing it in the correct location when comment adjustment is required.

Disabling formatting

Skip comments

Air supports two special comments to disable formatting, # fmt: skip and # fmt: skip file.

# fmt: skip skips formatting for the following syntax node (including all of its children).

# This skips formatting for `list()` and its arguments, retaining the manual alignment
# fmt: skip
list(
  dollar = "USA",
  yen    = "Japan",
  yuan   = "China"
)

# This skips formatting for `tribble()` and its arguments
# fmt: skip
tribble(
  ~x, ~y,
   1,  2,
)

# This skips formatting for `graph_from_literal()` and its arguments,
# retaining the formatting of the domain specific language
# fmt: skip
igraph::graph_from_literal(Alice +--+ Bob)

# fmt: skip file skips the entire file. This must be placed at the top of the file before the first non-comment R code, otherwise the comment is ignored. This is useful for generated files, or as an alternative to exclude for individual files that you know you’d like to exclude from formatting.

# This is a generated file. Don't modify by hand!
# fmt: skip file

# This won't be formatted
generated_function_signature <- function(which, might, overflow, the, line, width, but, we, dont, really, care) {
    body
}

# Neither will this
another_generated_function_signature <- function(which, might, overflow, the, line, width, but, we, dont, really, care) {
    body
}

For generated files we also support the convention of a comment starting with Generated by at the top of the file:

# Generated by myPackage: don't modify by hand

Skip configuration

Skip comments are useful for disabling formatting for one-off function calls, but sometimes you may find yourself repeatedly using functions from a domain specific language (DSL) that doesn’t follow conventional formatting rules. For example, the igraph package contains a DSL for constructing a graph from a literal representation:

igraph::graph_from_literal(A +-+ B +---+ C ++ D + E)

Air would format this as:

igraph::graph_from_literal(A + -+B + ---+C + +D + E)

If you use graph_from_literal() often, it would be annoying to add # fmt: skip comments at every call site. Instead, air.toml supports a skip field that allows you to specify function names that you never want formatting for. Specifying this would retain the original formatting of the graph_from_literal() call, even without a # fmt: skip comment:

skip = ["graph_from_literal"]

One particularly common use of this is skip = ["tribble"], which you can use to avoid formatting of all tibble::tribble() calls. In the long term, Air may provide more sophisticated tooling for formatting using a specified alignment.

When does a formatter run?

The most convenient way to run a formatter is when you save a file. Most editors don’t enable this by default, so check out the Editors section to learn how to configure your editor to format on save. Once enabled, your code will be reformatted every time you save! This prevents your code from ever diverging from the standards set by Air.

Air also provides a CLI tool. It makes it easy to reformat whole directories such as an R project or package. If you’re incorporating Air into an existing project, you may want to use the CLI tool to run air format once on the entire project, commit those results, and then rely on per-file format on save from there on out.

Note that in all of these cases, the formatter will only run if your code is syntactically correct. If the formatter is seemingly no longer working, check your code for syntax errors.