Formatter
Air is first and foremost a formatter of R code. On this page, you’ll find details about what a formatter is, why you’d want to use one, and you’ll learn about how Air makes decisions on how to format your R code.
What’s a formatter?
A formatter is in charge of the layout of your R code. Formatters do not change the meaning of code; instead they ensure that whitespace, newlines, and other punctuation conform to a set of rules and standards, such as:
Making sure your code is indented with the appropriate amount of leading whitespace depending on the context. By default, Air uses an indentation of 2 spaces. You will see this indentation in pipelines:
|> data ggplot(aes(x, y)) + geom_point()
as well as expanded (i.e. vertically laid out) function calls:
list( foo = 1, bar = 2 )
Preventing your code from overflowing a given line width. By default, we use a line width of 80 characters. It does so by splitting lines of code that have become too long over multiple lines. For instance, let’s say that we’ve set the line width to be extremely small and these expressions would overflow:
|> select(foo) data <- function(bar = 1, baz = 2) { foo list(bar, baz) }
To respect the very small line width, Air would switch these expressions from a horizontal layout (called “flat”) to a vertical one (called “expanded”):
|> data select(foo) <- function( foo bar = 1, baz = 2 ) {list( bar, baz ) }
Standardizing the whitespace around code elements. Have you ever had difficulties deciphering very dense code?
1+2:3*(4/5)
Air reformats this expression to:
1 + 2:3 * (4 / 5)
In general, a formatter takes over the whitespace in your code and moves elements around to respect style conventions and maximize readability.
How does a formatter help your workflow?
By using a formatter it might seem like you’re rescinding control over the layout of your code. And indeed you are! However, putting Air in charge of styling your code has substantial advantages.
First, it automatically forces you to write legible code that is neither too wide nor too narrow, with proper breathing room around syntactic elements. Having a formatter as a companion significantly improves the process of writing code as you no longer have to think about style as much - the formatter does that for you!
Second, it reduces friction when working in a team. By agreeing to use a formatter in a project, collaborators no longer have to discuss styling and layout issues. Code sent to you by a colleague will adhere to the standards that you’re used to. Code review no longer has to be about style nitpicks and can focus on the substance of the changes instead.
How does Air decide how to format your code?
Air tries to strike a balance between enforcing rigid rules and allowing authors some control over the layout. Our main source of styling rules is the Tidyverse style guide, but we occasionally deviate from these.
There is a trend among modern formatters of being opinionated. Air certainly fits this trend and provides very few configuration options, mostly the indent style (spaces versus tabs), the indent width, and the line width. However, Air also puts code authors in charge of certain aspects of the layout through the notion of persistent line breaks.
In general, Air is in control of deciding where to put vertical space (line breaks) in your code. For instance if you write:
list(foo,
bar)
Air will figure out that this expression fits on a single line without exceeding the line width. It will discard the line break and reformat to:
list(foo, bar)
However there are very specific places at which you can enforce a line break, i.e. make it persistent.
Before the very first argument in a function call. This:
list( foo, bar)
gets formatted as:
list( foo, bar )
Before the very first right-hand side expression in a pipeline. This:
|> data select(foo) |> filter(!bar)
gets formatted as:
|> data select(foo) |> filter(!bar)
A persistent line break will never be removed by Air. But you can remove it manually. Taking the last example, if you join the first lines like this:
list(foo,
bar
)
1 + 2 +
3
Air will recognize that you’ve removed the persistent line break, and reformat as:
list(foo, bar)
1 + 2 + 3
The goal of this feature is to strike a balance between being opinionated and recognizing that users often know when taking up more vertical space results in more readable output.
Autobracing
To encourage more consistent, readable, and portable code, Air will autobrace the following elements:
If statements
For, while, and repeat loops
Function definitions
Autobracing is the process of wrapping the body of these code elements with { }
if braces don’t already exist.
If statements
Air will autobrace if statements if:
Any existing part of the if statement spans multiple lines
Any existing part of the if statement is already braced
The if statement is nested, i.e. there is an
else if {
The if statement exceeds the line length
For example, the following will all be autobraced:
if (condition)
a
# Becomes:
if (condition) {
a }
if (condition) a else { b }
# Becomes:
if (condition) {
aelse {
}
b }
if (condition) a else if (condition2) b else c
# Becomes:
if (condition) {
aelse if (condition2) {
}
belse {
}
c }
Simple if statements that don’t hit any of the autobracing criteria mentioned above are allowed to stay on one line as long as they are also in value position, as opposed to effect position.
Top level if statements are in effect position.
If statements that are direct children of
{}
are in effect position, unless the if statement is the last child of the{}
expression list, in which case it is in value position (because it is the returned value from that scope).Otherwise, the if statement is in value position.
This if statement is at top level, putting it in effect position, and would be autobraced:
if (condition) a else b
# Becomes:
if (condition) {
aelse {
}
b }
These if statements are children of {}
(but aren’t the last child!), putting them in effect position, and would be autobraced:
<- function(x) {
fn if (condition) stop("oh no")
if (condition) return(1)
if (condition) x <- 1
+ 1
x
}
# Becomes:
<- function(x) {
fn if (condition) {
stop("oh no")
}if (condition) {
return(1)
}if (condition) {
<- 1
x
}+ 1
x }
It’s particularly important for code involving stop()
, return()
, and <-
to be easily readable on their own line because they cause side effects that affect control flow or state.
These if statements are in value position, and would not be autobraced:
<- if (condition) 1 else 2
x
<- x %||% if (condition) 1 else 2
x
list(a = if (condition) 1 else 2)
function(
a,optional = if (is.null(a)) 1 else 2
) {
}
# If statement is the last expression of the `{}` scope
map(xs, function(x) {
if (is.null(x)) 1 else 2
})
Portability
It is particularly important to autobrace multiline if statements for portability, which is the ability to copy and paste that if statement into any context and have it still parse. Consider the following if statement:
<- function(a) {
fn if (is.null(a))
1
else
2
}
This parses and runs correctly while the if statement is nested within the {}
braces of the function. But if you’re testing this code and you copy and paste it out of the function, then it no longer parses:
if (is.null(a))
1
else
2
If you try and run this, then you’ll see an error like Error: unexpected 'else'
. This is particularly annoying when you’re working inside a debugger. Most R debuggers allow you to pause inside functions and highlight and run chunks of that function. If you’re paused inside fn()
and try to highlight and run the if statement, then it will confusingly fail to parse. Autobracing multiline if statements avoids this problem entirely.
For, while, and repeat loops
Air unconditionally autobraces the body of all R loops. This is mostly for consistency with if statements, as it is fairly uncommon to see braceless loops in practice.
for (i in 1:5) x <- x + i
# Becomes:
for (i in 1:5) {
<- x + i
x }
while (x < 5) x <- x + 1
# Becomes:
while (x < 5) {
<- x + 1
x }
Function definitions
Air will autobrace the body of a function definition if:
Any existing part of the function definition spans multiple lines
The function definition exceeds the line length
<- function(a, b)
fn + b
a
# Becomes:
<- function(a, b) {
fn + b
a }
<- function(
fn
a,
b+ b
) a
# Becomes:
<- function(
fn
a,
b
) {+ b
a }
<- function(a_really_long_variable_name, another_really_long_name) a_really_long_variable_name + another_really_long_name
fn
# Becomes:
<- function(
fn
a_really_long_variable_name,
another_really_long_name
) {+ another_really_long_name
a_really_long_variable_name }
Short function definitions are allowed on one line and will not be autobraced. These are all allowed by Air:
<- function(x) x + 1
add_one
map_lgl(xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x))
# This includes anonymous functions
map_lgl(xs, \(x) is.list(x) && length(x) == 0L)
With persistent line breaks
Autobracing is particularly useful as a code rewriting tool when combined with persistent line breaks. Consider:
<- map_lgl(xs, function(x) is.logical(x) && length(x) == 1L && !is.na(x)) result
This may be easier to read if it spanned across multiple lines. You could manually rework this, or you could let Air help you! There are two places you could put a persistent line break depending on what your desired final result is:
# Adding a line break before `xs` expands the call
<- map_lgl(
result function(x) is.logical(x) && length(x) == 1L && !is.na(x))
xs,
# Becomes:
<- map_lgl(
result
xs,function(x) is.logical(x) && length(x) == 1L && !is.na(x)
)
# Adding a line break before `is.logical(x)` forces autobracing
<- map_lgl(xs, function(x)
result is.logical(x) && length(x) == 1L && !is.na(x))
# Becomes:
<- map_lgl(xs, function(x) {
result is.logical(x) && length(x) == 1L && !is.na(x)
})
Disabling formatting
Skip comments
Air supports two special comments to disable formatting, # fmt: skip
and # fmt: skip file
.
# fmt: skip
skips formatting for the following syntax node (including all of its children).
# This skips formatting for `list()` and its arguments, retaining the manual alignment
# fmt: skip
list(
dollar = "USA",
yen = "Japan",
yuan = "China"
)
# This skips formatting for `tribble()` and its arguments
# fmt: skip
tribble(
~x, ~y,
1, 2,
)
# This skips formatting for `graph_from_literal()` and its arguments,
# retaining the formatting of the domain specific language
# fmt: skip
::graph_from_literal(Alice +--+ Bob) igraph
# fmt: skip file
skips the entire file. This must be placed at the top of the file before the first non-comment R code, otherwise the comment is ignored. This is useful for generated files, or as an alternative to exclude
for individual files that you know you’d like to exclude from formatting.
# This is a generated file. Don't modify by hand!
# fmt: skip file
# This won't be formatted
<- function(which, might, overflow, the, line, width, but, we, dont, really, care) {
generated_function_signature
body
}
# Neither will this
<- function(which, might, overflow, the, line, width, but, we, dont, really, care) {
another_generated_function_signature
body }
For generated files we also support the convention of a comment starting with Generated by
at the top of the file:
# Generated by myPackage: don't modify by hand
Skip configuration
Skip comments are useful for disabling formatting for one-off function calls, but sometimes you may find yourself repeatedly using functions from a domain specific language (DSL) that doesn’t follow conventional formatting rules. For example, the igraph package contains a DSL for constructing a graph from a literal representation:
::graph_from_literal(A +-+ B +---+ C ++ D + E) igraph
Air would format this as:
::graph_from_literal(A + -+B + ---+C + +D + E) igraph
If you use graph_from_literal()
often, it would be annoying to add # fmt: skip
comments at every call site. Instead, air.toml
supports a skip
field that allows you to specify function names that you never want formatting for. Specifying this would retain the original formatting of the graph_from_literal()
call, even without a # fmt: skip
comment:
skip = ["graph_from_literal"]
One particularly common use of this is skip = ["tribble"]
, which you can use to avoid formatting of all tibble::tribble()
calls. In the long term, Air may provide more sophisticated tooling for formatting using a specified alignment.
When does a formatter run?
The most convenient way to run a formatter is when you save a file. Most editors don’t enable this by default, so check out the Editors section to learn how to configure your editor to format on save. Once enabled, your code will be reformatted every time you save! This prevents your code from ever diverging from the standards set by Air.
Air also provides a CLI tool. It makes it easy to reformat whole directories such as an R project or package. If you’re incorporating Air into an existing project, you may want to use the CLI tool to run air format once on the entire project, commit those results, and then rely on per-file format on save from there on out.
Note that in all of these cases, the formatter will only run if your code is syntactically correct. If the formatter is seemingly no longer working, check your code for syntax errors.
Comments
Air generally avoids moving your comments. However, when Air autobraces code, it may have to adjust them. This generally works quite well for most code, but is impossible to do perfectly. It is possible that you will have to adjust the placement of your comments after Air runs.
For example, leading comments on autobraced elements are generally placed in a way that you’d expect:
But trailing comments might need manual adjustment:
In general, prefer leading comments over trailing comments for readability and to have the highest chance of Air placing it in the correct location when comment adjustment is required.