Skip to contents

A generic function used to describe an object for use by LLM.

Usage

btw_this(x, ...)

Arguments

x

The thing to describe.

...

Additional arguments passed down to underlying methods. Unused arguments are silently ignored.

Value

A character vector of lines describing the object.

See also

Examples

btw_this(mtcars)
#> [1] "```json"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
#> [2] "{\"n_cols\":11,\"n_rows\":32,\"groups\":[],\"class\":\"data.frame\",\"columns\":{\"mpg\":{\"variable\":\"mpg\",\"type\":\"numeric\",\"mean\":20.0906,\"sd\":6.0269,\"p0\":10.4,\"p25\":15.425,\"p50\":19.2,\"p75\":22.8,\"p100\":33.9},\"cyl\":{\"variable\":\"cyl\",\"type\":\"numeric\",\"mean\":6.1875,\"sd\":1.7859,\"p0\":4,\"p25\":4,\"p50\":6,\"p75\":8,\"p100\":8},\"disp\":{\"variable\":\"disp\",\"type\":\"numeric\",\"mean\":230.7219,\"sd\":123.9387,\"p0\":71.1,\"p25\":120.825,\"p50\":196.3,\"p75\":326,\"p100\":472},\"hp\":{\"variable\":\"hp\",\"type\":\"numeric\",\"mean\":146.6875,\"sd\":68.5629,\"p0\":52,\"p25\":96.5,\"p50\":123,\"p75\":180,\"p100\":335},\"drat\":{\"variable\":\"drat\",\"type\":\"numeric\",\"mean\":3.5966,\"sd\":0.5347,\"p0\":2.76,\"p25\":3.08,\"p50\":3.695,\"p75\":3.92,\"p100\":4.93},\"wt\":{\"variable\":\"wt\",\"type\":\"numeric\",\"mean\":3.2172,\"sd\":0.9785,\"p0\":1.513,\"p25\":2.5812,\"p50\":3.325,\"p75\":3.61,\"p100\":5.424},\"qsec\":{\"variable\":\"qsec\",\"type\":\"numeric\",\"mean\":17.8487,\"sd\":1.7869,\"p0\":14.5,\"p25\":16.8925,\"p50\":17.71,\"p75\":18.9,\"p100\":22.9},\"vs\":{\"variable\":\"vs\",\"type\":\"numeric\",\"mean\":0.4375,\"sd\":0.504,\"p0\":0,\"p25\":0,\"p50\":0,\"p75\":1,\"p100\":1},\"am\":{\"variable\":\"am\",\"type\":\"numeric\",\"mean\":0.4062,\"sd\":0.499,\"p0\":0,\"p25\":0,\"p50\":0,\"p75\":1,\"p100\":1},\"gear\":{\"variable\":\"gear\",\"type\":\"numeric\",\"mean\":3.6875,\"sd\":0.7378,\"p0\":3,\"p25\":3,\"p50\":4,\"p75\":4,\"p100\":5},\"carb\":{\"variable\":\"carb\",\"type\":\"numeric\",\"mean\":2.8125,\"sd\":1.6152,\"p0\":1,\"p25\":2,\"p50\":2,\"p75\":4,\"p100\":8}}}"
#> [3] "```"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
btw_this(dplyr::mutate)
#>   [1] "mutate                  package:dplyr                  R Documentation"   
#>   [2] ""                                                                         
#>   [3] "Create, modify, and delete columns"                                       
#>   [4] ""                                                                         
#>   [5] "Description:"                                                             
#>   [6] ""                                                                         
#>   [7] "     ‘mutate()’ creates new columns that are functions of existing"       
#>   [8] "     variables. It can also modify (if the name is the same as an"        
#>   [9] "     existing column) and delete columns (by setting their value to"      
#>  [10] "     ‘NULL’)."                                                            
#>  [11] ""                                                                         
#>  [12] "Usage:"                                                                   
#>  [13] ""                                                                         
#>  [14] "     mutate(.data, ...)"                                                  
#>  [15] "     "                                                                    
#>  [16] "     ## S3 method for class 'data.frame'"                                 
#>  [17] "     mutate("                                                             
#>  [18] "       .data,"                                                            
#>  [19] "       ...,"                                                              
#>  [20] "       .by = NULL,"                                                       
#>  [21] "       .keep = c(\"all\", \"used\", \"unused\", \"none\"),"               
#>  [22] "       .before = NULL,"                                                   
#>  [23] "       .after = NULL"                                                     
#>  [24] "     )"                                                                   
#>  [25] "     "                                                                    
#>  [26] "Arguments:"                                                               
#>  [27] ""                                                                         
#>  [28] "   .data: A data frame, data frame extension (e.g. a tibble), or a lazy"  
#>  [29] "          data frame (e.g. from dbplyr or dtplyr). See _Methods_,"        
#>  [30] "          below, for more details."                                       
#>  [31] ""                                                                         
#>  [32] "     ...: <‘data-masking’> Name-value pairs. The name gives the name of"  
#>  [33] "          the column in the output."                                      
#>  [34] ""                                                                         
#>  [35] "          The value can be:"                                              
#>  [36] ""                                                                         
#>  [37] "            • A vector of length 1, which will be recycled to the"        
#>  [38] "              correct length."                                            
#>  [39] ""                                                                         
#>  [40] "            • A vector the same length as the current group (or the"      
#>  [41] "              whole data frame if ungrouped)."                            
#>  [42] ""                                                                         
#>  [43] "            • ‘NULL’, to remove the column."                              
#>  [44] ""                                                                         
#>  [45] "            • A data frame or tibble, to create multiple columns in the"  
#>  [46] "              output."                                                    
#>  [47] ""                                                                         
#>  [48] "     .by: *[Experimental]*"                                               
#>  [49] ""                                                                         
#>  [50] "          <‘tidy-select’> Optionally, a selection of columns to group"    
#>  [51] "          by for just this operation, functioning as an alternative to"   
#>  [52] "          ‘group_by()’. For details and examples, see ?dplyr_by."         
#>  [53] ""                                                                         
#>  [54] "   .keep: Control which columns from ‘.data’ are retained in the"         
#>  [55] "          output. Grouping columns and columns created by ‘...’ are"      
#>  [56] "          always kept."                                                   
#>  [57] ""                                                                         
#>  [58] "            • ‘\"all\"’ retains all columns from ‘.data’. This is the"    
#>  [59] "              default."                                                   
#>  [60] ""                                                                         
#>  [61] "            • ‘\"used\"’ retains only the columns used in ‘...’ to create"
#>  [62] "              new columns. This is useful for checking your work, as it"  
#>  [63] "              displays inputs and outputs side-by-side."                  
#>  [64] ""                                                                         
#>  [65] "            • ‘\"unused\"’ retains only the columns _not_ used in ‘...’"  
#>  [66] "              to create new columns. This is useful if you generate new"  
#>  [67] "              columns, but no longer need the columns used to generate"   
#>  [68] "              them."                                                      
#>  [69] ""                                                                         
#>  [70] "            • ‘\"none\"’ doesn't retain any extra columns from ‘.data’."  
#>  [71] "              Only the grouping variables and columns created by ‘...’"   
#>  [72] "              are kept."                                                  
#>  [73] ""                                                                         
#>  [74] ".before, .after: <‘tidy-select’> Optionally, control where new columns"   
#>  [75] "          should appear (the default is to add to the right hand side)."  
#>  [76] "          See ‘relocate()’ for more details."                             
#>  [77] ""                                                                         
#>  [78] "Value:"                                                                   
#>  [79] ""                                                                         
#>  [80] "     An object of the same type as ‘.data’. The output has the"           
#>  [81] "     following properties:"                                               
#>  [82] ""                                                                         
#>  [83] "        • Columns from ‘.data’ will be preserved according to the"        
#>  [84] "          ‘.keep’ argument."                                              
#>  [85] ""                                                                         
#>  [86] "        • Existing columns that are modified by ‘...’ will always be"     
#>  [87] "          returned in their original location."                           
#>  [88] ""                                                                         
#>  [89] "        • New columns created through ‘...’ will be placed according to"  
#>  [90] "          the ‘.before’ and ‘.after’ arguments."                          
#>  [91] ""                                                                         
#>  [92] "        • The number of rows is not affected."                            
#>  [93] ""                                                                         
#>  [94] "        • Columns given the value ‘NULL’ will be removed."                
#>  [95] ""                                                                         
#>  [96] "        • Groups will be recomputed if a grouping variable is mutated."   
#>  [97] ""                                                                         
#>  [98] "        • Data frame attributes are preserved."                           
#>  [99] ""                                                                         
#> [100] "Useful mutate functions:"                                                 
#> [101] ""                                                                         
#> [102] "        • ‘+’, ‘-’, ‘log()’, etc., for their usual mathematical"          
#> [103] "          meanings"                                                       
#> [104] ""                                                                         
#> [105] "        • ‘lead()’, ‘lag()’"                                              
#> [106] ""                                                                         
#> [107] "        • ‘dense_rank()’, ‘min_rank()’, ‘percent_rank()’,"                
#> [108] "          ‘row_number()’, ‘cume_dist()’, ‘ntile()’"                       
#> [109] ""                                                                         
#> [110] "        • ‘cumsum()’, ‘cummean()’, ‘cummin()’, ‘cummax()’, ‘cumany()’,"   
#> [111] "          ‘cumall()’"                                                     
#> [112] ""                                                                         
#> [113] "        • ‘na_if()’, ‘coalesce()’"                                        
#> [114] ""                                                                         
#> [115] "        • ‘if_else()’, ‘recode()’, ‘case_when()’"                         
#> [116] ""                                                                         
#> [117] "Grouped tibbles:"                                                         
#> [118] ""                                                                         
#> [119] "     Because mutating expressions are computed within groups, they may"   
#> [120] "     yield different results on grouped tibbles. This will be the case"   
#> [121] "     as soon as an aggregating, lagging, or ranking function is"          
#> [122] "     involved. Compare this ungrouped mutate:"                            
#> [123] ""                                                                         
#> [124] "     starwars %>%"                                                        
#> [125] "       select(name, mass, species) %>%"                                   
#> [126] "       mutate(mass_norm = mass / mean(mass, na.rm = TRUE))"               
#> [127] "     "                                                                    
#> [128] "     With the grouped equivalent:"                                        
#> [129] ""                                                                         
#> [130] "     starwars %>%"                                                        
#> [131] "       select(name, mass, species) %>%"                                   
#> [132] "       group_by(species) %>%"                                             
#> [133] "       mutate(mass_norm = mass / mean(mass, na.rm = TRUE))"               
#> [134] "     "                                                                    
#> [135] "     The former normalises ‘mass’ by the global average whereas the"      
#> [136] "     latter normalises by the averages within species levels."            
#> [137] ""                                                                         
#> [138] "Methods:"                                                                 
#> [139] ""                                                                         
#> [140] "     This function is a *generic*, which means that packages can"         
#> [141] "     provide implementations (methods) for other classes. See the"        
#> [142] "     documentation of individual methods for extra arguments and"         
#> [143] "     differences in behaviour."                                           
#> [144] ""                                                                         
#> [145] "     Methods available in currently loaded packages: no methods found."   
#> [146] ""                                                                         
#> [147] "See Also:"                                                                
#> [148] ""                                                                         
#> [149] "     Other single table verbs: ‘arrange()’, ‘filter()’, ‘reframe()’,"     
#> [150] "     ‘rename()’, ‘select()’, ‘slice()’, ‘summarise()’"                    
#> [151] ""                                                                         
#> [152] "Examples:"                                                                
#> [153] ""                                                                         
#> [154] "     # Newly created variables are available immediately"                 
#> [155] "     starwars %>%"                                                        
#> [156] "       select(name, mass) %>%"                                            
#> [157] "       mutate("                                                           
#> [158] "         mass2 = mass * 2,"                                               
#> [159] "         mass2_squared = mass2 * mass2"                                   
#> [160] "       )"                                                                 
#> [161] "     "                                                                    
#> [162] "     # As well as adding new variables, you can use mutate() to"          
#> [163] "     # remove variables and modify existing variables."                   
#> [164] "     starwars %>%"                                                        
#> [165] "       select(name, height, mass, homeworld) %>%"                         
#> [166] "       mutate("                                                           
#> [167] "         mass = NULL,"                                                    
#> [168] "         height = height * 0.0328084 # convert to feet"                   
#> [169] "       )"                                                                 
#> [170] "     "                                                                    
#> [171] "     # Use across() with mutate() to apply a transformation"              
#> [172] "     # to multiple columns in a tibble."                                  
#> [173] "     starwars %>%"                                                        
#> [174] "       select(name, homeworld, species) %>%"                              
#> [175] "       mutate(across(!name, as.factor))"                                  
#> [176] "     # see more in ?across"                                               
#> [177] "     "                                                                    
#> [178] "     # Window functions are useful for grouped mutates:"                  
#> [179] "     starwars %>%"                                                        
#> [180] "       select(name, mass, homeworld) %>%"                                 
#> [181] "       group_by(homeworld) %>%"                                           
#> [182] "       mutate(rank = min_rank(desc(mass)))"                               
#> [183] "     # see `vignette(\"window-functions\")` for more details"             
#> [184] "     "                                                                    
#> [185] "     # By default, new columns are placed on the far right."              
#> [186] "     df <- tibble(x = 1, y = 2)"                                          
#> [187] "     df %>% mutate(z = x + y)"                                            
#> [188] "     df %>% mutate(z = x + y, .before = 1)"                               
#> [189] "     df %>% mutate(z = x + y, .after = x)"                                
#> [190] "     "                                                                    
#> [191] "     # By default, mutate() keeps all columns from the input data."       
#> [192] "     df <- tibble(x = 1, y = 2, a = \"a\", b = \"b\")"                    
#> [193] "     df %>% mutate(z = x + y, .keep = \"all\") # the default"             
#> [194] "     df %>% mutate(z = x + y, .keep = \"used\")"                          
#> [195] "     df %>% mutate(z = x + y, .keep = \"unused\")"                        
#> [196] "     df %>% mutate(z = x + y, .keep = \"none\")"                          
#> [197] "     "                                                                    
#> [198] "     # Grouping ----------------------------------------"                 
#> [199] "     # The mutate operation may yield different results on grouped"       
#> [200] "     # tibbles because the expressions are computed within groups."       
#> [201] "     # The following normalises `mass` by the global average:"            
#> [202] "     starwars %>%"                                                        
#> [203] "       select(name, mass, species) %>%"                                   
#> [204] "       mutate(mass_norm = mass / mean(mass, na.rm = TRUE))"               
#> [205] "     "                                                                    
#> [206] "     # Whereas this normalises `mass` by the averages within species"     
#> [207] "     # levels:"                                                           
#> [208] "     starwars %>%"                                                        
#> [209] "       select(name, mass, species) %>%"                                   
#> [210] "       group_by(species) %>%"                                             
#> [211] "       mutate(mass_norm = mass / mean(mass, na.rm = TRUE))"               
#> [212] "     "                                                                    
#> [213] "     # Indirection ----------------------------------------"              
#> [214] "     # Refer to column names stored as strings with the `.data` pronoun:" 
#> [215] "     vars <- c(\"mass\", \"height\")"                                     
#> [216] "     mutate(starwars, prod = .data[[vars[[1]]]] * .data[[vars[[2]]]])"    
#> [217] "     # Learn more in ?rlang::args_data_masking"                           
#> [218] "     "                                                                    
btw_this("{dplyr}")
#>   [1] "# Introduction to dplyr {#introduction-to-dplyr .title .toc-ignore}"                
#>   [2] ""                                                                                   
#>   [3] "When working with data you must:"                                                   
#>   [4] ""                                                                                   
#>   [5] "-   Figure out what you want to do."                                                
#>   [6] ""                                                                                   
#>   [7] "-   Describe those tasks in the form of a computer program."                        
#>   [8] ""                                                                                   
#>   [9] "-   Execute the program."                                                           
#>  [10] ""                                                                                   
#>  [11] "The dplyr package makes these steps fast and easy:"                                 
#>  [12] ""                                                                                   
#>  [13] "-   By constraining your options, it helps you think about your data"               
#>  [14] "    manipulation challenges."                                                       
#>  [15] ""                                                                                   
#>  [16] "-   It provides simple \"verbs\", functions that correspond to the most"            
#>  [17] "    common data manipulation tasks, to help you translate your thoughts"            
#>  [18] "    into code."                                                                     
#>  [19] ""                                                                                   
#>  [20] "-   It uses efficient backends, so you spend less time waiting for the"             
#>  [21] "    computer."                                                                      
#>  [22] ""                                                                                   
#>  [23] "This document introduces you to dplyr's basic set of tools, and shows"              
#>  [24] "you how to apply them to data frames. dplyr also supports databases via"            
#>  [25] "the dbplyr package, once you've installed, read `vignette(\"dbplyr\")` to"          
#>  [26] "learn more."                                                                        
#>  [27] ""                                                                                   
#>  [28] "::: {#data-starwars .section .level2}"                                              
#>  [29] "## Data: starwars"                                                                  
#>  [30] ""                                                                                   
#>  [31] "To explore the basic data manipulation verbs of dplyr, we'll use the"               
#>  [32] "dataset `starwars`. This dataset contains 87 characters and comes from"             
#>  [33] "the [Star Wars API](https://swapi.dev), and is documented in `?starwars`"           
#>  [34] ""                                                                                   
#>  [35] "::: {#cb1 .sourceCode}"                                                             
#>  [36] "``` {.sourceCode .r}"                                                               
#>  [37] "dim(starwars)"                                                                      
#>  [38] "#> [1] 87 14"                                                                       
#>  [39] "starwars"                                                                           
#>  [40] "#> # A tibble: 87 × 14"                                                             
#>  [41] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#>  [42] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#>  [43] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#>  [44] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#>  [45] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#>  [46] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#>  [47] "#> # ℹ 83 more rows"                                                                
#>  [48] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#>  [49] "#> #   vehicles <list>, starships <list>"                                           
#>  [50] "```"                                                                                
#>  [51] ":::"                                                                                
#>  [52] ""                                                                                   
#>  [53] "Note that `starwars` is a tibble, a modern reimagining of the data"                 
#>  [54] "frame. It's particularly useful for large datasets because it only"                 
#>  [55] "prints the first few rows. You can learn more about tibbles at"                     
#>  [56] "<https://tibble.tidyverse.org>; in particular you can convert data"                 
#>  [57] "frames to tibbles with `as_tibble()`."                                              
#>  [58] ":::"                                                                                
#>  [59] ""                                                                                   
#>  [60] "::: {#single-table-verbs .section .level2}"                                         
#>  [61] "## Single table verbs"                                                              
#>  [62] ""                                                                                   
#>  [63] "dplyr aims to provide a function for each basic verb of data"                       
#>  [64] "manipulation. These verbs can be organised into three categories based"             
#>  [65] "on the component of the dataset that they work with:"                               
#>  [66] ""                                                                                   
#>  [67] "-   Rows:"                                                                          
#>  [68] "    -   `filter()` chooses rows based on column values."                            
#>  [69] "    -   `slice()` chooses rows based on location."                                  
#>  [70] "    -   `arrange()` changes the order of the rows."                                 
#>  [71] "-   Columns:"                                                                       
#>  [72] "    -   `select()` changes whether or not a column is included."                    
#>  [73] "    -   `rename()` changes the name of columns."                                    
#>  [74] "    -   `mutate()` changes the values of columns and creates new"                   
#>  [75] "        columns."                                                                   
#>  [76] "    -   `relocate()` changes the order of the columns."                             
#>  [77] "-   Groups of rows:"                                                                
#>  [78] "    -   `summarise()` collapses a group into a single row."                         
#>  [79] ""                                                                                   
#>  [80] "::: {#the-pipe .section .level3}"                                                   
#>  [81] "### The pipe"                                                                       
#>  [82] ""                                                                                   
#>  [83] "All of the dplyr functions take a data frame (or tibble) as the first"              
#>  [84] "argument. Rather than forcing the user to either save intermediate"                 
#>  [85] "objects or nest functions, dplyr provides the `%>%` operator from"                  
#>  [86] "magrittr. `x %>% f(y)` turns into `f(x, y)` so the result from one step"            
#>  [87] "is then \"piped\" into the next step. You can use the pipe to rewrite"              
#>  [88] "multiple operations that you can read left-to-right, top-to-bottom"                 
#>  [89] "(reading the pipe operator as \"then\")."                                           
#>  [90] ":::"                                                                                
#>  [91] ""                                                                                   
#>  [92] "::: {#filter-rows-with-filter .section .level3}"                                    
#>  [93] "### Filter rows with `filter()`"                                                    
#>  [94] ""                                                                                   
#>  [95] "`filter()` allows you to select a subset of rows in a data frame. Like"             
#>  [96] "all single verbs, the first argument is the tibble (or data frame). The"            
#>  [97] "second and subsequent arguments refer to variables within that data"                
#>  [98] "frame, selecting rows where the expression is `TRUE`."                              
#>  [99] ""                                                                                   
#> [100] "For example, we can select all character with light skin color and brown"           
#> [101] "eyes with:"                                                                         
#> [102] ""                                                                                   
#> [103] "::: {#cb2 .sourceCode}"                                                             
#> [104] "``` {.sourceCode .r}"                                                               
#> [105] "starwars %>% filter(skin_color == \"light\", eye_color == \"brown\")"               
#> [106] "#> # A tibble: 7 × 14"                                                              
#> [107] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [108] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [109] "#> 1 Leia Org…    150    49 brown      light      brown             19 fema… femin…"
#> [110] "#> 2 Biggs Da…    183    84 black      light      brown             24 male  mascu…"
#> [111] "#> 3 Padmé Am…    185    45 brown      light      brown             46 fema… femin…"
#> [112] "#> 4 Cordé        157    NA brown      light      brown             NA <NA>  <NA>  "
#> [113] "#> # ℹ 3 more rows"                                                                 
#> [114] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [115] "#> #   vehicles <list>, starships <list>"                                           
#> [116] "```"                                                                                
#> [117] ":::"                                                                                
#> [118] ""                                                                                   
#> [119] "This is roughly equivalent to this base R code:"                                    
#> [120] ""                                                                                   
#> [121] "::: {#cb3 .sourceCode}"                                                             
#> [122] "``` {.sourceCode .r}"                                                               
#> [123] "starwars[starwars$skin_color == \"light\" & starwars$eye_color == \"brown\", ]"     
#> [124] "```"                                                                                
#> [125] ":::"                                                                                
#> [126] ":::"                                                                                
#> [127] ""                                                                                   
#> [128] "::: {#arrange-rows-with-arrange .section .level3}"                                  
#> [129] "### Arrange rows with `arrange()`"                                                  
#> [130] ""                                                                                   
#> [131] "`arrange()` works similarly to `filter()` except that instead of"                   
#> [132] "filtering or selecting rows, it reorders them. It takes a data frame,"              
#> [133] "and a set of column names (or more complicated expressions) to order by."           
#> [134] "If you provide more than one column name, each additional column will be"           
#> [135] "used to break ties in the values of preceding columns:"                             
#> [136] ""                                                                                   
#> [137] "::: {#cb4 .sourceCode}"                                                             
#> [138] "``` {.sourceCode .r}"                                                               
#> [139] "starwars %>% arrange(height, mass)"                                                 
#> [140] "#> # A tibble: 87 × 14"                                                             
#> [141] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [142] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [143] "#> 1 Yoda          66    17 white      green      brown            896 male  mascu…"
#> [144] "#> 2 Ratts Ty…     79    15 none       grey, blue unknown           NA male  mascu…"
#> [145] "#> 3 Wicket S…     88    20 brown      brown      brown              8 male  mascu…"
#> [146] "#> 4 Dud Bolt      94    45 none       blue, grey yellow            NA male  mascu…"
#> [147] "#> # ℹ 83 more rows"                                                                
#> [148] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [149] "#> #   vehicles <list>, starships <list>"                                           
#> [150] "```"                                                                                
#> [151] ":::"                                                                                
#> [152] ""                                                                                   
#> [153] "Use `desc()` to order a column in descending order:"                                
#> [154] ""                                                                                   
#> [155] "::: {#cb5 .sourceCode}"                                                             
#> [156] "``` {.sourceCode .r}"                                                               
#> [157] "starwars %>% arrange(desc(height))"                                                 
#> [158] "#> # A tibble: 87 × 14"                                                             
#> [159] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [160] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [161] "#> 1 Yarael P…    264    NA none       white      yellow            NA male  mascu…"
#> [162] "#> 2 Tarfful      234   136 brown      brown      blue              NA male  mascu…"
#> [163] "#> 3 Lama Su      229    88 none       grey       black             NA male  mascu…"
#> [164] "#> 4 Chewbacca    228   112 brown      unknown    blue             200 male  mascu…"
#> [165] "#> # ℹ 83 more rows"                                                                
#> [166] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [167] "#> #   vehicles <list>, starships <list>"                                           
#> [168] "```"                                                                                
#> [169] ":::"                                                                                
#> [170] ":::"                                                                                
#> [171] ""                                                                                   
#> [172] "::: {#choose-rows-using-their-position-with-slice .section .level3}"                
#> [173] "### Choose rows using their position with `slice()`"                                
#> [174] ""                                                                                   
#> [175] "`slice()` lets you index rows by their (integer) locations. It allows"              
#> [176] "you to select, remove, and duplicate rows."                                         
#> [177] ""                                                                                   
#> [178] "We can get characters from row numbers 5 through 10."                               
#> [179] ""                                                                                   
#> [180] "::: {#cb6 .sourceCode}"                                                             
#> [181] "``` {.sourceCode .r}"                                                               
#> [182] "starwars %>% slice(5:10)"                                                           
#> [183] "#> # A tibble: 6 × 14"                                                              
#> [184] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [185] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [186] "#> 1 Leia Org…    150    49 brown      light      brown             19 fema… femin…"
#> [187] "#> 2 Owen Lars    178   120 brown, gr… light      blue              52 male  mascu…"
#> [188] "#> 3 Beru Whi…    165    75 brown      light      blue              47 fema… femin…"
#> [189] "#> 4 R5-D4         97    32 <NA>       white, red red               NA none  mascu…"
#> [190] "#> # ℹ 2 more rows"                                                                 
#> [191] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [192] "#> #   vehicles <list>, starships <list>"                                           
#> [193] "```"                                                                                
#> [194] ":::"                                                                                
#> [195] ""                                                                                   
#> [196] "It is accompanied by a number of helpers for common use cases:"                     
#> [197] ""                                                                                   
#> [198] "-   `slice_head()` and `slice_tail()` select the first or last rows."               
#> [199] ""                                                                                   
#> [200] "::: {#cb7 .sourceCode}"                                                             
#> [201] "``` {.sourceCode .r}"                                                               
#> [202] "starwars %>% slice_head(n = 3)"                                                     
#> [203] "#> # A tibble: 3 × 14"                                                              
#> [204] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [205] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [206] "#> 1 Luke Sky…    172    77 blond      fair       blue              19 male  mascu…"
#> [207] "#> 2 C-3PO        167    75 <NA>       gold       yellow           112 none  mascu…"
#> [208] "#> 3 R2-D2         96    32 <NA>       white, bl… red               33 none  mascu…"
#> [209] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [210] "#> #   vehicles <list>, starships <list>"                                           
#> [211] "```"                                                                                
#> [212] ":::"                                                                                
#> [213] ""                                                                                   
#> [214] "-   `slice_sample()` randomly selects rows. Use the option prop to"                 
#> [215] "    choose a certain proportion of the cases."                                      
#> [216] ""                                                                                   
#> [217] "::: {#cb8 .sourceCode}"                                                             
#> [218] "``` {.sourceCode .r}"                                                               
#> [219] "starwars %>% slice_sample(n = 5)"                                                   
#> [220] "#> # A tibble: 5 × 14"                                                              
#> [221] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [222] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [223] "#> 1 Ayla Sec…    178  55   none       blue       hazel             48 fema… femin…"
#> [224] "#> 2 Bossk        190 113   none       green      red               53 male  mascu…"
#> [225] "#> 3 San Hill     191  NA   none       grey       gold              NA male  mascu…"
#> [226] "#> 4 Luminara…    170  56.2 black      yellow     blue              58 fema… femin…"
#> [227] "#> # ℹ 1 more row"                                                                  
#> [228] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [229] "#> #   vehicles <list>, starships <list>"                                           
#> [230] "starwars %>% slice_sample(prop = 0.1)"                                              
#> [231] "#> # A tibble: 8 × 14"                                                              
#> [232] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [233] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [234] "#> 1 Qui-Gon …    193    89 brown      fair       blue              92 male  mascu…"
#> [235] "#> 2 Jango Fe…    183    79 black      tan        brown             66 male  mascu…"
#> [236] "#> 3 Jocasta …    167    NA white      fair       blue              NA fema… femin…"
#> [237] "#> 4 Zam Wese…    168    55 blonde     fair, gre… yellow            NA fema… femin…"
#> [238] "#> # ℹ 4 more rows"                                                                 
#> [239] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [240] "#> #   vehicles <list>, starships <list>"                                           
#> [241] "```"                                                                                
#> [242] ":::"                                                                                
#> [243] ""                                                                                   
#> [244] "Use `replace = TRUE` to perform a bootstrap sample. If needed, you can"             
#> [245] "weight the sample with the `weight` argument."                                      
#> [246] ""                                                                                   
#> [247] "-   `slice_min()` and `slice_max()` select rows with highest or lowest"             
#> [248] "    values of a variable. Note that we first must choose only the values"           
#> [249] "    which are not NA."                                                              
#> [250] ""                                                                                   
#> [251] "::: {#cb9 .sourceCode}"                                                             
#> [252] "``` {.sourceCode .r}"                                                               
#> [253] "starwars %>%"                                                                       
#> [254] "  filter(!is.na(height)) %>%"                                                       
#> [255] "  slice_max(height, n = 3)"                                                         
#> [256] "#> # A tibble: 3 × 14"                                                              
#> [257] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [258] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [259] "#> 1 Yarael P…    264    NA none       white      yellow            NA male  mascu…"
#> [260] "#> 2 Tarfful      234   136 brown      brown      blue              NA male  mascu…"
#> [261] "#> 3 Lama Su      229    88 none       grey       black             NA male  mascu…"
#> [262] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [263] "#> #   vehicles <list>, starships <list>"                                           
#> [264] "```"                                                                                
#> [265] ":::"                                                                                
#> [266] ":::"                                                                                
#> [267] ""                                                                                   
#> [268] "::: {#select-columns-with-select .section .level3}"                                 
#> [269] "### Select columns with `select()`"                                                 
#> [270] ""                                                                                   
#> [271] "Often you work with large datasets with many columns but only a few are"            
#> [272] "actually of interest to you. `select()` allows you to rapidly zoom in on"           
#> [273] "a useful subset using operations that usually only work on numeric"                 
#> [274] "variable positions:"                                                                
#> [275] ""                                                                                   
#> [276] "::: {#cb10 .sourceCode}"                                                            
#> [277] "``` {.sourceCode .r}"                                                               
#> [278] "# Select columns by name"                                                           
#> [279] "starwars %>% select(hair_color, skin_color, eye_color)"                             
#> [280] "#> # A tibble: 87 × 3"                                                              
#> [281] "#>   hair_color skin_color  eye_color"                                              
#> [282] "#>   <chr>      <chr>       <chr>    "                                              
#> [283] "#> 1 blond      fair        blue     "                                              
#> [284] "#> 2 <NA>       gold        yellow   "                                              
#> [285] "#> 3 <NA>       white, blue red      "                                              
#> [286] "#> 4 none       white       yellow   "                                              
#> [287] "#> # ℹ 83 more rows"                                                                
#> [288] "# Select all columns between hair_color and eye_color (inclusive)"                  
#> [289] "starwars %>% select(hair_color:eye_color)"                                          
#> [290] "#> # A tibble: 87 × 3"                                                              
#> [291] "#>   hair_color skin_color  eye_color"                                              
#> [292] "#>   <chr>      <chr>       <chr>    "                                              
#> [293] "#> 1 blond      fair        blue     "                                              
#> [294] "#> 2 <NA>       gold        yellow   "                                              
#> [295] "#> 3 <NA>       white, blue red      "                                              
#> [296] "#> 4 none       white       yellow   "                                              
#> [297] "#> # ℹ 83 more rows"                                                                
#> [298] "# Select all columns except those from hair_color to eye_color (inclusive)"         
#> [299] "starwars %>% select(!(hair_color:eye_color))"                                       
#> [300] "#> # A tibble: 87 × 11"                                                             
#> [301] "#>   name     height  mass birth_year sex   gender homeworld species films vehicles"
#> [302] "#>   <chr>     <int> <dbl>      <dbl> <chr> <chr>  <chr>     <chr>   <lis> <list>  "
#> [303] "#> 1 Luke Sk…    172    77       19   male  mascu… Tatooine  Human   <chr> <chr>   "
#> [304] "#> 2 C-3PO       167    75      112   none  mascu… Tatooine  Droid   <chr> <chr>   "
#> [305] "#> 3 R2-D2        96    32       33   none  mascu… Naboo     Droid   <chr> <chr>   "
#> [306] "#> 4 Darth V…    202   136       41.9 male  mascu… Tatooine  Human   <chr> <chr>   "
#> [307] "#> # ℹ 83 more rows"                                                                
#> [308] "#> # ℹ 1 more variable: starships <list>"                                           
#> [309] "# Select all columns ending with color"                                             
#> [310] "starwars %>% select(ends_with(\"color\"))"                                          
#> [311] "#> # A tibble: 87 × 3"                                                              
#> [312] "#>   hair_color skin_color  eye_color"                                              
#> [313] "#>   <chr>      <chr>       <chr>    "                                              
#> [314] "#> 1 blond      fair        blue     "                                              
#> [315] "#> 2 <NA>       gold        yellow   "                                              
#> [316] "#> 3 <NA>       white, blue red      "                                              
#> [317] "#> 4 none       white       yellow   "                                              
#> [318] "#> # ℹ 83 more rows"                                                                
#> [319] "```"                                                                                
#> [320] ":::"                                                                                
#> [321] ""                                                                                   
#> [322] "There are a number of helper functions you can use within `select()`,"              
#> [323] "like `starts_with()`, `ends_with()`, `matches()` and `contains()`. These"           
#> [324] "let you quickly match larger blocks of variables that meet some"                    
#> [325] "criterion. See `?select` for more details."                                         
#> [326] ""                                                                                   
#> [327] "You can rename variables with `select()` by using named arguments:"                 
#> [328] ""                                                                                   
#> [329] "::: {#cb11 .sourceCode}"                                                            
#> [330] "``` {.sourceCode .r}"                                                               
#> [331] "starwars %>% select(home_world = homeworld)"                                        
#> [332] "#> # A tibble: 87 × 1"                                                              
#> [333] "#>   home_world"                                                                    
#> [334] "#>   <chr>     "                                                                    
#> [335] "#> 1 Tatooine  "                                                                    
#> [336] "#> 2 Tatooine  "                                                                    
#> [337] "#> 3 Naboo     "                                                                    
#> [338] "#> 4 Tatooine  "                                                                    
#> [339] "#> # ℹ 83 more rows"                                                                
#> [340] "```"                                                                                
#> [341] ":::"                                                                                
#> [342] ""                                                                                   
#> [343] "But because `select()` drops all the variables not explicitly mentioned,"           
#> [344] "it's not that useful. Instead, use `rename()`:"                                     
#> [345] ""                                                                                   
#> [346] "::: {#cb12 .sourceCode}"                                                            
#> [347] "``` {.sourceCode .r}"                                                               
#> [348] "starwars %>% rename(home_world = homeworld)"                                        
#> [349] "#> # A tibble: 87 × 14"                                                             
#> [350] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [351] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [352] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#> [353] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#> [354] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#> [355] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#> [356] "#> # ℹ 83 more rows"                                                                
#> [357] "#> # ℹ 5 more variables: home_world <chr>, species <chr>, films <list>,"            
#> [358] "#> #   vehicles <list>, starships <list>"                                           
#> [359] "```"                                                                                
#> [360] ":::"                                                                                
#> [361] ":::"                                                                                
#> [362] ""                                                                                   
#> [363] "::: {#add-new-columns-with-mutate .section .level3}"                                
#> [364] "### Add new columns with `mutate()`"                                                
#> [365] ""                                                                                   
#> [366] "Besides selecting sets of existing columns, it's often useful to add new"           
#> [367] "columns that are functions of existing columns. This is the job of"                 
#> [368] "`mutate()`:"                                                                        
#> [369] ""                                                                                   
#> [370] "::: {#cb13 .sourceCode}"                                                            
#> [371] "``` {.sourceCode .r}"                                                               
#> [372] "starwars %>% mutate(height_m = height / 100)"                                       
#> [373] "#> # A tibble: 87 × 15"                                                             
#> [374] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [375] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [376] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#> [377] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#> [378] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#> [379] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#> [380] "#> # ℹ 83 more rows"                                                                
#> [381] "#> # ℹ 6 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [382] "#> #   vehicles <list>, starships <list>, height_m <dbl>"                           
#> [383] "```"                                                                                
#> [384] ":::"                                                                                
#> [385] ""                                                                                   
#> [386] "We can't see the height in meters we just calculated, but we can fix"               
#> [387] "that using a select command."                                                       
#> [388] ""                                                                                   
#> [389] "::: {#cb14 .sourceCode}"                                                            
#> [390] "``` {.sourceCode .r}"                                                               
#> [391] "starwars %>%"                                                                       
#> [392] "  mutate(height_m = height / 100) %>%"                                              
#> [393] "  select(height_m, height, everything())"                                           
#> [394] "#> # A tibble: 87 × 15"                                                             
#> [395] "#>   height_m height name     mass hair_color skin_color eye_color birth_year sex  "
#> [396] "#>      <dbl>  <int> <chr>   <dbl> <chr>      <chr>      <chr>          <dbl> <chr>"
#> [397] "#> 1     1.72    172 Luke S…    77 blond      fair       blue            19   male "
#> [398] "#> 2     1.67    167 C-3PO      75 <NA>       gold       yellow         112   none "
#> [399] "#> 3     0.96     96 R2-D2      32 <NA>       white, bl… red             33   none "
#> [400] "#> 4     2.02    202 Darth …   136 none       white      yellow          41.9 male "
#> [401] "#> # ℹ 83 more rows"                                                                
#> [402] "#> # ℹ 6 more variables: gender <chr>, homeworld <chr>, species <chr>,"             
#> [403] "#> #   films <list>, vehicles <list>, starships <list>"                             
#> [404] "```"                                                                                
#> [405] ":::"                                                                                
#> [406] ""                                                                                   
#> [407] "`dplyr::mutate()` is similar to the base `transform()`, but allows you"             
#> [408] "to refer to columns that you've just created:"                                      
#> [409] ""                                                                                   
#> [410] "::: {#cb15 .sourceCode}"                                                            
#> [411] "``` {.sourceCode .r}"                                                               
#> [412] "starwars %>%"                                                                       
#> [413] "  mutate("                                                                          
#> [414] "    height_m = height / 100,"                                                       
#> [415] "    BMI = mass / (height_m^2)"                                                      
#> [416] "  ) %>%"                                                                            
#> [417] "  select(BMI, everything())"                                                        
#> [418] "#> # A tibble: 87 × 16"                                                             
#> [419] "#>     BMI name       height  mass hair_color skin_color eye_color birth_year sex  "
#> [420] "#>   <dbl> <chr>       <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>"
#> [421] "#> 1  26.0 Luke Skyw…    172    77 blond      fair       blue            19   male "
#> [422] "#> 2  26.9 C-3PO         167    75 <NA>       gold       yellow         112   none "
#> [423] "#> 3  34.7 R2-D2          96    32 <NA>       white, bl… red             33   none "
#> [424] "#> 4  33.3 Darth Vad…    202   136 none       white      yellow          41.9 male "
#> [425] "#> # ℹ 83 more rows"                                                                
#> [426] "#> # ℹ 7 more variables: gender <chr>, homeworld <chr>, species <chr>,"             
#> [427] "#> #   films <list>, vehicles <list>, starships <list>, height_m <dbl>"             
#> [428] "```"                                                                                
#> [429] ":::"                                                                                
#> [430] ""                                                                                   
#> [431] "If you only want to keep the new variables, use `.keep = \"none\"`:"                
#> [432] ""                                                                                   
#> [433] "::: {#cb16 .sourceCode}"                                                            
#> [434] "``` {.sourceCode .r}"                                                               
#> [435] "starwars %>%"                                                                       
#> [436] "  mutate("                                                                          
#> [437] "    height_m = height / 100,"                                                       
#> [438] "    BMI = mass / (height_m^2),"                                                     
#> [439] "    .keep = \"none\""                                                               
#> [440] "  )"                                                                                
#> [441] "#> # A tibble: 87 × 2"                                                              
#> [442] "#>   height_m   BMI"                                                                
#> [443] "#>      <dbl> <dbl>"                                                                
#> [444] "#> 1     1.72  26.0"                                                                
#> [445] "#> 2     1.67  26.9"                                                                
#> [446] "#> 3     0.96  34.7"                                                                
#> [447] "#> 4     2.02  33.3"                                                                
#> [448] "#> # ℹ 83 more rows"                                                                
#> [449] "```"                                                                                
#> [450] ":::"                                                                                
#> [451] ":::"                                                                                
#> [452] ""                                                                                   
#> [453] "::: {#change-column-order-with-relocate .section .level3}"                          
#> [454] "### Change column order with `relocate()`"                                          
#> [455] ""                                                                                   
#> [456] "Use a similar syntax as `select()` to move blocks of columns at once"               
#> [457] ""                                                                                   
#> [458] "::: {#cb17 .sourceCode}"                                                            
#> [459] "``` {.sourceCode .r}"                                                               
#> [460] "starwars %>% relocate(sex:homeworld, .before = height)"                             
#> [461] "#> # A tibble: 87 × 14"                                                             
#> [462] "#>   name       sex   gender homeworld height  mass hair_color skin_color eye_color"
#> [463] "#>   <chr>      <chr> <chr>  <chr>      <int> <dbl> <chr>      <chr>      <chr>    "
#> [464] "#> 1 Luke Skyw… male  mascu… Tatooine     172    77 blond      fair       blue     "
#> [465] "#> 2 C-3PO      none  mascu… Tatooine     167    75 <NA>       gold       yellow   "
#> [466] "#> 3 R2-D2      none  mascu… Naboo         96    32 <NA>       white, bl… red      "
#> [467] "#> 4 Darth Vad… male  mascu… Tatooine     202   136 none       white      yellow   "
#> [468] "#> # ℹ 83 more rows"                                                                
#> [469] "#> # ℹ 5 more variables: birth_year <dbl>, species <chr>, films <list>,"            
#> [470] "#> #   vehicles <list>, starships <list>"                                           
#> [471] "```"                                                                                
#> [472] ":::"                                                                                
#> [473] ":::"                                                                                
#> [474] ""                                                                                   
#> [475] "::: {#summarise-values-with-summarise .section .level3}"                            
#> [476] "### Summarise values with `summarise()`"                                            
#> [477] ""                                                                                   
#> [478] "The last verb is `summarise()`. It collapses a data frame to a single"              
#> [479] "row."                                                                               
#> [480] ""                                                                                   
#> [481] "::: {#cb18 .sourceCode}"                                                            
#> [482] "``` {.sourceCode .r}"                                                               
#> [483] "starwars %>% summarise(height = mean(height, na.rm = TRUE))"                        
#> [484] "#> # A tibble: 1 × 1"                                                               
#> [485] "#>   height"                                                                        
#> [486] "#>    <dbl>"                                                                        
#> [487] "#> 1   175."                                                                        
#> [488] "```"                                                                                
#> [489] ":::"                                                                                
#> [490] ""                                                                                   
#> [491] "It's not that useful until we learn the `group_by()` verb below."                   
#> [492] ":::"                                                                                
#> [493] ""                                                                                   
#> [494] "::: {#commonalities .section .level3}"                                              
#> [495] "### Commonalities"                                                                  
#> [496] ""                                                                                   
#> [497] "You may have noticed that the syntax and function of all these verbs are"           
#> [498] "very similar:"                                                                      
#> [499] ""                                                                                   
#> [500] "-   The first argument is a data frame."                                            
#> [501] ""                                                                                   
#> [502] "-   The subsequent arguments describe what to do with the data frame."              
#> [503] "    You can refer to columns in the data frame directly without using"              
#> [504] "    `$`."                                                                           
#> [505] ""                                                                                   
#> [506] "-   The result is a new data frame"                                                 
#> [507] ""                                                                                   
#> [508] "Together these properties make it easy to chain together multiple simple"           
#> [509] "steps to achieve a complex result."                                                 
#> [510] ""                                                                                   
#> [511] "These five functions provide the basis of a language of data"                       
#> [512] "manipulation. At the most basic level, you can only alter a tidy data"              
#> [513] "frame in five useful ways: you can reorder the rows (`arrange()`), pick"            
#> [514] "observations and variables of interest (`filter()` and `select()`), add"            
#> [515] "new variables that are functions of existing variables (`mutate()`), or"            
#> [516] "collapse many values to a summary (`summarise()`)."                                 
#> [517] ":::"                                                                                
#> [518] ":::"                                                                                
#> [519] ""                                                                                   
#> [520] "::: {#combining-functions-with .section .level2}"                                   
#> [521] "## Combining functions with `%>%`"                                                  
#> [522] ""                                                                                   
#> [523] "The dplyr API is functional in the sense that function calls don't have"            
#> [524] "side-effects. You must always save their results. This doesn't lead to"             
#> [525] "particularly elegant code, especially if you want to do many operations"            
#> [526] "at once. You either have to do it step-by-step:"                                    
#> [527] ""                                                                                   
#> [528] "::: {#cb19 .sourceCode}"                                                            
#> [529] "``` {.sourceCode .r}"                                                               
#> [530] "a1 <- group_by(starwars, species, sex)"                                             
#> [531] "a2 <- select(a1, height, mass)"                                                     
#> [532] "a3 <- summarise(a2,"                                                                
#> [533] "  height = mean(height, na.rm = TRUE),"                                             
#> [534] "  mass = mean(mass, na.rm = TRUE)"                                                  
#> [535] ")"                                                                                  
#> [536] "```"                                                                                
#> [537] ":::"                                                                                
#> [538] ""                                                                                   
#> [539] "Or if you don't want to name the intermediate results, you need to wrap"            
#> [540] "the function calls inside each other:"                                              
#> [541] ""                                                                                   
#> [542] "::: {#cb20 .sourceCode}"                                                            
#> [543] "``` {.sourceCode .r}"                                                               
#> [544] "summarise("                                                                         
#> [545] "  select("                                                                          
#> [546] "    group_by(starwars, species, sex),"                                              
#> [547] "    height, mass"                                                                   
#> [548] "  ),"                                                                               
#> [549] "  height = mean(height, na.rm = TRUE),"                                             
#> [550] "  mass = mean(mass, na.rm = TRUE)"                                                  
#> [551] ")"                                                                                  
#> [552] "#> Adding missing grouping variables: `species`, `sex`"                             
#> [553] "#> `summarise()` has grouped output by 'species'. You can override using the"       
#> [554] "#> `.groups` argument."                                                             
#> [555] "#> # A tibble: 41 × 4"                                                              
#> [556] "#> # Groups:   species [38]"                                                        
#> [557] "#>   species  sex   height  mass"                                                   
#> [558] "#>   <chr>    <chr>  <dbl> <dbl>"                                                   
#> [559] "#> 1 Aleena   male      79    15"                                                   
#> [560] "#> 2 Besalisk male     198   102"                                                   
#> [561] "#> 3 Cerean   male     198    82"                                                   
#> [562] "#> 4 Chagrian male     196   NaN"                                                   
#> [563] "#> # ℹ 37 more rows"                                                                
#> [564] "```"                                                                                
#> [565] ":::"                                                                                
#> [566] ""                                                                                   
#> [567] "This is difficult to read because the order of the operations is from"              
#> [568] "inside to out. Thus, the arguments are a long way away from the"                    
#> [569] "function. To get around this problem, dplyr provides the `%>%` operator"            
#> [570] "from magrittr. `x %>% f(y)` turns into `f(x, y)` so you can use it to"              
#> [571] "rewrite multiple operations that you can read left-to-right,"                       
#> [572] "top-to-bottom (reading the pipe operator as \"then\"):"                             
#> [573] ""                                                                                   
#> [574] "::: {#cb21 .sourceCode}"                                                            
#> [575] "``` {.sourceCode .r}"                                                               
#> [576] "starwars %>%"                                                                       
#> [577] "  group_by(species, sex) %>%"                                                       
#> [578] "  select(height, mass) %>%"                                                         
#> [579] "  summarise("                                                                       
#> [580] "    height = mean(height, na.rm = TRUE),"                                           
#> [581] "    mass = mean(mass, na.rm = TRUE)"                                                
#> [582] "  )"                                                                                
#> [583] "```"                                                                                
#> [584] ":::"                                                                                
#> [585] ":::"                                                                                
#> [586] ""                                                                                   
#> [587] "::: {#patterns-of-operations .section .level2}"                                     
#> [588] "## Patterns of operations"                                                          
#> [589] ""                                                                                   
#> [590] "The dplyr verbs can be classified by the type of operations they"                   
#> [591] "accomplish (we sometimes speak of their **semantics**, i.e., their"                 
#> [592] "meaning). It's helpful to have a good grasp of the difference between"              
#> [593] "select and mutate operations."                                                      
#> [594] ""                                                                                   
#> [595] "::: {#selecting-operations .section .level3}"                                       
#> [596] "### Selecting operations"                                                           
#> [597] ""                                                                                   
#> [598] "One of the appealing features of dplyr is that you can refer to columns"            
#> [599] "from the tibble as if they were regular variables. However, the"                    
#> [600] "syntactic uniformity of referring to bare column names hides semantical"            
#> [601] "differences across the verbs. A column symbol supplied to `select()`"               
#> [602] "does not have the same meaning as the same symbol supplied to"                      
#> [603] "`mutate()`."                                                                        
#> [604] ""                                                                                   
#> [605] "Selecting operations expect column names and positions. Hence, when you"            
#> [606] "call `select()` with bare variable names, they actually represent their"            
#> [607] "own positions in the tibble. The following calls are completely"                    
#> [608] "equivalent from dplyr's point of view:"                                             
#> [609] ""                                                                                   
#> [610] "::: {#cb22 .sourceCode}"                                                            
#> [611] "``` {.sourceCode .r}"                                                               
#> [612] "# `name` represents the integer 1"                                                  
#> [613] "select(starwars, name)"                                                             
#> [614] "#> # A tibble: 87 × 1"                                                              
#> [615] "#>   name          "                                                                
#> [616] "#>   <chr>         "                                                                
#> [617] "#> 1 Luke Skywalker"                                                                
#> [618] "#> 2 C-3PO         "                                                                
#> [619] "#> 3 R2-D2         "                                                                
#> [620] "#> 4 Darth Vader   "                                                                
#> [621] "#> # ℹ 83 more rows"                                                                
#> [622] "select(starwars, 1)"                                                                
#> [623] "#> # A tibble: 87 × 1"                                                              
#> [624] "#>   name          "                                                                
#> [625] "#>   <chr>         "                                                                
#> [626] "#> 1 Luke Skywalker"                                                                
#> [627] "#> 2 C-3PO         "                                                                
#> [628] "#> 3 R2-D2         "                                                                
#> [629] "#> 4 Darth Vader   "                                                                
#> [630] "#> # ℹ 83 more rows"                                                                
#> [631] "```"                                                                                
#> [632] ":::"                                                                                
#> [633] ""                                                                                   
#> [634] "By the same token, this means that you cannot refer to variables from"              
#> [635] "the surrounding context if they have the same name as one of the"                   
#> [636] "columns. In the following example, `height` still represents 2, not 5:"             
#> [637] ""                                                                                   
#> [638] "::: {#cb23 .sourceCode}"                                                            
#> [639] "``` {.sourceCode .r}"                                                               
#> [640] "height <- 5"                                                                        
#> [641] "select(starwars, height)"                                                           
#> [642] "#> # A tibble: 87 × 1"                                                              
#> [643] "#>   height"                                                                        
#> [644] "#>    <int>"                                                                        
#> [645] "#> 1    172"                                                                        
#> [646] "#> 2    167"                                                                        
#> [647] "#> 3     96"                                                                        
#> [648] "#> 4    202"                                                                        
#> [649] "#> # ℹ 83 more rows"                                                                
#> [650] "```"                                                                                
#> [651] ":::"                                                                                
#> [652] ""                                                                                   
#> [653] "One useful subtlety is that this only applies to bare names and to"                 
#> [654] "selecting calls like `c(height, mass)` or `height:mass`. In all other"              
#> [655] "cases, the columns of the data frame are not put in scope. This allows"             
#> [656] "you to refer to contextual variables in selection helpers:"                         
#> [657] ""                                                                                   
#> [658] "::: {#cb24 .sourceCode}"                                                            
#> [659] "``` {.sourceCode .r}"                                                               
#> [660] "name <- \"color\""                                                                  
#> [661] "select(starwars, ends_with(name))"                                                  
#> [662] "#> # A tibble: 87 × 3"                                                              
#> [663] "#>   hair_color skin_color  eye_color"                                              
#> [664] "#>   <chr>      <chr>       <chr>    "                                              
#> [665] "#> 1 blond      fair        blue     "                                              
#> [666] "#> 2 <NA>       gold        yellow   "                                              
#> [667] "#> 3 <NA>       white, blue red      "                                              
#> [668] "#> 4 none       white       yellow   "                                              
#> [669] "#> # ℹ 83 more rows"                                                                
#> [670] "```"                                                                                
#> [671] ":::"                                                                                
#> [672] ""                                                                                   
#> [673] "These semantics are usually intuitive. But note the subtle difference:"             
#> [674] ""                                                                                   
#> [675] "::: {#cb25 .sourceCode}"                                                            
#> [676] "``` {.sourceCode .r}"                                                               
#> [677] "name <- 5"                                                                          
#> [678] "select(starwars, name, identity(name))"                                             
#> [679] "#> # A tibble: 87 × 2"                                                              
#> [680] "#>   name           skin_color "                                                    
#> [681] "#>   <chr>          <chr>      "                                                    
#> [682] "#> 1 Luke Skywalker fair       "                                                    
#> [683] "#> 2 C-3PO          gold       "                                                    
#> [684] "#> 3 R2-D2          white, blue"                                                    
#> [685] "#> 4 Darth Vader    white      "                                                    
#> [686] "#> # ℹ 83 more rows"                                                                
#> [687] "```"                                                                                
#> [688] ":::"                                                                                
#> [689] ""                                                                                   
#> [690] "In the first argument, `name` represents its own position `1`. In the"              
#> [691] "second argument, `name` is evaluated in the surrounding context and"                
#> [692] "represents the fifth column."                                                       
#> [693] ""                                                                                   
#> [694] "For a long time, `select()` used to only understand column positions."              
#> [695] "Counting from dplyr 0.6, it now understands column names as well. This"             
#> [696] "makes it a bit easier to program with `select()`:"                                  
#> [697] ""                                                                                   
#> [698] "::: {#cb26 .sourceCode}"                                                            
#> [699] "``` {.sourceCode .r}"                                                               
#> [700] "vars <- c(\"name\", \"height\")"                                                    
#> [701] "select(starwars, all_of(vars), \"mass\")"                                           
#> [702] "#> # A tibble: 87 × 3"                                                              
#> [703] "#>   name           height  mass"                                                   
#> [704] "#>   <chr>           <int> <dbl>"                                                   
#> [705] "#> 1 Luke Skywalker    172    77"                                                   
#> [706] "#> 2 C-3PO             167    75"                                                   
#> [707] "#> 3 R2-D2              96    32"                                                   
#> [708] "#> 4 Darth Vader       202   136"                                                   
#> [709] "#> # ℹ 83 more rows"                                                                
#> [710] "```"                                                                                
#> [711] ":::"                                                                                
#> [712] ":::"                                                                                
#> [713] ""                                                                                   
#> [714] "::: {#mutating-operations .section .level3}"                                        
#> [715] "### Mutating operations"                                                            
#> [716] ""                                                                                   
#> [717] "Mutate semantics are quite different from selection semantics. Whereas"             
#> [718] "`select()` expects column names or positions, `mutate()` expects *column"           
#> [719] "vectors*. We will set up a smaller tibble to use for our examples."                 
#> [720] ""                                                                                   
#> [721] "::: {#cb27 .sourceCode}"                                                            
#> [722] "``` {.sourceCode .r}"                                                               
#> [723] "df <- starwars %>% select(name, height, mass)"                                      
#> [724] "```"                                                                                
#> [725] ":::"                                                                                
#> [726] ""                                                                                   
#> [727] "When we use `select()`, the bare column names stand for their own"                  
#> [728] "positions in the tibble. For `mutate()` on the other hand, column"                  
#> [729] "symbols represent the actual column vectors stored in the tibble."                  
#> [730] "Consider what happens if we give a string or a number to `mutate()`:"               
#> [731] ""                                                                                   
#> [732] "::: {#cb28 .sourceCode}"                                                            
#> [733] "``` {.sourceCode .r}"                                                               
#> [734] "mutate(df, \"height\", 2)"                                                          
#> [735] "#> # A tibble: 87 × 5"                                                              
#> [736] "#>   name           height  mass `\"height\"`   `2`"                                
#> [737] "#>   <chr>           <int> <dbl> <chr>      <dbl>"                                  
#> [738] "#> 1 Luke Skywalker    172    77 height         2"                                  
#> [739] "#> 2 C-3PO             167    75 height         2"                                  
#> [740] "#> 3 R2-D2              96    32 height         2"                                  
#> [741] "#> 4 Darth Vader       202   136 height         2"                                  
#> [742] "#> # ℹ 83 more rows"                                                                
#> [743] "```"                                                                                
#> [744] ":::"                                                                                
#> [745] ""                                                                                   
#> [746] "`mutate()` gets length-1 vectors that it interprets as new columns in"              
#> [747] "the data frame. These vectors are recycled so they match the number of"             
#> [748] "rows. That's why it doesn't make sense to supply expressions like"                  
#> [749] "`\"height\" + 10` to `mutate()`. This amounts to adding 10 to a string!"            
#> [750] "The correct expression is:"                                                         
#> [751] ""                                                                                   
#> [752] "::: {#cb29 .sourceCode}"                                                            
#> [753] "``` {.sourceCode .r}"                                                               
#> [754] "mutate(df, height + 10)"                                                            
#> [755] "#> # A tibble: 87 × 4"                                                              
#> [756] "#>   name           height  mass `height + 10`"                                     
#> [757] "#>   <chr>           <int> <dbl>         <dbl>"                                     
#> [758] "#> 1 Luke Skywalker    172    77           182"                                     
#> [759] "#> 2 C-3PO             167    75           177"                                     
#> [760] "#> 3 R2-D2              96    32           106"                                     
#> [761] "#> 4 Darth Vader       202   136           212"                                     
#> [762] "#> # ℹ 83 more rows"                                                                
#> [763] "```"                                                                                
#> [764] ":::"                                                                                
#> [765] ""                                                                                   
#> [766] "In the same way, you can unquote values from the context if these values"           
#> [767] "represent a valid column. They must be either length 1 (they then get"              
#> [768] "recycled) or have the same length as the number of rows. In the"                    
#> [769] "following example we create a new vector that we add to the data frame:"            
#> [770] ""                                                                                   
#> [771] "::: {#cb30 .sourceCode}"                                                            
#> [772] "``` {.sourceCode .r}"                                                               
#> [773] "var <- seq(1, nrow(df))"                                                            
#> [774] "mutate(df, new = var)"                                                              
#> [775] "#> # A tibble: 87 × 4"                                                              
#> [776] "#>   name           height  mass   new"                                             
#> [777] "#>   <chr>           <int> <dbl> <int>"                                             
#> [778] "#> 1 Luke Skywalker    172    77     1"                                             
#> [779] "#> 2 C-3PO             167    75     2"                                             
#> [780] "#> 3 R2-D2              96    32     3"                                             
#> [781] "#> 4 Darth Vader       202   136     4"                                             
#> [782] "#> # ℹ 83 more rows"                                                                
#> [783] "```"                                                                                
#> [784] ":::"                                                                                
#> [785] ""                                                                                   
#> [786] "A case in point is `group_by()`. While you might think it has select"               
#> [787] "semantics, it actually has mutate semantics. This is quite handy as it"             
#> [788] "allows to group by a modified column:"                                              
#> [789] ""                                                                                   
#> [790] "::: {#cb31 .sourceCode}"                                                            
#> [791] "``` {.sourceCode .r}"                                                               
#> [792] "group_by(starwars, sex)"                                                            
#> [793] "#> # A tibble: 87 × 14"                                                             
#> [794] "#> # Groups:   sex [5]"                                                             
#> [795] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [796] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [797] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#> [798] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#> [799] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#> [800] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#> [801] "#> # ℹ 83 more rows"                                                                
#> [802] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [803] "#> #   vehicles <list>, starships <list>"                                           
#> [804] "group_by(starwars, sex = as.factor(sex))"                                           
#> [805] "#> # A tibble: 87 × 14"                                                             
#> [806] "#> # Groups:   sex [5]"                                                             
#> [807] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [808] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <fct> <chr> "
#> [809] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#> [810] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#> [811] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#> [812] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#> [813] "#> # ℹ 83 more rows"                                                                
#> [814] "#> # ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [815] "#> #   vehicles <list>, starships <list>"                                           
#> [816] "group_by(starwars, height_binned = cut(height, 3))"                                 
#> [817] "#> # A tibble: 87 × 15"                                                             
#> [818] "#> # Groups:   height_binned [4]"                                                   
#> [819] "#>   name      height  mass hair_color skin_color eye_color birth_year sex   gender"
#> [820] "#>   <chr>      <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> "
#> [821] "#> 1 Luke Sky…    172    77 blond      fair       blue            19   male  mascu…"
#> [822] "#> 2 C-3PO        167    75 <NA>       gold       yellow         112   none  mascu…"
#> [823] "#> 3 R2-D2         96    32 <NA>       white, bl… red             33   none  mascu…"
#> [824] "#> 4 Darth Va…    202   136 none       white      yellow          41.9 male  mascu…"
#> [825] "#> # ℹ 83 more rows"                                                                
#> [826] "#> # ℹ 6 more variables: homeworld <chr>, species <chr>, films <list>,"             
#> [827] "#> #   vehicles <list>, starships <list>, height_binned <fct>"                      
#> [828] "```"                                                                                
#> [829] ":::"                                                                                
#> [830] ""                                                                                   
#> [831] "This is why you can't supply a column name to `group_by()`. This amounts"           
#> [832] "to creating a new column containing the string recycled to the number of"           
#> [833] "rows:"                                                                              
#> [834] ""                                                                                   
#> [835] "::: {#cb32 .sourceCode}"                                                            
#> [836] "``` {.sourceCode .r}"                                                               
#> [837] "group_by(df, \"month\")"                                                            
#> [838] "#> # A tibble: 87 × 4"                                                              
#> [839] "#> # Groups:   \"month\" [1]"                                                       
#> [840] "#>   name           height  mass `\"month\"`"                                       
#> [841] "#>   <chr>           <int> <dbl> <chr>    "                                         
#> [842] "#> 1 Luke Skywalker    172    77 month    "                                         
#> [843] "#> 2 C-3PO             167    75 month    "                                         
#> [844] "#> 3 R2-D2              96    32 month    "                                         
#> [845] "#> 4 Darth Vader       202   136 month    "                                         
#> [846] "#> # ℹ 83 more rows"                                                                
#> [847] "```"                                                                                
#> [848] ":::"                                                                                
#> [849] ":::"                                                                                
#> [850] ":::"                                                                                

# Files ----
btw_this("./") # list files in the current working directory
#> [1] "| path | type | size | modification_time |\n|------|------|------|-------------------|\n| btw-package.html | file |  6.2K | 2025-03-10 17:34:39 |\n| btw.html | file | 13.52K | 2025-03-10 17:34:39 |\n| btw_register_tools.html | file | 10.55K | 2025-03-10 17:34:39 |\n| index.html | file |  7.02K | 2025-03-10 17:34:38 |"